CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-jmcomic

Python API for accessing and downloading content from JMComic with Cloudflare bypass and plugin system.

Pending
Overview
Eval results
Files

text-data-processing.mddocs/

Text and Data Processing

Specialized utilities for text processing, HTML parsing, image processing, and cryptographic operations. These tools support the core functionality with URL parsing, data extraction, content processing, and security operations.

Types

from typing import Dict, Any, List, Optional, Union, Pattern, Match

Capabilities

Text Processing Utilities

Comprehensive text processing tools for URL handling, domain management, and ID parsing.

class JmcomicText:
    """
    Text processing utilities for URL parsing, domain extraction, and ID parsing.
    
    Provides essential text manipulation functions for working with JMComic
    URLs, domain names, and content identifiers.
    
    Static Methods:
    - parse_to_jm_id(text): Parse text to extract JM IDs
    - extract_domain(url): Extract domain from URL
    - normalize_url(url): Normalize URL format
    - is_valid_jm_id(jm_id): Validate JM ID format
    - parse_album_id(text): Extract album ID from text
    - parse_photo_id(text): Extract photo ID from text
    - clean_filename(filename): Clean filename for filesystem
    - format_title(title): Format title for display
    """
    
    @staticmethod
    def parse_to_jm_id(text: Union[str, int]) -> str:
        """
        Parse text or URL to extract JM ID.
        
        Handles various input formats including URLs, raw IDs,
        and text containing IDs.
        
        Parameters:
        - text: str or int - Text containing JM ID
        
        Returns:
        str - Extracted and normalized JM ID
        
        Raises:
        ValueError - If no valid ID found
        """
    
    @staticmethod
    def extract_domain(url: str) -> str:
        """
        Extract domain from URL.
        
        Parameters:
        - url: str - URL to parse
        
        Returns:
        str - Extracted domain name
        """
    
    @staticmethod
    def normalize_url(url: str) -> str:
        """
        Normalize URL format for consistent processing.
        
        Parameters:
        - url: str - URL to normalize
        
        Returns:
        str - Normalized URL
        """
    
    @staticmethod
    def is_valid_jm_id(jm_id: Union[str, int]) -> bool:
        """
        Validate JM ID format.
        
        Parameters:
        - jm_id: str or int - ID to validate
        
        Returns:
        bool - True if valid JM ID format
        """
    
    @staticmethod
    def clean_filename(filename: str) -> str:
        """
        Clean filename for filesystem compatibility.
        
        Removes or replaces invalid characters for safe file operations.
        
        Parameters:
        - filename: str - Original filename
        
        Returns:
        str - Cleaned filename safe for filesystem
        """

Usage examples:

# Parse various ID formats
jm_id = JmcomicText.parse_to_jm_id("https://example.com/album/123456")
jm_id = JmcomicText.parse_to_jm_id("123456")
jm_id = JmcomicText.parse_to_jm_id("Album ID: 123456")

# Validate IDs
is_valid = JmcomicText.is_valid_jm_id("123456")

# Clean filenames
safe_filename = JmcomicText.clean_filename("Album: Title with/invalid\\chars")

HTML Parsing and Pattern Matching

Tools for parsing HTML content and extracting data using regular expressions.

class PatternTool:
    """
    Regular expression utilities for HTML parsing and data extraction.
    
    Provides pre-compiled patterns and matching utilities for
    extracting structured data from HTML pages.
    
    Class Attributes:
    - ALBUM_ID_PATTERN: Pattern - Regex for album ID extraction
    - PHOTO_ID_PATTERN: Pattern - Regex for photo ID extraction
    - IMAGE_URL_PATTERN: Pattern - Regex for image URL extraction
    - TITLE_PATTERN: Pattern - Regex for title extraction
    
    Static Methods:
    - match_album_info(html): Extract album information from HTML
    - match_photo_info(html): Extract photo information from HTML
    - match_image_urls(html): Extract image URLs from HTML
    - find_all_matches(pattern, text): Find all regex matches
    """
    
    @staticmethod
    def match_album_info(html: str) -> Dict[str, Any]:
        """
        Extract album information from HTML content.
        
        Parameters:
        - html: str - HTML content to parse
        
        Returns:
        dict - Extracted album information
        """
    
    @staticmethod
    def match_photo_info(html: str) -> Dict[str, Any]:
        """
        Extract photo information from HTML content.
        
        Parameters:
        - html: str - HTML content to parse
        
        Returns:
        dict - Extracted photo information
        """
    
    @staticmethod
    def match_image_urls(html: str) -> List[str]:
        """
        Extract image URLs from HTML content.
        
        Parameters:
        - html: str - HTML content to parse
        
        Returns:
        List[str] - List of extracted image URLs
        """
    
    @staticmethod
    def find_all_matches(pattern: Pattern, text: str) -> List[Match]:
        """
        Find all regex matches in text.
        
        Parameters:
        - pattern: Pattern - Compiled regex pattern
        - text: str - Text to search
        
        Returns:
        List[Match] - List of regex match objects
        """

Page Processing Tools

Specialized tools for processing HTML pages and extracting structured data.

class JmPageTool:
    """
    HTML page parsing and data extraction utilities.
    
    Provides high-level functions for parsing JMComic HTML pages
    and extracting structured data for albums, photos, and searches.
    
    Static Methods:
    - parse_album_page(html): Parse album detail page
    - parse_photo_page(html): Parse photo detail page  
    - parse_search_page(html): Parse search results page
    - parse_category_page(html): Parse category listing page
    - extract_pagination(html): Extract pagination information
    - extract_metadata(html): Extract page metadata
    """
    
    @staticmethod
    def parse_album_page(html: str) -> 'JmAlbumDetail':
        """
        Parse album detail page HTML to extract album information.
        
        Parameters:
        - html: str - Album page HTML content
        
        Returns:
        JmAlbumDetail - Parsed album with metadata and episodes
        """
    
    @staticmethod
    def parse_photo_page(html: str) -> 'JmPhotoDetail':
        """
        Parse photo detail page HTML to extract photo information.
        
        Parameters:
        - html: str - Photo page HTML content
        
        Returns:
        JmPhotoDetail - Parsed photo with metadata and images
        """
    
    @staticmethod
    def parse_search_page(html: str) -> 'JmSearchPage':
        """
        Parse search results page HTML.
        
        Parameters:
        - html: str - Search page HTML content
        
        Returns:
        JmSearchPage - Parsed search results with albums and pagination
        """
    
    @staticmethod
    def extract_pagination(html: str) -> Dict[str, Any]:
        """
        Extract pagination information from page.
        
        Parameters:
        - html: str - HTML content with pagination
        
        Returns:
        dict - Pagination data (current_page, total_pages, has_next, etc.)
        """

API Response Processing

Tools for processing and adapting API responses from different client types.

class JmApiAdaptTool:
    """
    API response adaptation and transformation utilities.
    
    Handles conversion between different API response formats and
    standardizes data structures across client types.
    
    Static Methods:
    - adapt_album_response(response): Adapt album API response
    - adapt_photo_response(response): Adapt photo API response
    - adapt_search_response(response): Adapt search API response
    - normalize_response_data(data): Normalize response data format
    - validate_api_response(response): Validate API response structure
    """
    
    @staticmethod
    def adapt_album_response(response: Dict[str, Any]) -> 'JmAlbumDetail':
        """
        Adapt album API response to standard format.
        
        Parameters:
        - response: dict - Raw API response data
        
        Returns:
        JmAlbumDetail - Standardized album entity
        """
    
    @staticmethod
    def adapt_photo_response(response: Dict[str, Any]) -> 'JmPhotoDetail':
        """
        Adapt photo API response to standard format.
        
        Parameters:
        - response: dict - Raw API response data
        
        Returns:
        JmPhotoDetail - Standardized photo entity
        """
    
    @staticmethod
    def normalize_response_data(data: Dict[str, Any]) -> Dict[str, Any]:
        """
        Normalize response data format across different APIs.
        
        Parameters:
        - data: dict - Raw response data
        
        Returns:
        dict - Normalized data structure
        """

Image Processing Tools

Comprehensive image processing utilities including decryption, format conversion, and manipulation.

class JmImageTool:
    """
    Image processing, decryption, and format conversion utilities.
    
    Provides tools for handling scrambled images, format conversion,
    and image manipulation operations.
    
    Static Methods:
    - decrypt_image(image_data, scramble_id): Decrypt scrambled image
    - is_image_scrambled(image_data): Check if image is scrambled
    - convert_image_format(image_data, target_format): Convert image format
    - resize_image(image_data, width, height): Resize image
    - get_image_info(image_data): Get image metadata
    - merge_images_vertical(images): Merge images vertically
    - optimize_image(image_data): Optimize image for size
    """
    
    @staticmethod
    def decrypt_image(image_data: bytes, scramble_id: int) -> bytes:
        """
        Decrypt scrambled image data.
        
        JMComic images are sometimes scrambled for protection.
        This function reverses the scrambling process.
        
        Parameters:
        - image_data: bytes - Scrambled image data
        - scramble_id: int - Scramble algorithm identifier
        
        Returns:
        bytes - Decrypted image data
        """
    
    @staticmethod
    def is_image_scrambled(image_data: bytes) -> bool:
        """
        Check if image data is scrambled.
        
        Parameters:
        - image_data: bytes - Image data to check
        
        Returns:
        bool - True if image appears to be scrambled
        """
    
    @staticmethod
    def convert_image_format(image_data: bytes, target_format: str) -> bytes:
        """
        Convert image to different format.
        
        Parameters:
        - image_data: bytes - Original image data
        - target_format: str - Target format ('JPEG', 'PNG', 'WEBP')
        
        Returns:
        bytes - Converted image data
        """
    
    @staticmethod
    def get_image_info(image_data: bytes) -> Dict[str, Any]:
        """
        Get image metadata and properties.
        
        Parameters:
        - image_data: bytes - Image data
        
        Returns:
        dict - Image information (width, height, format, size)
        """
    
    @staticmethod
    def merge_images_vertical(images: List[bytes]) -> bytes:
        """
        Merge multiple images vertically into single image.
        
        Parameters:
        - images: List[bytes] - List of image data to merge
        
        Returns:
        bytes - Merged image data
        """

Cryptographic Tools

Encryption and decryption utilities for API communications and data protection.

class JmCryptoTool:
    """
    Encryption/decryption utilities for API communications.
    
    Handles the encryption protocols used by JMComic mobile API
    and provides security functions for data protection.
    
    Static Methods:
    - encrypt_api_request(data): Encrypt API request data
    - decrypt_api_response(encrypted_data): Decrypt API response
    - generate_request_signature(data): Generate request signature
    - validate_response_signature(response): Validate response signature
    - hash_password(password): Hash password for authentication
    """
    
    @staticmethod
    def encrypt_api_request(data: Dict[str, Any]) -> bytes:
        """
        Encrypt API request data using JMComic protocol.
        
        Parameters:
        - data: dict - Request data to encrypt
        
        Returns:
        bytes - Encrypted request data
        """
    
    @staticmethod
    def decrypt_api_response(encrypted_data: bytes) -> Dict[str, Any]:
        """
        Decrypt API response data using JMComic protocol.
        
        Parameters:
        - encrypted_data: bytes - Encrypted response data
        
        Returns:
        dict - Decrypted response data
        """
    
    @staticmethod
    def generate_request_signature(data: Dict[str, Any]) -> str:
        """
        Generate request signature for API authentication.
        
        Parameters:
        - data: dict - Request data
        
        Returns:
        str - Generated signature
        """
    
    @staticmethod
    def validate_response_signature(response: Dict[str, Any]) -> bool:
        """
        Validate response signature for data integrity.
        
        Parameters:
        - response: dict - API response with signature
        
        Returns:
        bool - True if signature is valid
        """

Usage Examples

# Text processing
jm_id = JmcomicText.parse_to_jm_id("https://jmcomic.example/album/123456")
clean_name = JmcomicText.clean_filename("Album: Title/with\\invalid*chars")

# HTML parsing
album_info = PatternTool.match_album_info(html_content)
image_urls = PatternTool.match_image_urls(photo_html)

# Page processing
album = JmPageTool.parse_album_page(album_html)
search_results = JmPageTool.parse_search_page(search_html)

# Image processing
decrypted_image = JmImageTool.decrypt_image(scrambled_data, scramble_id)
image_info = JmImageTool.get_image_info(image_data)
converted_image = JmImageTool.convert_image_format(image_data, 'JPEG')

# API processing
album = JmApiAdaptTool.adapt_album_response(api_response)
normalized_data = JmApiAdaptTool.normalize_response_data(raw_data)

# Cryptographic operations
encrypted_request = JmCryptoTool.encrypt_api_request(request_data)
decrypted_response = JmCryptoTool.decrypt_api_response(encrypted_response)

Integration with Core Systems

These tools integrate seamlessly with the core download and client systems:

  • Text tools are used throughout for ID parsing and URL handling
  • Pattern tools power the HTML client's data extraction
  • Page tools convert HTML pages to structured entities
  • API tools standardize responses across different client types
  • Image tools handle content processing in downloaders and plugins
  • Crypto tools secure API communications in the mobile client

Install with Tessl CLI

npx tessl i tessl/pypi-jmcomic

docs

client-system.md

command-line-interface.md

configuration-management.md

content-entities.md

core-download-api.md

download-system.md

exception-handling.md

index.md

plugin-system.md

text-data-processing.md

tile.json