Ctrl + K

tessl/pypi-w3lib

tessl install tessl/pypi-w3lib@2.3.0

Library of web-related functions for HTML manipulation, HTTP processing, URL handling, and encoding detection

Agent Success

Agent success rate when using this tile

84%

Improvement

Agent success rate improvement when using this tile compared to baseline

0.91x

Baseline

Agent success rate without this tile

92%

HTML Entity Converter

Build a text processing utility that converts HTML entity references into their corresponding Unicode characters. The utility should handle various types of HTML entities that appear in web content and provide options for selective entity preservation.

Requirements

Your implementation should process HTML text containing entities and convert them to readable Unicode characters:

Entity Types: Support for named entities (like &), decimal numeric entities (like A), and hexadecimal numeric entities (like A).
Selective Preservation: Provide the ability to keep specific entities unconverted. For example, preserve <, >, and & while converting others.
Entity Detection: Implement a check to determine if a given text contains any HTML entities before processing.
Illegal Character Handling: Handle removal of illegal XML/HTML character references according to standard specifications.

Test Cases

Converting text "Hello & goodbye" produces "Hello & goodbye" @test
Converting text "Price: €100" produces "Price: €100" @test
Converting text "<div>" with preserved entities ['lt', 'gt'] keeps "<div>" unchanged @test
Checking if "No entities here" contains entities returns False @test
Checking if "Has   entity" contains entities returns True @test

Implementation

@generates

API

def convert_entities(text: str, keep: list[str] | None = None, remove_illegal: bool = True) -> str:
    """
    Convert HTML entities in text to Unicode characters.

    Args:
        text: The text containing HTML entities
        keep: Optional list of entity names to preserve (e.g., ['amp', 'lt', 'gt'])
        remove_illegal: Whether to remove illegal character references

    Returns:
        Text with entities converted to Unicode characters
    """
    pass

def has_entities(text: str) -> bool:
    """
    Check if text contains any HTML entities.

    Args:
        text: The text to check

    Returns:
        True if the text contains HTML entities, False otherwise
    """
    pass

Dependencies { .dependencies }

w3lib { .dependency }

Provides web utility functions for HTML processing.

@satisfied-by

tessl/pypi-w3lib

task.mdevals/scenario-1/

HTML Entity Converter

Requirements

Test Cases

Implementation

API

Dependencies { .dependencies }

w3lib { .dependency }

Version

tessl/pypi-w3lib

task.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-1/

HTML Entity Converter

Requirements

Test Cases

Implementation

API

Dependencies { .dependencies }

w3lib { .dependency }

Version

task.mdevals/scenario-1/