or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/python-levenshtein@0.27.x
tile.json

tessl/pypi-python-levenshtein

tessl install tessl/pypi-python-levenshtein@0.27.0

Python compatibility wrapper for computing string edit distances and similarities using fast Levenshtein algorithms.

Agent Success

Agent success rate when using this tile

88%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.38x

Baseline

Agent success rate without this tile

64%

task.mdevals/scenario-8/

Duplicate Detection System

Build a duplicate detection system that identifies similar product entries in a catalog by calculating their similarity scores.

Requirements

Implement a system that can:

Duplicate Detection

  • calculate_similarity("Apple iPhone 13", "Apple iPhone 13") returns 1.0 @test
  • calculate_similarity("Samsung Galaxy S21", "Samsung Galaxy S22") returns a value between 0.8 and 1.0 @test
  • is_duplicate("MacBook Pro 16", "MacBook Pro 16-inch") returns True with default threshold @test
  • is_duplicate("iPhone", "Samsung") returns False @test
  • find_duplicates("Dell XPS 13", ["Dell XPS 13 Laptop", "HP Pavilion", "Dell XPS 15"]) returns ["Dell XPS 13 Laptop"] @test
  • find_duplicates("Sony TV", ["Samsung TV", "LG Monitor"]) returns empty list when no matches above threshold @test

Implementation

@generates

API

def calculate_similarity(product1: str, product2: str) -> float:
    """
    Calculate the similarity score between two product names.

    Args:
        product1: First product name
        product2: Second product name

    Returns:
        A float between 0.0 (completely different) and 1.0 (identical)
    """
    pass

def is_duplicate(product1: str, product2: str, threshold: float = 0.85) -> bool:
    """
    Determine if two products are duplicates based on similarity threshold.

    Args:
        product1: First product name
        product2: Second product name
        threshold: Minimum similarity score to consider as duplicate (default: 0.85)

    Returns:
        True if products are duplicates, False otherwise
    """
    pass

def find_duplicates(target: str, catalog: list[str], threshold: float = 0.85) -> list[str]:
    """
    Find all products in catalog that are potential duplicates of target.

    Args:
        target: The product name to check for duplicates
        catalog: List of product names to search through
        threshold: Minimum similarity score to consider as duplicate (default: 0.85)

    Returns:
        List of product names from catalog that are duplicates of target
    """
    pass

Dependencies { .dependencies }

python-Levenshtein { .dependency }

Provides fast string similarity computation.

@satisfied-by