tessl install tessl/pypi-python-levenshtein@0.27.0Python compatibility wrapper for computing string edit distances and similarities using fast Levenshtein algorithms.
Agent Success
Agent success rate when using this tile
88%
Improvement
Agent success rate improvement when using this tile compared to baseline
1.38x
Baseline
Agent success rate without this tile
64%
Build a simple name matching system that identifies potential duplicate or similar names in a database.
You're building a system to help identify potential duplicate person records in a database. The system should use string similarity metrics to find names that are likely referring to the same person, even when there are typos, spelling variations, or slight differences.
Implement a name matching module that provides the following functionality:
Calculate Similarity: Compute a similarity score between two names. The score should range from 0.0 (completely different) to 1.0 (identical).
Find Matches Above Threshold: Given a target name and a list of candidate names, return all names that have a similarity score above a specified threshold.
Find Best Match: Given a target name and a list of candidate names, return the single best matching name (the one with the highest similarity score).
Your implementation should handle:
The similarity algorithm should be particularly suited for proper nouns and names, giving appropriate weight to character matching and transpositions.
@generates
def calculate_similarity(name1: str, name2: str) -> float:
"""
Calculate similarity score between two names.
Args:
name1: First name to compare (case-insensitive)
name2: Second name to compare (case-insensitive)
Returns:
Float between 0.0 and 1.0 representing similarity
"""
pass
def find_matches(target: str, candidates: list[str], threshold: float) -> list[str]:
"""
Find all names in candidates that match target above the threshold.
Args:
target: Name to match against (case-insensitive)
candidates: List of candidate names to search
threshold: Minimum similarity score (0.0 to 1.0)
Returns:
List of names with similarity >= threshold, in original order
"""
pass
def find_best_match(target: str, candidates: list[str]) -> str:
"""
Find the single best matching name from candidates.
Args:
target: Name to match against (case-insensitive)
candidates: List of candidate names to search
Returns:
The candidate name with the highest similarity score
"""
passProvides string similarity and distance computation functions.