CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-python-levenshtein

Python compatibility wrapper for computing string edit distances and similarities using fast Levenshtein algorithms.

88

1.37x
Overview
Eval results
Files

task.mdevals/scenario-5/

String Consensus Finder

Build a tool that finds consensus strings from collections of text variants, useful for normalizing user-generated content with spelling variations.

Problem Description

You need to implement a function that analyzes a collection of text strings and identifies a representative "consensus" string that best represents the group. This is particularly useful when you have multiple variations of the same text (due to typos, different spellings, or minor differences) and need to find the most representative version.

The consensus string should be the one that minimizes the total edit distance to all strings in the collection, treating the collection as a set where order doesn't matter.

Requirements

Implement a function find_consensus(strings) that:

  • Accepts a list of strings as input
  • Returns a single string that represents the consensus of the input collection
  • The returned string should minimize the total edit distance to all input strings
  • Uses a set-based approach where the order of strings in the input doesn't affect the result
  • Handles edge cases appropriately (empty lists, single strings, etc.)

Examples

# Example 1: Finding consensus among product names
variants = ["iPhone", "iFone", "IPhone", "iphone"]
consensus = find_consensus(variants)
# Should return a string that minimizes distance to all variants

# Example 2: Normalizing user input
user_inputs = ["hello world", "helo world", "hello wrld", "hello world"]
consensus = find_consensus(user_inputs)
# Should return the most representative string

Test Cases

  • Given an empty list, the function returns an empty string @test
  • Given a single string "hello", the function returns "hello" @test
  • Given ["cat", "bat", "rat"], the function returns one of the input strings that minimizes total edit distance @test
  • Given ["test", "text", "best"], the function returns a string from the collection that best represents the group @test

Implementation

@generates

API

def find_consensus(strings):
    """
    Finds a consensus string from a collection of string variants.

    Args:
        strings: A list of strings to analyze

    Returns:
        A string that represents the consensus of the input collection,
        minimizing total edit distance to all input strings
    """
    pass

Dependencies { .dependencies }

Levenshtein { .dependency }

Provides string edit distance and similarity computation capabilities, including specialized functions for finding representative strings from collections.

@satisfied-by

Install with Tessl CLI

npx tessl i tessl/pypi-python-levenshtein

tile.json