or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

fuzzy-algorithms.mdindex.mdstring-processing.mdutilities.md
tile.json

tessl/pypi-fuzzywuzzy

Fuzzy string matching library using Levenshtein Distance calculations for approximate string comparison and search

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/fuzzywuzzy@0.18.x

To install, run

npx @tessl/cli install tessl/pypi-fuzzywuzzy@0.18.0

index.mddocs/

FuzzyWuzzy

A Python library for fuzzy string matching using Levenshtein Distance calculations. FuzzyWuzzy enables developers to find similarities between strings through multiple algorithms including ratio matching, partial ratio matching, token-based comparisons, and weighted combinations of scoring methods.

Package Information

  • Package Name: fuzzywuzzy
  • Language: Python
  • Installation: pip install fuzzywuzzy
  • Optional Speedup: pip install fuzzywuzzy[speedup] (includes python-Levenshtein for 4-10x performance improvement)

Core Imports

from fuzzywuzzy import fuzz      # Core fuzzy string comparison algorithms
from fuzzywuzzy import process   # Functions for processing collections of strings
from fuzzywuzzy import utils     # String processing and validation utilities

For advanced string processing:

from fuzzywuzzy.string_processing import StringProcessor

For optional high-performance matching (when python-Levenshtein is installed):

from fuzzywuzzy.StringMatcher import StringMatcher

Basic Usage

from fuzzywuzzy import fuzz, process

# Basic string similarity
similarity = fuzz.ratio("this is a test", "this is a test!")
print(similarity)  # 97

# Partial string matching
partial_sim = fuzz.partial_ratio("this is a test", "this is a test!")
print(partial_sim)  # 100

# Token-based comparison (handles word order)
token_sim = fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
print(token_sim)  # 100

# Find best match in a list
choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]
best = process.extractOne("new york jets", choices)
print(best)  # ("New York Jets", 100)

# Extract multiple matches
matches = process.extract("new york", choices, limit=2)
print(matches)  # [("New York Jets", 90), ("New York Giants", 90)]

Architecture

FuzzyWuzzy uses a layered approach to fuzzy string matching:

  • Core Algorithms (fuzz module): Basic ratio, partial ratio, and token-based algorithms
  • Processing Layer (process module): Functions for working with collections and finding best matches
  • Utilities Layer (utils module): String preprocessing, validation, and helper functions
  • Optional Performance Layer: Uses python-Levenshtein when available for significant speedup

The library automatically falls back to Python's difflib.SequenceMatcher when the optional Levenshtein library is not installed.

Capabilities

Fuzzy String Algorithms

Core fuzzy string comparison functions including basic ratio matching, partial ratio matching, and advanced token-based algorithms that handle word order variations and set comparisons.

def ratio(s1: str, s2: str) -> int: ...
def partial_ratio(s1: str, s2: str) -> int: ...
def token_sort_ratio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int: ...
def token_set_ratio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int: ...
def WRatio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int: ...

Fuzzy Algorithms

String Collection Processing

Functions for extracting best matches from collections of strings, including single best match extraction, multiple match extraction with limits, and fuzzy deduplication capabilities.

def extractOne(query: str, choices, processor=None, scorer=None, score_cutoff: int = 0): ...
def extract(query: str, choices, processor=None, scorer=None, limit: int = 5): ...
def extractBests(query: str, choices, processor=None, scorer=None, score_cutoff: int = 0, limit: int = 5): ...
def dedupe(contains_dupes: list, threshold: int = 70, scorer=None): ...

String Processing

Utilities and Helpers

String preprocessing, validation functions, and utility decorators for handling edge cases in fuzzy string matching operations.

def full_process(s: str, force_ascii: bool = False) -> str: ...
def validate_string(s) -> bool: ...
def make_type_consistent(s1, s2): ...

Utilities