Fuzzy string matching library using Levenshtein Distance calculations for approximate string comparison and search
npx @tessl/cli install tessl/pypi-fuzzywuzzy@0.18.0A Python library for fuzzy string matching using Levenshtein Distance calculations. FuzzyWuzzy enables developers to find similarities between strings through multiple algorithms including ratio matching, partial ratio matching, token-based comparisons, and weighted combinations of scoring methods.
pip install fuzzywuzzypip install fuzzywuzzy[speedup] (includes python-Levenshtein for 4-10x performance improvement)from fuzzywuzzy import fuzz # Core fuzzy string comparison algorithms
from fuzzywuzzy import process # Functions for processing collections of strings
from fuzzywuzzy import utils # String processing and validation utilitiesFor advanced string processing:
from fuzzywuzzy.string_processing import StringProcessorFor optional high-performance matching (when python-Levenshtein is installed):
from fuzzywuzzy.StringMatcher import StringMatcherfrom fuzzywuzzy import fuzz, process
# Basic string similarity
similarity = fuzz.ratio("this is a test", "this is a test!")
print(similarity) # 97
# Partial string matching
partial_sim = fuzz.partial_ratio("this is a test", "this is a test!")
print(partial_sim) # 100
# Token-based comparison (handles word order)
token_sim = fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
print(token_sim) # 100
# Find best match in a list
choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]
best = process.extractOne("new york jets", choices)
print(best) # ("New York Jets", 100)
# Extract multiple matches
matches = process.extract("new york", choices, limit=2)
print(matches) # [("New York Jets", 90), ("New York Giants", 90)]FuzzyWuzzy uses a layered approach to fuzzy string matching:
fuzz module): Basic ratio, partial ratio, and token-based algorithmsprocess module): Functions for working with collections and finding best matchesutils module): String preprocessing, validation, and helper functionsThe library automatically falls back to Python's difflib.SequenceMatcher when the optional Levenshtein library is not installed.
Core fuzzy string comparison functions including basic ratio matching, partial ratio matching, and advanced token-based algorithms that handle word order variations and set comparisons.
def ratio(s1: str, s2: str) -> int: ...
def partial_ratio(s1: str, s2: str) -> int: ...
def token_sort_ratio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int: ...
def token_set_ratio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int: ...
def WRatio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int: ...Functions for extracting best matches from collections of strings, including single best match extraction, multiple match extraction with limits, and fuzzy deduplication capabilities.
def extractOne(query: str, choices, processor=None, scorer=None, score_cutoff: int = 0): ...
def extract(query: str, choices, processor=None, scorer=None, limit: int = 5): ...
def extractBests(query: str, choices, processor=None, scorer=None, score_cutoff: int = 0, limit: int = 5): ...
def dedupe(contains_dupes: list, threshold: int = 70, scorer=None): ...String preprocessing, validation functions, and utility decorators for handling edge cases in fuzzy string matching operations.
def full_process(s: str, force_ascii: bool = False) -> str: ...
def validate_string(s) -> bool: ...
def make_type_consistent(s1, s2): ...