Fuzzy string matching library using Levenshtein Distance algorithms for approximate text comparison
Core fuzzy string matching functions that calculate similarity ratios between strings using various algorithms. All functions return integer scores from 0 (no match) to 100 (perfect match).
Simple string similarity using Levenshtein distance, providing a straightforward comparison between two strings without any preprocessing.
def ratio(s1: str, s2: str) -> int:
"""
Calculate similarity ratio between two strings using Levenshtein distance.
Args:
s1: First string to compare
s2: Second string to compare
Returns:
int: Similarity score from 0-100
"""Finds the ratio of the most similar substring between two strings, useful when one string is contained within another or for partial matches.
def partial_ratio(s1: str, s2: str) -> int:
"""
Calculate similarity ratio of the most similar substring.
Args:
s1: First string to compare
s2: Second string to compare
Returns:
int: Similarity score from 0-100 based on best substring match
"""Advanced scoring functions that split strings into tokens (words) and apply different matching strategies to handle word order differences and common variations.
def token_sort_ratio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int:
"""
Calculate similarity after sorting tokens alphabetically.
Args:
s1: First string to compare
s2: Second string to compare
force_ascii: Convert to ASCII before processing
full_process: Apply full string preprocessing
Returns:
int: Similarity score from 0-100
"""
def partial_token_sort_ratio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int:
"""
Calculate partial similarity after sorting tokens alphabetically.
Args:
s1: First string to compare
s2: Second string to compare
force_ascii: Convert to ASCII before processing
full_process: Apply full string preprocessing
Returns:
int: Similarity score from 0-100 based on best partial match
"""
def token_set_ratio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int:
"""
Calculate similarity using token set comparison.
Args:
s1: First string to compare
s2: Second string to compare
force_ascii: Convert to ASCII before processing
full_process: Apply full string preprocessing
Returns:
int: Similarity score from 0-100
"""
def partial_token_set_ratio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int:
"""
Calculate partial similarity using token set comparison.
Args:
s1: First string to compare
s2: Second string to compare
force_ascii: Convert to ASCII before processing
full_process: Apply full string preprocessing
Returns:
int: Similarity score from 0-100 based on best partial match
"""Sophisticated scoring functions that combine multiple algorithms and apply intelligent weighting to provide the most accurate similarity scores.
def QRatio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int:
"""
Quick ratio comparison optimized for speed.
Args:
s1: First string to compare
s2: Second string to compare
force_ascii: Convert to ASCII before processing
full_process: Apply full string preprocessing
Returns:
int: Similarity score from 0-100
"""
def UQRatio(s1: str, s2: str, full_process: bool = True) -> int:
"""
Unicode-aware quick ratio comparison.
Args:
s1: First string to compare
s2: Second string to compare
full_process: Apply full string preprocessing
Returns:
int: Similarity score from 0-100
"""
def WRatio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int:
"""
Weighted ratio using multiple algorithms for best accuracy.
Combines ratio, partial_ratio, token_sort_ratio, and token_set_ratio
with intelligent weighting based on string length ratios.
Args:
s1: First string to compare
s2: Second string to compare
force_ascii: Convert to ASCII before processing
full_process: Apply full string preprocessing
Returns:
int: Similarity score from 0-100
"""
def UWRatio(s1: str, s2: str, full_process: bool = True) -> int:
"""
Unicode-aware weighted ratio using multiple algorithms.
Args:
s1: First string to compare
s2: Second string to compare
full_process: Apply full string preprocessing
Returns:
int: Similarity score from 0-100
"""from thefuzz import fuzz
# Simple string comparison
score = fuzz.ratio("hello world", "hello world!")
print(score) # 95
# Partial matching - useful for substring matching
score = fuzz.partial_ratio("this is a test", "is a")
print(score) # 100from thefuzz import fuzz
# Handle word order differences
score = fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
print(score) # 100
# Token set matching - handles duplicates and order
score = fuzz.token_set_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
print(score) # 100from thefuzz import fuzz
# WRatio provides the most accurate results by combining algorithms
score = fuzz.WRatio("New York Mets vs Atlanta Braves", "Atlanta Braves vs New York Mets")
print(score) # High score despite different word order
# Unicode support
score = fuzz.UWRatio("Café", "cafe") # Handles accented charactersInstall with Tessl CLI
npx tessl i tessl/pypi-thefuzz