Tessl Tile for pypi/thefuzz@0.22.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

index.md string-processing.md string-similarity.md string-utilities.md

string-similarity.mddocs/

0
# String Similarity Scoring
1

2
Core fuzzy string matching functions that calculate similarity ratios between strings using various algorithms. All functions return integer scores from 0 (no match) to 100 (perfect match).
3

4
## Capabilities
5

6
### Basic Ratio Scoring
7

8
Simple string similarity using Levenshtein distance, providing a straightforward comparison between two strings without any preprocessing.
9

10
```python { .api }
11
def ratio(s1: str, s2: str) -> int:
12
    """
13
    Calculate similarity ratio between two strings using Levenshtein distance.
14
    
15
    Args:
16
        s1: First string to compare
17
        s2: Second string to compare
18
        
19
    Returns:
20
        int: Similarity score from 0-100
21
    """
22
```
23

24
### Partial Ratio Scoring
25

26
Finds the ratio of the most similar substring between two strings, useful when one string is contained within another or for partial matches.
27

28
```python { .api }
29
def partial_ratio(s1: str, s2: str) -> int:
30
    """
31
    Calculate similarity ratio of the most similar substring.
32
    
33
    Args:
34
        s1: First string to compare  
35
        s2: Second string to compare
36
        
37
    Returns:
38
        int: Similarity score from 0-100 based on best substring match
39
    """
40
```
41

42
### Token-Based Scoring
43

44
Advanced scoring functions that split strings into tokens (words) and apply different matching strategies to handle word order differences and common variations.
45

46
```python { .api }
47
def token_sort_ratio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int:
48
    """
49
    Calculate similarity after sorting tokens alphabetically.
50
    
51
    Args:
52
        s1: First string to compare
53
        s2: Second string to compare  
54
        force_ascii: Convert to ASCII before processing
55
        full_process: Apply full string preprocessing
56
        
57
    Returns:
58
        int: Similarity score from 0-100
59
    """
60

61
def partial_token_sort_ratio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int:
62
    """
63
    Calculate partial similarity after sorting tokens alphabetically.
64
    
65
    Args:
66
        s1: First string to compare
67
        s2: Second string to compare
68
        force_ascii: Convert to ASCII before processing  
69
        full_process: Apply full string preprocessing
70
        
71
    Returns:
72
        int: Similarity score from 0-100 based on best partial match
73
    """
74

75
def token_set_ratio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int:
76
    """
77
    Calculate similarity using token set comparison.
78
    
79
    Args:
80
        s1: First string to compare
81
        s2: Second string to compare
82
        force_ascii: Convert to ASCII before processing
83
        full_process: Apply full string preprocessing  
84
        
85
    Returns:
86
        int: Similarity score from 0-100
87
    """
88

89
def partial_token_set_ratio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int:
90
    """
91
    Calculate partial similarity using token set comparison.
92
    
93
    Args:
94
        s1: First string to compare
95
        s2: Second string to compare
96
        force_ascii: Convert to ASCII before processing
97
        full_process: Apply full string preprocessing
98
        
99
    Returns:
100
        int: Similarity score from 0-100 based on best partial match
101
    """
102
```
103

104
### Advanced Combination Algorithms
105

106
Sophisticated scoring functions that combine multiple algorithms and apply intelligent weighting to provide the most accurate similarity scores.
107

108
```python { .api }
109
def QRatio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int:
110
    """
111
    Quick ratio comparison optimized for speed.
112
    
113
    Args:
114
        s1: First string to compare
115
        s2: Second string to compare
116
        force_ascii: Convert to ASCII before processing
117
        full_process: Apply full string preprocessing
118
        
119
    Returns:
120
        int: Similarity score from 0-100
121
    """
122

123
def UQRatio(s1: str, s2: str, full_process: bool = True) -> int:
124
    """
125
    Unicode-aware quick ratio comparison.
126
    
127
    Args:
128
        s1: First string to compare
129
        s2: Second string to compare  
130
        full_process: Apply full string preprocessing
131
        
132
    Returns:
133
        int: Similarity score from 0-100
134
    """
135

136
def WRatio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int:
137
    """
138
    Weighted ratio using multiple algorithms for best accuracy.
139
    
140
    Combines ratio, partial_ratio, token_sort_ratio, and token_set_ratio
141
    with intelligent weighting based on string length ratios.
142
    
143
    Args:
144
        s1: First string to compare
145
        s2: Second string to compare
146
        force_ascii: Convert to ASCII before processing
147
        full_process: Apply full string preprocessing
148
        
149
    Returns:
150
        int: Similarity score from 0-100
151
    """
152

153
def UWRatio(s1: str, s2: str, full_process: bool = True) -> int:
154
    """
155
    Unicode-aware weighted ratio using multiple algorithms.
156
    
157
    Args:
158
        s1: First string to compare
159
        s2: Second string to compare
160
        full_process: Apply full string preprocessing
161
        
162
    Returns:
163
        int: Similarity score from 0-100
164
    """
165
```
166

167
## Usage Examples
168

169
### Basic Comparison
170

171
```python
172
from thefuzz import fuzz
173

174
# Simple string comparison
175
score = fuzz.ratio("hello world", "hello world!")
176
print(score)  # 95
177

178
# Partial matching - useful for substring matching
179
score = fuzz.partial_ratio("this is a test", "is a")  
180
print(score)  # 100
181
```
182

183
### Token-Based Matching
184

185
```python
186
from thefuzz import fuzz
187

188
# Handle word order differences
189
score = fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
190
print(score)  # 100
191

192
# Token set matching - handles duplicates and order
193
score = fuzz.token_set_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
194
print(score)  # 100
195
```
196

197
### Advanced Algorithms
198

199
```python
200
from thefuzz import fuzz
201

202
# WRatio provides the most accurate results by combining algorithms
203
score = fuzz.WRatio("New York Mets vs Atlanta Braves", "Atlanta Braves vs New York Mets")
204
print(score)  # High score despite different word order
205

206
# Unicode support
207
score = fuzz.UWRatio("Café", "cafe")  # Handles accented characters
208
```

Version

Tile

Files

string-similarity.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

string-similarity.mddocs/