or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-fuzzywuzzy

Fuzzy string matching library using Levenshtein Distance calculations for approximate string comparison and search

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/fuzzywuzzy@0.18.x

To install, run

npx @tessl/cli install tessl/pypi-fuzzywuzzy@0.18.0

0

# FuzzyWuzzy

1

2

A Python library for fuzzy string matching using Levenshtein Distance calculations. FuzzyWuzzy enables developers to find similarities between strings through multiple algorithms including ratio matching, partial ratio matching, token-based comparisons, and weighted combinations of scoring methods.

3

4

## Package Information

5

6

- **Package Name**: fuzzywuzzy

7

- **Language**: Python

8

- **Installation**: `pip install fuzzywuzzy`

9

- **Optional Speedup**: `pip install fuzzywuzzy[speedup]` (includes python-Levenshtein for 4-10x performance improvement)

10

11

## Core Imports

12

13

```python

14

from fuzzywuzzy import fuzz # Core fuzzy string comparison algorithms

15

from fuzzywuzzy import process # Functions for processing collections of strings

16

from fuzzywuzzy import utils # String processing and validation utilities

17

```

18

19

For advanced string processing:

20

```python

21

from fuzzywuzzy.string_processing import StringProcessor

22

```

23

24

For optional high-performance matching (when python-Levenshtein is installed):

25

```python

26

from fuzzywuzzy.StringMatcher import StringMatcher

27

```

28

29

## Basic Usage

30

31

```python

32

from fuzzywuzzy import fuzz, process

33

34

# Basic string similarity

35

similarity = fuzz.ratio("this is a test", "this is a test!")

36

print(similarity) # 97

37

38

# Partial string matching

39

partial_sim = fuzz.partial_ratio("this is a test", "this is a test!")

40

print(partial_sim) # 100

41

42

# Token-based comparison (handles word order)

43

token_sim = fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

44

print(token_sim) # 100

45

46

# Find best match in a list

47

choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]

48

best = process.extractOne("new york jets", choices)

49

print(best) # ("New York Jets", 100)

50

51

# Extract multiple matches

52

matches = process.extract("new york", choices, limit=2)

53

print(matches) # [("New York Jets", 90), ("New York Giants", 90)]

54

```

55

56

## Architecture

57

58

FuzzyWuzzy uses a layered approach to fuzzy string matching:

59

60

- **Core Algorithms** (`fuzz` module): Basic ratio, partial ratio, and token-based algorithms

61

- **Processing Layer** (`process` module): Functions for working with collections and finding best matches

62

- **Utilities Layer** (`utils` module): String preprocessing, validation, and helper functions

63

- **Optional Performance Layer**: Uses python-Levenshtein when available for significant speedup

64

65

The library automatically falls back to Python's `difflib.SequenceMatcher` when the optional Levenshtein library is not installed.

66

67

## Capabilities

68

69

### Fuzzy String Algorithms

70

71

Core fuzzy string comparison functions including basic ratio matching, partial ratio matching, and advanced token-based algorithms that handle word order variations and set comparisons.

72

73

```python { .api }

74

def ratio(s1: str, s2: str) -> int: ...

75

def partial_ratio(s1: str, s2: str) -> int: ...

76

def token_sort_ratio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int: ...

77

def token_set_ratio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int: ...

78

def WRatio(s1: str, s2: str, force_ascii: bool = True, full_process: bool = True) -> int: ...

79

```

80

81

[Fuzzy Algorithms](./fuzzy-algorithms.md)

82

83

### String Collection Processing

84

85

Functions for extracting best matches from collections of strings, including single best match extraction, multiple match extraction with limits, and fuzzy deduplication capabilities.

86

87

```python { .api }

88

def extractOne(query: str, choices, processor=None, scorer=None, score_cutoff: int = 0): ...

89

def extract(query: str, choices, processor=None, scorer=None, limit: int = 5): ...

90

def extractBests(query: str, choices, processor=None, scorer=None, score_cutoff: int = 0, limit: int = 5): ...

91

def dedupe(contains_dupes: list, threshold: int = 70, scorer=None): ...

92

```

93

94

[String Processing](./string-processing.md)

95

96

### Utilities and Helpers

97

98

String preprocessing, validation functions, and utility decorators for handling edge cases in fuzzy string matching operations.

99

100

```python { .api }

101

def full_process(s: str, force_ascii: bool = False) -> str: ...

102

def validate_string(s) -> bool: ...

103

def make_type_consistent(s1, s2): ...

104

```

105

106

[Utilities](./utilities.md)