or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-levenshtein

Python extension for computing string edit distances and similarities.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/levenshtein@0.27.x

To install, run

npx @tessl/cli install tessl/pypi-levenshtein@0.27.0

0

# Levenshtein

1

2

A high-performance Python C extension for computing various string edit distances and similarities. The library provides fast computation of Levenshtein (edit) distance, Hamming distance, Jaro and Jaro-Winkler similarities, along with detailed edit operations, string averaging, and sequence similarity analysis.

3

4

## Package Information

5

6

- **Package Name**: Levenshtein

7

- **Language**: Python

8

- **Installation**: `pip install levenshtein`

9

- **Requirements**: Python 3.9 or later

10

11

## Core Imports

12

13

```python

14

import Levenshtein

15

```

16

17

Common usage patterns:

18

19

```python

20

from Levenshtein import distance, ratio, editops, opcodes, median

21

```

22

23

## Basic Usage

24

25

```python

26

import Levenshtein

27

28

# Calculate edit distance between strings

29

dist = Levenshtein.distance("kitten", "sitting")

30

print(f"Edit distance: {dist}") # Edit distance: 3

31

32

# Calculate similarity ratio (0.0 to 1.0)

33

similarity = Levenshtein.ratio("kitten", "sitting")

34

print(f"Similarity: {similarity:.2f}") # Similarity: 0.62

35

36

# Get edit operations to transform one string to another

37

ops = Levenshtein.editops("kitten", "sitting")

38

print(ops) # [('replace', 0, 0), ('replace', 4, 4), ('insert', 6, 6)]

39

40

# Find approximate median of multiple strings

41

strings = ["Levenshtein", "Levenhstein", "Levenshtien", "Levenstein"]

42

med = Levenshtein.median(strings)

43

print(f"Median: {med}") # Median: Levenshtein

44

```

45

46

## Architecture

47

48

The Levenshtein library is built on the rapidfuzz library for core distance algorithms, providing:

49

50

- **High Performance**: C extension implementation for fast computation

51

- **Multiple Metrics**: Support for various string distance and similarity measures

52

- **Edit Analysis**: Detailed edit operation sequences and transformations

53

- **String Averaging**: Median string calculation and string improvement algorithms

54

- **Compatibility**: SequenceMatcher-like interface for drop-in replacement scenarios

55

56

## Capabilities

57

58

### String Distance and Similarity

59

60

Core functions for computing various string distance metrics and similarity scores, including Levenshtein distance, normalized similarity ratios, Hamming distance, and Jaro/Jaro-Winkler similarities.

61

62

```python { .api }

63

def distance(s1, s2, *, weights=(1, 1, 1), processor=None, score_cutoff=None, score_hint=None):

64

"""Calculate Levenshtein distance with custom operation weights."""

65

66

def ratio(s1, s2, *, processor=None, score_cutoff=None):

67

"""Calculate normalized indel similarity ratio [0, 1]."""

68

69

def hamming(s1, s2, *, pad=True, processor=None, score_cutoff=None):

70

"""Calculate Hamming distance (substitutions only)."""

71

72

def jaro(s1, s2, *, processor=None, score_cutoff=None):

73

"""Calculate Jaro similarity."""

74

75

def jaro_winkler(s1, s2, *, prefix_weight=0.1, processor=None, score_cutoff=None):

76

"""Calculate Jaro-Winkler similarity with prefix weighting."""

77

```

78

79

[String Distance and Similarity](./string-distance.md)

80

81

### Edit Operations

82

83

Functions for analyzing and manipulating edit operation sequences that transform one string into another, including conversion between different operation formats and applying transformations.

84

85

```python { .api }

86

def editops(*args):

87

"""Find sequence of edit operations (triples) transforming one string to another."""

88

89

def opcodes(*args):

90

"""Find sequence of edit operations (5-tuples) like SequenceMatcher."""

91

92

def matching_blocks(edit_operations, source_string, destination_string):

93

"""Find identical blocks in two strings from edit operations."""

94

95

def apply_edit(edit_operations, source_string, destination_string):

96

"""Apply sequence of edit operations to transform a string."""

97

```

98

99

[Edit Operations](./edit-operations.md)

100

101

### String Averaging and Median

102

103

Functions for computing approximate median strings, improving strings toward a target set, and calculating sequence and set similarity ratios for multiple strings.

104

105

```python { .api }

106

def median(strings, weights=None):

107

"""Find approximate median string from a list of strings."""

108

109

def quickmedian(strings, weights=None):

110

"""Fast approximate median string calculation."""

111

112

def median_improve(string, strings, weights=None):

113

"""Improve a string towards median of given strings."""

114

115

def seqratio(strings1, strings2):

116

"""Calculate similarity ratio between two string sequences."""

117

118

def setratio(strings1, strings2):

119

"""Calculate similarity ratio between two string sets."""

120

```

121

122

[String Averaging and Median](./string-averaging.md)

123

124

## Types

125

126

```python { .api }

127

# Type aliases for function parameters

128

Sequence = Union[str, bytes, List[Any]]

129

Processor = Callable[[Sequence], Sequence]

130

EditOperation = Tuple[str, int, int] # (operation, source_pos, dest_pos)

131

Opcode = Tuple[str, int, int, int, int] # (operation, start1, end1, start2, end2)

132

MatchingBlock = Tuple[int, int, int] # (source_pos, dest_pos, length)

133

```