or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

batch-processing.mdcore-anonymization.mddeanonymization.mdentities.mdindex.mdoperators.md

core-anonymization.mddocs/

0

# Core Anonymization

1

2

The AnonymizerEngine provides the primary anonymization functionality, taking text and analyzer results to apply configurable transformation operators on detected PII entities.

3

4

## Capabilities

5

6

### Text Anonymization

7

8

Main anonymization method that processes text with PII entity locations and applies transformation operators.

9

10

```python { .api }

11

def anonymize(

12

self,

13

text: str,

14

analyzer_results: List[RecognizerResult],

15

operators: Optional[Dict[str, OperatorConfig]] = None,

16

conflict_resolution: ConflictResolutionStrategy = ConflictResolutionStrategy.MERGE_SIMILAR_OR_CONTAINED

17

) -> EngineResult:

18

"""

19

Anonymize method to anonymize the given text.

20

21

Parameters:

22

- text (str): The text to anonymize

23

- analyzer_results (List[RecognizerResult]): Results from analyzer containing PII locations

24

- operators (Optional[Dict[str, OperatorConfig]]): Configuration for anonymizers per entity type

25

- conflict_resolution (ConflictResolutionStrategy): Strategy for handling overlapping entities

26

27

Returns:

28

EngineResult: Contains anonymized text and metadata about transformations

29

"""

30

```

31

32

**Usage Examples:**

33

34

```python

35

from presidio_anonymizer import AnonymizerEngine

36

from presidio_anonymizer.entities import RecognizerResult, OperatorConfig

37

38

engine = AnonymizerEngine()

39

40

# Basic replacement

41

result = engine.anonymize(

42

text="My name is John Doe",

43

analyzer_results=[RecognizerResult("PERSON", 11, 19, 0.9)],

44

operators={"PERSON": OperatorConfig("replace", {"new_value": "[REDACTED]"})}

45

)

46

print(result.text) # "My name is [REDACTED]"

47

48

# Multiple operators

49

operators = {

50

"PERSON": OperatorConfig("replace", {"new_value": "[PERSON]"}),

51

"EMAIL_ADDRESS": OperatorConfig("mask", {"masking_char": "*", "chars_to_mask": 5}),

52

"PHONE_NUMBER": OperatorConfig("redact")

53

}

54

55

result = engine.anonymize(

56

text="Contact John at john@email.com or 555-1234",

57

analyzer_results=[

58

RecognizerResult("PERSON", 8, 12, 0.9),

59

RecognizerResult("EMAIL_ADDRESS", 16, 30, 0.9),

60

RecognizerResult("PHONE_NUMBER", 34, 42, 0.8)

61

],

62

operators=operators

63

)

64

```

65

66

### Operator Management

67

68

Add or remove custom anonymization operators at runtime.

69

70

```python { .api }

71

def add_anonymizer(self, anonymizer_cls: Type[Operator]) -> None:

72

"""

73

Add a new anonymizer to the engine.

74

75

Parameters:

76

- anonymizer_cls (Type[Operator]): The anonymizer class to add

77

"""

78

79

def remove_anonymizer(self, anonymizer_cls: Type[Operator]) -> None:

80

"""

81

Remove an anonymizer from the engine.

82

83

Parameters:

84

- anonymizer_cls (Type[Operator]): The anonymizer class to remove

85

"""

86

```

87

88

**Usage Example:**

89

90

```python

91

from presidio_anonymizer.operators import Operator

92

93

class CustomHasher(Operator):

94

def operate(self, text, params):

95

# Custom hashing logic

96

return f"HASH_{hash(text)}"

97

98

engine = AnonymizerEngine()

99

engine.add_anonymizer(CustomHasher)

100

101

# Use the custom operator

102

operators = {"PERSON": OperatorConfig("custom_hasher")}

103

```

104

105

### Available Operators

106

107

Get list of all available anonymization operators.

108

109

```python { .api }

110

def get_anonymizers(self) -> List[str]:

111

"""

112

Return a list of supported anonymizers.

113

114

Returns:

115

List[str]: Names of available anonymizer operators

116

"""

117

```

118

119

**Usage Example:**

120

121

```python

122

engine = AnonymizerEngine()

123

available = engine.get_anonymizers()

124

print(available) # ['replace', 'redact', 'mask', 'hash', 'encrypt', 'keep']

125

```

126

127

## Conflict Resolution Strategies

128

129

When PII entities overlap, the engine uses conflict resolution strategies:

130

131

### MERGE_SIMILAR_OR_CONTAINED (Default)

132

133

Merges entities of the same type that overlap or are contained within each other.

134

135

```python

136

from presidio_anonymizer.entities import ConflictResolutionStrategy

137

138

result = engine.anonymize(

139

text=text,

140

analyzer_results=overlapping_results,

141

conflict_resolution=ConflictResolutionStrategy.MERGE_SIMILAR_OR_CONTAINED

142

)

143

```

144

145

### REMOVE_INTERSECTIONS

146

147

Adjusts boundaries of overlapping entities to remove intersections, keeping higher-scored entities intact.

148

149

```python

150

result = engine.anonymize(

151

text=text,

152

analyzer_results=overlapping_results,

153

conflict_resolution=ConflictResolutionStrategy.REMOVE_INTERSECTIONS

154

)

155

```

156

157

## Default Behavior

158

159

- **Default Operator**: If no operator is specified for an entity type, uses "replace" operator

160

- **Default Parameters**: Each operator has sensible defaults (e.g., mask uses "*" character)

161

- **Whitespace Merging**: Adjacent entities of the same type separated only by whitespace are merged

162

- **Sorted Processing**: Entities are processed in start position order for consistent results