Tessl Tile for pypi/presidio-anonymizer@2.2.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

batch-processing.md core-anonymization.md deanonymization.md entities.md index.md operators.md

core-anonymization.mddocs/

0
# Core Anonymization
1

2
The AnonymizerEngine provides the primary anonymization functionality, taking text and analyzer results to apply configurable transformation operators on detected PII entities.
3

4
## Capabilities
5

6
### Text Anonymization
7

8
Main anonymization method that processes text with PII entity locations and applies transformation operators.
9

10
```python { .api }
11
def anonymize(
12
    self,
13
    text: str,
14
    analyzer_results: List[RecognizerResult],
15
    operators: Optional[Dict[str, OperatorConfig]] = None,
16
    conflict_resolution: ConflictResolutionStrategy = ConflictResolutionStrategy.MERGE_SIMILAR_OR_CONTAINED
17
) -> EngineResult:
18
    """
19
    Anonymize method to anonymize the given text.
20

21
    Parameters:
22
    - text (str): The text to anonymize
23
    - analyzer_results (List[RecognizerResult]): Results from analyzer containing PII locations
24
    - operators (Optional[Dict[str, OperatorConfig]]): Configuration for anonymizers per entity type
25
    - conflict_resolution (ConflictResolutionStrategy): Strategy for handling overlapping entities
26

27
    Returns:
28
    EngineResult: Contains anonymized text and metadata about transformations
29
    """
30
```
31

32
**Usage Examples:**
33

34
```python
35
from presidio_anonymizer import AnonymizerEngine
36
from presidio_anonymizer.entities import RecognizerResult, OperatorConfig
37

38
engine = AnonymizerEngine()
39

40
# Basic replacement
41
result = engine.anonymize(
42
    text="My name is John Doe",
43
    analyzer_results=[RecognizerResult("PERSON", 11, 19, 0.9)],
44
    operators={"PERSON": OperatorConfig("replace", {"new_value": "[REDACTED]"})}
45
)
46
print(result.text)  # "My name is [REDACTED]"
47

48
# Multiple operators
49
operators = {
50
    "PERSON": OperatorConfig("replace", {"new_value": "[PERSON]"}),
51
    "EMAIL_ADDRESS": OperatorConfig("mask", {"masking_char": "*", "chars_to_mask": 5}),
52
    "PHONE_NUMBER": OperatorConfig("redact")
53
}
54

55
result = engine.anonymize(
56
    text="Contact John at john@email.com or 555-1234",
57
    analyzer_results=[
58
        RecognizerResult("PERSON", 8, 12, 0.9),
59
        RecognizerResult("EMAIL_ADDRESS", 16, 30, 0.9),
60
        RecognizerResult("PHONE_NUMBER", 34, 42, 0.8)
61
    ],
62
    operators=operators
63
)
64
```
65

66
### Operator Management
67

68
Add or remove custom anonymization operators at runtime.
69

70
```python { .api }
71
def add_anonymizer(self, anonymizer_cls: Type[Operator]) -> None:
72
    """
73
    Add a new anonymizer to the engine.
74

75
    Parameters:
76
    - anonymizer_cls (Type[Operator]): The anonymizer class to add
77
    """
78

79
def remove_anonymizer(self, anonymizer_cls: Type[Operator]) -> None:
80
    """
81
    Remove an anonymizer from the engine.
82

83
    Parameters:
84
    - anonymizer_cls (Type[Operator]): The anonymizer class to remove
85
    """
86
```
87

88
**Usage Example:**
89

90
```python
91
from presidio_anonymizer.operators import Operator
92

93
class CustomHasher(Operator):
94
    def operate(self, text, params):
95
        # Custom hashing logic
96
        return f"HASH_{hash(text)}"
97

98
engine = AnonymizerEngine()
99
engine.add_anonymizer(CustomHasher)
100

101
# Use the custom operator
102
operators = {"PERSON": OperatorConfig("custom_hasher")}
103
```
104

105
### Available Operators
106

107
Get list of all available anonymization operators.
108

109
```python { .api }
110
def get_anonymizers(self) -> List[str]:
111
    """
112
    Return a list of supported anonymizers.
113

114
    Returns:
115
    List[str]: Names of available anonymizer operators
116
    """
117
```
118

119
**Usage Example:**
120

121
```python
122
engine = AnonymizerEngine()
123
available = engine.get_anonymizers()
124
print(available)  # ['replace', 'redact', 'mask', 'hash', 'encrypt', 'keep']
125
```
126

127
## Conflict Resolution Strategies
128

129
When PII entities overlap, the engine uses conflict resolution strategies:
130

131
### MERGE_SIMILAR_OR_CONTAINED (Default)
132

133
Merges entities of the same type that overlap or are contained within each other.
134

135
```python
136
from presidio_anonymizer.entities import ConflictResolutionStrategy
137

138
result = engine.anonymize(
139
    text=text,
140
    analyzer_results=overlapping_results,
141
    conflict_resolution=ConflictResolutionStrategy.MERGE_SIMILAR_OR_CONTAINED
142
)
143
```
144

145
### REMOVE_INTERSECTIONS
146

147
Adjusts boundaries of overlapping entities to remove intersections, keeping higher-scored entities intact.
148

149
```python
150
result = engine.anonymize(
151
    text=text,
152
    analyzer_results=overlapping_results,
153
    conflict_resolution=ConflictResolutionStrategy.REMOVE_INTERSECTIONS
154
)
155
```
156

157
## Default Behavior
158

159
- **Default Operator**: If no operator is specified for an entity type, uses "replace" operator
160
- **Default Parameters**: Each operator has sensible defaults (e.g., mask uses "*" character)
161
- **Whitespace Merging**: Adjacent entities of the same type separated only by whitespace are merged
162
- **Sorted Processing**: Entities are processed in start position order for consistent results

Version

Tile

Files

core-anonymization.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

core-anonymization.mddocs/