CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-presidio-anonymizer

Presidio Anonymizer package - replaces analyzed text with desired values.

Pending
Overview
Eval results
Files

deanonymization.mddocs/

Deanonymization

The DeanonymizeEngine reverses anonymization operations when using reversible operators like encryption. It takes anonymized text and the original anonymization metadata to restore the original content.

Capabilities

Text Deanonymization

Main deanonymization method that reverses anonymization operations using operator metadata.

def deanonymize(
    self,
    text: str,
    entities: List[OperatorResult],
    operators: Dict[str, OperatorConfig]
) -> EngineResult:
    """
    Deanonymize text that was previously anonymized.

    Parameters:
    - text (str): The anonymized text to restore
    - entities (List[OperatorResult]): Metadata from original anonymization
    - operators (Dict[str, OperatorConfig]): Configuration for deanonymization operators

    Returns:
    EngineResult: Contains restored text and metadata about transformations
    """

Usage Example:

from presidio_anonymizer import AnonymizerEngine, DeanonymizeEngine
from presidio_anonymizer.entities import RecognizerResult, OperatorConfig

# First, anonymize with reversible operators
anonymizer = AnonymizerEngine()
original_text = "My credit card is 4111-1111-1111-1111"
analyzer_results = [RecognizerResult("CREDIT_CARD", 18, 37, 0.9)]

# Use encryption for reversible anonymization
encrypt_config = OperatorConfig("encrypt", {"key": "my-secret-key-32-characters-long12"})
anonymize_result = anonymizer.anonymize(
    text=original_text,
    analyzer_results=analyzer_results,
    operators={"CREDIT_CARD": encrypt_config}
)

print(f"Anonymized: {anonymize_result.text}")

# Now deanonymize
deanonymizer = DeanonymizeEngine()
decrypt_config = OperatorConfig("decrypt", {"key": "my-secret-key-32-characters-long12"})

deanonymize_result = deanonymizer.deanonymize(
    text=anonymize_result.text,
    entities=anonymize_result.items,  # Use original anonymization metadata
    operators={"CREDIT_CARD": decrypt_config}
)

print(f"Restored: {deanonymize_result.text}")  # Original text restored

Deanonymizer Management

Add or remove custom deanonymization operators at runtime.

def add_deanonymizer(self, deanonymizer_cls: Type[Operator]) -> None:
    """
    Add a new deanonymizer to the engine.

    Parameters:
    - deanonymizer_cls (Type[Operator]): The deanonymizer class to add
    """

def remove_deanonymizer(self, deanonymizer_cls: Type[Operator]) -> None:
    """
    Remove a deanonymizer from the engine.

    Parameters:
    - deanonymizer_cls (Type[Operator]): The deanonymizer class to remove
    """

Usage Example:

from presidio_anonymizer.operators import Operator

class CustomDecrypter(Operator):
    def operate(self, text, params):
        # Custom decryption logic
        return decrypt_with_custom_algorithm(text, params.get("key"))

deanonymizer = DeanonymizeEngine()
deanonymizer.add_deanonymizer(CustomDecrypter)

Available Deanonymizers

Get list of all available deanonymization operators.

def get_deanonymizers(self) -> List[str]:
    """
    Return a list of supported deanonymizers.

    Returns:
    List[str]: Names of available deanonymizer operators
    """

Usage Example:

deanonymizer = DeanonymizeEngine()
available = deanonymizer.get_deanonymizers()
print(available)  # ['decrypt', 'deanonymize_keep']

Reversible Operators

Only certain operators support deanonymization:

Encrypt/Decrypt

Uses AES encryption with a secret key to enable full restoration.

# Anonymization
encrypt_config = OperatorConfig("encrypt", {
    "key": "my-secret-key-32-characters-long12"
})

# Deanonymization  
decrypt_config = OperatorConfig("decrypt", {
    "key": "my-secret-key-32-characters-long12"  # Must match
})

Keep/DeanonymizeKeep

Keeps text unchanged during both anonymization and deanonymization.

# Both operations use keep
keep_config = OperatorConfig("keep")
deanonymize_keep_config = OperatorConfig("deanonymize_keep")

Workflow Pattern

  1. Anonymize with reversible operators, save the EngineResult.items
  2. Store the anonymization metadata (items) for later use
  3. Deanonymize using the anonymized text, metadata, and matching operators
# Step 1: Anonymize and save metadata
anonymize_result = anonymizer.anonymize(text, analyzer_results, operators)
anonymized_text = anonymize_result.text
anonymization_metadata = anonymize_result.items  # Save this!

# Step 2: Later, restore original text
deanonymize_result = deanonymizer.deanonymize(
    text=anonymized_text,
    entities=anonymization_metadata,  # Use saved metadata
    operators=deanonymize_operators
)
original_text = deanonymize_result.text

Limitations

  • Irreversible Operators: Replace, mask, redact, and hash cannot be deanonymized
  • Key Management: Encryption keys must be securely stored and matched exactly
  • Metadata Required: Original anonymization metadata (OperatorResult list) is required
  • Operator Consistency: Deanonymization operators must match anonymization operators

Install with Tessl CLI

npx tessl i tessl/pypi-presidio-anonymizer

docs

batch-processing.md

core-anonymization.md

deanonymization.md

entities.md

index.md

operators.md

tile.json