Batch anonymize DICOM medical images by removing patient sensitive information (name, ID, birth date) while preserving image data for research use. Trigger when users need to de-identify medical imaging data, prepare DICOM files for research sharing, or remove PHI from radiology/scanned images.
90
86%
Does it follow best practices?
Impact
100%
1.51xAverage score across 3 eval scenarios
Passed
No known issues
A clinical-grade tool for batch anonymization of DICOM medical images, removing patient identifiable information while preserving essential imaging data for research and analysis.
This skill anonymizes DICOM (Digital Imaging and Communications in Medicine) files by removing or replacing Protected Health Information (PHI) while maintaining the integrity of the medical image data. It supports batch processing of entire directories and generates audit logs for compliance documentation.
# Anonymize a single file
python scripts/main.py --input patient_scan.dcm --output anonymized.dcm
# Batch process a directory
python scripts/main.py --input /path/to/dicom/folder/ --output /path/to/output/ --batch
# Preserve study relationships with pseudonyms
python scripts/main.py --input scans/ --output clean/ --batch --preserve-studies
# Custom anonymization (keep age, remove birth date)
python scripts/main.py --input scan.dcm --output clean.dcm --keep-tags PatientAgefrom scripts.main import DICOMAnonymizer
anonymizer = DICOMAnonymizer(preserve_studies=True)
result = anonymizer.anonymize_file("input.dcm", "output.dcm")
print(f"Tags anonymized: {len(result.anonymized_tags)}")
# Batch processing
results = anonymizer.anonymize_directory("input_folder/", "output_folder/")| Parameter | Type | Default | Required | Description |
|---|---|---|---|---|
--input, -i | string | - | Yes | Input DICOM file or directory path |
--output, -o | string | - | Yes | Output DICOM file or directory path |
--batch, -b | flag | false | No | Enable batch/directory processing |
--preserve-studies | flag | false | No | Maintain study relationships with pseudonyms |
--keep-tags | string | - | No | Comma-separated list of tags to preserve |
--remove-private | flag | true | No | Remove private/unknown tags |
--audit-log | string | - | No | Path for JSON audit log |
--overwrite | flag | false | No | Overwrite existing output files |
The following PHI tags are anonymized by default:
| Tag | Attribute | Action |
|---|---|---|
| (0010,0010) | PatientName | Removed / Replaced |
| (0010,0020) | PatientID | Hashed / Pseudonym |
| (0010,0030) | PatientBirthDate | Removed |
| (0010,0040) | PatientSex | Preserved (demographic research) |
| (0010,1010) | PatientAge | Preserved (calculated from birth date) |
| (0010,1020) | PatientSize | Preserved |
| (0010,1030) | PatientWeight | Preserved |
| Tag | Attribute | Action |
|---|---|---|
| (0008,0080) | InstitutionName | Removed |
| (0008,0081) | InstitutionAddress | Removed |
| (0008,0090) | ReferringPhysicianName | Removed |
| (0008,1048) | PhysiciansOfRecord | Removed |
| (0008,1050) | PerformingPhysicianName | Removed |
| (0008,1060) | NameOfPhysiciansReadingStudy | Removed |
| (0008,1070) | OperatorsName | Removed |
| Tag | Attribute | Action |
|---|---|---|
| (0008,0050) | AccessionNumber | Hashed / Removed |
| (0020,0010) | StudyID | Hashed (if preserve-studies) |
| (0020,000D) | StudyInstanceUID | Hashed (if preserve-studies) |
| (0020,000E) | SeriesInstanceUID | Hashed (if preserve-studies) |
| (0020,4000) | ImageComments | Removed |
| Tag | Attribute | Action |
|---|---|---|
| (0018,1030) | ProtocolName | Preserved / Anonymized |
| (0018,1000) | DeviceSerialNumber | Removed |
| (0008,1010) | StationName | Removed |
| (0008,0018) | SOPInstanceUID | Regenerated |
{
"timestamp": "2024-01-15T10:30:00Z",
"input_file": "/path/to/original.dcm",
"output_file": "/path/to/anonymized.dcm",
"original_patient_id_hash": "sha256:abc123...",
"pseudonym": "ANON_0001",
"tags_anonymized": [
{"tag": "(0010,0010)", "attribute": "PatientName", "action": "cleared"},
{"tag": "(0010,0020)", "attribute": "PatientID", "action": "hashed"},
{"tag": "(0010,0030)", "attribute": "PatientBirthDate", "action": "cleared"}
],
"statistics": {
"total_tags_processed": 150,
"phi_tags_removed": 12,
"private_tags_removed": 5,
"image_data_preserved": true
}
}See references/requirements.txt for full dependency list.
⚠️ CRITICAL: This tool is designed as a helper, not a replacement for institutional review.
references/dicom_standard_ps3.15.pdf - DICOM Standard Part 15: Security and System Managementreferences/hipaa_deidentification_guide.pdf - HIPAA Safe Harbor de-identification standardsreferences/phi_tags.json - Complete list of PHI-related DICOM tagsreferences/requirements.txt - Python dependenciesComplex DICOM data structures, UID management, regulatory compliance requirements, potential pixel-data PHI.
| Risk Indicator | Assessment | Level |
|---|---|---|
| Code Execution | Python/R scripts executed locally | Medium |
| Network Access | No external API calls | Low |
| File System Access | Read input files, write output files | Medium |
| Instruction Tampering | Standard prompt guidelines | Low |
| Data Exposure | Output files saved to workspace | Low |
# Python dependencies
pip install -r requirements.txtca9aaa4
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.