Map unstructured biomedical text to standardized ontologies (SNOMED CT, MeSH, ICD-10) for terminology normalization and semantic interoperability. Extracts medical entities and maps to standardized codes with confidence scoring.
Install with Tessl CLI
npx tessl i github:aipoch/medical-research-skills --skill bio-ontology-mapperOverall
score
17%
Does it follow best practices?
If you maintain this skill, you can automatically optimize it using the tessl CLI to improve its score:
npx tessl skill review --optimize ./path/to/skillValidation for skill structure
Biomedical terminology normalization tool that maps free-text clinical and scientific concepts to standardized ontologies for semantic interoperability and data harmonization.
Key Capabilities:
✅ Use this skill when:
❌ Do NOT use when:
Integration:
clinical-data-cleaner (data preparation), ehr-semantic-compressor (text extraction)clinical-data-cleaner (SDTM mapping), unstructured-medical-text-miner (NLP pipelines)Extract and map biomedical entities to ontologies:
from scripts.mapper import BioOntologyMapper
mapper = BioOntologyMapper()
# Map clinical text
result = mapper.map_text(
text="Patient has diabetes and hypertension, taking metformin",
ontologies=["snomed", "mesh", "rxnorm"],
confidence_threshold=0.7
)
for entity in result.entities:
print(f"{entity.text} → {entity.concept_id} ({entity.ontology})")
print(f" Preferred: {entity.preferred_term}")
print(f" Confidence: {entity.confidence:.2f}")Supported Ontologies:
| Ontology | Domain | Use Case |
|---|---|---|
| SNOMED CT | Clinical | EHR interoperability |
| MeSH | Literature | PubMed indexing |
| ICD-10 | Billing | Diagnosis codes |
| LOINC | Labs | Test result standardization |
| RxNorm | Drugs | Medication normalization |
| HGNC | Genes | Gene name standardization |
Map concepts between different ontologies:
# Cross-map SNOMED to ICD-10
translation = mapper.cross_map(
source_id="22298006", # SNOMED: Myocardial infarction
source_ontology="snomed",
target_ontology="icd10"
)
print(f"ICD-10: {translation.target_id} - {translation.target_term}")
# Output: I21.9 - Acute myocardial infarction, unspecifiedCross-Mapping Coverage:
Process large datasets:
# Batch process CSV
results = mapper.batch_map(
input_file="clinical_terms.csv",
text_column="diagnosis_description",
ontologies=["snomed", "icd10"],
output_format="csv",
max_workers=4
)
# Results include:
# - Original term
# - Mapped concept ID
# - Confidence score
# - Alternative mappings (if ambiguous)Performance:
Assess mapping reliability:
scoring = mapper.score_mapping(
term="heart attack",
candidate="22298006", # Myocardial infarction
factors=["string_similarity", "context_match", "frequency"]
)
print(f"Overall confidence: {scoring.confidence:.2f}")
print(f"Breakdown: {scoring.factors}")Scoring Factors:
Scenario: Convert free-text diagnoses to SNOMED codes.
# Normalize clinical notes
python scripts/main.py \
--input notes.csv \
--column diagnosis_text \
--ontology snomed \
--threshold 0.8 \
--output coded_diagnoses.csv
# Results: "heart attack" → 22298006 (Myocardial infarction)Post-Processing:
Scenario: Map research paper keywords to MeSH.
# Map keywords to MeSH
mesh_terms = mapper.map_to_mesh(
keywords=["cancer immunotherapy", "checkpoint inhibitors", "PD-1"],
include_tree_numbers=True,
include_qualifiers=True
)
for term in mesh_terms:
print(f"{term.input} → {term.descriptor}")
print(f" Tree: {term.tree_numbers}")
print(f" Entry terms: {term.synonyms}")Scenario: Standardize medication names across datasets.
# Normalize drug names
drugs = ["Tylenol", "Advil", "Motrin", "acetaminophen"]
for drug in drugs:
result = mapper.map_to_rxnorm(drug)
print(f"{drug} → {result.rxcui}: {result.name}")
# Tylenol → 161: Acetaminophen
# Advil → 5640: Ibuprofen
# Motrin → 5640: IbuprofenScenario: Merge data from multiple hospital systems.
# Harmonize diagnoses from 3 hospitals
python scripts/main.py \
--batch \
--inputs "hospital_a.csv,hospital_b.csv,hospital_c.csv" \
--target-ontology snomed \
--cross-map-to icd10 \
--output harmonized_data.csvFrom free-text to coded database:
from scripts.mapper import BioOntologyMapper
from scripts.validator import MappingValidator
# Initialize
mapper = BioOntologyMapper()
validator = MappingValidator()
# Step 1: Extract entities from text
clinical_note = "Patient has Type 2 diabetes and hypertension..."
entities = mapper.extract_entities(clinical_note)
# Step 2: Map to SNOMED
mappings = []
for entity in entities:
mapping = mapper.map_to_snomed(
entity.text,
context=clinical_note,
top_n=3
)
mappings.append(mapping)
# Step 3: Validate mappings
for mapping in mappings:
validation = validator.validate(
mapping,
check_clinical_plausibility=True
)
if not validation.is_valid:
print(f"Review needed: {mapping}")
# Step 4: Export to database format
db_records = [m.to_database_record() for m in mappings]Pre-Mapping:
During Mapping:
Post-Mapping:
Before Production:
Mapping Errors:
❌ Abbreviation ambiguity → "MI" = Myocardial infarction OR Michigan
❌ Outdated terms → Old terminology not in current ontology
❌ False confidence → High score for wrong concept
Technical Issues:
❌ API failures → No local fallback
❌ Version mismatches → Different ontology versions
❌ PHI exposure → Sending patient data to external APIs
Available in references/ directory:
snomed_ct_guide.md - SNOMED CT hierarchy and relationshipsmesh_structure.md - MeSH tree structure and qualifiersontology_mappings.md - Crosswalks between systemsnlp_best_practices.md - Biomedical text processingapi_documentation.md - External service integrationvalidation_datasets.md - Gold standard test setsLocated in scripts/ directory:
main.py - CLI interface for mappingmapper.py - Core ontology mapping engineextractor.py - Named entity recognitioncross_mapper.py - Ontology-to-ontology translationscorer.py - Confidence calculationbatch_processor.py - Large dataset handlingvalidator.py - Mapping quality checkscaching.py - Local storage for frequent lookups⚠️ Critical: Ontology mapping is for research and data integration, not clinical decision-making. Always validate mappings with domain experts before use in patient care contexts. Never process PHI without appropriate de-identification and compliance measures.
| Parameter | Type | Default | Description |
|---|---|---|---|
--term | str | Required | Single term to map |
--input | str | Required | Input file path |
--output | str | Required | Output file path |
--ontology | str | 'both' | |
--threshold | float | 0.7 | |
--format | str | 'json' | |
--use-api | str | Required | Use UMLS/MeSH APIs |
--api-key | str | Required |
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.