Automatically convert uploaded drug application documents (Word/PDF) into XML skeleton structure compliant with eCTD 4.0/3.2.2 specifications.
57
48%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./scientific-skills/Academic Writing/ectd-xml-compiler/SKILL.mdID: 197
Automatically convert uploaded drug application documents (Word/PDF) into XML skeleton structure compliant with eCTD 4.0/3.2.2 specifications.
scripts/main.py.references/ for task-specific guidance.python-docx>=0.8.11 # Word document parsing
PyPDF2>=3.0.0 # PDF text extraction
lxml>=4.9.0 # XML processingSee ## Usage above for related details.
cd "20260318/scientific-skills/Academic Writing/ectd-xml-compiler"
python -m py_compile scripts/main.py
python scripts/main.py --helpExample run plan:
CONFIG block or documented parameters if the script uses fixed settings.python scripts/main.py with the validated inputs.See ## Workflow above for related details.
scripts/main.py.references/ contains supporting rules, prompts, or checklists.Use this command to verify that the packaged script entry point can be parsed before deeper execution.
python -m py_compile scripts/main.pyUse these concrete commands for validation. They are intentionally self-contained and avoid placeholder paths.
python -m py_compile scripts/main.py
python scripts/main.py --helpeCTD (electronic Common Technical Document) is the electronic Common Technical Document standard established by ICH for submitting drug registration applications to regulatory agencies such as FDA and EMA.
This tool parses uploaded drug application documents (Word/PDF) and converts them into XML skeleton structure compliant with eCTD 4.0/3.2.2 specifications.
eCTD/
├── m1/ # Module 1: Administrative Information and Prescribing Information (region-specific)
│ ├── m1.xml
│ └── ...
├── m2/ # Module 2: CTD Summaries
│ ├── m2.xml
│ └── ...
├── m3/ # Module 3: Quality
│ ├── m3.xml
│ └── ...
├── m4/ # Module 4: Nonclinical Study Reports
│ ├── m4.xml
│ └── ...
├── m5/ # Module 5: Clinical Study Reports
│ ├── m5.xml
│ └── ...
├── index.xml # Master index file
├── index-md5.txt # MD5 checksum file
└── dtd/ # DTD filespython skills/ectd-xml-compiler/scripts/main.py [options] <input_files...>| Argument | Description |
|---|---|
input_files | Input Word/PDF file paths (supports multiple) |
| Option | Short | Description | Default |
|---|---|---|---|
--output | -o | Output directory path | ./ectd-output |
--module | -m | Target module (m1-m5, auto) | auto |
--region | -r | Target region (FDA, EMA, ICH) | ICH |
--version | -v | eCTD version (3.2.2, 4.0) | 4.0 |
--dtd-path | -d | Custom DTD path | Built-in DTD |
--validate | Validate generated XML | False |
# Basic usage - auto-detect module
python skills/ectd-xml-compiler/scripts/main.py document1.docx document2.pdf
# Specify output directory and module
python skills/ectd-xml-compiler/scripts/main.py -o ./my-ectd -m m3 quality-doc.docx
# FDA submission format
python skills/ectd-xml-compiler/scripts/main.py -r FDA -v 3.2.2 *.pdf
# Validate generated XML
python skills/ectd-xml-compiler/scripts/main.py --validate submission.pdf| Keyword Pattern | Target Module |
|---|---|
| Administrative, Label, Package Insert | m1 |
| Summary, summary, Overview | m2 |
| Quality, quality, CMC, API, Drug Product | m3 |
| Nonclinical, Toxicology, Pharmacokinetics | m4 |
| Clinical, clinical, Study, Trial | m5 |
Generated eCTD skeleton contains:
Master index file containing references and sequence information for all modules.
XML skeleton for each module, containing:
<leaf>, <node>)<cross-reference>)MD5 checksum values for each file to ensure integrity.
# Install dependencies
pip install python-docx PyPDF2 lxmlUsing --validate option can validate generated XML:
MIT License
| Risk Indicator | Assessment | Level |
|---|---|---|
| Code Execution | Python/R scripts executed locally | Medium |
| Network Access | No external API calls | Low |
| File System Access | Read input files, write output files | Medium |
| Instruction Tampering | Standard prompt guidelines | Low |
| Data Exposure | Output files saved to workspace | Low |
No additional Python packages required.
Every final response should make these items explicit when they are relevant:
scripts/main.py fails, report the failure point, summarize what still can be completed safely, and provide a manual fallback.This skill accepts requests that match the documented purpose of ectd-xml-compiler and include enough context to complete the workflow safely.
Do not continue the workflow when the request is out of scope, missing a critical input, or would require unsupported assumptions. Instead respond:
ectd-xml-compileronly handles its documented workflow. Please provide the missing required inputs or switch to a more suitable skill.
Use the following fixed structure for non-trivial requests:
If the request is simple, you may compress the structure, but still keep assumptions and limits explicit when they affect correctness.
8277276
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.