Pythonic wrapper around RDKit with simplified interface and sensible defaults. Preferred for standard drug discovery including SMILES parsing, standardization, descriptors, fingerprints, clustering, 3D conformers, parallel processing. Returns native rdkit.Chem.Mol objects. For advanced control or custom parameters, use rdkit directly.
73
66%
Does it follow best practices?
Impact
81%
3.24xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./scientific-skills/datamol/SKILL.mdQuality
Discovery
82%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong, domain-specific description that clearly lists concrete capabilities and uses natural terminology from computational chemistry. Its main weakness is the lack of an explicit 'Use when...' clause, which would help Claude more reliably select this skill. The distinction from raw RDKit usage is a notable strength that reduces conflict risk.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when the user asks about molecular processing, SMILES strings, chemical fingerprints, drug-likeness descriptors, or cheminformatics tasks that benefit from a simplified RDKit interface.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: SMILES parsing, standardization, descriptors, fingerprints, clustering, 3D conformers, parallel processing. Also specifies return type (native rdkit.Chem.Mol objects) and when to use RDKit directly instead. | 3 / 3 |
Completeness | The 'what' is well covered with specific capabilities and return types. However, there is no explicit 'Use when...' clause or equivalent trigger guidance — the when is only implied by listing the domain and capabilities. Per rubric guidelines, missing explicit trigger guidance caps completeness at 2. | 2 / 3 |
Trigger Term Quality | Includes strong natural keywords a user in this domain would use: 'RDKit', 'SMILES', 'drug discovery', 'fingerprints', 'descriptors', 'conformers', 'clustering', 'standardization'. These are the exact terms a computational chemist would mention. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche — cheminformatics/drug discovery with specific mention of RDKit, SMILES, fingerprints, conformers. Also explicitly distinguishes itself from using RDKit directly ('For advanced control or custom parameters, use rdkit directly'), which reduces conflict risk with a potential raw RDKit skill. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
50%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill excels at actionability with comprehensive, executable code examples covering the full datamol API surface. However, it is far too verbose for a SKILL.md file - it reads more like a tutorial or documentation page than a concise skill reference. Much of the inline content (SAR analysis, virtual screening, ML integration, troubleshooting) should either be in reference files or dramatically condensed, given that Claude can infer standard Python patterns.
Suggestions
Reduce the main file to ~100-150 lines by moving detailed examples (SAR analysis, virtual screening, ML integration, scaffold splitting) into reference files and keeping only the core API patterns inline.
Remove explanations of concepts Claude already knows: what SMILES/InChI are, how to use pandas groupby, basic sklearn usage, what Lipinski's rules are, etc.
Add explicit validation checkpoints to multi-step workflows, e.g., verify molecule count after standardization filtering, check that conformer generation succeeded before clustering.
Consolidate the 'Best Practices', 'Error Handling', and 'Troubleshooting' sections into a single brief section with only the non-obvious gotchas (like Butina clustering scale limitations).
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~500+ lines, explaining many concepts Claude already knows (e.g., what SMILES is, how to iterate lists, basic pandas operations, sklearn usage). Sections like 'Integration with Machine Learning' and 'Troubleshooting' add little value. The overview paragraph explains what cheminformatics is. Much content could be condensed to API signatures and key patterns. | 1 / 3 |
Actionability | The skill provides fully executable, copy-paste ready code examples throughout. Every section includes concrete Python code with specific function calls, parameters, and expected outputs. The complete pipeline examples are particularly actionable. | 3 / 3 |
Workflow Clarity | Multi-step pipelines are presented with numbered steps and clear sequencing (e.g., the complete pipeline and virtual screening sections). However, there are no explicit validation checkpoints - for example, the standardization pipeline doesn't verify that standardization succeeded before proceeding, and batch operations lack error recovery feedback loops. | 2 / 3 |
Progressive Disclosure | The skill does reference external files (references/core_api.md, references/io_module.md, etc.) which is good, but the main file itself is a monolithic wall of content that inlines extensive code examples that could be in reference files. The overview should be much shorter with more content pushed to the reference documents. | 2 / 3 |
Total | 8 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (705 lines); consider splitting into references/ and linking | Warning |
metadata_version | 'metadata.version' is missing | Warning |
Total | 9 / 11 Passed | |
25e1c0f
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.