Diffusion-based molecular docking. Predict protein-ligand binding poses from PDB/SMILES, confidence scores, virtual screening, for structure-based drug design. Not for affinity prediction.
76
66%
Does it follow best practices?
Impact
94%
1.54xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./scientific-skills/diffdock/SKILL.mdQuality
Discovery
82%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong, domain-specific description that clearly communicates concrete capabilities and uses precise technical terminology that its target users would naturally employ. Its main weakness is the absence of an explicit 'Use when...' clause, which caps completeness. The explicit exclusion ('Not for affinity prediction') is a nice touch for disambiguation.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when the user asks about docking ligands to proteins, predicting binding poses, or performing virtual screening with PDB structures or SMILES strings.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: predict protein-ligand binding poses, confidence scores, virtual screening, and explicitly scopes to structure-based drug design. Also includes a clear exclusion ('Not for affinity prediction'). | 3 / 3 |
Completeness | Clearly answers 'what does this do' with specific capabilities and input formats, but lacks an explicit 'Use when...' clause or equivalent trigger guidance. The when is only implied through the listed capabilities. | 2 / 3 |
Trigger Term Quality | Includes strong natural keywords a user in this domain would use: 'molecular docking', 'protein-ligand', 'binding poses', 'PDB', 'SMILES', 'virtual screening', 'structure-based drug design', 'confidence scores'. These are the exact terms a computational chemist would naturally mention. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche — diffusion-based molecular docking is a very specific domain. The explicit exclusion of affinity prediction further sharpens its boundary. Unlikely to conflict with other skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
50%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is highly actionable with executable commands and clear examples for all major DiffDock workflows, which is its primary strength. However, it is significantly over-verbose, including explanatory content Claude doesn't need, sections that duplicate what should be in referenced files, and general knowledge padding. The workflow clarity would benefit from explicit validation checkpoints and error recovery patterns, particularly for batch operations.
Suggestions
Cut the content by ~50%: remove 'When to Use This Skill' (redundant with frontmatter), trim the Overview to 2-3 lines, remove explanations of what molecular docking/PDB/SMILES are, and move Troubleshooting, Advanced Techniques, and detailed parameter guidance entirely into the referenced files.
Add explicit validation checkpoints to batch workflows: after CSV validation ('Stop if errors found'), after docking ('Verify output directory contains expected number of rank_*.sdf files'), and before analysis ('Check that confidence_scores.txt exists for each complex').
Move the Resources/Helper Scripts descriptions, Citations, and Additional Resources sections into a separate reference file — these consume significant tokens without aiding task execution.
Consolidate the Best Practices list into the relevant workflow sections rather than a standalone 10-item list at the end.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~400+ lines. It explains concepts Claude would know (what molecular docking is, what PDB files are, what SMILES strings are), includes a 'When to Use This Skill' section that restates the description, provides extensive best practices lists, citation formatting, and resource links that inflate token count without adding actionable value. The 'Key Distinction' note and confidence interpretation are useful, but much content could be cut by 50%+. | 1 / 3 |
Actionability | The skill provides fully executable commands for all workflows (single docking, batch processing, virtual screening), complete with specific CLI flags, CSV format specifications, Python code for ensemble docking, and concrete bash commands for integration with GNINA. Commands are copy-paste ready with realistic examples. | 3 / 3 |
Workflow Clarity | Multi-step workflows are clearly sequenced (batch processing has Step 1/Step 2, virtual screening has pre-compute then run), but validation checkpoints are weak. The batch workflow mentions using prepare_batch_csv.py to validate but doesn't enforce a 'stop if validation fails' pattern. There's no explicit validation step after docking completes (e.g., checking output files exist, verifying confidence scores are reasonable before proceeding to analysis). The recommended workflow at the end of 'Integration with Scoring Functions' is good but lacks error recovery loops. | 2 / 3 |
Progressive Disclosure | The skill references several external files (references/confidence_and_limitations.md, references/parameters_reference.md, references/workflows_examples.md, scripts/, assets/) with clear descriptions of when to read them. However, the main SKILL.md itself contains too much inline content that should be in those reference files (e.g., the full troubleshooting section, detailed parameter customization, ensemble docking code, scoring function integration). The overview is not concise enough to serve as a true progressive disclosure entry point. | 2 / 3 |
Total | 8 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
Total | 10 / 11 Passed | |
cbcae7b
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.