Diffusion-based molecular docking. Predict protein-ligand binding poses from PDB/SMILES, confidence scores, virtual screening, for structure-based drug design. Not for affinity prediction.
76
66%
Does it follow best practices?
Impact
94%
1.54xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./scientific-skills/diffdock/SKILL.mdQuality
Discovery
82%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong, domain-specific description with excellent specificity and trigger term coverage for its target audience of computational chemists and drug designers. The explicit exclusion ('Not for affinity prediction') is a nice touch for disambiguation. The main weakness is the absence of an explicit 'Use when...' clause, which would help Claude know precisely when to select this skill.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when the user asks about molecular docking, predicting binding poses, or virtual screening with protein structures and small molecules.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: predict protein-ligand binding poses, confidence scores, virtual screening. Also specifies input formats (PDB/SMILES) and the broader domain (structure-based drug design), plus an explicit exclusion (not for affinity prediction). | 3 / 3 |
Completeness | The 'what' is well-covered (predict binding poses, confidence scores, virtual screening), but there is no explicit 'Use when...' clause or equivalent trigger guidance. The when is only implied by the listed capabilities. Per rubric guidelines, a missing 'Use when...' clause caps completeness at 2. | 2 / 3 |
Trigger Term Quality | Includes strong natural keywords a computational chemist or drug designer would use: 'molecular docking', 'protein-ligand binding poses', 'PDB', 'SMILES', 'virtual screening', 'structure-based drug design', 'confidence scores'. These are the exact terms domain users would naturally mention. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche — diffusion-based molecular docking is a very specific computational chemistry task. The explicit exclusion of affinity prediction further sharpens the boundary. Unlikely to conflict with other skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
50%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides excellent actionability with concrete, executable commands and clear examples for molecular docking workflows. However, it is significantly over-verbose, explaining concepts Claude already knows and duplicating content between the main body and reference file descriptions. The workflow clarity would benefit from integrated validation checkpoints rather than a separate troubleshooting section.
Suggestions
Cut the Overview, 'When to Use This Skill', and 'Limitations and Scope' sections to 2-3 lines each—Claude doesn't need explanations of what molecular docking is or when docking is appropriate.
Remove the detailed descriptions of each reference file in the Resources section—a one-line summary per file with a Read tool reference is sufficient, as the current descriptions duplicate body content.
Add explicit validation steps within the core docking workflows (e.g., 'Verify output files exist and confidence scores are parsed before proceeding to analysis') rather than relying on a separate troubleshooting section.
Move the Best Practices list, Citations, and Additional Resources to a reference file—these consume significant tokens without adding actionable guidance for the core task.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~400+ lines. It explains concepts Claude would know (what molecular docking is, what PDB files are), includes a lengthy 'When to Use This Skill' section that restates obvious triggers, provides extensive resource descriptions that duplicate information already in the body, and includes a 10-item best practices list that largely restates earlier content. The overview section explains DiffDock's purpose redundantly. | 1 / 3 |
Actionability | The skill provides fully executable commands with concrete examples throughout—specific CLI invocations with real flags, CSV format specifications, Python code for ensemble docking, and exact bash commands for scoring integration. Commands are copy-paste ready with realistic parameters and file paths. | 3 / 3 |
Workflow Clarity | Multi-step workflows are clearly sequenced (batch processing has Step 1/Step 2, virtual screening has pre-compute then run), but validation checkpoints are weak. There's no explicit 'validate output before proceeding' step after docking runs. The batch CSV validation is good, but the main docking workflows lack feedback loops for error recovery—troubleshooting is separated into its own section rather than integrated into workflows. | 2 / 3 |
Progressive Disclosure | The skill does reference external files (references/confidence_and_limitations.md, references/parameters_reference.md, references/workflows_examples.md) with clear signals, which is good. However, the main file itself is monolithic with too much inline content—the troubleshooting section, advanced techniques, detailed parameter customization, and extensive resource descriptions could be offloaded to reference files. The references section at the bottom essentially re-documents what the referenced files contain. | 2 / 3 |
Total | 8 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
Total | 10 / 11 Passed | |
25e1c0f
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.