Structured hypothesis formulation from observations. Use when you have experimental observations or data and need to formulate testable hypotheses with predictions, propose mechanisms, and design experiments to test them. Follows scientific method framework. For open-ended ideation use scientific-brainstorming; for automated LLM-driven hypothesis testing on datasets use hypogenic.
78
67%
Does it follow best practices?
Impact
99%
1.67xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./scientific-skills/hypothesis-generation/SKILL.mdQuality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong description that clearly articulates specific capabilities (hypothesis formulation, mechanism proposal, experiment design), provides explicit trigger conditions, and proactively disambiguates from related skills. The inclusion of boundary conditions with other skills is particularly effective for preventing misselection. Minor note: uses second person 'you have' in the trigger clause, but the primary capability descriptions use appropriate impersonal voice.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'formulate testable hypotheses with predictions', 'propose mechanisms', and 'design experiments to test them'. These are clear, actionable capabilities. | 3 / 3 |
Completeness | Clearly answers both 'what' (structured hypothesis formulation, propose mechanisms, design experiments) and 'when' ('Use when you have experimental observations or data and need to formulate testable hypotheses'). Also includes explicit disambiguation guidance for related skills (scientific-brainstorming, hypogenic). | 3 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'experimental observations', 'data', 'testable hypotheses', 'predictions', 'mechanisms', 'design experiments', 'scientific method'. These are terms a researcher or scientist would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Explicitly distinguishes itself from related skills ('For open-ended ideation use scientific-brainstorming; for automated LLM-driven hypothesis testing on datasets use hypogenic'), creating a clear niche for structured hypothesis formulation from observations. This anti-overlap guidance is excellent. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
35%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill suffers primarily from severe verbosity—LaTeX formatting instructions dominate the document and are repeated multiple times (page overflow prevention appears in at least 3 places with near-identical content). The scientific methodology portions are reasonable in structure but read as generic scientific method guidance that Claude already knows, rather than providing novel, actionable instructions. The skill would benefit enormously from moving all LaTeX formatting details to the referenced FORMATTING_GUIDE.md and keeping only a brief pointer in the main file.
Suggestions
Move all LaTeX formatting details (page overflow prevention, box usage, citation requirements, compilation instructions) to the referenced FORMATTING_GUIDE.md and replace with a 2-3 line pointer, reducing the main file by ~60%.
Eliminate redundant content: the page overflow prevention strategy is stated 3 times with nearly identical wording—consolidate into one location.
Make the scientific workflow more actionable by providing a concrete worked example (e.g., a brief example observation → hypothesis → prediction → experiment chain) rather than abstract descriptions of what each step involves.
Add validation checkpoints between workflow steps, such as 'Before proceeding to Step 4, verify you have identified at least 2 conflicting findings or gaps from the literature' to create feedback loops.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~250+ lines. It over-explains LaTeX formatting details (page overflow prevention is repeated multiple times), explains basic scientific method concepts Claude already knows (what testability and falsifiability mean), and includes redundant sections (the page break strategy is stated at least 3 separate times with nearly identical content). | 1 / 3 |
Actionability | There are some concrete elements like the xelatex compilation commands and the generate_schematic.py script call, but the core hypothesis generation workflow is largely abstract guidance ('identify the core observation,' 'consider multiple approaches') rather than executable steps. The LaTeX formatting instructions are specific but the actual scientific methodology portions read more like a textbook than actionable instructions. | 2 / 3 |
Workflow Clarity | The 8-step workflow is clearly sequenced and numbered, which is good. However, there are no validation checkpoints or feedback loops between steps—no guidance on what to do if literature search yields insufficient evidence, if hypotheses fail quality evaluation, or if experimental designs reveal flaws in the hypotheses. The LaTeX compilation steps do include a concrete sequence but lack error handling. | 2 / 3 |
Progressive Disclosure | The skill references multiple external files (hypothesis_quality_criteria.md, experimental_design_patterns.md, hypothesis_report_template.tex, FORMATTING_GUIDE.md, etc.) which is good progressive disclosure structure. However, the main SKILL.md itself contains massive inline content about LaTeX formatting that should be in the referenced FORMATTING_GUIDE.md instead, and no bundle files were provided to verify the references exist. The page overflow prevention section alone could be an entire separate reference document. | 2 / 3 |
Total | 7 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
Total | 10 / 11 Passed | |
cbcae7b
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.