hypothesis-testing

Applies the scientific method to debugging by helping users form specific, testable hypotheses, design targeted experiments, and systematically confirm or reject theories to find root causes. Use when a user says their code isn't working, they're getting an error, something broke, they want to troubleshoot a bug, or they're trying to figure out what's causing an issue. Concrete actions include isolating failing components, forming and testing hypotheses, analyzing error messages, tracing execution paths, and interpreting test results to narrow down root causes.

Quality

77%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./packages/core/src/methodology/packs/debugging/hypothesis-testing/SKILL.md

Quality

Discovery

92%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that clearly articulates both what the skill does and when to use it, with natural trigger terms that match how users actually describe debugging scenarios. The scientific method framing provides a distinctive angle, though the broad debugging triggers could overlap with other coding/debugging skills. The description is well-structured, uses third person voice correctly, and lists concrete actions.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: 'isolating failing components, forming and testing hypotheses, analyzing error messages, tracing execution paths, and interpreting test results to narrow down root causes.' Also describes the methodology clearly.	3 / 3
Completeness	Clearly answers both 'what' (applies scientific method to debugging with specific actions like isolating components, forming hypotheses, analyzing errors) and 'when' (explicit 'Use when...' clause with multiple natural trigger scenarios).	3 / 3
Trigger Term Quality	Excellent coverage of natural user language: 'code isn't working', 'getting an error', 'something broke', 'troubleshoot a bug', 'figure out what's causing an issue'. These are phrases users would naturally say when encountering bugs.	3 / 3
Distinctiveness Conflict Risk	While the scientific method / hypothesis-driven approach is a distinctive angle, the trigger terms like 'code isn't working', 'getting an error', and 'troubleshoot a bug' are very broad and could easily overlap with general coding assistance or error-fixing skills. The debugging domain is common enough that this could conflict with other debugging-related skills.	2 / 3
	Total	11 / 12 Passed

Implementation

62%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured methodological skill that clearly teaches the scientific debugging approach with a logical 5-step workflow and good examples. Its main weaknesses are moderate verbosity (the extended connection pool scenario, while illustrative, is lengthy) and the fact that much of the content is process/framework guidance rather than directly executable actions. The workflow clarity is strong with explicit feedback loops and a decision tree.

Suggestions

Tighten the connection pool example — it's repeated across Observe, Predict, Test, and Analyze sections; consider using one compact running example or reducing redundancy.

Make the testing technique code snippets more actionable by showing how to integrate them into an actual debugging session rather than presenting them as standalone patterns.

Consider extracting the hypothesis tracking template and testing techniques into a separate reference file to improve progressive disclosure and reduce the main skill's length.

Dimension	Reasoning	Score
Conciseness	The skill is reasonably well-structured but includes some unnecessary verbosity. The extended examples (connection pool scenario) are illustrative but lengthy. The hypothesis tracking template is a large block that could be more compact. Some sections like the decision tree add value but the overall content could be tightened.	2 / 3
Actionability	The skill provides concrete examples and code snippets for testing techniques, but much of the content is templated/illustrative rather than directly executable. The code examples are useful but the core methodology is more of a framework/process description than copy-paste-ready debugging commands. The hypothesis tracking template is actionable but the overall guidance leans toward structured thinking rather than specific executable steps.	2 / 3
Workflow Clarity	The 5-step scientific method (Observe → Hypothesize → Predict → Test → Analyze) is clearly sequenced with explicit validation checkpoints. The predict step defines expected results for both true and false cases, creating a natural feedback loop. The decision tree provides clear branching logic for what to do when results are inconclusive or hypotheses are rejected. The analyze step explicitly asks whether to form new hypotheses.	3 / 3
Progressive Disclosure	The skill references other skills (root-cause-analysis, trace-and-isolate, red-green-refactor) at the end, which is good navigation. However, the content is monolithic — the hypothesis tracking template, testing techniques by type, and the extended connection pool example could potentially be split into referenced files. With no bundle files, all content is inline, making it a long single document that could benefit from better structural separation.	2 / 3
	Total	9 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	10 / 11 Passed

Repository: rohitg00/skillkit
Commit: a9e5c83

Reviewed: 25 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.