Socratic deep interview with mathematical ambiguity gating before explicit execution approval
32
27%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/deep-interview/SKILL.mdQuality
Discovery
7%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description is heavily abstract and jargon-laden, failing to communicate concrete actions or provide natural trigger terms. It reads more like an academic concept label than a functional skill description. Without clear 'what' and 'when' guidance, Claude would struggle to select this skill appropriately from a pool of available skills.
Suggestions
Replace abstract jargon with concrete actions, e.g., 'Conducts structured interviews by asking clarifying questions to resolve ambiguous requirements before proceeding with task execution.'
Add an explicit 'Use when...' clause with natural trigger terms, e.g., 'Use when the user's request is ambiguous, underspecified, or when requirements need clarification before coding or executing a task.'
Include natural keywords users might say, such as 'clarify requirements', 'ask questions first', 'confirm before proceeding', or 'ambiguous request'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses abstract, jargon-heavy language ('Socratic deep interview', 'mathematical ambiguity gating', 'explicit execution approval') without listing any concrete actions the skill performs. No specific capabilities like 'asks clarifying questions' or 'validates requirements' are mentioned. | 1 / 3 |
Completeness | The description fails to clearly answer either 'what does this do' or 'when should Claude use it.' There is no 'Use when...' clause or equivalent trigger guidance, and the 'what' is expressed in abstract jargon rather than actionable terms. | 1 / 3 |
Trigger Term Quality | The terms used are highly technical and unlikely to match natural user language. No user would naturally say 'mathematical ambiguity gating' or 'Socratic deep interview' when requesting help. There are no common, natural trigger terms present. | 1 / 3 |
Distinctiveness Conflict Risk | The unusual terminology ('Socratic deep interview', 'mathematical ambiguity gating') is distinctive enough that it's unlikely to conflict with common skills, but the lack of clarity about what it actually does means it could either never trigger or trigger inappropriately. | 2 / 3 |
Total | 5 / 12 Passed |
Implementation
47%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill defines a sophisticated, well-structured interview workflow with clear phases, validation gates, and mathematical ambiguity scoring — the workflow clarity is its strongest dimension. However, it is severely over-engineered in length, repeating the approval-gating concept across multiple sections, explaining motivations Claude doesn't need, and including extensive reference tables inline rather than in separate files. The actionability is moderate: state schemas and formulas are concrete, but tool invocations lack complete executable examples.
Suggestions
Cut the content by at least 50%: remove 'Why_This_Exists', deduplicate the approval-gating explanation (appears in Steps, Advanced, and the pipeline diagram), and eliminate the Ambiguity Score Interpretation table which Claude can derive from the threshold logic.
Move the Advanced section (configuration schema, integration guides, brownfield weight tables, challenge agent reference table) into a separate REFERENCE.md or ADVANCED.md file and link to it from the main skill.
Replace the scoring prompt template with a more concise version — the current one re-explains dimensions that are already defined earlier in the document, adding ~40 lines of redundancy.
Add a concrete, complete example of a `state_write` call and a `Skill()` invocation with actual parameters rather than describing them abstractly.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | This skill is extremely verbose at ~600+ lines. It explains concepts Claude already understands (what Socratic questioning is, why clarity matters, what PDFs are equivalent explanations for this domain), includes extensive pipeline diagrams, redundant tables, and repeats the same approval-gating concept multiple times across Steps, Tool_Usage, Examples, Advanced, and Final_Checklist sections. The 'Why_This_Exists' section is pure rationale padding. Many sections could be cut by 50-70% without losing actionable content. | 1 / 3 |
Actionability | The skill provides concrete JSON state structures, scoring formulas, question templates, and specific prompt injections for challenge agents — these are actionable. However, much of the guidance is procedural description rather than executable code. The scoring prompt is presented as a template but relies on an unspecified 'opus model' invocation mechanism. Tool calls like `state_write`, `Skill()`, and `Task()` are referenced but their exact APIs are assumed rather than demonstrated with complete examples. | 2 / 3 |
Workflow Clarity | The multi-phase workflow is clearly sequenced (Phase 1 → Round 0 → Phase 2 loop → Phase 3 challenge agents → Phase 4 crystallize → Phase 5 bridge) with explicit validation checkpoints (ambiguity threshold gating, topology confirmation gate, soft/hard round limits, early exit warnings). The approval-gated pipeline diagram clearly shows three stages with explicit consent gates. Feedback loops for error recovery (stalled ambiguity → Ontologist mode, early exit with risk warning) are well-defined. | 3 / 3 |
Progressive Disclosure | The content is largely monolithic — everything is in one massive file with no references to supporting bundle files. The Advanced section contains configuration, integration details, and reference tables that could be split into separate files. However, the use of XML-style sections and collapsible transcript in the spec template provides some internal structure. No bundle files are provided, so there's no external file organization to evaluate, but the sheer length of this single file suggests content should be distributed. | 2 / 3 |
Total | 8 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (769 lines); consider splitting into references/ and linking | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
679b418
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.