Use this skill any time someone wants to create, scaffold, build, fix, improve, benchmark, or optimize a Tessl/Claude skill — even if they don't say 'tessl' explicitly. If the request involves making a new skill ('create a skill for X', 'build me a skill that does Y', 'scaffold a skill called Z'), fixing or completing an existing one (missing tile.json, broken repo integration, low eval scores, description not triggering), or running and iterating on evals, invoke this skill. The full workflow covers: structured interview → SKILL.md + tile.json + rules/ scaffolding → README/CI repo integration → tessl tile lint → optional Tessl CLI pipeline (skill review, scenario generate/download, eval run) → hand-authored evals or LLM-as-judge fallback → benchmark logging. Do NOT use for: editing application code, debugging, refactoring, writing general documentation, or creating presentations.
90
88%
Does it follow best practices?
Impact
91%
1.26xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong, well-crafted description that excels across all dimensions. It provides comprehensive trigger coverage with natural user phrasings, details the full workflow pipeline, and includes explicit exclusion criteria to minimize conflict with other skills. The only minor concern is that it uses second-person voice ('someone wants') in places, but the overall quality is high.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: create, scaffold, build, fix, improve, benchmark, optimize skills. Also details the full workflow steps including structured interview, SKILL.md + tile.json scaffolding, linting, CLI pipeline, eval run, and benchmark logging. | 3 / 3 |
Completeness | Clearly answers both 'what' (the full workflow from interview to benchmark logging) and 'when' (explicit trigger guidance with example phrases like 'create a skill for X', plus a 'Do NOT use for' exclusion list). The 'Use this skill any time...' clause serves as an explicit trigger section. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms users would say: 'create a skill for X', 'build me a skill that does Y', 'scaffold a skill called Z', 'fixing', 'eval scores', 'description not triggering', 'tessl', plus explicit mention that it applies even without saying 'tessl'. Covers many natural phrasings. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with a clear niche around Tessl/Claude skill creation. The explicit 'Do NOT use for' clause (editing application code, debugging, refactoring, general documentation, presentations) actively reduces conflict risk with other skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, highly actionable skill for creating and optimizing Tessl skills. Its greatest strengths are the clear multi-phase workflow with explicit validation checkpoints, concrete CLI commands, and the comprehensive integrated example. The main weakness is verbosity — the skill could be more concise by reducing redundancy between non-negotiables and anti-patterns, and by leaning more heavily on the referenced companion files rather than summarizing their content inline.
Suggestions
Reduce redundancy by removing anti-patterns that merely restate non-negotiables (e.g., 'Modifying skill files during eval execution' duplicates non-negotiable #6) — or consolidate into a single authoritative list.
Move detailed fallback implementations (Phase 4 Path B LLM-as-Judge, Phase 3 Path M manual scenario authoring) into the referenced companion rule files to reduce SKILL.md length and better leverage progressive disclosure.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is comprehensive but verbose at ~400+ lines. Some sections (like the full interview table, detailed CLI pipeline steps, and the integrated example) earn their place, but there's redundancy — non-negotiable #6 is restated multiple times, and some explanations could be tightened. The anti-patterns section partially duplicates non-negotiables. | 2 / 3 |
Actionability | Highly actionable throughout: specific CLI commands (`tessl tile lint`, `tessl scenario generate`), concrete file structures (tile.json schema, criteria.json with exact format), exact interview questions with fallback logic, and a complete integrated example showing input-to-output. Code snippets are copy-paste ready. | 3 / 3 |
Workflow Clarity | Excellent multi-step workflow with clear phase sequencing (Interview → Scaffold → Lint → CLI pipeline → Eval → Optimize → Benchmark). Validation checkpoints are explicit (lint check after scaffold, completeness check after interview, gate check after eval). Feedback loops are well-defined (Phase 5.3 apply → re-eval → log → flag regressions). Non-negotiable #6 establishes a clear read-only boundary during eval execution. | 3 / 3 |
Progressive Disclosure | References to companion rule files (scaffold-rules.md, activation-design.md, benchmark-loop.md, eval-runner.md) are well-signaled and one-level deep, which is good. However, no bundle files were provided, so we cannot verify these references resolve. The main SKILL.md itself is quite long and some content (like the full Phase 3 scenario schema or Phase 4 fallback details) could arguably live in the referenced rule files rather than being duplicated/summarized inline. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
a283f77
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.