Start Tilt dev environment in tmux, monitor bootstrap to healthy state, fix Tiltfile bugs without hard-coding or fallbacks. Use when starting tilt, debugging Tiltfile errors, or bootstrapping a dev environment.
90
88%
Does it follow best practices?
Impact
92%
1.33xAverage score across 3 eval scenarios
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that clearly identifies specific actions (starting Tilt in tmux, monitoring bootstrap, fixing Tiltfile bugs), includes natural trigger terms developers would use, and provides an explicit 'Use when' clause. The description is concise, uses third-person voice, and occupies a distinct niche that minimizes conflict risk with other skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: starting Tilt dev environment in tmux, monitoring bootstrap to healthy state, and fixing Tiltfile bugs. Also specifies constraints (without hard-coding or fallbacks), which adds further specificity. | 3 / 3 |
Completeness | Clearly answers both 'what' (start Tilt dev environment in tmux, monitor bootstrap, fix Tiltfile bugs) and 'when' with an explicit 'Use when...' clause covering starting tilt, debugging Tiltfile errors, or bootstrapping a dev environment. | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms users would say: 'tilt', 'Tiltfile', 'tmux', 'dev environment', 'bootstrap', 'debugging', 'Tiltfile errors'. These cover the key terms a developer would use when needing this skill. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with a clear niche around Tilt/Tiltfile/tmux dev environment bootstrapping. The combination of Tilt, tmux, and Tiltfile debugging is very specific and unlikely to conflict with other skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, highly actionable skill with a clear multi-step workflow, explicit validation loops, and strong guiding principles. Its main weakness is moderate verbosity in the Principles section (which could be more concise) and limited progressive disclosure given no bundle files exist to offload detailed reference material. The executable code examples and error recovery patterns are particular strengths.
Suggestions
Condense the Principles section — consider a compact table format (anti-pattern | correct approach) instead of repeated 'Never' bullet points to save tokens.
Add bundle files for referenced concepts (e.g., a TROUBLESHOOTING.md for common Tilt errors, or a SILO.md for silo-specific workflows) to improve progressive disclosure.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Generally efficient and assumes Claude's competence with Tilt/k8s concepts, but the Principles section is somewhat verbose with repeated 'Never' patterns that could be condensed into a table or shorter list. Some items like explaining when Tilt live-reloads are borderline unnecessary. | 2 / 3 |
Actionability | Provides fully executable bash commands for every step — tmux session management, tilt status polling with jq, log retrieval. The code is copy-paste ready with proper variable interpolation and conditional logic. | 3 / 3 |
Workflow Clarity | Clear 5-step sequential workflow with explicit validation checkpoints (Step 3 polling for convergence), a feedback loop in Step 4 (fix → live-reload → re-poll → verify), and a clear escalation path after 3 failed iterations. The report template in Step 5 provides a structured output format. | 3 / 3 |
Progressive Disclosure | References the 'tmux skill' for patterns but has no bundle files to support deeper dives. The Principles section is inlined rather than separated, and there are no references to external docs for advanced topics like silo.toml configuration or gen-env scripts. For a skill of this length (~100 lines), the structure is reasonable but the principles could be a separate reference. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
aa009ea
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.