Queries Tilt resource status, logs, and manages dev environments. Use when checking deployment health, investigating errors, reading logs, or working with Tiltfiles.
96
95%
Does it follow best practices?
Impact
100%
1.29xAverage score across 3 eval scenarios
Passed
No known issues
Quality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid skill description that clearly identifies its domain (Tilt), provides explicit trigger guidance via a 'Use when...' clause, and uses distinctive terminology that minimizes conflict risk. The main weakness is that the capability listing could be more specific—enumerating concrete actions like 'restart resources, trigger builds, check build status' rather than the somewhat general 'queries resource status' and 'manages dev environments'.
Suggestions
Expand the capability list with more specific concrete actions, e.g., 'Queries Tilt resource status, reads build and runtime logs, restarts resources, and manages local dev environments.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (Tilt) and some actions ('queries resource status, logs, manages dev environments'), but the actions are somewhat general rather than listing multiple specific concrete operations like 'restart resources, trigger builds, read build logs'. | 2 / 3 |
Completeness | Clearly answers both 'what' (queries Tilt resource status, logs, manages dev environments) and 'when' with an explicit 'Use when...' clause listing four trigger scenarios: checking deployment health, investigating errors, reading logs, working with Tiltfiles. | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms users would say: 'deployment health', 'errors', 'logs', 'Tiltfiles', 'resource status', 'dev environments'. These cover the common ways a user would phrase requests related to Tilt. | 3 / 3 |
Distinctiveness Conflict Risk | Tilt is a specific tool with a clear niche (local Kubernetes dev environments), and the description includes distinctive terms like 'Tiltfiles', 'Tilt resource status' that are unlikely to conflict with other skills. 'Logs' and 'deployment health' could overlap slightly with generic monitoring skills, but the Tilt context anchors it well. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
100%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is an excellent skill file that is concise, highly actionable, and well-structured. It provides immediately executable commands for the most common Tilt operations, includes important operational warnings (never restart for code changes), and appropriately delegates detailed content to referenced files. The diagnostic-first approach and complete jq patterns make this particularly effective for troubleshooting workflows.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Every section is lean and purposeful. No unnecessary explanations of what Tilt is or how it works conceptually—jumps straight to actionable commands. The status values list and critical warning are both essential information that earn their tokens. | 3 / 3 |
Actionability | All guidance is concrete with copy-paste ready bash commands, including complex jq pipelines for filtering resource status. The tmux session management script is fully executable, and specific flag usage (--since, --tail, --port) is demonstrated. | 3 / 3 |
Workflow Clarity | The skill opens with 'First Action: Check for Errors' establishing a clear diagnostic workflow. The tmux script includes conditional checks (session exists, window exists) as validation. The 'Critical: Never Restart' section provides explicit decision criteria for when to restart vs. not, preventing a common destructive mistake. | 3 / 3 |
Progressive Disclosure | The skill provides a well-structured overview with clear one-level-deep references to TILTFILE_API.md, CLI_REFERENCE.md, and external docs. Content is appropriately split—the main file covers the most common operations while pointing to detailed references for deeper needs. | 3 / 3 |
Total | 12 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
e437c3c
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.