Queries Tilt resource status, logs, and manages dev environments. Use when checking deployment health, investigating errors, reading logs, or working with Tiltfiles.
96
95%
Does it follow best practices?
Impact
100%
1.29xAverage score across 3 eval scenarios
Passed
No known issues
Quality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid skill description that clearly identifies the Tilt-specific domain, provides an explicit 'Use when' clause with relevant trigger scenarios, and is distinctive enough to avoid conflicts with other skills. The main weakness is that the 'what' portion could be more specific about the concrete actions available (e.g., restart resources, trigger updates, read build/runtime logs).
Suggestions
Expand the capability list with more specific actions, e.g., 'Queries Tilt resource status, reads build and runtime logs, restarts resources, triggers updates, and manages dev environments.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (Tilt) and some actions ('queries resource status, logs, manages dev environments'), but the actions are somewhat general rather than listing multiple specific concrete operations like 'restart resources, trigger builds, read build logs'. | 2 / 3 |
Completeness | Clearly answers both what ('Queries Tilt resource status, logs, and manages dev environments') and when ('Use when checking deployment health, investigating errors, reading logs, or working with Tiltfiles') with an explicit 'Use when' clause. | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms users would say: 'deployment health', 'investigating errors', 'reading logs', 'Tiltfiles', 'resource status', and 'dev environments'. These cover the main scenarios a user working with Tilt would mention. | 3 / 3 |
Distinctiveness Conflict Risk | Tilt is a specific tool with distinct terminology (Tiltfiles, resource status). The description is clearly distinguishable from generic logging or deployment skills due to the Tilt-specific references. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
100%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is an excellent skill file that is concise, highly actionable, and well-structured. It provides immediately useful commands with proper jq patterns for JSON processing, includes important operational guardrails (never restart for code changes), and appropriately delegates detailed content to referenced files. The tmux session management script is a particularly thoughtful inclusion for real-world usage.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Every section is lean and purposeful. No unnecessary explanations of what Tilt is or how it works conceptually—it jumps straight into actionable commands. The status values list is a useful reference, not padding. | 3 / 3 |
Actionability | All guidance is concrete with copy-paste ready bash commands, including complex jq pipelines for filtering resource status. The tmux session management script is fully executable. Specific flag usage (--since, --tail, --port) is demonstrated. | 3 / 3 |
Workflow Clarity | The skill opens with 'First Action: Check for Errors' establishing a clear starting workflow. The 'Critical: Never Restart for Code Changes' section provides explicit guardrails. The tmux script includes conditional checks (session exists, window exists) as validation. The wait-for-ready command serves as a verification checkpoint. | 3 / 3 |
Progressive Disclosure | The skill provides a well-organized overview with clear one-level-deep references to TILTFILE_API.md, CLI_REFERENCE.md, and external docs. Content is appropriately split—the main file covers the most common operations while pointing to detailed references for deeper topics. | 3 / 3 |
Total | 12 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
f772de4
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.