Operate long-lived agent workloads with observability, security boundaries, and lifecycle management.
35
35%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
17%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description is overly abstract and jargon-heavy, reading more like a product tagline than actionable skill guidance. It lacks concrete actions, natural trigger terms, and any explicit 'when to use' clause, making it difficult for Claude to reliably select this skill from a pool of alternatives.
Suggestions
Add a 'Use when...' clause with specific trigger scenarios, e.g., 'Use when the user asks to run a long-running background process, manage agent lifecycles, or monitor running workloads.'
Replace abstract nouns with concrete actions, e.g., 'Start, stop, and monitor long-running agent processes; set security sandboxes; view logs and metrics; manage process restarts and health checks.'
Include natural user-facing keywords like 'background task', 'daemon', 'process monitoring', 'agent logs', 'restart', 'sandbox' that users would actually use in requests.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names a domain (agent workloads) and some capabilities (observability, security boundaries, lifecycle management), but these are abstract concepts rather than concrete actions. No specific verbs like 'monitor logs', 'restart processes', or 'set resource limits' are provided. | 2 / 3 |
Completeness | The description provides a vague 'what' but completely lacks a 'when' clause. There is no 'Use when...' or equivalent guidance for when Claude should select this skill, which per the rubric caps completeness at 2, and the weak 'what' brings it to 1. | 1 / 3 |
Trigger Term Quality | The terms used ('observability', 'security boundaries', 'lifecycle management') are technical jargon that users are unlikely to naturally say. Missing common trigger terms like 'background process', 'daemon', 'long-running task', 'monitoring', or 'sandbox'. | 1 / 3 |
Distinctiveness Conflict Risk | The phrase 'long-lived agent workloads' is somewhat distinctive and narrows the domain, but 'observability' and 'lifecycle management' are broad enough to overlap with monitoring, DevOps, or container management skills. | 2 / 3 |
Total | 6 / 12 Passed |
Implementation
22%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill reads as a high-level overview of enterprise agent operations concepts rather than actionable guidance. It lacks concrete commands, code examples, configuration snippets, or specific tool usage instructions. Most of the content describes general DevOps principles that Claude already knows, without adding novel, executable knowledge.
Suggestions
Replace abstract bullet points with concrete, executable examples—e.g., actual PM2 commands for lifecycle management, sample systemd unit files, or specific logging/metrics collection configurations.
Add validation checkpoints to the incident pattern workflow with specific commands (e.g., 'Run `curl -s http://localhost:3000/health | jq .status` to verify recovery before resuming rollout').
Include at least one complete, copy-paste-ready example for a common scenario like deploying an agent with health checks and a kill switch.
Either link to detailed reference files for each deployment integration (PM2, systemd, containers, CI/CD) or provide inline configuration examples for at least one.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is relatively lean and avoids excessive explanation, but much of it is high-level conceptual knowledge Claude already possesses (e.g., what metrics to track, what baseline controls are). These are generic ops principles, not novel instructions. | 2 / 3 |
Actionability | The skill provides no concrete commands, code, configuration examples, or executable guidance. Everything is abstract and descriptive—bullet lists of concepts and vague numbered steps like 'isolate failing route' without specifying how. | 1 / 3 |
Workflow Clarity | The incident pattern section lists steps but they are vague and lack validation checkpoints, specific commands, or feedback loops. 'Capture representative traces' and 'patch with smallest safe change' are not actionable sequences. No verification steps are included for the deployment/rollback workflow. | 1 / 3 |
Progressive Disclosure | The content is organized into clear sections with reasonable headers, but the 'Deployment Integrations' section merely lists names without links or references to actual detailed files. There's no progressive disclosure structure pointing to deeper resources. | 2 / 3 |
Total | 6 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
Reviewed
Table of Contents