Service performance monitoring with RED metrics (Rate, Errors, Duration) and runtime-specific telemetry for Java, .NET, Node.js, Python, PHP, and Go. Use when analyzing service health, SLA compliance, or runtime issues. Trigger: "service response time", "error rate", "throughput", "SLA compliance", "service mesh overhead", "JVM GC", "Java heap", "Node.js event loop", ".NET CLR", "Python threads", "PHP OPcache", "Go goroutines", "service performance", "p95 latency", "request failures", "database response time by name". Do NOT use for explaining existing queries, product documentation questions, infrastructure metrics (use dt-obs-hosts), log analysis (use dt-obs-logs), or distributed tracing workflows (use dt-obs-tracing).
71
86%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that excels across all dimensions. It provides specific capabilities (RED metrics, runtime telemetry for six languages), comprehensive trigger terms that match natural user language, explicit 'Use when' guidance, and clear boundary definitions with named alternative skills to prevent conflicts. The description is thorough yet focused, making it easy for Claude to correctly select this skill from a large pool.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions and domains: RED metrics (Rate, Errors, Duration), runtime-specific telemetry for six named languages, SLA compliance analysis, and service health monitoring. Very concrete and detailed. | 3 / 3 |
Completeness | Clearly answers both 'what' (service performance monitoring with RED metrics and runtime-specific telemetry) and 'when' (explicit 'Use when' clause plus detailed trigger terms and explicit 'Do NOT use for' exclusions that further clarify scope). | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms users would say: 'service response time', 'error rate', 'throughput', 'p95 latency', 'JVM GC', 'Node.js event loop', etc. These are terms practitioners naturally use when diagnosing service performance issues. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with explicit boundary-setting via 'Do NOT use for' clauses that redirect to specific alternative skills (dt-obs-hosts, dt-obs-logs, dt-obs-tracing). The focus on service-level RED metrics and runtime-specific telemetry creates a clear niche. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
72%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured skill with strong actionability and excellent progressive disclosure across multiple reference files. The main weaknesses are moderate verbosity in use-case listings and workflow sections that lack explicit validation checkpoints and error recovery steps. The Agent Instructions section is a notable strength, providing clear defaults and decision-making guidance.
Suggestions
Add explicit validation steps to workflows — e.g., after running a query, check for empty results and suggest diagnostic steps (verify OneAgent is active, check timeframe, confirm service exists).
Trim the bullet-point 'Use Cases' lists under each capability section — the intent-mapping table in Agent Instructions already covers this mapping more efficiently.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is reasonably well-organized but includes some redundancy — the 'Use Cases' bullet lists under each capability section are somewhat verbose and overlap with the intent-mapping table in Agent Instructions. The 'When to Use This Skill' section largely repeats the description metadata. However, most content earns its place and the structure aids navigation. | 2 / 3 |
Actionability | The skill provides concrete, executable DQL queries for each capability area, specific metric names, unit conversion guidance (microseconds to milliseconds), default parameter values with rationale, and a clear step-by-step example for threshold violation requests. The code examples are copy-paste ready with real metric identifiers. | 3 / 3 |
Workflow Clarity | Workflows are listed but lack explicit validation checkpoints and feedback loops. The 'Service Health Check' and 'Runtime Troubleshooting' workflows are high-level outlines without concrete validation steps (e.g., what to do if no data returns, how to verify query correctness). The threshold violation workflow in Agent Instructions is better sequenced but still lacks error handling. | 2 / 3 |
Progressive Disclosure | Excellent progressive disclosure structure — the main skill provides concise overviews with quick examples for each capability, then clearly references 7 separate files for detailed queries. References are one level deep, well-signaled with inline links, and organized by domain (core service metrics vs. runtime-specific). The References section at the bottom provides a clean navigation index. | 3 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
7cbe1ef
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.