Investigate failing ThousandEyes synthetic tests with MCP tools. Use when a user wants ThousandEyes test triage, service-map or trace-ID correlation, distributed-tracing checks, correlation across Observability Platforms, or evidence-backed root-cause analysis with optional code fixes.
68
81%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that clearly identifies its domain (ThousandEyes synthetic test investigation), lists specific capabilities, and includes an explicit 'Use when' clause with multiple concrete trigger scenarios. The description is concise, uses third-person voice, and provides enough specificity to distinguish it from general observability or monitoring skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: investigating failing synthetic tests, service-map correlation, trace-ID correlation, distributed-tracing checks, cross-platform correlation, root-cause analysis, and optional code fixes. | 3 / 3 |
Completeness | Clearly answers both 'what' (investigate failing ThousandEyes synthetic tests with MCP tools) and 'when' (explicit 'Use when' clause listing five specific trigger scenarios including triage, correlation, tracing, and root-cause analysis). | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms users would say: 'ThousandEyes', 'synthetic tests', 'failing', 'triage', 'service-map', 'trace-ID', 'distributed-tracing', 'root-cause analysis', 'code fixes', and 'Observability Platforms'. These cover the domain well with terms practitioners naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with a clear niche: ThousandEyes-specific synthetic test investigation using MCP tools. The combination of ThousandEyes, synthetic tests, and observability platform correlation makes it very unlikely to conflict with other skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, domain-specific skill with excellent workflow clarity and logical sequencing for a complex multi-step diagnostic process. Its main weaknesses are moderate verbosity (some redundancy between Required Behavior and Workflow sections) and a lack of concrete executable examples such as sample MCP tool call payloads or expected response structures. The progressive disclosure design is sound but unverifiable without bundle files.
Suggestions
Add at least one concrete example of an MCP tool call with sample input parameters and a snippet of expected output to improve actionability.
Deduplicate content between the 'Required Behavior' section and the 'Workflow' section—several rules (e.g., enumerate all Observability Platforms, check every platform) are stated in both places.
Consider moving the detailed Observability Platform correlation sub-steps (Step 5) into reference.md to reduce the main skill's token footprint while keeping the high-level flow inline.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is generally well-structured and avoids explaining basic concepts, but it is somewhat verbose in places—particularly the Observability Platform correlation section (Step 5) which repeats similar instructions across multiple sub-points. Some rules in 'Required Behavior' duplicate what's already stated in the workflow steps. Could be tightened by ~20-30%. | 2 / 3 |
Actionability | The skill provides specific MCP tool names (e.g., `list_network_app_synthetics_tests`, `get_service_map`) and clear parameter guidance (`filter_dimension=TEST`, `filter_values=[testId]`), which is good. However, there are no concrete executable code examples, no example MCP call payloads, and no sample output structures inline. The guidance is specific but not copy-paste ready. | 2 / 3 |
Workflow Clarity | The workflow is clearly sequenced across 6 numbered phases with logical ordering, explicit conditional branches (e.g., distributed tracing enabled/disabled, service map available/unavailable), fallback paths, and validation checkpoints (e.g., 'validate trace ID format', 'ask user confirmation before edits', 'do not conclude until every platform checked'). The guardrails section adds safety constraints for destructive operations. | 3 / 3 |
Progressive Disclosure | The skill references `reference.md` and `examples.md` with clear signals for when to load each, which is good progressive disclosure design. However, no bundle files were provided, so we cannot verify these references exist or contain appropriate content. The main SKILL.md itself is fairly long and some content (like the detailed Observability Platform correlation sub-steps) could potentially be offloaded to a reference file. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
a4497e7
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.