Debug Azure production issues on Azure using AppLens, Azure Monitor, resource health, and safe triage. WHEN: debug production issues, troubleshoot app service, app service high CPU, app service deployment failure, troubleshoot container apps, troubleshoot functions, troubleshoot AKS, kubectl cannot connect, kube-system/CoreDNS failures, pod pending, crashloop, node not ready, upgrade failures, analyze logs, KQL, insights, image pull failures, cold start issues, health probe failures, resource health, root cause of errors, troubleshoot event hubs, troubleshoot service bus, messaging SDK error, AMQP connection failure, message lock lost, service bus dead letter.
79
73%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugin/skills/azure-diagnostics/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description with excellent trigger term coverage and clear 'when' guidance via an explicit WHEN clause. The main weakness is that the 'what it does' portion is somewhat thin — it says 'debug' and 'safe triage' but doesn't enumerate the concrete actions Claude would take (e.g., run KQL queries, check resource health, analyze pod logs). The extensive keyword list compensates well for skill selection purposes.
Suggestions
Expand the opening sentence to list more concrete actions beyond 'debug' and 'safe triage', e.g., 'Diagnoses Azure production issues by querying logs with KQL, checking resource health, analyzing pod states, and performing safe triage using AppLens and Azure Monitor.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description names the domain (Azure production issues) and mentions specific tools (AppLens, Azure Monitor, resource health), but the 'what it does' is limited to 'debug' and 'safe triage' — it doesn't list multiple concrete actions like 'analyze logs, query metrics, check resource health status, inspect pod states'. | 2 / 3 |
Completeness | The description explicitly answers both 'what' (debug Azure production issues using specific tools) and 'when' (with an explicit 'WHEN:' clause listing extensive trigger scenarios). The 'WHEN' section is thorough and clearly delineated. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms users would actually say: 'high CPU', 'deployment failure', 'crashloop', 'pod pending', 'node not ready', 'cold start issues', 'dead letter', 'AMQP connection failure', 'KQL', 'kubectl cannot connect', etc. These are highly specific and match real user language. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive — the combination of Azure-specific services (App Service, Container Apps, Functions, AKS, Event Hubs, Service Bus) with specific tools (AppLens, Azure Monitor, KQL) and very specific failure modes (CrashLoop, CoreDNS failures, AMQP connection failure) makes this unlikely to conflict with other skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
57%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill excels at progressive disclosure and routing—it serves well as a hub document directing to service-specific guides. However, the main content is weakened by generic diagnostic advice Claude already knows, placeholder-heavy examples that aren't truly executable, and a lack of validation/feedback loops in the troubleshooting workflow. The triggers section is redundant with the skill description.
Suggestions
Remove or drastically shorten the Triggers section and the generic Quick Diagnosis Flow—Claude already knows how to approach systematic debugging. Replace with a concise decision tree that maps specific symptoms to specific actions or references.
Make MCP tool examples concrete and executable: show a real-world example with actual resource IDs and expected output patterns, rather than pseudo-syntax with angle-bracket placeholders.
Add validation checkpoints to the diagnostic workflow: e.g., 'If resource health returns Degraded, skip to logs. If Available, check for application-level errors. After remediation, re-check health to confirm resolution.'
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The triggers section is overly verbose, essentially restating the YAML description. The 'Rules' and 'Quick Diagnosis Flow' sections are generic enough that Claude already knows them (e.g., 'Identify symptoms - What's failing?'). The routing section and service table are efficient, but the overall document could be tightened significantly. | 2 / 3 |
Actionability | CLI commands and MCP tool invocations are provided but use placeholder values without concrete examples of real diagnostic scenarios. The MCP tool syntax is not standard executable code—it's a pseudo-format. The quick diagnosis flow is abstract ('What's failing?') rather than providing concrete decision trees or specific commands for each step. | 2 / 3 |
Workflow Clarity | The quick diagnosis flow provides a sequence but lacks validation checkpoints and feedback loops. There's no guidance on what to do if resource health check fails vs. passes, no error recovery steps, and no explicit verification that a remediation worked. For production troubleshooting (a high-stakes operation), this is insufficient. | 2 / 3 |
Progressive Disclosure | The skill is well-structured as an overview/router with a clear service-to-reference table, explicit routing rules, and one-level-deep references to specific troubleshooting guides. Navigation is easy with the table format and clearly signaled links to sub-documents. | 3 / 3 |
Total | 9 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
771a666
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.