Troubleshoot and respond to Langfuse-related incidents and outages. Use when experiencing Langfuse outages, debugging production issues, or responding to LLM observability incidents. Trigger with phrases like "langfuse incident", "langfuse outage", "langfuse down", "langfuse production issue", "langfuse troubleshoot".
85
83%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Quality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a well-structured skill description with strong trigger terms, explicit 'Use when' and 'Trigger with' clauses, and a clearly distinctive niche around Langfuse incident response. The main weakness is that the specific capabilities could be more concrete—listing particular troubleshooting actions rather than the general verbs 'troubleshoot' and 'respond'.
Suggestions
Add more specific concrete actions such as 'check Langfuse service health, review error logs, diagnose trace ingestion failures, verify API connectivity' to improve specificity.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description names the domain (Langfuse incidents/outages) and some actions (troubleshoot, respond, debug), but doesn't list multiple specific concrete actions like 'check service health', 'review error logs', 'restart services', or 'escalate to on-call'. | 2 / 3 |
Completeness | Clearly answers both 'what' (troubleshoot and respond to Langfuse-related incidents and outages) and 'when' (explicit 'Use when' clause with scenarios, plus explicit 'Trigger with phrases' listing specific trigger terms). | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms including 'langfuse incident', 'langfuse outage', 'langfuse down', 'langfuse production issue', 'langfuse troubleshoot', plus broader terms like 'LLM observability incidents' and 'debugging production issues'. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive due to the specific product name 'Langfuse' and the narrow focus on incident response/outages for that particular tool. Very unlikely to conflict with other skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, actionable incident runbook with executable scripts, clear severity classification, and a well-sequenced workflow from triage to post-mortem. Its main weakness is that it's somewhat long for a single SKILL.md file — resolution procedures and the post-incident review template could be split out — and there's some redundancy between the symptom table and the error handling table at the end.
Suggestions
Remove or consolidate the 'Error Handling' table at the bottom since it largely duplicates the 'Determine Incident Type and Response' table in Step 2.
Consider splitting detailed resolution procedures (A, B, C) and the post-incident review template into separate referenced files to reduce the main skill's token footprint.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Generally efficient with good use of tables for quick reference, but some redundancy exists — the Error Handling table at the end largely duplicates the symptom/action table in Step 2, and some inline comments explain things Claude would already know. | 2 / 3 |
Actionability | Provides fully executable bash scripts for triage and verification, concrete TypeScript code for fallback mode and resolution procedures, and specific docker commands for self-hosted troubleshooting. Commands are copy-paste ready with proper error handling (set -euo pipefail). | 3 / 3 |
Workflow Clarity | Clear 6-step sequential workflow from initial assessment through post-incident review, with explicit time targets (2 min triage), severity-based branching, validation at Step 5 with a feedback check on whether traces are flowing, and escalation criteria with time thresholds. | 3 / 3 |
Progressive Disclosure | Content is well-structured with clear sections and tables, but the entire runbook is monolithic — the resolution procedures (A, B, C) and post-incident review template could be split into separate referenced files to keep the main skill leaner. External links are provided but only for Langfuse resources, not for supplementary skill content. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
4dee593
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.