CtrlK
BlogDocsLog inGet started
Tessl Logo

langfuse-observability

Set up comprehensive observability for Langfuse with metrics, dashboards, and alerts. Use when implementing monitoring for LLM operations, setting up dashboards, or configuring alerting for Langfuse integration health. Trigger with phrases like "langfuse monitoring", "langfuse metrics", "langfuse observability", "monitor langfuse", "langfuse alerts", "langfuse dashboard".

64

Quality

77%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/saas-packs/langfuse-pack/skills/langfuse-observability/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Content

64%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a highly actionable skill with executable, production-ready code covering the full observability stack from metrics instrumentation through alerting. Its main weaknesses are the lack of validation/verification steps between the setup stages and the monolithic structure that packs a lot of code inline. Some trimming of explanatory text around Langfuse's built-in features would improve token efficiency.

Suggestions

Add verification steps after key stages, e.g., 'curl localhost:3000/metrics | grep langfuse_traces' after Step 3 to confirm metrics are exposed, and a Prometheus query check after Step 4.

Consider splitting the traced LLM wrapper and Grafana dashboard JSON into separate bundle files, keeping only a concise summary and reference links in the main SKILL.md.

Trim the Langfuse Built-In Dashboards section — the bullet list describing UI features (Overview, Cost Dashboard, etc.) is information Claude can find in docs and doesn't need memorized.

DimensionReasoningScore

Conciseness

The skill is fairly long with substantial code blocks. Some content is efficient (metrics reference table, alert rules), but the traced LLM wrapper is quite verbose and the Langfuse built-in dashboards section explains UI features Claude doesn't need detailed descriptions of. The Grafana dashboard JSON and Prometheus config are appropriately concise.

2 / 3

Actionability

The skill provides fully executable TypeScript code for metrics setup, a complete instrumented LLM wrapper, working Prometheus config, Grafana dashboard JSON, and alert rules with specific thresholds. All code is copy-paste ready with proper imports and realistic configurations.

3 / 3

Workflow Clarity

Steps are clearly numbered and sequenced (1-6), but there are no validation checkpoints. After setting up metrics, there's no step to verify metrics are being scraped, no way to confirm the dashboard is working, and no feedback loop for troubleshooting. For a multi-step infrastructure setup involving multiple systems, explicit verification steps are needed.

2 / 3

Progressive Disclosure

The content is well-structured with clear sections and a reference table, but it's a monolithic document with ~200 lines of code that could benefit from splitting (e.g., the traced LLM wrapper and Grafana dashboard JSON into separate files). External resource links are provided but no bundle files exist to offload detailed content.

2 / 3

Total

9

/

12

Passed

Description

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a well-structured skill description with strong trigger terms and clear 'what/when' guidance. Its main weakness is that the capability descriptions could be more concrete—listing specific tools, integrations, or detailed actions rather than high-level categories like 'metrics, dashboards, and alerts'. Overall it would perform well in skill selection scenarios.

Suggestions

Add more specific concrete actions, e.g., 'configure Prometheus/Grafana dashboards for trace latency and token usage, set up alerting rules for error rates and latency thresholds' to improve specificity.

DimensionReasoningScore

Specificity

The description names the domain (Langfuse observability) and mentions some actions (metrics, dashboards, alerts), but doesn't list specific concrete actions like 'create Grafana dashboards', 'configure Prometheus exporters', or 'set up PagerDuty alerts'. The actions remain at a category level rather than being truly concrete.

2 / 3

Completeness

Clearly answers both 'what' (set up comprehensive observability with metrics, dashboards, and alerts for Langfuse) and 'when' (implementing monitoring for LLM operations, setting up dashboards, configuring alerting for Langfuse integration health), with an explicit 'Use when' clause and trigger phrases.

3 / 3

Trigger Term Quality

Excellent trigger term coverage with explicit natural phrases: 'langfuse monitoring', 'langfuse metrics', 'langfuse observability', 'monitor langfuse', 'langfuse alerts', 'langfuse dashboard'. These are terms users would naturally say, and the variations cover multiple phrasings.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive due to the specific focus on Langfuse observability. The combination of 'Langfuse' + 'observability/monitoring/alerts' creates a clear niche that is unlikely to conflict with other skills, even in a large skill library.

3 / 3

Total

11

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
jeremylongshore/claude-code-plugins-plus-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.