langfuse-observability

Set up comprehensive observability for Langfuse with metrics, dashboards, and alerts. Use when implementing monitoring for LLM operations, setting up dashboards, or configuring alerting for Langfuse integration health. Trigger with phrases like "langfuse monitoring", "langfuse metrics", "langfuse observability", "monitor langfuse", "langfuse alerts", "langfuse dashboard".

Quality

73%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/saas-packs/langfuse-pack/skills/langfuse-observability/SKILL.md

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a well-structured skill description with strong completeness and excellent trigger term coverage. Its main weakness is that the capability descriptions could be more concrete—listing specific tools, integrations, or detailed actions rather than high-level categories like 'metrics, dashboards, and alerts'. The Langfuse-specific focus makes it highly distinctive and unlikely to conflict with other skills.

Suggestions

Add more specific concrete actions, e.g., 'configure Prometheus/Grafana dashboards for trace latency, token usage, and error rates' instead of the generic 'metrics, dashboards, and alerts'.

Dimension	Reasoning	Score
Specificity	The description names the domain (Langfuse observability) and mentions some actions (metrics, dashboards, alerts), but doesn't list specific concrete actions like 'create Grafana dashboards', 'configure Prometheus exporters', or 'set up alerting rules for latency thresholds'. The actions remain somewhat high-level.	2 / 3
Completeness	Clearly answers both 'what' (set up comprehensive observability with metrics, dashboards, and alerts) and 'when' (implementing monitoring for LLM operations, setting up dashboards, configuring alerting for Langfuse integration health), with explicit trigger phrases provided.	3 / 3
Trigger Term Quality	Excellent trigger term coverage with explicit natural phrases: 'langfuse monitoring', 'langfuse metrics', 'langfuse observability', 'monitor langfuse', 'langfuse alerts', 'langfuse dashboard'. These are terms users would naturally say and cover multiple variations.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive due to the specific focus on Langfuse observability. The combination of 'Langfuse' + 'observability/monitoring/alerts' creates a clear niche that is unlikely to conflict with generic monitoring or other LLM tool skills.	3 / 3
	Total	11 / 12 Passed

Implementation

57%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill excels at actionability with fully executable, copy-paste ready code across the entire observability stack. However, it suffers from being a monolithic document (~200 lines of code) that would benefit greatly from splitting detailed implementations into separate bundle files. The workflow is well-sequenced but lacks validation checkpoints between infrastructure setup steps.

Suggestions

Split the large code blocks (instrumented wrapper, Grafana dashboard JSON, alert rules) into separate bundle files and reference them from SKILL.md to improve progressive disclosure and conciseness.

Add validation checkpoints between steps, e.g., 'Verify: curl localhost:3000/metrics should return Prometheus-formatted output' after Step 3, and 'Verify: check Prometheus targets page shows your app as UP' after Step 4.

Trim the SKILL.md to an overview with quick-start essentials and pointers to detailed files for each component (metrics library, dashboard config, alert rules).

Dimension	Reasoning	Score
Conciseness	The skill is fairly long with substantial inline code that could be split into referenced files. Some sections like the Grafana dashboard JSON and the full instrumented wrapper are verbose for a SKILL.md overview, though most content is functional rather than explanatory fluff.	2 / 3
Actionability	Provides fully executable TypeScript code for metrics setup, instrumented LLM wrapper, metrics endpoint, Prometheus config, Grafana dashboard JSON, and alert rules. All examples are copy-paste ready with specific metric names, thresholds, and configurations.	3 / 3
Workflow Clarity	Steps are clearly numbered and sequenced (1-6), but there are no validation checkpoints between steps. For a multi-step infrastructure setup involving Prometheus scraping, metrics endpoints, and alert rules, there should be explicit verification steps (e.g., 'verify metrics endpoint returns data', 'confirm Prometheus is scraping successfully') to catch configuration errors.	2 / 3
Progressive Disclosure	All content is inlined in a single monolithic file with no bundle files to offload detailed code examples. The full instrumented wrapper, Grafana dashboard JSON, and alert rules could be in separate referenced files, keeping SKILL.md as a concise overview with navigation pointers.	1 / 3
	Total	8 / 12 Passed

Validation

81%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 9 / 11 Passed

Validation for skill structure

Criteria	Description	Result
allowed_tools_field	'allowed-tools' contains unusual tool name(s)	Warning
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	9 / 11 Passed

Repository: jeremylongshore/claude-code-plugins-plus-skills
Commit: 23fe3bf

Reviewed: 3 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.