CtrlK
BlogDocsLog inGet started
Tessl Logo

langfuse-observability

LLM observability with Langfuse — query traces, generations, costs, metrics, and debug LLM pipelines via the REST API

79

Quality

73%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./langfuse-observability/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description effectively communicates specific capabilities within the Langfuse LLM observability domain and is highly distinctive. However, it lacks an explicit 'Use when...' clause and could benefit from additional natural trigger terms that users might say when needing this skill.

Suggestions

Add a 'Use when...' clause with trigger phrases like 'Use when debugging LLM applications, analyzing token costs, or investigating trace data in Langfuse'

Include additional natural keywords users might say such as 'monitoring', 'token usage', 'latency tracking', or 'LLM debugging'

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: 'query traces, generations, costs, metrics, and debug LLM pipelines'. These are distinct, actionable capabilities within the Langfuse domain.

3 / 3

Completeness

Clearly answers 'what' (query traces, generations, costs, metrics, debug LLM pipelines via REST API) but lacks an explicit 'Use when...' clause to indicate when Claude should select this skill.

2 / 3

Trigger Term Quality

Includes relevant technical terms like 'Langfuse', 'traces', 'generations', 'LLM pipelines', 'REST API', but missing common user variations like 'observability platform', 'token usage', 'latency', or 'monitoring'.

2 / 3

Distinctiveness Conflict Risk

Highly distinctive with 'Langfuse' as a specific product name and 'LLM observability' as a clear niche. Unlikely to conflict with other skills due to the specific tooling and domain focus.

3 / 3

Total

10

/

12

Passed

Implementation

79%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, actionable skill with excellent executable examples covering the Langfuse API comprehensively. The main weaknesses are the lack of explicit debugging workflows (e.g., 'how to investigate a slow trace') and the monolithic structure that could benefit from splitting deployment/integration content into separate files.

Suggestions

Add a 'Debugging workflow' section showing how to sequence queries when investigating issues (e.g., find error -> get trace -> examine generations -> check scores)

Move Docker deployment and OpenRouter integration sections to separate reference files (e.g., DEPLOYMENT.md, INTEGRATIONS.md) with links from the main skill

DimensionReasoningScore

Conciseness

The skill is lean and efficient, providing executable curl commands without explaining what Langfuse is beyond a single sentence. No unnecessary explanations of REST APIs, authentication, or JSON parsing.

3 / 3

Actionability

Every section provides copy-paste ready curl commands with jq filters. The examples are complete and executable, covering setup, queries, filtering, and common use cases with specific endpoints and parameters.

3 / 3

Workflow Clarity

While individual commands are clear, there's no explicit workflow for debugging LLM pipelines or investigating issues. The skill presents isolated queries without guidance on sequencing them for common debugging scenarios or validation steps.

2 / 3

Progressive Disclosure

Content is well-organized with clear sections, but it's a long monolithic file. The Docker deployment and OpenRouter integration sections could be separate files, and there's no reference to external documentation for advanced topics.

2 / 3

Total

10

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
ddnetters/homelab-agent-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.