CtrlK
BlogDocsLog inGet started
Tessl Logo

langfuse-core-workflow-a

Execute Langfuse primary workflow: Tracing LLM calls and spans. Use when implementing LLM tracing, building traced AI features, or adding observability to existing LLM applications. Trigger with phrases like "langfuse tracing", "trace LLM calls", "add langfuse to openai", "langfuse spans", "track llm requests".

64

Quality

77%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/saas-packs/langfuse-pack/skills/langfuse-core-workflow-a/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Content

64%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid, highly actionable skill with excellent executable code examples covering multiple integration patterns. Its main weaknesses are verbosity from including both v3 and v4 SDK versions inline plus multiple provider examples that could be split into separate files, and the lack of validation checkpoints to verify traces are actually appearing in the Langfuse dashboard. The error handling table is a nice touch but doesn't substitute for inline verification steps.

Suggestions

Add a validation checkpoint after Step 1 (e.g., 'Verify: Open Langfuse dashboard → Traces tab → confirm the trace appears with model, tokens, and latency before proceeding to manual tracing').

Move the v3 legacy RAG pipeline (Step 3) and the LangChain Python integration (Step 6) into separate reference files to reduce the main skill's token footprint.

Remove explanatory comments that state the obvious (e.g., '// Every call captures: model, input, output, tokens, latency, cost') to improve conciseness.

DimensionReasoningScore

Conciseness

The skill provides substantial executable code examples which are valuable, but includes both v3 and v4 SDK versions inline (Step 2 and Step 3 cover the same RAG pipeline twice), and the Anthropic/LangChain sections add significant length. Some comments are unnecessary (e.g., 'Every call captures: model, input, output, tokens, latency, cost'). The v3 legacy code could be in a separate reference file.

2 / 3

Actionability

All code examples are fully executable TypeScript/Python with proper imports, concrete API calls, and realistic patterns. The examples cover multiple real scenarios (OpenAI wrapper, RAG pipeline, streaming, Anthropic, LangChain) with copy-paste ready code.

3 / 3

Workflow Clarity

Steps are clearly numbered and sequenced, but they read more like independent recipes than a connected workflow. There are no validation checkpoints (e.g., 'verify traces appear in Langfuse dashboard before proceeding') and no error recovery feedback loops despite tracing being an operation where silent failures are common.

2 / 3

Progressive Disclosure

The skill has a clear structure with sections and an error handling table, plus links to external resources. However, the v3 legacy code and the LangChain Python example could be split into separate reference files to keep the main skill leaner. No bundle files exist to offload this content, and the inline content is quite long (~200 lines of code).

2 / 3

Total

9

/

12

Passed

Description

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a well-structured description with excellent trigger terms and clear 'what/when' guidance. Its main weakness is that the capability description is somewhat thin—it mentions 'tracing LLM calls and spans' but doesn't enumerate more specific actions like configuring decorators, adding metadata, or integrating with specific LLM providers beyond OpenAI.

Suggestions

Expand the specificity of capabilities by listing more concrete actions, e.g., 'Configure trace decorators, add span metadata, set up Langfuse with OpenAI/Anthropic SDKs, create scored evaluations.'

DimensionReasoningScore

Specificity

It names the domain (Langfuse, LLM tracing) and mentions some actions ('Tracing LLM calls and spans'), but doesn't list multiple concrete actions like creating traces, adding span metadata, configuring decorators, or viewing dashboards.

2 / 3

Completeness

Clearly answers both 'what' (tracing LLM calls and spans via Langfuse) and 'when' (implementing LLM tracing, building traced AI features, adding observability), with explicit trigger phrases provided.

3 / 3

Trigger Term Quality

Includes strong natural trigger terms: 'langfuse tracing', 'trace LLM calls', 'add langfuse to openai', 'langfuse spans', 'track llm requests'. These cover natural variations a user would actually say when needing this skill.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive due to the specific tool name 'Langfuse' and the niche of LLM tracing/observability. Unlikely to conflict with other skills given the explicit product-specific trigger terms.

3 / 3

Total

11

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
jeremylongshore/claude-code-plugins-plus-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.