CtrlK
BlogDocsLog inGet started
Tessl Logo

langfuse-cost-tuning

Monitor and optimize LLM costs using Langfuse analytics and dashboards. Use when tracking LLM spending, identifying cost anomalies, or implementing cost controls for AI applications. Trigger with phrases like "langfuse costs", "LLM spending", "track AI costs", "langfuse token usage", "optimize LLM budget".

64

Quality

77%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/saas-packs/langfuse-pack/skills/langfuse-cost-tuning/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Content

64%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid, highly actionable skill with executable TypeScript examples covering the full cost-monitoring workflow. Its main weaknesses are verbosity (some explanatory sections could be trimmed) and the lack of validation checkpoints between steps. The content would benefit from splitting the larger code blocks into referenced files and adding verification steps to confirm each stage works before proceeding.

Suggestions

Add validation checkpoints between steps, e.g., after Step 1 verify cost data appears in Langfuse before proceeding to Step 2's querying logic.

Trim the 'How Langfuse Tracks Costs' section to 1-2 sentences—Claude doesn't need the explanation of what observation types are or how auto-calculation works.

Consider splitting the model routing (Step 3) and budget alerts (Step 4) into separate referenced files to reduce the main skill's token footprint.

DimensionReasoningScore

Conciseness

The skill includes some unnecessary explanation (e.g., 'How Langfuse Tracks Costs' section explains concepts Claude likely knows, and the 'Understanding of LLM pricing models' prerequisite is filler). The code examples are substantial but justified given the complexity. The Dashboard Features section is somewhat redundant as it describes UI features without actionable guidance.

2 / 3

Actionability

All four steps provide fully executable TypeScript code with concrete examples—token capture, cost querying, model routing, and budget alerts are all copy-paste ready with realistic configurations and specific model names/pricing.

3 / 3

Workflow Clarity

The four steps are clearly sequenced and logically ordered, but there are no explicit validation checkpoints or feedback loops. For example, Step 1 doesn't verify that costs are actually being tracked before proceeding to Step 2, and there's no guidance on what to do if the Metrics API returns unexpected results.

2 / 3

Progressive Disclosure

The content is well-structured with clear sections and a helpful summary table, but it's quite long (~200 lines of code) and monolithic. The model routing and budget alert scripts could reasonably be split into separate referenced files. The Resources section at the end provides external links but no internal bundle references exist.

2 / 3

Total

9

/

12

Passed

Description

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a well-structured skill description with strong trigger terms and clear 'what/when' guidance. Its main weakness is that the capability description could be more specific about concrete actions beyond 'monitor and optimize'. The explicit trigger phrase list and Langfuse-specific terminology make it highly distinctive and easy for Claude to match correctly.

Suggestions

Add more specific concrete actions such as 'create cost dashboards, set spending alerts, compare model pricing, analyze token usage trends' to improve specificity.

DimensionReasoningScore

Specificity

Names the domain (LLM cost monitoring via Langfuse) and mentions some actions like 'monitor', 'optimize', 'tracking', 'identifying cost anomalies', 'implementing cost controls', but doesn't list multiple concrete specific actions (e.g., creating dashboards, setting alerts, generating reports, comparing model costs).

2 / 3

Completeness

Clearly answers both 'what' (monitor and optimize LLM costs using Langfuse analytics and dashboards) and 'when' (tracking LLM spending, identifying cost anomalies, implementing cost controls) with explicit trigger phrases provided.

3 / 3

Trigger Term Quality

Excellent coverage of natural trigger terms including 'langfuse costs', 'LLM spending', 'track AI costs', 'langfuse token usage', 'optimize LLM budget' — these are phrases users would naturally say. Also includes relevant keywords like 'cost anomalies', 'cost controls', and 'analytics'.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive due to the specific combination of Langfuse + LLM cost monitoring. The trigger terms like 'langfuse costs' and 'langfuse token usage' are very specific and unlikely to conflict with other skills.

3 / 3

Total

11

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
jeremylongshore/claude-code-plugins-plus-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.