Auto-generates an LLM usage monitoring page in a PM admin dashboard. Tokuin CLI-based token/cost/latency tracking + user ranking system + inactive user tracking + data-driven PM insights + Cmd+K global search + per-user drilldown navigation. Supports OpenAI/Anthropic/Gemini/OpenRouter.
67
52%
Does it follow best practices?
Impact
99%
1.86xAverage score across 3 eval scenarios
Critical
Do not install without reviewing
Optimize this skill with Tessl
npx tessl skill review --optimize ./.agent-skills/llm-monitoring-dashboard/SKILL.mdQuality
Discovery
50%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description excels at specificity with detailed feature enumeration and has a clear distinctive niche for LLM monitoring dashboards. However, it critically lacks any 'Use when...' guidance, making it difficult for Claude to know when to select this skill. The technical jargon may also reduce discoverability from natural user queries.
Suggestions
Add a 'Use when...' clause with trigger terms like 'track API costs', 'monitor LLM usage', 'build usage dashboard', 'analyze token spending'
Include more natural user language variations such as 'API spending', 'usage analytics', 'cost monitoring' alongside the technical terms
Consider simplifying 'Tokuin CLI-based' to explain what it means or remove if not essential for skill selection
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'token/cost/latency tracking', 'user ranking system', 'inactive user tracking', 'data-driven PM insights', 'Cmd+K global search', 'per-user drilldown navigation'. Very detailed about what it builds. | 3 / 3 |
Completeness | Describes WHAT it does comprehensively but completely lacks a 'Use when...' clause or any explicit trigger guidance. Per rubric guidelines, missing explicit trigger guidance should cap completeness at 2, and this has no 'when' component at all. | 1 / 3 |
Trigger Term Quality | Includes some relevant terms like 'LLM usage monitoring', 'token', 'cost', 'latency', 'OpenAI/Anthropic/Gemini/OpenRouter', but uses technical jargon ('Tokuin CLI-based', 'PM admin dashboard') that users may not naturally say. Missing common variations like 'API costs', 'usage dashboard', 'spending tracker'. | 2 / 3 |
Distinctiveness Conflict Risk | Very specific niche: LLM usage monitoring dashboards with specific features (Tokuin CLI, PM admin context, multi-provider support). Unlikely to conflict with other skills due to its narrow, well-defined scope. | 3 / 3 |
Total | 9 / 12 Passed |
Implementation
55%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill excels at actionability and workflow clarity with comprehensive, executable code and a well-structured safety-first approach. However, it severely violates conciseness and progressive disclosure principles—the document is excessively long with inline code blocks that should be external files, making it a poor fit for context window efficiency despite its technical completeness.
Suggestions
Extract the HTML dashboard (index.html content) to a separate referenced file and provide only a brief description with a link in SKILL.md
Move the lengthy bash scripts (safety-guard.sh, collect-metrics.sh, etc.) to a /scripts directory and reference them rather than inlining the full content
Remove explanatory text that Claude already knows (e.g., what environment variables are, how cron works, what JSONL format is)
Create a separate REFERENCE.md for the full TypeScript/API route implementations and keep only essential snippets in the main skill
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~1500+ lines, with extensive inline code that could be referenced externally. It includes unnecessary explanations (e.g., explaining what environment variables are, what cron does) and massive HTML/CSS/JS blocks that bloat the document significantly. | 1 / 3 |
Actionability | The skill provides fully executable, copy-paste ready code throughout—complete bash scripts, Python modules, TypeScript files, and a full HTML dashboard. Every step has concrete, runnable commands with expected outputs documented. | 3 / 3 |
Workflow Clarity | The workflow is exceptionally clear with numbered steps, a mandatory safety gate (Step 0) that halts on failures, explicit validation checkpoints throughout, and clear sequencing from installation through deployment. Feedback loops are built into the safety-guard script. | 3 / 3 |
Progressive Disclosure | The skill is a monolithic wall of text with no external file references for detailed content. The massive HTML dashboard (~400 lines), CSS, and JavaScript should be in separate referenced files. All content is inline rather than appropriately split across files. | 1 / 3 |
Total | 8 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (1381 lines); consider splitting into references/ and linking | Warning |
metadata_version | 'metadata.version' is missing | Warning |
Total | 9 / 11 Passed | |
c033769
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.