You are an AI product engineer who has shipped LLM features to millions of users. You've debugged hallucinations at 3am, optimized prompts to reduce costs by 80%, and built safety systems that caught thousands of harmful outputs. You know that demos are easy and production is hard.
25
7%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/ai-product/SKILL.mdQuality
Discovery
0%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description is written as a first/second-person persona backstory rather than a functional skill description. It violates the third-person voice requirement ('You are an AI product engineer'), provides no concrete actions the skill performs, and completely lacks a 'Use when...' clause. It would be nearly impossible for Claude to reliably select this skill from a list of alternatives.
Suggestions
Rewrite in third person and list specific concrete actions (e.g., 'Reviews LLM prompts for cost efficiency, diagnoses hallucination issues, designs safety guardrails for model outputs').
Add an explicit 'Use when...' clause with natural trigger terms (e.g., 'Use when the user asks about prompt engineering, LLM hallucination debugging, AI safety systems, or production deployment of language models').
Remove the persona narrative and focus on what the skill does and when it should be selected, narrowing the scope to a distinct niche rather than covering all of AI product engineering.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses vague, narrative language about past experience ('shipped LLM features', 'debugged hallucinations', 'optimized prompts') but never lists concrete actions the skill performs. It reads as a persona backstory rather than a capability description. | 1 / 3 |
Completeness | The description fails to answer both 'what does this do' and 'when should Claude use it'. There is no 'Use when...' clause or equivalent trigger guidance, and the 'what' is buried in persona-style storytelling rather than stated explicitly. | 1 / 3 |
Trigger Term Quality | While it mentions some domain terms like 'hallucinations', 'prompts', 'safety systems', and 'LLM', these are embedded in a narrative rather than presented as trigger terms. A user asking for help with prompt optimization or safety systems would not reliably match this description over others. | 1 / 3 |
Distinctiveness Conflict Risk | The description is extremely broad, covering LLM features, hallucinations, prompt optimization, safety systems, and production engineering. It could conflict with numerous other skills and provides no clear niche or distinct trigger. | 1 / 3 |
Total | 4 / 12 Passed |
Implementation
14%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill reads like a table of contents or outline rather than a functional skill document. It names important AI product engineering concepts (structured output, prompt versioning, validation) but provides zero executable code, no concrete examples, no workflows, and no references to deeper materials. The Sharp Edges table is particularly frustrating as it lists critical issues with solution columns that contain only truncated code comments rather than actual solutions.
Suggestions
Add complete, executable code examples for each pattern (e.g., a full JSON schema validation snippet, a streaming implementation, a prompt versioning example with test cases).
Fill in the Sharp Edges table solutions with actual code blocks—currently they are truncated comments like '# Always validate output:' with no following code.
Define at least one end-to-end workflow (e.g., 'Building a validated LLM feature') with numbered steps and explicit validation checkpoints.
Either link to separate detailed files for each pattern/anti-pattern or expand inline content to be substantive enough to act on.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The persona description is repeated from the frontmatter, wasting tokens. The patterns and anti-patterns sections are terse but lack substance—they name concepts without providing enough detail to be useful, making them simultaneously too verbose (persona repetition) and too thin (pattern descriptions). | 2 / 3 |
Actionability | The skill is almost entirely abstract. Patterns like 'Use function calling or JSON mode with schema validation' and 'Version prompts in code' provide no concrete code, commands, or executable examples. The Sharp Edges table references code comments ('# Always validate output:') but never shows actual code. Nothing is copy-paste ready. | 1 / 3 |
Workflow Clarity | There are no sequenced steps, no workflows, and no validation checkpoints. The Sharp Edges table lists issues and hints at solutions with truncated code comments but never provides actual procedures. For a skill covering production AI systems with critical safety concerns, the absence of any workflow or feedback loop is a significant gap. | 1 / 3 |
Progressive Disclosure | The content is a flat list of headings with shallow bullet points—no references to deeper materials, no links to examples or detailed guides. The structure exists (headings, table) but content under each heading is too thin to constitute meaningful organization, and there's no navigation to supplementary resources. | 1 / 3 |
Total | 5 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
d739c8b
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.