Use when designing prompts for LLMs, optimizing model performance, building evaluation frameworks, or implementing advanced prompting techniques like chain-of-thought, few-shot learning, or structured outputs.
Install with Tessl CLI
npx tessl i github:jeffallan/claude-skills --skill prompt-engineer69
Does it follow best practices?
If you maintain this skill, you can automatically optimize it using the tessl CLI to improve its score:
npx tessl skill review --optimize ./path/to/skillAgent success when using this skill
Validation for skill structure
Discovery
64%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description has strong trigger term coverage with relevant prompt engineering terminology, but is structured backwards - it only contains a 'Use when...' clause without first explaining what the skill actually does. The lack of concrete actions or outputs (e.g., 'generates optimized prompts', 'creates evaluation rubrics') makes it unclear what Claude will produce when using this skill.
Suggestions
Add a 'what it does' statement before the 'Use when...' clause, e.g., 'Generates optimized prompts, creates evaluation rubrics, and refines LLM instructions for better outputs.'
Replace vague verbs like 'designing' and 'building' with specific actions like 'writes', 'generates', 'analyzes', 'refactors'
Clarify the output format - does this skill produce prompt templates, evaluation criteria, or analysis of existing prompts?
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (prompts for LLMs) and lists some techniques (chain-of-thought, few-shot learning, structured outputs), but doesn't describe concrete actions - uses vague verbs like 'designing', 'optimizing', 'building' without specifying what outputs or transformations occur. | 2 / 3 |
Completeness | Has a 'Use when...' clause which addresses the 'when' question, but the 'what does this do' is weak - it only describes scenarios/triggers without explaining what the skill actually produces or accomplishes. | 2 / 3 |
Trigger Term Quality | Good coverage of natural terms users would say: 'prompts', 'LLMs', 'model performance', 'evaluation', 'chain-of-thought', 'few-shot learning', 'structured outputs' - these are terms practitioners naturally use when seeking prompt engineering help. | 3 / 3 |
Distinctiveness Conflict Risk | The prompt engineering domain is fairly specific, but 'optimizing model performance' and 'building evaluation frameworks' could overlap with ML/AI skills more broadly. The specific techniques mentioned help distinguish it somewhat. | 2 / 3 |
Total | 9 / 12 Passed |
Implementation
57%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill provides good structural organization and clear progressive disclosure to reference materials, but falls short on actionability by lacking concrete examples, prompt templates, or executable code. The workflow is present but missing validation checkpoints, and some sections (Role Definition, Knowledge Reference) add verbosity without proportional value.
Suggestions
Add 2-3 concrete prompt examples showing before/after optimization or demonstrating key patterns (e.g., a zero-shot vs few-shot comparison with actual prompts)
Include a specific validation checkpoint in the workflow, such as 'If accuracy < 80% on test set, identify failure patterns before iterating'
Remove or condense the 'Role Definition' section—Claude doesn't need to be told it's an expert prompt engineer
Replace the 'Knowledge Reference' keyword list with a brief note about which models/techniques are covered in the reference files
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is reasonably efficient but includes some unnecessary sections like 'Role Definition' that restates obvious context, and the 'Knowledge Reference' section at the end is a keyword dump that adds little value. The MUST DO/MUST NOT DO lists could be tighter. | 2 / 3 |
Actionability | The skill provides structured guidance and clear constraints, but lacks concrete executable examples. No actual prompt templates, code snippets, or copy-paste ready content is provided—it describes what to do rather than showing how with specific examples. | 2 / 3 |
Workflow Clarity | The 5-step core workflow is clearly sequenced, but lacks validation checkpoints or feedback loops. There's no explicit guidance on what to do when evaluation fails, how to measure 'quality metrics,' or when to stop iterating. | 2 / 3 |
Progressive Disclosure | Excellent use of a reference table pointing to specific files for detailed topics. Clear one-level-deep references with 'Load When' guidance. The main skill serves as a concise overview with well-signaled navigation to deeper content. | 3 / 3 |
Total | 9 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.