Master advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability in production. Use when optimizing prompts, improving LLM outputs, or designing production prompt templates.
67
54%
Does it follow best practices?
Impact
85%
1.16xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/llm-application-dev/skills/prompt-engineering-patterns/SKILL.mdQuality
Discovery
67%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description has good structural completeness with an explicit 'Use when...' clause and identifies the domain clearly. However, it lacks specificity in the concrete actions it covers (what specific prompt engineering techniques?) and uses somewhat broad language that could overlap with other LLM-related skills. The word 'Master' at the beginning is slightly promotional rather than descriptive.
Suggestions
Replace 'Master advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability' with specific concrete actions like 'Design chain-of-thought prompts, craft few-shot examples, structure system prompts, implement output formatting constraints'.
Expand trigger terms in the 'Use when' clause to include natural variations like 'system prompt', 'few-shot', 'chain of thought', 'prompt design', 'prompt writing', or 'AI instructions'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description names the domain ('prompt engineering') and mentions some actions ('optimizing prompts', 'improving LLM outputs', 'designing production prompt templates'), but these are fairly high-level and not concrete specific actions like 'chain-of-thought prompting, few-shot examples, structured output formatting'. | 2 / 3 |
Completeness | The description clearly answers both 'what' ('Master advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability in production') and 'when' ('Use when optimizing prompts, improving LLM outputs, or designing production prompt templates') with an explicit 'Use when...' clause. | 3 / 3 |
Trigger Term Quality | Includes some relevant keywords like 'prompt engineering', 'LLM', 'prompts', 'production prompt templates', but misses many natural variations users might say such as 'prompt design', 'system prompt', 'few-shot', 'chain of thought', 'prompt optimization', 'AI instructions', or 'prompt writing'. | 2 / 3 |
Distinctiveness Conflict Risk | While 'prompt engineering' is a recognizable niche, terms like 'improving LLM outputs' and 'maximize LLM performance' are broad enough to potentially overlap with skills related to general LLM usage, AI coding assistance, or model evaluation. The description could be more distinctive by specifying unique techniques or artifacts. | 2 / 3 |
Total | 9 / 12 Passed |
Implementation
42%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is highly actionable with excellent, executable code examples covering a wide range of prompt engineering patterns. However, it is severely bloated—much of the content explains concepts Claude already knows (what CoT is, what few-shot learning is, generic best practices). The monolithic structure with no supporting bundle files makes it a poor fit for a SKILL.md that should be a concise overview with references to detailed materials.
Suggestions
Cut the 'Core Capabilities' bullet-point section entirely—it duplicates what the code patterns already demonstrate and explains concepts Claude already knows.
Remove or drastically shorten 'Best Practices', 'Common Pitfalls', and 'Success Metrics' sections—these are generic prompt engineering knowledge that Claude already possesses.
Split detailed patterns into separate bundle files (e.g., patterns/structured-output.md, patterns/few-shot.md) and keep SKILL.md as a concise overview with links.
Add an explicit iterative workflow with validation steps: e.g., 1) Start with simple prompt, 2) Test on 5+ diverse inputs, 3) Measure accuracy, 4) If <threshold, escalate to next pattern level, 5) Re-test.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | This is extremely verbose at ~400+ lines. It explains concepts Claude already knows well (what few-shot learning is, what chain-of-thought is, what system prompts are, basic prompt engineering principles). The 'Best Practices', 'Common Pitfalls', and 'Success Metrics' sections are generic knowledge that adds no novel value. Much of the content reads like a tutorial/textbook rather than a skill file. | 1 / 3 |
Actionability | The skill provides numerous fully executable Python code examples with real libraries (anthropic, langchain, pydantic). Code is copy-paste ready with proper imports, type hints, and error handling. The patterns are concrete and specific. | 3 / 3 |
Workflow Clarity | While individual patterns are clear, there's no overall workflow for when to apply which pattern or how to iterate through prompt optimization. The 'Progressive Disclosure' pattern (Pattern 4) shows a nice escalation sequence, and Pattern 5 has error recovery, but there are no explicit validation checkpoints for the overall prompt engineering process (e.g., test -> measure -> refine cycle with concrete steps). | 2 / 3 |
Progressive Disclosure | This is a monolithic wall of content with no bundle files to offload detail into. All six patterns, integration patterns, performance optimization, best practices, pitfalls, and metrics are crammed into a single file. The 'Core Capabilities' section duplicates what the patterns already show. Content should be split into separate reference files for each pattern category. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
bbc5ade
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.