prompt-engineer

Writes, refactors, and evaluates prompts for LLMs — generating optimized prompt templates, structured output schemas, evaluation rubrics, and test suites. Use when designing prompts for new LLM applications, refactoring existing prompts for better accuracy or token efficiency, implementing chain-of-thought or few-shot learning, creating system prompts with personas and guardrails, building JSON/function-calling schemas, or developing prompt evaluation frameworks to measure and improve model performance.

Quality

75%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/prompt-engineer/SKILL.md

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly defines a specific domain (prompt engineering for LLMs), lists concrete capabilities, and provides comprehensive trigger guidance via an explicit 'Use when...' clause with multiple natural scenarios. It uses proper third-person voice throughout and covers both common and advanced prompt engineering concepts, making it highly distinguishable from other skills.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: writes/refactors/evaluates prompts, generates optimized prompt templates, structured output schemas, evaluation rubrics, and test suites. Very detailed and actionable.	3 / 3
Completeness	Clearly answers both 'what' (writes, refactors, evaluates prompts, generates templates/schemas/rubrics/test suites) and 'when' with an explicit 'Use when...' clause listing six specific trigger scenarios.	3 / 3
Trigger Term Quality	Excellent coverage of natural terms users would say: 'prompts', 'LLM', 'chain-of-thought', 'few-shot learning', 'system prompts', 'JSON', 'function-calling', 'token efficiency', 'evaluation', 'personas', 'guardrails'. These are terms prompt engineers and developers naturally use.	3 / 3
Distinctiveness Conflict Risk	Occupies a clear niche around prompt engineering specifically. The triggers are distinct — terms like 'prompt templates', 'chain-of-thought', 'few-shot learning', 'system prompts', and 'prompt evaluation frameworks' are unlikely to conflict with general coding or document skills.	3 / 3
	Total	12 / 12 Passed

Implementation

50%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a competent but somewhat generic prompt engineering skill that provides useful examples (zero-shot vs few-shot, before/after optimization) but lacks the concrete, executable depth needed for high scores. The workflow is reasonable but abstract, the constraints list includes obvious best practices, and the progressive disclosure structure references files that don't exist in the bundle. The skill would benefit from more specific, actionable guidance and actual supporting reference files.

Suggestions

Add concrete evaluation examples — show a specific test suite structure, scoring rubric template, or evaluation script rather than just describing what to deliver in the Output Templates section.

Trim the MUST DO/MUST NOT DO lists to only prompt-engineering-specific guidance that Claude wouldn't already know, removing general software engineering advice like 'version prompts and track changes systematically'.

Provide the referenced bundle files (e.g., references/prompt-patterns.md, references/evaluation-frameworks.md) or remove the reference table — currently it promises depth that doesn't exist.

Make the validation checkpoint in the workflow more concrete: specify how to measure accuracy (e.g., a simple scoring script template), what constitutes a good test set size, and provide an example of a failure pattern analysis.

Dimension	Reasoning	Score
Conciseness	The skill is mostly efficient but includes some unnecessary content. The 'When to Use This Skill' section largely duplicates the description, the 'Coverage Note' at the bottom is filler, and some constraint items state obvious best practices Claude already knows (e.g., 'Version prompts and track changes systematically'). The MUST DO/MUST NOT DO lists are somewhat verbose with items that are general software engineering wisdom rather than prompt-engineering-specific guidance.	2 / 3
Actionability	The before/after prompt examples and zero-shot vs few-shot comparisons are concrete and useful. However, the core workflow is fairly high-level and abstract ('Understand requirements', 'Design initial prompt'), and the skill lacks executable code or commands for evaluation, testing, or schema validation. The output templates section describes what to deliver but doesn't provide actual templates or schemas.	2 / 3
Workflow Clarity	The 5-step core workflow is clearly sequenced and includes a validation checkpoint at step 3 with a specific threshold (accuracy < 80%) and failure pattern identification. However, the validation is somewhat vague — there's no concrete guidance on how to measure accuracy, what tools to use, or how to structure the feedback loop. For a skill involving iterative refinement and evaluation, the workflow could benefit from more explicit checkpoints and concrete validation steps.	2 / 3
Progressive Disclosure	The reference table is well-structured with clear 'Load When' guidance and one-level-deep references to six topic-specific files. However, no bundle files are provided, meaning none of these references actually exist, which significantly undermines the progressive disclosure structure. The main SKILL.md also includes substantial inline content (examples, constraints, output templates) that could arguably be split into references, making the file longer than necessary.	2 / 3
	Total	8 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: jeffallan/claude-skills
Commit: e8be415

Reviewed: 7 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.