tessl i github:jeffallan/claude-skills --skill prompt-engineerUse when designing prompts for LLMs, optimizing model performance, building evaluation frameworks, or implementing advanced prompting techniques like chain-of-thought, few-shot learning, or structured outputs.
Validation
81%| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata' field is not a dictionary | Warning |
license_field | 'license' field is missing | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 13 / 16 Passed | |
Implementation
42%This skill has strong organizational structure and progressive disclosure but critically lacks actionability - it describes prompt engineering concepts without providing concrete examples, templates, or executable guidance. The content reads more like a topic overview than an actionable skill that Claude can execute. The workflow is present but missing the validation checkpoints needed for iterative prompt development.
Suggestions
Add 2-3 concrete prompt templates showing actual zero-shot, few-shot, and chain-of-thought examples with real input/output pairs
Include executable code for a basic evaluation framework (e.g., Python snippet for running test cases and measuring consistency)
Add specific validation checkpoints to the workflow, such as 'If accuracy < 80% on test set, analyze failure patterns before iterating'
Replace the 'Knowledge Reference' keyword list with a brief note that Claude already knows these concepts, or remove entirely
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is reasonably efficient but includes some unnecessary sections like 'Role Definition' that restates obvious context, and the 'Knowledge Reference' section is a keyword dump that adds little value. The MUST DO/MUST NOT lists could be tighter. | 2 / 3 |
Actionability | The skill provides abstract guidance without concrete, executable examples. No actual prompt templates, no code for evaluation frameworks, no specific commands - just descriptions of what to do rather than how to do it. | 1 / 3 |
Workflow Clarity | The 5-step core workflow provides a clear sequence, but lacks validation checkpoints and feedback loops. There's no guidance on what to do when evaluation fails or how to systematically iterate - just 'refine based on failures' without specifics. | 2 / 3 |
Progressive Disclosure | Excellent structure with a clear overview and well-organized reference table pointing to specific topic files. The one-level-deep references are clearly signaled with 'Load When' context, making navigation intuitive. | 3 / 3 |
Total | 8 / 12 Passed |
Activation
65%The description has strong trigger term coverage with relevant prompt engineering terminology, but lacks specificity about what concrete actions or outputs the skill provides. It reads more like a list of trigger conditions than a capability description, leaving unclear what Claude will actually do when this skill is selected.
Suggestions
Add concrete actions describing what the skill produces, e.g., 'Crafts system prompts, generates few-shot examples, structures output schemas, and debugs prompt failures.'
Clarify the 'what' before the 'when' - lead with capabilities like 'Designs and refines prompts for LLMs, creates evaluation rubrics, and implements prompting patterns' before listing trigger scenarios.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (prompts for LLMs) and lists some techniques (chain-of-thought, few-shot learning, structured outputs), but doesn't describe concrete actions - uses vague verbs like 'designing', 'optimizing', 'building' without specifying what outputs or transformations occur. | 2 / 3 |
Completeness | Has a 'Use when...' clause which addresses when to use it, but the 'what does this do' portion is weak - it only describes trigger scenarios rather than concrete capabilities or outputs the skill provides. | 2 / 3 |
Trigger Term Quality | Good coverage of natural terms users would say: 'prompts', 'LLMs', 'model performance', 'evaluation', 'chain-of-thought', 'few-shot learning', 'structured outputs' are all terms users naturally use when seeking prompt engineering help. | 3 / 3 |
Distinctiveness Conflict Risk | The prompt engineering focus is somewhat specific, but 'optimizing model performance' and 'building evaluation frameworks' could overlap with ML/AI skills, testing skills, or general coding assistance skills. | 2 / 3 |
Total | 9 / 12 Passed |
Reviewed
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.