llm-patterns

AI-first application patterns, LLM testing, prompt management

Quality

39%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/llm-patterns/SKILL.md

Quality

Discovery

14%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is essentially a list of topic tags rather than a functional skill description. It lacks concrete actions, explicit trigger guidance, and sufficient specificity to distinguish it from other AI-related skills. It would be very difficult for Claude to reliably select this skill from a pool of similar options.

Suggestions

Add concrete actions describing what the skill does, e.g., 'Guides building AI-first applications, writing and managing prompts, and setting up LLM evaluation frameworks.'

Add an explicit 'Use when...' clause with trigger terms, e.g., 'Use when the user asks about building apps with LLMs, testing model outputs, writing evals, prompt engineering, or structuring AI-powered workflows.'

Include common natural-language variations of key terms (e.g., 'evals', 'prompt engineering', 'AI app architecture', 'model evaluation') to improve trigger term coverage and distinctiveness.

Dimension	Reasoning	Score
Specificity	The description lists broad topic areas ('AI-first application patterns', 'LLM testing', 'prompt management') but does not describe any concrete actions. There are no verbs indicating what the skill actually does.	1 / 3
Completeness	The description weakly addresses 'what' (topic areas only, no actions) and completely lacks any 'when' guidance. There is no 'Use when...' clause or equivalent trigger guidance.	1 / 3
Trigger Term Quality	Terms like 'LLM testing', 'prompt management', and 'AI-first' are somewhat relevant keywords a user might mention, but they miss common variations (e.g., 'prompt engineering', 'evaluations', 'evals', 'AI app development', 'model testing') and are fairly broad.	2 / 3
Distinctiveness Conflict Risk	The terms are very broad and could overlap with many AI/ML-related skills. 'AI-first application patterns' and 'prompt management' are generic enough to conflict with coding skills, AI documentation skills, or prompt engineering skills.	1 / 3
	Total	5 / 12 Passed

Implementation

64%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid, actionable skill with excellent executable code examples covering the full lifecycle of LLM-integrated applications. Its main weaknesses are moderate verbosity (some sections explain concepts Claude already knows), a lack of explicit error handling/validation workflows for LLM calls, and a monolithic structure that could benefit from splitting into referenced sub-files for testing and CI patterns.

Suggestions

Add explicit error handling workflow with validation checkpoints: what to do when JSON parsing fails, when schema validation fails, retry strategies with backoff — show concrete code for these recovery paths.

Split the testing section (mocks, fixtures, evals) and CI configuration into a separate TESTING.md file, referenced from the main skill, to reduce the monolithic structure.

Remove or significantly trim the 'Core Principle' section — Claude already understands when to use LLMs vs traditional code; a single-line reminder suffices.

Dimension	Reasoning	Score
Conciseness	The skill is fairly well-structured but includes some unnecessary verbosity. The 'Core Principle' section explaining what LLMs vs code should handle is somewhat obvious to Claude. The project structure tree, while useful, is lengthy. The anti-patterns list at the end restates things already demonstrated in the code examples. However, the code examples themselves are lean and purposeful.	2 / 3
Actionability	The skill provides fully executable TypeScript code examples throughout — typed LLM wrapper, Zod schemas, prompt templates, three tiers of testing, GitHub Actions YAML, and cost tracking. All examples are copy-paste ready with realistic implementations and concrete patterns.	3 / 3
Workflow Clarity	The skill presents a clear three-tier testing strategy (mocks → fixtures → evals) with good sequencing, and the CI workflow distinguishes fast vs nightly runs. However, there are no explicit validation checkpoints or error recovery steps for the LLM call workflow itself — e.g., what to do when schema parsing fails, retry logic, or how to handle malformed JSON responses. The anti-patterns mention handling failures but don't show how.	2 / 3
Progressive Disclosure	The content is well-organized with clear section headers and logical progression from client setup to testing to CI. However, at ~250 lines this is a monolithic document that could benefit from splitting detailed testing patterns and CI configuration into separate referenced files. No bundle files exist to offload content, and no references to external files are provided.	2 / 3
	Total	9 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	10 / 11 Passed

Repository: alinaqi/claude-bootstrap
Commit: 7e5f7a2

Reviewed: 3 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.