llm-patterns

AI-first application patterns, LLM testing, prompt management

Quality

65%

Does it follow best practices?

Run evals on this skill

Adds up to 20 points to the overall score

View guide

Securityby

Passed

No known issues

Fix and improve this skill with Tessl

tessl review fix ./skills/llm-patterns/SKILL.md

Quality

Content

80%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The body is a dense, mostly executable pattern library that assumes Claude's competence and provides copy-paste-ready code for the core LLM-client, prompt, and testing concerns. Its weaknesses are the absence of explicit validation/feedback checkpoints in its testing workflow and a monolithic single-file layout with no progressive disclosure to reference files.

Suggestions

Add an explicit validation/recovery loop to the eval workflow (e.g., 'if accuracy < 0.9, inspect failures, update prompt version, re-run eval') to lift workflow clarity.

Define the undefined symbols in secondary examples (classifyTicketPromptV1/V2, the parsed variable in llmCallWithMetrics) or mark them as illustrative so every code block is copy-paste ready.

Move the larger reference material (full testing suite, GitHub Actions config, cost-tracking implementation) into files under references/ and link to them from SKILL.md for progressive disclosure.

Dimension	Reasoning	Score
Conciseness	The body is lean and almost entirely executable code patterns plus terse directives, with minimal explanation of concepts Claude already knows; the 'LLM for logic, code for plumbing' framing and the anti-patterns list are compact, matching 'lean and efficient; assumes Claude's competence'.	3 / 3
Actionability	Core patterns (llmCall wrapper, Zod schemas, prompt template, vi.mock unit tests, fixtures, eval tests, CI YAML) are complete and copy-paste ready; minor undefined references in two secondary examples (classifyTicketPromptV1/V2, the {...} metrics block) are small gaps that don't make the dominant content pseudocode.	3 / 3
Workflow Clarity	The testing approach is clearly sequenced (1. Unit Tests with Mocks, 2. Fixture Tests, 3. Evaluation Tests) but validation checkpoints and error-recovery feedback loops are implicit or missing, matching 'steps listed but validation gaps; checkpoints missing or implicit'.	2 / 3
Progressive Disclosure	The skill is a single ~320-line file with no bundle references; although it is well-sectioned with separators, a large amount of code (testing patterns, CI config, metrics) is inline that could be split into reference files, matching 'some structure but content that should be separate is inline'.	2 / 3
	Total	10 / 12 Passed

Description

50%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description reads as a topic list for a specific AI/LLM niche rather than a capability statement with concrete verbs and explicit usage triggers. It is third-person and not vague enough to score 1, but lacks the concrete actions and 'Use when...' guidance needed for a 3.

Suggestions

Rewrite as concrete actions, e.g. 'Designs LLM-driven application logic: classification, extraction, generation, and prompt management with mock-based LLM testing.'

Add an explicit 'Use when...' trigger clause naming natural user phrases ('building LLM apps', 'classifying tickets', 'prompt versioning', 'testing LLM calls').

Drop abstract phrasing like 'AI-first application patterns' in favor of the concrete task verbs already used in the body.

Dimension	Reasoning	Score
Specificity	Names a domain and a couple of action areas ('LLM testing, prompt management') but stays abstract ('AI-first application patterns') rather than listing concrete actions like extract/classify, so it matches 'names domain and some actions, but not comprehensive' rather than the multi-action score 3.	2 / 3
Completeness	The description field states 'what' but contains no 'Use when...' or equivalent trigger guidance, so per the rubric a missing explicit trigger clause caps completeness at 2 even though a separate when-to-use field exists in the frontmatter.	2 / 3
Trigger Term Quality	'LLM' and 'prompt' are terms a user might say, but 'AI-first application patterns' is jargon and common variations are missing, matching 'some relevant keywords but missing common variations' rather than full coverage.	2 / 3
Distinctiveness Conflict Risk	'AI-first application patterns' carves a somewhat specific niche but could still overlap with general coding or other LLM skills and lacks distinct triggers, matching 'somewhat specific but could still overlap' rather than a clear distinct niche.	2 / 3
	Total	8 / 12 Passed

Validation

93%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 15 / 16 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	15 / 16 Passed

Repository: alinaqi/claude-bootstrap
Path: skills/llm-patterns/SKILL.md
Commit: bb6195f

Reviewed: 6 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.