CtrlK
BlogDocsLog inGet started
Tessl Logo

llm-patterns

AI-first application patterns, LLM testing, prompt management

56

Quality

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Content

80%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The body is a dense, mostly executable pattern library that assumes Claude's competence and provides copy-paste-ready code for the core LLM-client, prompt, and testing concerns. Its weaknesses are the absence of explicit validation/feedback checkpoints in its testing workflow and a monolithic single-file layout with no progressive disclosure to reference files.

Suggestions

Add an explicit validation/recovery loop to the eval workflow (e.g., 'if accuracy < 0.9, inspect failures, update prompt version, re-run eval') to lift workflow clarity.

Define the undefined symbols in secondary examples (classifyTicketPromptV1/V2, the parsed variable in llmCallWithMetrics) or mark them as illustrative so every code block is copy-paste ready.

Move the larger reference material (full testing suite, GitHub Actions config, cost-tracking implementation) into files under references/ and link to them from SKILL.md for progressive disclosure.

DimensionReasoningScore

Conciseness

The body is lean and almost entirely executable code patterns plus terse directives, with minimal explanation of concepts Claude already knows; the 'LLM for logic, code for plumbing' framing and the anti-patterns list are compact, matching 'lean and efficient; assumes Claude's competence'.

3 / 3

Actionability

Core patterns (llmCall wrapper, Zod schemas, prompt template, vi.mock unit tests, fixtures, eval tests, CI YAML) are complete and copy-paste ready; minor undefined references in two secondary examples (classifyTicketPromptV1/V2, the {...} metrics block) are small gaps that don't make the dominant content pseudocode.

3 / 3

Workflow Clarity

The testing approach is clearly sequenced (1. Unit Tests with Mocks, 2. Fixture Tests, 3. Evaluation Tests) but validation checkpoints and error-recovery feedback loops are implicit or missing, matching 'steps listed but validation gaps; checkpoints missing or implicit'.

2 / 3

Progressive Disclosure

The skill is a single ~320-line file with no bundle references; although it is well-sectioned with separators, a large amount of code (testing patterns, CI config, metrics) is inline that could be split into reference files, matching 'some structure but content that should be separate is inline'.

2 / 3

Total

10

/

12

Passed

Description

50%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description reads as a topic list for a specific AI/LLM niche rather than a capability statement with concrete verbs and explicit usage triggers. It is third-person and not vague enough to score 1, but lacks the concrete actions and 'Use when...' guidance needed for a 3.

Suggestions

Rewrite as concrete actions, e.g. 'Designs LLM-driven application logic: classification, extraction, generation, and prompt management with mock-based LLM testing.'

Add an explicit 'Use when...' trigger clause naming natural user phrases ('building LLM apps', 'classifying tickets', 'prompt versioning', 'testing LLM calls').

Drop abstract phrasing like 'AI-first application patterns' in favor of the concrete task verbs already used in the body.

DimensionReasoningScore

Specificity

Names a domain and a couple of action areas ('LLM testing, prompt management') but stays abstract ('AI-first application patterns') rather than listing concrete actions like extract/classify, so it matches 'names domain and some actions, but not comprehensive' rather than the multi-action score 3.

2 / 3

Completeness

The description field states 'what' but contains no 'Use when...' or equivalent trigger guidance, so per the rubric a missing explicit trigger clause caps completeness at 2 even though a separate when-to-use field exists in the frontmatter.

2 / 3

Trigger Term Quality

'LLM' and 'prompt' are terms a user might say, but 'AI-first application patterns' is jargon and common variations are missing, matching 'some relevant keywords but missing common variations' rather than full coverage.

2 / 3

Distinctiveness Conflict Risk

'AI-first application patterns' carves a somewhat specific niche but could still overlap with general coding or other LLM skills and lacks distinct triggers, matching 'somewhat specific but could still overlap' rather than a clear distinct niche.

2 / 3

Total

8

/

12

Passed

Validation

93%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation15 / 16 Passed

Validation for skill structure

CriteriaDescriptionResult

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

15

/

16

Passed

Repository
alinaqi/claude-bootstrap
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.