CtrlK
BlogDocsLog inGet started
Tessl Logo

llm-patterns

AI-first application patterns, LLM testing, prompt management

40

Quality

39%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/llm-patterns/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

14%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is essentially a list of three broad topic labels with no concrete actions, no explicit trigger guidance, and no clear niche. It reads more like a set of tags than a functional skill description, making it very difficult for Claude to know when to select this skill over others.

Suggestions

Add concrete actions describing what the skill does, e.g., 'Guides building AI-first applications, writing and managing prompts, and setting up LLM evaluation pipelines.'

Add an explicit 'Use when...' clause with trigger terms, e.g., 'Use when the user asks about building apps with LLMs, testing AI outputs, writing evals, prompt engineering, or prompt versioning.'

Include common natural-language variations users might say, such as 'evals', 'prompt engineering', 'AI app architecture', 'LLM evaluation', 'prompt versioning', to improve trigger term coverage and distinctiveness.

DimensionReasoningScore

Specificity

The description lists broad topic areas ('AI-first application patterns', 'LLM testing', 'prompt management') but does not describe any concrete actions. There are no verbs indicating what the skill actually does.

1 / 3

Completeness

The description weakly addresses 'what' (topic areas only, no actions) and completely omits 'when' — there is no 'Use when...' clause or any explicit trigger guidance.

1 / 3

Trigger Term Quality

Terms like 'LLM testing', 'prompt management', and 'AI-first' are somewhat relevant keywords a user might mention, but they lack common variations (e.g., 'prompt engineering', 'evaluations', 'evals', 'AI app development') and are fairly broad.

2 / 3

Distinctiveness Conflict Risk

The terms are very broad and could overlap with many AI/ML-related skills. 'AI-first application patterns' and 'prompt management' are generic enough to conflict with coding skills, prompt engineering skills, or general AI development skills.

1 / 3

Total

5

/

12

Passed

Implementation

64%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid, actionable skill with excellent executable code examples covering LLM client patterns, prompt management, testing strategies, and CI integration. Its main weaknesses are moderate verbosity (some sections explain concepts Claude already knows), lack of explicit error handling/validation workflows for LLM operations, and a monolithic structure that would benefit from splitting into referenced sub-documents.

Suggestions

Add explicit error handling workflows: show what happens when JSON parsing fails, when schema validation rejects a response, and how to implement retry/fallback logic — this is critical for LLM operations.

Split the file into a concise overview SKILL.md with references to separate files for testing patterns (TESTING.md), CI configuration (CI.md), and the cost tracking module.

Trim the 'Core Principle' section — Claude already knows when to use LLMs vs traditional code. A single sentence like 'LLM for classification/extraction/generation; code for validation/routing/auth' suffices.

DimensionReasoningScore

Conciseness

The skill is fairly well-structured but includes some unnecessary verbosity. The 'Core Principle' section explaining what LLMs vs code should handle is somewhat obvious to Claude. The project structure tree and some of the inline comments add bulk without proportional value. However, the code examples themselves are reasonably tight.

2 / 3

Actionability

The skill provides fully executable TypeScript code examples throughout — typed LLM wrapper, Zod schemas, prompt templates, three tiers of testing, GitHub Actions YAML, and cost tracking. All examples are copy-paste ready with realistic implementations.

3 / 3

Workflow Clarity

The testing section presents a clear three-tier strategy (mocks → fixtures → evals) with good sequencing, and the CI workflow shows when each runs. However, there are no explicit validation checkpoints or error recovery steps for the core LLM call workflow (e.g., what to do when schema parsing fails, retry logic, fallback behavior). The anti-patterns section mentions handling failures but doesn't show how.

2 / 3

Progressive Disclosure

The content is well-organized with clear section headers, but it's a monolithic ~250-line file with no references to external files. The project structure suggests files like prompts/, schemas, and services that could be separate reference documents. Testing patterns, CI config, and cost tracking could each be split out for better navigation.

2 / 3

Total

9

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

10

/

11

Passed

Repository
alinaqi/claude-bootstrap
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.