cookbook-audit

Audit an Anthropic Cookbook notebook based on a rubric. Use whenever a notebook review or audit is requested.

1.96x

Quality

56%

Does it follow best practices?

Impact

100%

1.96x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./.claude/skills/cookbook-audit/SKILL.md

Quality

Discovery

57%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is minimal but structurally complete, with both a 'what' and 'when' clause. Its main weaknesses are the lack of specific concrete actions (what does 'audit' actually entail?) and limited trigger term coverage. The description would benefit from listing specific audit actions and additional natural language variations users might use.

Suggestions

Add specific concrete actions the audit performs, e.g., 'Checks code correctness, validates markdown formatting, evaluates explanatory content, scores against quality rubric criteria'.

Expand trigger terms in the 'Use when...' clause to include natural variations like 'evaluate', 'grade', 'check quality', 'Jupyter notebook', 'cookbook quality check'.

Dimension	Reasoning	Score
Specificity	The description says 'audit an Anthropic Cookbook notebook based on a rubric' which names a domain but describes only one vague action ('audit'). It doesn't list specific concrete actions like checking code cells, validating outputs, scoring against criteria, etc.	1 / 3
Completeness	It answers both 'what' (audit an Anthropic Cookbook notebook based on a rubric) and 'when' (whenever a notebook review or audit is requested) with an explicit 'Use whenever...' clause.	3 / 3
Trigger Term Quality	Includes some relevant keywords like 'notebook', 'review', 'audit', and 'Anthropic Cookbook', but misses natural variations users might say such as 'evaluate', 'grade', 'check quality', 'cookbook notebook', or 'Jupyter notebook'.	2 / 3
Distinctiveness Conflict Risk	The mention of 'Anthropic Cookbook notebook' provides some specificity, but 'review or audit' is broad enough that it could overlap with general code review or document review skills. The rubric-based aspect is not well differentiated.	2 / 3
	Total	8 / 12 Passed

Implementation

55%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill provides a well-structured and actionable audit workflow with clear steps, a concrete report template, and a comprehensive checklist. However, it severely undermines its own design by inlining extensive content that it explicitly tells Claude to read from style_guide.md, resulting in massive redundancy and poor token efficiency. The skill would be roughly 60-70% shorter and equally effective if it trusted its own reference architecture.

Suggestions

Remove all inlined style guide content (Sections: 'Content Philosophy', 'What Makes a Good Cookbook', 'What Cookbooks Are NOT', 'Style Guidelines', 'Structural Requirements', 'Common Anti-Patterns') and replace with brief one-line references to style_guide.md sections, since the workflow already instructs Claude to read that file first.

Consolidate the Quick Reference Checklist to only include items not covered in style_guide.md, or move the entire checklist to a separate file and reference it.

Remove the duplicated good/bad examples (❌/✅ patterns) that are explicitly noted as being in style_guide.md already — the instruction 'Refer to style_guide.md for detailed good/bad examples' followed by listing them anyway is contradictory.

Dimension	Reasoning	Score
Conciseness	The skill is extremely verbose at ~250+ lines, with extensive inline content that duplicates what should be in the referenced style_guide.md. Entire sections like 'What Makes a Good Cookbook', 'What Cookbooks Are NOT', 'Style Guidelines', 'Structural Requirements', and 'Common Anti-Patterns' are fully inlined rather than deferred to the style guide. The skill tells Claude to 'read style_guide.md first' but then reproduces much of its content anyway, wasting significant token budget.	1 / 3
Actionability	The workflow provides concrete, executable steps including a specific command (`python3 validate_notebook.py <path>`), a detailed audit report template with exact formatting, and a comprehensive checklist with specific items to verify. The scoring dimensions (X/5 for each category, X/20 overall) are clearly defined and the report structure is copy-paste ready.	3 / 3
Workflow Clarity	The 8-step workflow is clearly sequenced with logical progression from reading the style guide through generating the report. It includes validation via the automated script (step 3), a review checkpoint (step 5), and explicit instructions to provide specific examples with line references (step 8). The checklist serves as an additional verification mechanism.	3 / 3
Progressive Disclosure	Despite repeatedly referencing style_guide.md as the canonical source, the skill inlines massive amounts of content that should live in that file — full structural requirements, anti-patterns with examples, style guidelines, content philosophy, and more. This creates a monolithic wall of text and defeats the purpose of having a separate style guide. No bundle files were provided, but even so, the skill should defer to style_guide.md rather than duplicating it.	1 / 3
	Total	8 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: anthropics/claude-cookbooks
Commit: 3c30b02

Reviewed: about 5 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.