pdf-reading

Use this skill when you need to read, inspect, or extract content from PDF files — especially when file content is NOT in your context and you need to read it from disk. Covers content inventory, text extraction, page rasterization for visual inspection, embedded image/attachment/table/form-field extraction, and choosing the right reading strategy for different document types (text-heavy, scanned, slide-decks, forms, data-heavy). Do NOT use this skill for PDF creation, form filling, merging, splitting, watermarking, or encryption — use the pdf skill instead.

Quality

88%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly defines its scope (reading and extracting from PDFs), provides rich trigger terms, and explicitly delineates boundaries with a related skill. The inclusion of specific document types (scanned, slide-decks, forms, data-heavy) and the negative boundary ('Do NOT use this skill for...') make it highly effective for skill selection. The only minor note is the use of second-person 'you' in the opening, but since it's addressed to Claude (the agent selecting the skill) rather than the user, this is contextually appropriate.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: content inventory, text extraction, page rasterization for visual inspection, embedded image/attachment/table/form-field extraction, and choosing reading strategies for different document types (text-heavy, scanned, slide-decks, forms, data-heavy).	3 / 3
Completeness	Clearly answers both 'what' (content inventory, text extraction, page rasterization, embedded content extraction, reading strategy selection) and 'when' ('when you need to read, inspect, or extract content from PDF files — especially when file content is NOT in your context and you need to read it from disk'). Also explicitly states when NOT to use it, which further clarifies scope.	3 / 3
Trigger Term Quality	Includes strong natural trigger terms: 'read', 'inspect', 'extract content', 'PDF files', 'text extraction', 'scanned', 'forms', 'tables', 'images', 'attachments'. These cover many natural ways a user would phrase requests about reading PDFs.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive — explicitly differentiates itself from a sibling 'pdf skill' for creation/modification tasks. The focus on reading/extraction vs. creation/manipulation creates a clear boundary, and the 'Do NOT use this skill for...' clause directly addresses potential conflicts.	3 / 3
	Total	12 / 12 Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, highly actionable skill that provides comprehensive coverage of PDF reading operations with executable code and CLI commands throughout. Its main weakness is length — the body content is thorough but could benefit from moving some detailed sections (image extraction with PyMuPDF, rare media, font diagnostics) to REFERENCE.md to keep the main skill leaner. The decision-tree approach for choosing reading strategies and the token cost awareness section are particularly valuable additions.

Suggestions

Move detailed/edge-case sections (PyMuPDF image extraction, rare embedded media, font diagnostics) to REFERENCE.md and add brief pointers from the main skill to reduce body length by ~30%.

Trim explanatory context that Claude already knows (e.g., 'PDFs can contain embedded files — spreadsheets, data files, other documents' and the paragraph explaining two attachment mechanisms) to improve conciseness.

Dimension	Reasoning	Score
Conciseness	The skill is generally well-written but includes some unnecessary explanations that Claude would already know (e.g., explaining what PDF attachments are, what vector graphics are, what CMYK is). The 'Two attachment mechanisms exist' paragraph and some contextual explanations could be trimmed. However, most content is practical and earns its place.	2 / 3
Actionability	Excellent actionability throughout — every section provides fully executable code snippets and CLI commands that are copy-paste ready. The content covers multiple tools with concrete examples for each use case, including specific flags, output handling, and edge cases like the pdftoppm filename padding gotcha.	3 / 3
Workflow Clarity	The skill provides a clear diagnostic-first workflow (content inventory → choose strategy → extract), with explicit decision trees for when to rasterize vs. text-extract. The 'Choosing your reading strategy' section acts as a clear routing guide. Validation is addressed through the content inventory step and font diagnostics for troubleshooting garbled output.	3 / 3
Progressive Disclosure	The skill references REFERENCE.md for advanced features (pypdfium2, OCR fallback, encrypted PDFs) and correctly points to the pdf skill for non-reading operations. However, the main file is quite long (~200+ lines of detailed content) and some sections like embedded images extraction with PyMuPDF, rare media content, and font diagnostics could be moved to the reference file. The quick reference table at the end is a nice touch but the body could be leaner with more content offloaded.	2 / 3
	Total	10 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: douglasvought/wiggle-skills
Commit: b27906e

Reviewed: 1 day ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.