CtrlK
BlogDocsLog inGet started
Tessl Logo

pdf-reading

Use this skill when you need to read, inspect, or extract content from PDF files — especially when file content is NOT in your context and you need to read it from disk. Covers content inventory, text extraction, page rasterization for visual inspection, embedded image/attachment/table/form-field extraction, and choosing the right reading strategy for different document types (text-heavy, scanned, slide-decks, forms, data-heavy). Do NOT use this skill for PDF creation, form filling, merging, splitting, watermarking, or encryption — use the pdf skill instead.

90

Quality

88%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly defines its scope (reading and extracting from PDFs), provides rich trigger terms, and explicitly delineates boundaries with a related skill. The inclusion of specific document types (scanned, slide-decks, forms, data-heavy) and the negative boundary ('Do NOT use this skill for...') make it highly effective for skill selection. The only minor note is the use of second-person 'you' in the opening, but since it's addressed to Claude (the agent selecting the skill) rather than the user, this is contextually appropriate.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: content inventory, text extraction, page rasterization for visual inspection, embedded image/attachment/table/form-field extraction, and choosing reading strategies for different document types (text-heavy, scanned, slide-decks, forms, data-heavy).

3 / 3

Completeness

Clearly answers both 'what' (content inventory, text extraction, page rasterization, embedded content extraction, reading strategy selection) and 'when' ('when you need to read, inspect, or extract content from PDF files — especially when file content is NOT in your context and you need to read it from disk'). Also explicitly states when NOT to use it, which further clarifies scope.

3 / 3

Trigger Term Quality

Includes strong natural trigger terms: 'read', 'inspect', 'extract content', 'PDF files', 'text extraction', 'scanned', 'forms', 'tables', 'images', 'attachments'. These cover many natural ways a user would phrase requests about reading PDFs.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive — explicitly differentiates itself from a sibling 'pdf skill' for creation/modification tasks. The focus on reading/extraction vs. creation/manipulation creates a clear boundary, and the 'Do NOT use this skill for...' clause directly addresses potential conflicts.

3 / 3

Total

12

/

12

Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, highly actionable skill that provides comprehensive coverage of PDF reading operations with executable code and CLI commands throughout. Its main weakness is length — the body content is thorough but could benefit from moving some detailed sections (image extraction with PyMuPDF, rare media, font diagnostics) to REFERENCE.md to keep the main skill leaner. The decision-tree approach for choosing reading strategies and the token cost awareness section are particularly valuable additions.

Suggestions

Move detailed/edge-case sections (PyMuPDF image extraction, rare embedded media, font diagnostics) to REFERENCE.md and add brief pointers from the main skill to reduce body length by ~30%.

Trim explanatory context that Claude already knows (e.g., 'PDFs can contain embedded files — spreadsheets, data files, other documents' and the paragraph explaining two attachment mechanisms) to improve conciseness.

DimensionReasoningScore

Conciseness

The skill is generally well-written but includes some unnecessary explanations that Claude would already know (e.g., explaining what PDF attachments are, what vector graphics are, what CMYK is). The 'Two attachment mechanisms exist' paragraph and some contextual explanations could be trimmed. However, most content is practical and earns its place.

2 / 3

Actionability

Excellent actionability throughout — every section provides fully executable code snippets and CLI commands that are copy-paste ready. The content covers multiple tools with concrete examples for each use case, including specific flags, output handling, and edge cases like the pdftoppm filename padding gotcha.

3 / 3

Workflow Clarity

The skill provides a clear diagnostic-first workflow (content inventory → choose strategy → extract), with explicit decision trees for when to rasterize vs. text-extract. The 'Choosing your reading strategy' section acts as a clear routing guide. Validation is addressed through the content inventory step and font diagnostics for troubleshooting garbled output.

3 / 3

Progressive Disclosure

The skill references REFERENCE.md for advanced features (pypdfium2, OCR fallback, encrypted PDFs) and correctly points to the pdf skill for non-reading operations. However, the main file is quite long (~200+ lines of detailed content) and some sections like embedded images extraction with PyMuPDF, rare media content, and font diagnostics could be moved to the reference file. The quick reference table at the end is a nice touch but the body could be leaner with more content offloaded.

2 / 3

Total

10

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
douglasvought/wiggle-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.