CtrlK
BlogDocsLog inGet started
Tessl Logo

extract-from-pdfs

This skill should be used when extracting structured data from scientific PDFs for systematic reviews, meta-analyses, or database creation. Use when working with collections of research papers that need to be converted into analyzable datasets with validation metrics.

90

Quality

87%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

SKILL.md
Quality
Evals
Security

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a solid description with clear 'when' triggers and a well-defined niche in scientific literature data extraction. Its main weakness is that the 'what' portion is somewhat abstract—it describes the goal (extracting structured data, creating datasets) without listing the specific concrete actions or capabilities involved (e.g., parsing tables, extracting sample sizes, coding study characteristics). The trigger terms are strong and domain-appropriate.

Suggestions

Add specific concrete actions to improve specificity, e.g., 'Extracts study characteristics, sample sizes, effect sizes, and outcome measures from scientific PDFs; parses tables and figures; generates structured datasets with validation metrics.'

DimensionReasoningScore

Specificity

The description names the domain (scientific PDFs, systematic reviews) and mentions some actions (extracting structured data, converting to analyzable datasets), but doesn't list multiple specific concrete actions like parsing tables, extracting methodology sections, coding effect sizes, or identifying study characteristics.

2 / 3

Completeness

Clearly answers both what ('extracting structured data from scientific PDFs... converted into analyzable datasets with validation metrics') and when ('systematic reviews, meta-analyses, or database creation... working with collections of research papers'). Has explicit 'Use when' clause with trigger scenarios.

3 / 3

Trigger Term Quality

Good coverage of natural terms users would say: 'scientific PDFs', 'systematic reviews', 'meta-analyses', 'database creation', 'research papers', 'analyzable datasets', 'structured data', 'validation metrics'. These are terms researchers would naturally use when requesting this kind of work.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive niche combining scientific PDFs, systematic reviews, meta-analyses, and validation metrics. Unlikely to conflict with a general PDF extraction skill or a generic data processing skill due to the specific academic research context.

3 / 3

Total

11

/

12

Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured, highly actionable skill that provides a complete executable pipeline with clear sequencing and validation checkpoints. Its main weakness is moderate verbosity—some sections duplicate information (Available Scripts repeats the pipeline), and explanations of basic statistical concepts (precision, recall, F1) are unnecessary for Claude. Overall it's a strong skill that effectively balances overview content with progressive disclosure to detailed references.

Suggestions

Remove the 'Available Scripts' section since it duplicates the pipeline commands already shown in 'Workflow Execution', or consolidate them.

Remove explanations of precision, recall, and F1 score—Claude already knows these concepts. Simply state that validation produces per-field and overall metrics.

DimensionReasoningScore

Conciseness

The skill is reasonably efficient but includes some sections that could be tightened—e.g., the 'When to Use This Skill' section, the 'Available Scripts' section that largely duplicates the pipeline commands already shown, and the 'Quality Assurance' section which explains basic concepts like precision/recall that Claude already knows. The cost optimization section adds value but is somewhat verbose.

2 / 3

Actionability

The skill provides fully executable, copy-paste-ready bash commands for every pipeline step with specific flags and arguments. Schema customization, API configuration, and filtering customization all include concrete file paths and actionable instructions. The complete pipeline section is exemplary.

3 / 3

Workflow Clarity

The 6-step pipeline (plus 3 validation steps) is clearly sequenced with numbered steps, explicit commands, and a validation/repair feedback loop (Steps 4 and 7-9). The iterative improvement section explicitly describes a validate-fix-retry cycle. The workflow handles the destructive/batch nature of extraction with JSON repair and validation checkpoints.

3 / 3

Progressive Disclosure

The skill provides a clear overview with well-signaled one-level-deep references to setup_guide.md, workflow_guide.md, validation_guide.md, and api_reference.md. Asset files and example schemas are clearly listed. The main content stays at the right level of detail while pointing to comprehensive guides for deeper information.

3 / 3

Total

11

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
brunoasm/my_claude_skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.