pdf-brain-ingest

Ingest PDF/Markdown/TXT files into joelclaw's docs memory pipeline with Inngest durability, durable NAS artifacts, and OTEL verification. Use when adding docs, running batch reindex, reconciling coverage, or recovering stuck runs. Triggers on: 'ingest pdf', 'ingest markdown', 'docs add', 'pdf-brain ingest', 'backfill books', 'docs reconcile', 'reindex docs', 'batch reindex'.

Quality

82%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Quality

Content

64%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, highly actionable skill with concrete CLI commands for every workflow step and a clear pipeline architecture. Its main weaknesses are the lack of explicit validation checkpoints between pipeline stages (important for a batch/destructive operation) and the monolithic structure that inlines reference-level detail (embedding models, chunking strategy, event tables) that would be better served by separate files. Some sections include rationale and benchmarks that don't help Claude execute the task.

Suggestions

Add explicit validation checkpoints between pipeline stages (e.g., 'Verify {docId}.md exists and is non-empty before proceeding to CLASSIFY') and after batch operations (e.g., 'Run joelclaw docs reconcile to confirm all books processed').

Move reference-level content (Inngest event table, chunking strategy details, embedding model comparison, extraction details) into separate bundle files and link to them from the main SKILL.md.

Remove explanatory rationale that doesn't aid execution, such as benchmark scores ('0.90 accuracy'), arxiv findings ('45% higher precision'), and speed comparisons ('~150x faster than Typesense CPU auto-embed').

Dimension	Reasoning	Score
Conciseness	The skill is fairly detailed and mostly earns its tokens with concrete commands and architecture details, but includes some unnecessary explanation (e.g., embedding model comparisons, chunking strategy rationale like 'arxiv R100-0 finding: 45% higher precision', extraction benchmark scores) that Claude doesn't need to execute the workflow. The architecture diagram and artifact descriptions are useful but could be tighter.	2 / 3
Actionability	Excellent actionability throughout — every workflow step has concrete, copy-paste-ready CLI commands with flags and arguments. File paths, event names, and specific tool invocations are all explicit and executable.	3 / 3
Workflow Clarity	The workflow is clearly sequenced with numbered steps from preflight through recovery, and the recovery section addresses error handling. However, there are no explicit validation checkpoints between pipeline stages — for a multi-stage destructive/batch operation involving NAS artifacts and database upserts, the skill should include explicit 'verify before proceeding' steps (e.g., validate artifacts exist after Stage 1 before Stage 2, confirm chunk counts before indexing).	2 / 3
Progressive Disclosure	The content is well-structured with clear sections and headers, but it's a monolithic document (~150 lines) that inlines detailed reference material (chunking strategy, embedding model comparisons, Inngest event table, acquisition pipeline) that could be split into separate reference files. No bundle files are provided, so there's no progressive disclosure to external references despite the content length warranting it.	2 / 3
	Total	9 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that clearly communicates specific capabilities, provides explicit trigger guidance with a comprehensive list of natural trigger terms, and occupies a distinct niche. The description is concise yet thorough, covering the what, when, and how-to-trigger aspects effectively. Minor concern is that some terms like 'OTEL verification' and 'Inngest durability' are technical jargon that may not be user-facing trigger terms, but they serve well for distinguishing this skill from others.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: ingesting PDF/Markdown/TXT files, batch reindex, reconciling coverage, recovering stuck runs. Also names specific technologies (Inngest, NAS, OTEL).	3 / 3
Completeness	Clearly answers both 'what' (ingest files into docs memory pipeline with durability and verification) and 'when' (explicit 'Use when...' clause plus a 'Triggers on:' list with specific phrases).	3 / 3
Trigger Term Quality	Provides excellent coverage of natural trigger terms including 'ingest pdf', 'ingest markdown', 'docs add', 'backfill books', 'docs reconcile', 'reindex docs', 'batch reindex'. These cover multiple variations users would naturally say.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive with a clear niche: it's specifically about joelclaw's docs memory pipeline with Inngest durability and OTEL verification. The specific trigger terms like 'pdf-brain ingest' and 'docs reconcile' are unlikely to conflict with generic document processing skills.	3 / 3
	Total	12 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	10 / 11 Passed

Repository: joelhooks/joelclaw
Commit: 2ca3686

Reviewed: 10 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.