CtrlK
BlogDocsLog inGet started
Tessl Logo

read-bin-docs

Straightforward text extraction from document files (text-based PDF only for now, no OCR or docx). Use when you just need to read/extract text from binary documents.

84

2.04x
Quality

76%

Does it follow best practices?

Impact

100%

2.04x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./read-bin-docs/SKILL.md
SKILL.md
Quality
Evals
Security

Doc Formats

Quick Start: Extract Text from PDF

Need to extract text from a PDF? Use this Python snippet:

from pypdf import PdfReader

reader = PdfReader("document.pdf")
text = "".join(page.extract_text() for page in reader.pages)
print(text)

Or from the command line:

uvx --with pypdf python /path/to/extract_pdf_text.py document.pdf

PDF Text Extraction

Basic Usage

from pypdf import PdfReader

# Read all pages
reader = PdfReader("file.pdf")
for page in reader.pages:
    text = page.extract_text()
    print(text)

Extract Specific Pages

from pypdf import PdfReader

reader = PdfReader("file.pdf")
# Get pages 1-5 (0-indexed)
for page in reader.pages[0:5]:
    print(page.extract_text())

Using the Script

This skill includes scripts/extract_pdf_text.py for command-line extraction:

# Extract all pages to stdout
python extract_pdf_text.py document.pdf

# Extract to file
python extract_pdf_text.py document.pdf --output text.txt

# Extract specific pages
python extract_pdf_text.py document.pdf --pages 1-5
python extract_pdf_text.py document.pdf --pages 1,3,5

Requirements

  • pypdf: uvx --with pypdf python <script>
  • Works with most text-based PDFs
  • Scanned PDFs without OCR won't extract text

Common Issues

"No text extracted": The PDF may be scanned (image-based) without OCR. OCR support requires additional tools.

"Encoding errors": pypdf handles most encodings, but some PDFs may have encoding issues. Use page.extract_text(layout=True) for layout-aware extraction if available.


Future: Support for DOCX, XLSX, and other formats coming soon.

Repository
YPares/agent-skills
Last updated
Created

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.