Extract text from images using Tesseract OCR
59
Quality
48%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./examples/skill/skills/ocr/SKILL.mdExtract text from images using Tesseract OCR engine.
python3 scripts/ocr.py <image_file> <output_file># Specify language (default: eng)
python3 scripts/ocr.py image.png text.txt --lang eng
# Chinese text
python3 scripts/ocr.py image.png text.txt --lang chi_sim
# Multiple languages
python3 scripts/ocr.py image.png text.txt --lang eng+chi_sim
# With image preprocessing (improves accuracy)
python3 scripts/ocr.py image.png text.txt --preprocess
# JSON output with confidence scores
python3 scripts/ocr.py image.png output.json --format json# OCR from remote image
python3 scripts/ocr_url.py <image_url> <output_file>
# With options
python3 scripts/ocr_url.py https://example.com/image.jpg text.txt --lang eng --preprocessimage_file / image_url (required): Path to local image or image URLoutput_file (required): Path to output text/JSON file--lang: Language code (e.g., eng, chi_sim, jpn, fra, deu). Default: eng--preprocess: Apply image preprocessing (grayscale, thresholding) for better accuracy--format: Output format (text/json, default: text)| Language | Code |
|---|---|
| English | eng |
| Chinese (Simplified) | chi_sim |
| Chinese (Traditional) | chi_tra |
| Japanese | jpn |
| Korean | kor |
| French | fra |
| German | deu |
| Spanish | spa |
| Russian | rus |
| Arabic | ara |
PNG, JPG, JPEG, GIF, BMP, TIFF, WEBP
# Python packages
pip install pytesseract Pillow
# Tesseract OCR engine
sudo apt-get install tesseract-ocr # Ubuntu/Debian
sudo yum install tesseract # CentOS/RHEL
brew install tesseract # macOS8763418
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.