Transcribe audio files to text using OpenAI Whisper
67
Quality
58%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./examples/skill/skills/whisper/SKILL.mdTranscribe audio files to text using OpenAI Whisper.
python3 scripts/transcribe.py <audio_file> <output_file># Specify model size (default: base)
python3 scripts/transcribe.py audio.mp3 transcript.txt --model medium
# Specify language (improves accuracy)
python3 scripts/transcribe.py audio.mp3 transcript.txt --language zh
# Include timestamps
python3 scripts/transcribe.py audio.mp3 transcript.txt --timestamps
# JSON output with metadata
python3 scripts/transcribe.py audio.mp3 output.json --format jsonaudio_file (required): Path to input audio fileoutput_file (required): Path to output text/JSON file--model: Whisper model size (tiny/base/small/medium/large, default: base)--language: Language code (e.g., en, zh, es, fr, auto for detection)--timestamps: Include word-level timestamps in output--format: Output format (text/json, default: text)| Model | Parameters | Speed | Accuracy | Memory |
|---|---|---|---|---|
| tiny | 39M | ~32x | Good | ~1GB |
| base | 74M | ~16x | Better | ~1GB |
| small | 244M | ~6x | Great | ~2GB |
| medium | 769M | ~2x | Excellent | ~5GB |
| large | 1.5B | 1x | Best | ~10GB |
MP3, WAV, M4A, FLAC, OGG, AAC, WMA, and more (via FFmpeg)
pip install openai-whisper
sudo apt-get install ffmpeg # Ubuntu/Debian8763418
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.