Transcribe audio to text using ElevenLabs Scribe v2. Use when converting audio/video to text, generating subtitles, transcribing meetings, or processing spoken content.
95
88%
Does it follow best practices?
Impact
96%
1.88xAverage score across 11 eval scenarios
Passed
No known issues
Speaker diarization and keyterm prompting
Enables diarization
100%
100%
Uses keyterms
100%
100%
Uses scribe_v2 model
0%
100%
Groups output by speaker
100%
100%
Accesses word.speaker_id
100%
100%
Word timestamps and type classification
Word-level timestamps
100%
100%
Uses word.start and word.end
100%
100%
Filters by word type
100%
100%
Enables audio event tagging
100%
100%
React useScribe hook and token auth
Uses useScribe hook
0%
100%
Token-based auth
20%
100%
Handles both transcript types
60%
100%
Microphone config
100%
100%
Server-side real-time with manual commit and PCM format
Manual commit strategy
100%
100%
PCM 16-bit format
100%
100%
Correct chunk size
0%
100%
Base64 encoding
100%
100%
Explicit commit call
100%
100%
VAD configuration and transcript types
VAD enabled
0%
100%
VAD threshold tuning
0%
0%
Partial for display
60%
100%
Committed as source of truth
60%
100%
Uses scribe_v2_realtime
0%
100%
Entity detection and language hint
Entity detection enabled
0%
100%
Accesses entities in response
100%
100%
Language code provided
100%
100%
Uses scribe_v2 model
0%
100%
Reads language probability
100%
100%
Reads detected language code
100%
100%
Correct JS package
100%
100%
Entity output included
100%
100%
Export formats and request cost tracking
additional_formats used
0%
100%
SRT format requested
0%
100%
with_raw_response used
0%
100%
request-id captured
0%
100%
response.parse() called
0%
100%
Uses scribe_v2 model
0%
100%
Request ID in output
50%
100%
Real-time status handling and session continuity
Status checks both states
0%
40%
VAD commit strategy
0%
100%
CommitStrategy imported from @elevenlabs/react
0%
100%
previous_text field used
0%
53%
previous_text length constraint
0%
100%
Uses scribe_v2_realtime model
0%
100%
Token-based auth
100%
100%
Character-level timestamps and diarization tuning
Character-level timestamps
0%
100%
Diarization enabled
0%
100%
num_speakers set to 3
0%
100%
Uses scribe_v2 model
0%
100%
subtitle_data.json written
100%
100%
Speaker ID in output
100%
100%
Correct Python package
0%
100%
Multichannel audio and cloud storage URL transcription
Uses cloud_storage_url
100%
100%
Enables multichannel
100%
100%
Uses scribe_v2 model
100%
100%
Correct Python package
100%
100%
Per-channel output
100%
86%
JSON file written
100%
100%
No local file download
100%
100%
Clean verbatim-free transcription and deterministic seeding
no_verbatim enabled
0%
100%
Seed parameter used
0%
100%
Uses scribe_v2 model
0%
100%
transcript.txt written
100%
100%
transcript_meta.json written
100%
100%
Language fields in metadata
70%
100%
Correct Python package
0%
100%
cab396c
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.