Use when writing or reviewing JavaScript/TypeScript in this repo that calls Deepgram Speech-to-Text v1 (`/v1/listen`) for prerecorded or live audio transcription. Covers `client.listen.v1.media.transcribeUrl` / `transcribeFile` (REST) plus `client.listen.v1.createConnection()` / `connect()` (WebSocket). Use `deepgram-js-audio-intelligence` for summarize/sentiment/topics/diarize overlays, `deepgram-js-conversational-stt` for Flux turn-taking on `/v2/listen`, and `deepgram-js-voice-agent` for full-duplex assistants. Triggers include "transcribe", "speech to text", "STT", "listen.v1", "nova-3", "live transcription", and "websocket transcription".
94
92%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Basic transcription for prerecorded audio (REST) or live audio (WebSocket) via /v1/listen.
client.listen.v1.media.transcribeUrl / transcribeFile) — one-shot transcription of a finished URL or file. Good for batch jobs, caption generation, offline processing.client.listen.v1.createConnection() / connect()) — continuous streaming transcription. Good for live captions, microphone audio, telephony streams, browser or Node realtime apps.Use a different skill when:
/v1/listen call → deepgram-js-audio-intelligence./v2/listen → deepgram-js-conversational-stt.deepgram-js-voice-agent.require("dotenv").config();
const { DeepgramClient } = require("@deepgram/sdk");
const deepgramClient = new DeepgramClient({
apiKey: process.env.DEEPGRAM_API_KEY,
});Use the exported DeepgramClient from src/CustomClient.ts, not DefaultDeepgramClient. The wrapper adds the required Token auth prefix, session headers, and patched WebSocket behavior.
From examples/04-transcription-prerecorded-url.ts:
const data = await deepgramClient.listen.v1.media.transcribeUrl({
url: "https://dpgr.am/spacewalk.wav",
model: "nova-3",
language: "en",
punctuate: true,
paragraphs: true,
utterances: true,
});
console.log(
"Transcription:",
data.results?.channels?.[0]?.alternatives?.[0]?.transcript,
);From examples/05-transcription-prerecorded-file.ts:
const { createReadStream } = require("fs");
const data = await deepgramClient.listen.v1.media.transcribeFile(
createReadStream("./examples/spacewalk.wav"),
{
model: "nova-3",
language: "en",
punctuate: true,
paragraphs: true,
utterances: true,
smart_format: true,
}
);transcribeFile(...) accepts multiple upload shapes in this SDK: fs.ReadStream, Buffer, ReadableStream, Blob, File, ArrayBuffer, and Uint8Array (see examples/23-file-upload-types.ts).
From examples/07-transcription-live-websocket.ts:
const deepgramConnection = await deepgramClient.listen.v1.createConnection({
model: "nova-3",
language: "en",
punctuate: "true",
interim_results: "true",
});
deepgramConnection.on("message", (data) => {
if (data.type === "Results") {
console.log("Transcript:", data);
}
});
deepgramConnection.connect();
await deepgramConnection.waitForOpen();
// Swap this for a mic capture (e.g. `node-microphone` / `MediaRecorder`)
// in real apps; the repo examples use `createReadStream` over a sample WAV.
const { createReadStream } = require("node:fs");
const audioStream = createReadStream("samples/spacewalk.wav");
audioStream.on("data", (chunk) => {
deepgramConnection.sendMedia(chunk);
});
audioStream.on("end", () => {
deepgramConnection.sendFinalize({ type: "Finalize" });
});The repo examples use the two-step socket flow: createConnection() → register handlers → connect() → waitForOpen().
model, language, punctuate, smart_format, paragraphs, utterances, multichannel, numerals, search, keyterm, keywords, encoding, sample_rate, callback, tag.src/api/resources/listen/resources/v1/client/Client.ts): model is required; common realtime flags include language, interim_results, endpointing, utterance_end_ms, vad_events, encoding, sample_rate, multichannel, punctuate, smart_format.src/api/resources/listen/resources/v1/client/Socket.ts): sendMedia(...), sendFinalize(...), sendCloseStream(...), sendKeepAlive(...).Results, Metadata, UtteranceEnd, SpeechStarted.reference.md → Listen V1 Media for REST; WSS behavior lives in src/CustomClient.ts and src/api/resources/listen/resources/v1/client/{Client,Socket}.ts./llmstxt/developers_deepgram_llms_txtDeepgramClient, not DefaultDeepgramClient. The custom wrapper adds Token auth, session IDs, browser WS auth protocols, and patched sockets.createConnection() does not open the socket; call connect() and usually waitForOpen().sendFinalize({ type: "Finalize" }) flushes the final partial.sendKeepAlive({ type: "KeepAlive" }) on long pauses.encoding and sample_rate must match the bytes./v2/listen only for Flux. If you need turn-aware conversational STT, switch skills instead of forcing v1.examples/04-transcription-prerecorded-url.tsexamples/05-transcription-prerecorded-file.tsexamples/06-transcription-prerecorded-callback.tsexamples/07-transcription-live-websocket.tsexamples/08-transcription-captions.tsexamples/23-file-upload-types.tsexamples/27-deepgram-session-header.tsFor cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
npx skills add deepgram/skillsThis SDK ships language-idiomatic code skills; deepgram/skills ships cross-language product knowledge (see api, docs, recipes, examples, starters, setup-mcp).
c567b98
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.