Use when writing or reviewing JavaScript/TypeScript in this repo that calls Deepgram Text-to-Speech v1 (`/v1/speak`) for audio synthesis. Covers one-shot REST via `client.speak.v1.audio.generate` and streaming WebSocket via `client.speak.v1.createConnection()` / `connect()`. Use `deepgram-js-voice-agent` when you need full-duplex STT + LLM + TTS instead of one-way synthesis. Triggers include "TTS", "text to speech", "speak", "aura", "streaming TTS", and "speak.v1".
89
86%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Convert text to audio with one-shot REST generation or low-latency streaming synthesis via /v1/speak.
client.speak.v1.audio.generate) — render finished text into an audio response. Best for downloadable files, pre-generated prompts, batch synthesis.client.speak.v1.createConnection() / connect()) — stream text in and receive audio out with lower latency. Best when an LLM is still producing tokens.Use a different skill when:
deepgram-js-voice-agent.require("dotenv").config();
const { DeepgramClient } = require("@deepgram/sdk");
const deepgramClient = new DeepgramClient({
apiKey: process.env.DEEPGRAM_API_KEY,
});The repo examples use require("../dist/cjs/index.js"), but application code should normally import from @deepgram/sdk.
From examples/10-text-to-speech-single.ts:
const data = await deepgramClient.speak.v1.audio.generate({
text: "Hello, this is a test of Deepgram's text-to-speech API.",
model: "aura-2-thalia-en",
encoding: "linear16",
container: "wav",
});
console.log("Audio generated successfully", data);generate(...) returns a BinaryResponse, not JSON. See examples/25-binary-response.ts for .stream(), .arrayBuffer(), .blob(), and .bytes() handling.
From examples/11-text-to-speech-streaming.ts:
const deepgramConnection = await deepgramClient.speak.v1.createConnection({
model: "aura-2-thalia-en",
encoding: "linear16",
});
deepgramConnection.on("message", (data) => {
if (typeof data === "string" || data instanceof ArrayBuffer || data instanceof Blob) {
console.log("Audio received");
} else if (data.type === "Flushed") {
deepgramConnection.close();
}
});
deepgramConnection.connect();
await deepgramConnection.waitForOpen();
deepgramConnection.sendText({ type: "Speak", text: "Hello from streaming TTS." });
deepgramConnection.sendFlush({ type: "Flush" });model, encoding, sample_rate, container, bit_rate, callback, callback_method, tag, mip_opt_out.examples/25-binary-response.ts): response.stream(), response.arrayBuffer(), response.blob(), response.bytes(), response.bodyUsed.src/api/resources/speak/resources/v1/client/Socket.ts): sendText(...), sendFlush(...), sendClear(...), sendClose(...).Metadata, Flushed, Cleared, Warning.Unlike the Python SDK, this repo does not include a hand-written TextBuilder helper. If you want incremental token buffering before sendText(...), build that helper in your application layer.
reference.md → Speak V1 Audio for REST; WSS behavior lives in src/CustomClient.ts and src/api/resources/speak/resources/v1/client/{Client,Socket}.ts./llmstxt/developers_deepgram_llms_txtsrc/CustomClient.ts patches binary WebSocket handling; the generated socket assumes JSON too aggressively.createConnection() is lazy. Register handlers, then call connect() and waitForOpen().Flush after your text. Without sendFlush({ type: "Flush" }), trailing audio may not be emitted promptly.{ type: "Speak", text }, not a raw string.string, ArrayBuffer, or Blob.examples/10-text-to-speech-single.tsexamples/11-text-to-speech-streaming.tsexamples/25-binary-response.tsFor cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
npx skills add deepgram/skillsThis SDK ships language-idiomatic code skills; deepgram/skills ships cross-language product knowledge (see api, docs, recipes, examples, starters, setup-mcp).
c567b98
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.