Use when writing or reviewing Java code in this repo that calls Deepgram Text-to-Speech v1 (`/v1/speak`) for audio synthesis. Covers one-shot REST via `client.speak().v1().audio().generate(...)` and streaming synthesis via `client.speak().v1().v1WebSocket()`. Use `deepgram-java-voice-agent` for full-duplex assistants instead of one-way synthesis. Triggers include "tts", "text to speech", "speak", "aura", "streaming tts", and "speak websocket".
89
86%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Convert text to audio with REST or stream audio back incrementally over WebSocket via /v1/speak.
audio().generate) — one-shot synthesis when you already have the full text.v1WebSocket()) — lower-latency synthesis while text arrives in chunks.Use a different skill when:
deepgram-java-voice-agent.import com.deepgram.DeepgramClient;
DeepgramClient client = DeepgramClient.builder()
.apiKey(System.getenv("DEEPGRAM_API_KEY"))
.build();API key auth uses Authorization: Token <apiKey>.
import com.deepgram.resources.speak.v1.audio.requests.SpeakV1Request;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.StandardCopyOption;
SpeakV1Request request = SpeakV1Request.builder()
.text("Hello! This is a text-to-speech example using the Deepgram Java SDK.")
.build();
InputStream audioStream = client.speak().v1().audio().generate(request);
Files.copy(audioStream, Path.of("output.mp3"), StandardCopyOption.REPLACE_EXISTING);
audioStream.close();REST returns an InputStream, not JSON.
import com.deepgram.resources.speak.v1.types.SpeakV1Close;
import com.deepgram.resources.speak.v1.types.SpeakV1CloseType;
import com.deepgram.resources.speak.v1.types.SpeakV1Flush;
import com.deepgram.resources.speak.v1.types.SpeakV1FlushType;
import com.deepgram.resources.speak.v1.types.SpeakV1Text;
import com.deepgram.resources.speak.v1.websocket.V1WebSocketClient;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.util.concurrent.TimeUnit;
import java.util.logging.Level;
import java.util.logging.Logger;
Logger logger = Logger.getLogger("StreamingTts");
V1WebSocketClient wsClient = client.speak().v1().v1WebSocket();
OutputStream audioOutput = new FileOutputStream("output_streaming.wav");
// Log write failures rather than throwing from the WebSocket callback thread
// (matches examples/speak/StreamingTts.java).
wsClient.onSpeakV1Audio(audioData -> {
try {
audioOutput.write(audioData.toByteArray());
} catch (IOException e) {
logger.log(Level.SEVERE, "Failed to write streaming audio to output file.", e);
}
});
// Close the output stream when the server disconnects so we don't leak the file handle.
wsClient.onDisconnected(message -> {
try {
audioOutput.close();
} catch (IOException e) {
logger.log(Level.WARNING, "Failed to close streaming audio output file.", e);
}
});
wsClient.connect().get(10, TimeUnit.SECONDS);
wsClient.sendText(SpeakV1Text.builder().text("Hello, this is streaming text to speech.").build());
wsClient.sendFlush(SpeakV1Flush.builder().type(SpeakV1FlushType.FLUSH).build());
wsClient.sendClose(SpeakV1Close.builder().type(SpeakV1CloseType.CLOSE).build());import com.deepgram.AsyncDeepgramClient;
import java.io.InputStream;
import java.util.concurrent.CompletableFuture;
AsyncDeepgramClient asyncClient = AsyncDeepgramClient.builder()
.apiKey(System.getenv("DEEPGRAM_API_KEY"))
.build();
CompletableFuture<InputStream> future = asyncClient.speak().v1().audio().generate(request);SpeakV1Request.builder().text(...)model, encoding, sampleRate, bitRate, container, callback, callbackMethod, mipOptOut, tagaudio().generate(request) and audio().withRawResponse().generate(request)model, encoding, sampleRate, mipOptOutsendText(...), sendFlush(...), sendClear(...), sendClose(...)onSpeakV1Audio, onMetadata, onFlushed, onCleared, onWarning, plus generic connection/error hookssrc/main/java/com/deepgram/resources/speak/v1/ and examples/speak/. reference.md is not present in this checkout./llmstxt/developers_deepgram_llms_txtInputStream. Save or consume it; do not try to deserialize JSON.Flush before Close so the tail of the audio is not lost.ByteString. Convert to bytes before writing or playback.container and bitRate are REST request fields, not WebSocket connect options in this checkout.text; pick an explicit model/encoding when output format matters.TextBuilder helper in this repo. That Python helper does not exist here.CompletableFuture<InputStream>. You still need to close the stream after the future resolves.examples/speak/TextToSpeech.javaexamples/speak/StreamingTts.javaexamples/agent/ProviderCombinations.java — shows Aura model selection inside Agent configsFor cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
npx skills add deepgram/skillsThis SDK ships language-idiomatic code skills; deepgram/skills ships cross-language product knowledge (see api, docs, recipes, examples, starters, setup-mcp).
de2dd4b
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.