Use when writing or reviewing Java code in this repo that calls Deepgram Speech-to-Text v1 (`/v1/listen`) for prerecorded or live transcription. Covers `client.listen().v1().media().transcribeUrl` / `transcribeFile` (REST) and `client.listen().v1().v1WebSocket()` (WebSocket). Use `deepgram-java-audio-intelligence` for analytics overlays, `deepgram-java-conversational-stt` for Flux `/v2/listen`, and `deepgram-java-voice-agent` for full-duplex assistants. Triggers include "transcribe", "speech to text", "STT", "listen v1", "nova-3", "live transcription", and "websocket transcription".
71
86%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Basic transcription for prerecorded audio over REST or live audio over WebSocket via /v1/listen.
media().transcribeUrl / transcribeFile) — one-shot transcription of a complete URL or byte array.v1WebSocket()) — live streaming transcription with interim/final results.Use a different skill when:
deepgram-java-audio-intelligence./v2/listen → deepgram-java-conversational-stt.deepgram-java-voice-agent.Gradle
implementation 'com.deepgram:deepgram-java-sdk:0.2.1'Maven
<dependency>
<groupId>com.deepgram</groupId>
<artifactId>deepgram-java-sdk</artifactId>
<version>0.2.1</version>
</dependency>import com.deepgram.DeepgramClient;
DeepgramClient client = DeepgramClient.builder()
.apiKey(System.getenv("DEEPGRAM_API_KEY"))
.build();Default API-key auth sends Authorization: Token <apiKey>. accessToken(...) switches to Bearer.
import com.deepgram.resources.listen.v1.media.requests.ListenV1RequestUrl;
import com.deepgram.resources.listen.v1.media.types.MediaTranscribeRequestModel;
import com.deepgram.resources.listen.v1.media.types.MediaTranscribeResponse;
import com.deepgram.types.ListenV1Response;
ListenV1RequestUrl request = ListenV1RequestUrl.builder()
.url("https://static.deepgram.com/examples/Bueller-Life-moves-pretty-fast.wav")
.model(MediaTranscribeRequestModel.NOVA3)
.smartFormat(true)
.build();
MediaTranscribeResponse result = client.listen().v1().media().transcribeUrl(request);
result.visit(new MediaTranscribeResponse.Visitor<Void>() {
@Override
public Void visit(ListenV1Response response) {
// Guard channels + alternatives against empty results (matches examples/listen/TranscribeUrl.java).
String transcript = "";
java.util.List<?> channels = response.getResults().getChannels();
if (channels != null && !channels.isEmpty()) {
java.util.List<?> alternatives = response.getResults()
.getChannels().get(0)
.getAlternatives().orElse(java.util.Collections.emptyList());
if (!alternatives.isEmpty()) {
transcript = response.getResults()
.getChannels().get(0)
.getAlternatives().orElse(java.util.Collections.emptyList())
.get(0)
.getTranscript().orElse("");
}
}
System.out.println(transcript);
return null;
}
@Override
public Void visit(com.deepgram.types.ListenV1AcceptedResponse accepted) {
System.out.println("Request accepted: " + accepted.getRequestId());
return null;
}
});import com.deepgram.resources.listen.v1.media.requests.MediaTranscribeRequestOctetStream;
import com.deepgram.resources.listen.v1.media.types.MediaTranscribeRequestModel;
byte[] audioBytes = java.nio.file.Files.readAllBytes(java.nio.file.Paths.get("audio.wav"));
MediaTranscribeRequestOctetStream request = MediaTranscribeRequestOctetStream.builder()
.body(audioBytes)
.model(MediaTranscribeRequestModel.NOVA3)
.smartFormat(true)
.build();
MediaTranscribeResponse result = client.listen().v1().media().transcribeFile(request);transcribeFile(...) accepts either raw byte[] or a full MediaTranscribeRequestOctetStream request object.
import com.deepgram.resources.listen.v1.types.ListenV1CloseStream;
import com.deepgram.resources.listen.v1.types.ListenV1CloseStreamType;
import com.deepgram.resources.listen.v1.websocket.V1ConnectOptions;
import com.deepgram.resources.listen.v1.websocket.V1WebSocketClient;
import com.deepgram.types.ListenV1Model;
import java.util.concurrent.TimeUnit;
V1WebSocketClient wsClient = client.listen().v1().v1WebSocket();
wsClient.onResults(result -> {
if (result.getChannel() != null
&& result.getChannel().getAlternatives() != null
&& !result.getChannel().getAlternatives().isEmpty()) {
String transcript = result.getChannel().getAlternatives().get(0).getTranscript();
boolean isFinal = result.getIsFinal().orElse(false);
System.out.printf("%s %s%n", isFinal ? "[final]" : "[interim]", transcript);
}
});
wsClient.connect(V1ConnectOptions.builder().model(ListenV1Model.NOVA3).build())
.get(10, TimeUnit.SECONDS);
// send raw audio chunks here
// wsClient.sendMedia(okio.ByteString.of(audioChunk));
wsClient.sendCloseStream(ListenV1CloseStream.builder()
.type(ListenV1CloseStreamType.CLOSE_STREAM)
.build());import com.deepgram.AsyncDeepgramClient;
import java.util.concurrent.CompletableFuture;
AsyncDeepgramClient asyncClient = AsyncDeepgramClient.builder()
.apiKey(System.getenv("DEEPGRAM_API_KEY"))
.build();
CompletableFuture<com.deepgram.resources.listen.v1.media.types.MediaTranscribeResponse> future =
asyncClient.listen().v1().media().transcribeUrl(request);The async REST clients return CompletableFuture<T>. WebSocket clients are already asynchronous and also return CompletableFuture<Void> from connect(...) and send methods.
ListenV1RequestUrl.builder() and MediaTranscribeRequestOctetStream.builder()model, language, encoding, smartFormat, punctuate, diarize, detectEntities, multichannel, numerals, paragraphs, utterances, keywords, keyterm, replace, search, mipOptOut, tag, callbacktranscribeUrl(...), transcribeFile(byte[]), transcribeFile(MediaTranscribeRequestOctetStream)model, encoding, sampleRate, endpointing, interimResults, vadEvents, utteranceEndMs, diarize, detectEntities, redact, keywords, keyterm, languagesendMedia(...), sendFinalize(...), sendKeepAlive(...), sendCloseStream(...)onResults, onMetadata, onUtteranceEnd, onSpeechStarted, plus generic onConnected, onDisconnected, onError, onMessageListenV1Response or ListenV1AcceptedResponse, handled via MediaTranscribeResponse.Visitorsrc/main/java/com/deepgram/resources/listen/v1/ plus examples under examples/listen/. This checkout does not include reference.md./llmstxt/developers_deepgram_llms_txtToken, not Bearer. Bearer only happens when you use accessToken(...).ListenV1Response and ListenV1AcceptedResponse with the visitor.transcribeFile(byte[]) reads the whole file into memory. Use the request builder only when you need extra params.redact as a single String. Do not assume Python-style list support in this checkout.encoding, the bytes must actually match it.sendFinalize(...) or sendCloseStream(...); otherwise trailing audio can be lost.connect(...). The examples do this consistently.V1WebSocketClient is async already. Wait on connect(...).get(...) before sending audio.examples/listen/TranscribeUrl.javaexamples/listen/FileUploadTypes.javaexamples/listen/AdvancedOptions.javaexamples/listen/LiveStreaming.javaexamples/listen/TranscribeCallback.javaexamples/listen/Captions.javaFor cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
npx skills add deepgram/skillsThis SDK ships language-idiomatic code skills; deepgram/skills ships cross-language product knowledge (see api, docs, recipes, examples, starters, setup-mcp).
6d7d7d5
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.