Use when writing or reviewing Java code in this repo that calls Deepgram Speech-to-Text v1 (`/v1/listen`) for prerecorded or live transcription. Covers `client.listen().v1().media().transcribeUrl` / `transcribeFile` (REST) and `client.listen().v1().v1WebSocket()` (WebSocket). Use `deepgram-java-audio-intelligence` for analytics overlays, `deepgram-java-conversational-stt` for Flux `/v2/listen`, and `deepgram-java-voice-agent` for full-duplex assistants. Triggers include "transcribe", "speech to text", "STT", "listen v1", "nova-3", "live transcription", and "websocket transcription".
89
86%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Basic transcription for prerecorded audio over REST or live audio over WebSocket via /v1/listen.
media().transcribeUrl / transcribeFile) — one-shot transcription of a complete URL or byte array.v1WebSocket()) — live streaming transcription with interim/final results.Use a different skill when:
deepgram-java-audio-intelligence./v2/listen → deepgram-java-conversational-stt.deepgram-java-voice-agent.Gradle
implementation 'com.deepgram:deepgram-java-sdk:0.2.1'Maven
<dependency>
<groupId>com.deepgram</groupId>
<artifactId>deepgram-java-sdk</artifactId>
<version>0.2.1</version>
</dependency>import com.deepgram.DeepgramClient;
DeepgramClient client = DeepgramClient.builder()
.apiKey(System.getenv("DEEPGRAM_API_KEY"))
.build();Default API-key auth sends Authorization: Token <apiKey>. accessToken(...) switches to Bearer.
import com.deepgram.resources.listen.v1.media.requests.ListenV1RequestUrl;
import com.deepgram.resources.listen.v1.media.types.MediaTranscribeRequestModel;
import com.deepgram.resources.listen.v1.media.types.MediaTranscribeResponse;
import com.deepgram.types.ListenV1Response;
ListenV1RequestUrl request = ListenV1RequestUrl.builder()
.url("https://static.deepgram.com/examples/Bueller-Life-moves-pretty-fast.wav")
.model(MediaTranscribeRequestModel.NOVA3)
.smartFormat(true)
.build();
MediaTranscribeResponse result = client.listen().v1().media().transcribeUrl(request);
result.visit(new MediaTranscribeResponse.Visitor<Void>() {
@Override
public Void visit(ListenV1Response response) {
// Guard channels + alternatives against empty results (matches examples/listen/TranscribeUrl.java).
String transcript = "";
java.util.List<?> channels = response.getResults().getChannels();
if (channels != null && !channels.isEmpty()) {
java.util.List<?> alternatives = response.getResults()
.getChannels().get(0)
.getAlternatives().orElse(java.util.Collections.emptyList());
if (!alternatives.isEmpty()) {
transcript = response.getResults()
.getChannels().get(0)
.getAlternatives().orElse(java.util.Collections.emptyList())
.get(0)
.getTranscript().orElse("");
}
}
System.out.println(transcript);
return null;
}
@Override
public Void visit(com.deepgram.types.ListenV1AcceptedResponse accepted) {
System.out.println("Request accepted: " + accepted.getRequestId());
return null;
}
});import com.deepgram.resources.listen.v1.media.requests.MediaTranscribeRequestOctetStream;
import com.deepgram.resources.listen.v1.media.types.MediaTranscribeRequestModel;
byte[] audioBytes = java.nio.file.Files.readAllBytes(java.nio.file.Paths.get("audio.wav"));
MediaTranscribeRequestOctetStream request = MediaTranscribeRequestOctetStream.builder()
.body(audioBytes)
.model(MediaTranscribeRequestModel.NOVA3)
.smartFormat(true)
.build();
MediaTranscribeResponse result = client.listen().v1().media().transcribeFile(request);transcribeFile(...) accepts either raw byte[] or a full MediaTranscribeRequestOctetStream request object.
import com.deepgram.resources.listen.v1.types.ListenV1CloseStream;
import com.deepgram.resources.listen.v1.types.ListenV1CloseStreamType;
import com.deepgram.resources.listen.v1.websocket.V1ConnectOptions;
import com.deepgram.resources.listen.v1.websocket.V1WebSocketClient;
import com.deepgram.types.ListenV1Model;
import java.util.concurrent.TimeUnit;
V1WebSocketClient wsClient = client.listen().v1().v1WebSocket();
wsClient.onResults(result -> {
if (result.getChannel() != null
&& result.getChannel().getAlternatives() != null
&& !result.getChannel().getAlternatives().isEmpty()) {
String transcript = result.getChannel().getAlternatives().get(0).getTranscript();
boolean isFinal = result.getIsFinal().orElse(false);
System.out.printf("%s %s%n", isFinal ? "[final]" : "[interim]", transcript);
}
});
wsClient.connect(V1ConnectOptions.builder().model(ListenV1Model.NOVA3).build())
.get(10, TimeUnit.SECONDS);
// send raw audio chunks here
// wsClient.sendMedia(okio.ByteString.of(audioChunk));
wsClient.sendCloseStream(ListenV1CloseStream.builder()
.type(ListenV1CloseStreamType.CLOSE_STREAM)
.build());import com.deepgram.AsyncDeepgramClient;
import java.util.concurrent.CompletableFuture;
AsyncDeepgramClient asyncClient = AsyncDeepgramClient.builder()
.apiKey(System.getenv("DEEPGRAM_API_KEY"))
.build();
CompletableFuture<com.deepgram.resources.listen.v1.media.types.MediaTranscribeResponse> future =
asyncClient.listen().v1().media().transcribeUrl(request);The async REST clients return CompletableFuture<T>. WebSocket clients are already asynchronous and also return CompletableFuture<Void> from connect(...) and send methods.
ListenV1RequestUrl.builder() and MediaTranscribeRequestOctetStream.builder()model, language, encoding, smartFormat, punctuate, diarize, detectEntities, multichannel, numerals, paragraphs, utterances, keywords, keyterm, replace, search, mipOptOut, tag, callbacktranscribeUrl(...), transcribeFile(byte[]), transcribeFile(MediaTranscribeRequestOctetStream)model, encoding, sampleRate, endpointing, interimResults, vadEvents, utteranceEndMs, diarize, detectEntities, redact, keywords, keyterm, languagesendMedia(...), sendFinalize(...), sendKeepAlive(...), sendCloseStream(...)onResults, onMetadata, onUtteranceEnd, onSpeechStarted, plus generic onConnected, onDisconnected, onError, onMessageListenV1Response or ListenV1AcceptedResponse, handled via MediaTranscribeResponse.Visitorsrc/main/java/com/deepgram/resources/listen/v1/ plus examples under examples/listen/. This checkout does not include reference.md./llmstxt/developers_deepgram_llms_txtToken, not Bearer. Bearer only happens when you use accessToken(...).ListenV1Response and ListenV1AcceptedResponse with the visitor.transcribeFile(byte[]) reads the whole file into memory. Use the request builder only when you need extra params.redact as a single String. Do not assume Python-style list support in this checkout.encoding, the bytes must actually match it.sendFinalize(...) or sendCloseStream(...); otherwise trailing audio can be lost.connect(...). The examples do this consistently.V1WebSocketClient is async already. Wait on connect(...).get(...) before sending audio.examples/listen/TranscribeUrl.javaexamples/listen/FileUploadTypes.javaexamples/listen/AdvancedOptions.javaexamples/listen/LiveStreaming.javaexamples/listen/TranscribeCallback.javaexamples/listen/Captions.javaFor cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
npx skills add deepgram/skillsThis SDK ships language-idiomatic code skills; deepgram/skills ships cross-language product knowledge (see api, docs, recipes, examples, starters, setup-mcp).
de2dd4b
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.