Use when writing or reviewing Java code in this repo that builds an interactive voice agent over `agent.deepgram.com/v1/agent/converse`. Covers `client.agent().v1().v1WebSocket()`, `AgentV1Settings`, `sendSettings`, `sendMedia`, event handlers, provider configuration, and message injection. Use `deepgram-java-text-to-speech` for one-way synthesis or the STT skills for transcription-only flows. Triggers include "voice agent", "agent converse", "full duplex", "barge in", "function call", and "agent websocket".
86
82%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Run a full-duplex voice agent over a single WebSocket: user audio in, agent events + audio out.
Use a different skill when:
deepgram-java-speech-to-text or deepgram-java-conversational-stt.deepgram-java-text-to-speech.deepgram-java-management-api.import com.deepgram.DeepgramClient;
DeepgramClient client = DeepgramClient.builder()
.apiKey(System.getenv("DEEPGRAM_API_KEY"))
.build();The agent WebSocket uses the SDK's agent environment URL and the same auth headers.
import com.deepgram.resources.agent.v1.types.AgentV1Settings;
import com.deepgram.resources.agent.v1.types.AgentV1SettingsAgent;
import com.deepgram.resources.agent.v1.types.AgentV1SettingsAgentThink;
import com.deepgram.resources.agent.v1.types.AgentV1SettingsAgentThinkOneItem;
import com.deepgram.resources.agent.v1.types.AgentV1SettingsAgentThinkOneItemProvider;
import com.deepgram.resources.agent.v1.types.AgentV1SettingsAudio;
import com.deepgram.resources.agent.v1.websocket.V1WebSocketClient;
import com.deepgram.types.OpenAiThinkProvider;
import java.util.List;
import java.util.Map;
V1WebSocketClient wsClient = client.agent().v1().v1WebSocket();
wsClient.onWelcome(welcome -> {
OpenAiThinkProvider openAiProvider = OpenAiThinkProvider.of(Map.of("model", "gpt-4o-mini"));
AgentV1Settings settings = AgentV1Settings.builder()
.audio(AgentV1SettingsAudio.builder().build())
.agent(AgentV1SettingsAgent.builder()
.think(AgentV1SettingsAgentThink.of(List.of(AgentV1SettingsAgentThinkOneItem.builder()
.provider(AgentV1SettingsAgentThinkOneItemProvider.of(openAiProvider))
.prompt("You are a helpful voice assistant. Keep your responses brief.")
.build())))
.greeting("Hello! How can I help you today?")
.build())
.build();
wsClient.sendSettings(settings);
});
wsClient.onConversationText(text -> System.out.printf("[%s] %s%n", text.getRole(), text.getContent()));
wsClient.onAgentStartedSpeaking(event -> System.out.println(">> Agent started speaking"));
wsClient.onAgentV1Audio(audioData -> System.out.printf("Received %d bytes%n", audioData.size()));
wsClient.connect().get(10, java.util.concurrent.TimeUnit.SECONDS);The repo also demonstrates:
wsClient.sendInjectUserMessage(com.deepgram.resources.agent.v1.types.AgentV1InjectUserMessage.builder()
.content("What is the capital of France?")
.build());
wsClient.sendInjectAgentMessage(com.deepgram.resources.agent.v1.types.AgentV1InjectAgentMessage.builder()
.message("By the way, I can also help you with math and science questions!")
.build());client.agent().v1().v1WebSocket()AgentV1SettingssendSettings, sendMedia, sendUpdatePrompt, sendUpdateSpeak, sendInjectUserMessage, sendInjectAgentMessage, sendFunctionCallResponse, sendKeepAliveonWelcome, onSettingsApplied, onConversationText, onUserStartedSpeaking, onAgentThinking, onFunctionCallRequest, onAgentStartedSpeaking, onAgentAudioDone, onAgentV1Audio, onInjectionRefused, onPromptUpdated, onSpeakUpdated, onErrorMessage, onWarningclient.agent().v1().settings().think().models().list()src/main/java/com/deepgram/resources/agent/v1/ and examples/agent/. No reference.md file is present./llmstxt/developers_deepgram_llms_txtenvironment().getAgentURL().onWelcome(...) and immediately call sendSettings(...).ByteString. Playback/output is your responsibility.sendMedia(...) is raw audio bytes. Match whatever audio settings you configured.OpenAiThinkProvider.of(...), AnthropicThinkProvider.of(...), GoogleThinkProvider.of(...) package the provider into the think/listen/speak union the SDK expects. The underlying payload is still an Object (so provider-field mistakes won't be caught at compile time), but the wrappers keep routing correct and ensure you pick the right variant of the sealed union.disconnect(); there is no separate close-message flow like Speak/Listen.examples/agent/VoiceAgent.javaexamples/agent/InjectMessage.javaexamples/agent/ProviderCombinations.javaexamples/agent/CustomProviders.javaFor cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
npx skills add deepgram/skillsThis SDK ships language-idiomatic code skills; deepgram/skills ships cross-language product knowledge (see api, docs, recipes, examples, starters, setup-mcp).
de2dd4b
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.