Name: deepgram-java-speech-to-text
Rating: 71.2 (1 reviews)
Author: deepgram

deepgram-java-speech-to-text

Use when writing or reviewing Java code in this repo that calls Deepgram Speech-to-Text v1 (`/v1/listen`) for prerecorded or live transcription. Covers `client.listen().v1().media().transcribeUrl` / `transcribeFile` (REST) and `client.listen().v1().v1WebSocket()` (WebSocket). Use `deepgram-java-audio-intelligence` for analytics overlays, `deepgram-java-conversational-stt` for Flux `/v2/listen`, and `deepgram-java-voice-agent` for full-duplex assistants. Triggers include "transcribe", "speech to text", "STT", "listen v1", "nova-3", "live transcription", and "websocket transcription".

Quality

86%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that is highly specific, well-scoped, and clearly distinguishable. It explicitly states what it does, when to use it, includes natural trigger terms, and proactively differentiates itself from related skills by naming them and their use cases. The only minor note is its density, but every element serves a clear purpose.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: writing/reviewing Java code, calling Deepgram STT v1 `/v1/listen`, prerecorded and live transcription, specific API methods (`transcribeUrl`, `transcribeFile`, `v1WebSocket`), and distinguishes REST vs WebSocket approaches.	3 / 3
Completeness	Clearly answers both 'what' (Java code for Deepgram STT v1 prerecorded/live transcription via REST and WebSocket) and 'when' (explicit 'Use when' clause at the start, plus explicit trigger terms listed at the end). Also clarifies when NOT to use it by pointing to sibling skills.	3 / 3
Trigger Term Quality	Excellent coverage of natural trigger terms explicitly listed: 'transcribe', 'speech to text', 'STT', 'listen v1', 'nova-3', 'live transcription', 'websocket transcription'. These are terms users would naturally use when working with speech-to-text functionality.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive — scoped to a specific API version (`/v1/listen`), specific language (Java), and specific vendor (Deepgram). Explicitly delineates boundaries by naming three sibling skills for analytics, v2/Flux, and voice agents, minimizing overlap risk.	3 / 3
	Total	12 / 12 Passed

Implementation

72%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid, actionable skill with excellent code examples covering REST and WebSocket transcription patterns. Its main weaknesses are moderate verbosity (the full visitor pattern example is lengthy and some sections could be tightened) and the lack of error handling/validation guidance for a workflow that involves network calls and streaming connections. The gotchas section is a strong addition that captures real SDK-specific pitfalls.

Suggestions

Add a brief error handling example (e.g., try/catch around connect or transcribeUrl) to improve workflow clarity for failure scenarios.

Trim the REST URL visitor example — consider showing just the happy path inline and noting the accepted-response branch in a comment, since the full visitor is verbose.

Dimension	Reasoning	Score
Conciseness	The skill is mostly efficient with good code examples, but includes some unnecessary verbosity — e.g., the full visitor pattern in the REST URL example is quite long and could be trimmed, the 'Central product skills' section at the bottom adds marginal value, and some explanatory sentences (like 'Default API-key auth sends Authorization: Token <apiKey>') could be folded into the gotchas section where it already appears.	2 / 3
Actionability	All code examples are fully executable with correct imports, concrete builder patterns, and real method calls. The REST URL, REST file, WebSocket, and async examples are all copy-paste ready with specific class names and method signatures.	3 / 3
Workflow Clarity	The skill clearly separates REST vs WebSocket paths and provides good sequential guidance for WebSocket (register handlers → connect → send audio → close). However, there are no explicit validation checkpoints — no error handling examples, no guidance on what to do when transcription fails, and no verification steps for checking response quality or handling connection failures.	2 / 3
Progressive Disclosure	The skill is well-structured with clear sections progressing from authentication to quick starts to advanced details. It cleanly references other skills (audio-intelligence, conversational-stt, voice-agent), example files in the repo, and external API references with clear one-level-deep navigation. The 'API reference (layered)' section is a good example of progressive disclosure.	3 / 3
	Total	10 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: deepgram/deepgram-java-sdk
Commit: 6d7d7d5

Reviewed: 9 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.