CtrlK
BlogDocsLog inGet started
Tessl Logo

deepgram-java-speech-to-text

Use when writing or reviewing Java code in this repo that calls Deepgram Speech-to-Text v1 (`/v1/listen`) for prerecorded or live transcription. Covers `client.listen().v1().media().transcribeUrl` / `transcribeFile` (REST) and `client.listen().v1().v1WebSocket()` (WebSocket). Use `deepgram-java-audio-intelligence` for analytics overlays, `deepgram-java-conversational-stt` for Flux `/v2/listen`, and `deepgram-java-voice-agent` for full-duplex assistants. Triggers include "transcribe", "speech to text", "STT", "listen v1", "nova-3", "live transcription", and "websocket transcription".

71

Quality

86%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that is highly specific, well-scoped, and clearly distinguishable. It explicitly states what it does, when to use it, includes natural trigger terms, and proactively differentiates itself from related skills by naming them and their use cases. The only minor note is its density, but every element serves a clear purpose.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: writing/reviewing Java code, calling Deepgram STT v1 `/v1/listen`, prerecorded and live transcription, specific API methods (`transcribeUrl`, `transcribeFile`, `v1WebSocket`), and distinguishes REST vs WebSocket approaches.

3 / 3

Completeness

Clearly answers both 'what' (Java code for Deepgram STT v1 prerecorded/live transcription via REST and WebSocket) and 'when' (explicit 'Use when' clause at the start, plus explicit trigger terms listed at the end). Also clarifies when NOT to use it by pointing to sibling skills.

3 / 3

Trigger Term Quality

Excellent coverage of natural trigger terms explicitly listed: 'transcribe', 'speech to text', 'STT', 'listen v1', 'nova-3', 'live transcription', 'websocket transcription'. These are terms users would naturally use when working with speech-to-text functionality.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive — scoped to a specific API version (`/v1/listen`), specific language (Java), and specific vendor (Deepgram). Explicitly delineates boundaries by naming three sibling skills for analytics, v2/Flux, and voice agents, minimizing overlap risk.

3 / 3

Total

12

/

12

Passed

Implementation

72%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid, actionable skill with excellent code examples covering REST and WebSocket transcription patterns. Its main weaknesses are moderate verbosity (the full visitor pattern example is lengthy and some sections could be tightened) and the lack of error handling/validation guidance for a workflow that involves network calls and streaming connections. The gotchas section is a strong addition that captures real SDK-specific pitfalls.

Suggestions

Add a brief error handling example (e.g., try/catch around connect or transcribeUrl) to improve workflow clarity for failure scenarios.

Trim the REST URL visitor example — consider showing just the happy path inline and noting the accepted-response branch in a comment, since the full visitor is verbose.

DimensionReasoningScore

Conciseness

The skill is mostly efficient with good code examples, but includes some unnecessary verbosity — e.g., the full visitor pattern in the REST URL example is quite long and could be trimmed, the 'Central product skills' section at the bottom adds marginal value, and some explanatory sentences (like 'Default API-key auth sends Authorization: Token <apiKey>') could be folded into the gotchas section where it already appears.

2 / 3

Actionability

All code examples are fully executable with correct imports, concrete builder patterns, and real method calls. The REST URL, REST file, WebSocket, and async examples are all copy-paste ready with specific class names and method signatures.

3 / 3

Workflow Clarity

The skill clearly separates REST vs WebSocket paths and provides good sequential guidance for WebSocket (register handlers → connect → send audio → close). However, there are no explicit validation checkpoints — no error handling examples, no guidance on what to do when transcription fails, and no verification steps for checking response quality or handling connection failures.

2 / 3

Progressive Disclosure

The skill is well-structured with clear sections progressing from authentication to quick starts to advanced details. It cleanly references other skills (audio-intelligence, conversational-stt, voice-agent), example files in the repo, and external API references with clear one-level-deep navigation. The 'API reference (layered)' section is a good example of progressive disclosure.

3 / 3

Total

10

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
deepgram/deepgram-java-sdk
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.