CtrlK
BlogDocsLog inGet started
Tessl Logo

deepgram-js-speech-to-text

Use when writing or reviewing JavaScript/TypeScript in this repo that calls Deepgram Speech-to-Text v1 (`/v1/listen`) for prerecorded or live audio transcription. Covers `client.listen.v1.media.transcribeUrl` / `transcribeFile` (REST) plus `client.listen.v1.createConnection()` / `connect()` (WebSocket). Use `deepgram-js-audio-intelligence` for summarize/sentiment/topics/diarize overlays, `deepgram-js-conversational-stt` for Flux turn-taking on `/v2/listen`, and `deepgram-js-voice-agent` for full-duplex assistants. Triggers include "transcribe", "speech to text", "STT", "listen.v1", "nova-3", "live transcription", and "websocket transcription".

75

Quality

92%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that hits all the marks: it opens with an explicit 'Use when' clause, lists concrete API methods and endpoints, provides rich natural trigger terms, and proactively distinguishes itself from related skills by naming them and their domains. The only minor note is its density, but the information is all functional and non-redundant.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions and API methods: transcribeUrl, transcribeFile (REST), createConnection/connect (WebSocket), and clearly names the endpoint `/v1/listen` with both prerecorded and live audio transcription use cases.

3 / 3

Completeness

Clearly answers both 'what' (calls Deepgram STT v1 for prerecorded/live audio transcription via REST and WebSocket) and 'when' (explicit 'Use when' clause at the start, plus explicit trigger terms and boundary conditions distinguishing from related skills).

3 / 3

Trigger Term Quality

Excellent coverage of natural trigger terms users would say: 'transcribe', 'speech to text', 'STT', 'live transcription', 'websocket transcription', 'nova-3', and 'listen.v1'. These cover both casual and technical variations.

3 / 3

Distinctiveness Conflict Risk

Exceptionally distinctive — explicitly delineates boundaries by naming sibling skills for adjacent capabilities (audio intelligence, conversational STT v2, voice agent), making it very clear when NOT to use this skill and reducing conflict risk.

3 / 3

Total

12

/

12

Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, well-structured skill that provides immediately actionable code examples for all primary use cases (REST URL, REST file, WebSocket streaming). The progressive disclosure is excellent with clear routing to related skills and external references. Minor verbosity in the introductory sections and use-case descriptions could be trimmed, but overall the content is efficient and comprehensive.

DimensionReasoningScore

Conciseness

Generally efficient but includes some unnecessary explanation (e.g., 'Good for batch jobs, caption generation, offline processing' and 'Good for live captions, microphone audio, telephony streams' are somewhat redundant for Claude). The 'When to use this product' section and cross-references to other skills are useful but slightly verbose. The gotchas and parameter lists are well-condensed.

2 / 3

Actionability

Provides fully executable, copy-paste-ready code examples for all three main use cases (REST URL, REST file, WebSocket). Includes specific method names, event types, and concrete parameter values. The authentication setup is also complete and executable.

3 / 3

Workflow Clarity

The WebSocket workflow clearly sequences the two-stage flow (createConnection → register handlers → connect → waitForOpen → sendMedia → sendFinalize). The gotchas section serves as validation checkpoints, explicitly calling out common failure modes like missing Finalize, auth wrapper issues, and encoding mismatches. For a non-destructive API integration skill, this level of workflow guidance is thorough.

3 / 3

Progressive Disclosure

Well-structured with clear sections progressing from quick starts to parameters to API references. External references are one level deep and clearly signaled (in-repo reference.md, OpenAPI/AsyncAPI URLs, product docs, Context7 ID). Example files are listed for deeper exploration. The cross-skill routing at the top is a nice navigation aid.

3 / 3

Total

11

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
deepgram/deepgram-js-sdk
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.