deepgram-js-audio-intelligence

Use when writing or reviewing JavaScript/TypeScript in this repo that calls Deepgram audio analytics overlays on `/v1/listen` - summarize, topics, intents, sentiment, diarize, redact, detect_language, and entity detection. Same endpoint as plain STT, different params. Covers REST via `client.listen.v1.media.transcribeUrl` / `transcribeFile` and the WebSocket-supported subset on `client.listen.v1.createConnection()` / `connect()`. Use `deepgram-js-speech-to-text` for plain transcription and `deepgram-js-text-intelligence` for analytics on already-transcribed text. Triggers include "audio intelligence", "summarize audio", "diarize", "sentiment from audio", "redact PII", and "detect language audio".

Quality

86%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Using Deepgram Audio Intelligence (JavaScript / TypeScript SDK)

Name: deepgram-js-audio-intelligence
Rating: 71.2 (1 reviews)
Author: deepgram

Analytics overlays applied to /v1/listen: summaries, topics, intents, sentiment, language detection, diarization, redaction, entities. Same client surface as STT; turn features on with parameters.

When to use this product

You have audio and want analytics returned alongside the transcript.
REST is the primary path; the WebSocket path supports only a subset of intelligence features.

Use a different skill when:

You just want transcript output → deepgram-js-speech-to-text.
You already have text and want analytics on that text → deepgram-js-text-intelligence.
You need Flux turn-taking → deepgram-js-conversational-stt.
You need a full interactive voice agent → deepgram-js-voice-agent.

Feature availability: REST vs WSS

Feature	REST	WSS
`diarize`	yes	yes
`redact`	yes	yes
`detect_entities`	yes	yes
`punctuate`, `smart_format`	yes	yes
`summarize`	yes	no in current WSS connect args
`topics`	yes	no
`intents`	yes	no
`sentiment`	yes	no
`detect_language`	yes	no

Authentication

require("dotenv").config();

const { DeepgramClient } = require("@deepgram/sdk");

const deepgramClient = new DeepgramClient({
  apiKey: process.env.DEEPGRAM_API_KEY,
});

Quick start — REST with analytics

From examples/22-transcription-advanced-options.ts:

const data = await deepgramClient.listen.v1.media.transcribeUrl({
  url: "https://dpgr.am/spacewalk.wav",
  model: "nova-3",
  language: "en",
  punctuate: true,
  paragraphs: true,
  utterances: true,
  smart_format: true,
  sentiment: true,
  topics: true,
  custom_topic: "custom_topic",
  custom_topic_mode: "extended",
  intents: true,
  custom_intent: "custom_intent",
  custom_intent_mode: "extended",
  detect_entities: true,
  detect_language: true,
  diarize: true,
  keyterm: ["keyword1", "keyword2"],
  redact: ["pci", "ssn"],
});

Quick start — WSS subset

Start from examples/07-transcription-live-websocket.ts and keep the same socket flow, but only use WSS-supported intelligence flags such as diarize, redact, and detect_entities in the connection args.

const deepgramConnection = await deepgramClient.listen.v1.createConnection({
  model: "nova-3",
  diarize: true,
  redact: "pci",
  detect_entities: true,
});

Key parameters / API surface

Analytics flags: summarize, topics, intents, sentiment, detect_language, detect_entities, diarize, redact, custom_topic, custom_topic_mode, custom_intent, custom_intent_mode.
Standard STT flags still apply: model, language, encoding, sample_rate, punctuate, smart_format, utterances, paragraphs, multichannel.
Nova-3-specific biasing in repo examples uses keyterm, not keywords.

API reference (layered)

In-repo reference: reference.md → Listen V1 Media; WSS subset behavior lives in src/CustomClient.ts and src/api/resources/listen/resources/v1/client/{Client,Socket}.ts.
Canonical OpenAPI (REST): https://developers.deepgram.com/openapi.yaml
Canonical AsyncAPI (WSS): https://developers.deepgram.com/asyncapi.yaml
Context7: library ID /llmstxt/developers_deepgram_llms_txt
Product docs:

Gotchas

summarize on /v1/listen is versioned, not plain boolean. The generated REST surface and examples point at "v2".
Most intelligence flags are REST-only. Current WSS connect args do not expose topics, intents, sentiment, summarize, or detect_language.
redact typing is looser in practice than in the generated alias. Examples pass arrays like ["pci", "ssn"], even though ListenV1Redact itself is just a string alias.
Use keyterm for Nova-3 biasing. examples/22-transcription-advanced-options.ts explicitly notes keywords are not supported for Nova-3.
Model/feature support is product-side. nova-3 is the safest choice when mixing many overlays.
Diarization quality depends on audio quality and duration. Short or noisy clips churn speakers.

Example files in this repo

examples/22-transcription-advanced-options.ts
examples/04-transcription-prerecorded-url.ts
examples/05-transcription-prerecorded-file.ts
examples/07-transcription-live-websocket.ts

Central product skills

For cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:

npx skills add deepgram/skills

This SDK ships language-idiomatic code skills; deepgram/skills ships cross-language product knowledge (see api, docs, recipes, examples, starters, setup-mcp).

Repository: deepgram/deepgram-js-sdk
Commit: bcffba7

Last updated: 11 days ago
Created: 11 days ago

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.