Tessl Tile for pypi/deepgram-sdk@4.8.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

audio-utilities.md conversational-ai.md index.md project-management.md speech-to-text.md text-analysis.md text-to-speech.md

index.mddocs/

0
# Deepgram Python SDK
1

2
The official Python SDK for the Deepgram automated speech recognition platform, enabling developers to integrate advanced AI-powered speech-to-text, text-to-speech, and audio intelligence capabilities into their applications. The SDK offers comprehensive functionality including real-time streaming transcription via WebSocket connections, batch processing of pre-recorded audio files, text-to-speech synthesis, conversational AI agents, text intelligence analysis, and complete project management through Deepgram's platform APIs.
3

4
## Package Information
5

6
- **Package Name**: deepgram-sdk
7
- **Package Type**: pypi
8
- **Language**: Python
9
- **Installation**: `pip install deepgram-sdk`
10
- **Python Version**: 3.10+
11

12
## Core Imports
13

14
```python
15
from deepgram import DeepgramClient, DeepgramClientOptions
16
```
17

18
Common imports for specific functionality:
19

20
```python
21
# For speech-to-text
22
from deepgram import (
23
    ListenRESTClient, ListenWebSocketClient,
24
    ListenRESTOptions, ListenWebSocketOptions
25
)
26

27
# For text-to-speech
28
from deepgram import (
29
    SpeakRESTClient, SpeakWebSocketClient,
30
    SpeakRESTOptions, SpeakWSOptions
31
)
32

33
# For text analysis
34
from deepgram import AnalyzeClient, AnalyzeOptions
35

36
# For project management
37
from deepgram import ManageClient
38

39
# For conversational AI
40
from deepgram import AgentWebSocketClient
41
```
42

43
## Basic Usage
44

45
```python
46
from deepgram import DeepgramClient, DeepgramClientOptions
47
import os
48

49
# Initialize client with API key
50
client = DeepgramClient(api_key="your-api-key")
51

52
# Alternative: Initialize with environment variables
53
# Set DEEPGRAM_API_KEY environment variable
54
client = DeepgramClient()
55

56
# Speech-to-text with prerecorded audio
57
from deepgram import UrlSource, ListenRESTOptions
58
source = UrlSource("https://example.com/audio.wav")
59
options = ListenRESTOptions(model="nova-2", language="en-US")
60
response = client.listen.rest.transcribe_url(source, options)
61
print(response.results.channels[0].alternatives[0].transcript)
62

63
# Text-to-speech
64
from deepgram import TextSource, SpeakRESTOptions
65
source = TextSource("Hello, world!")
66
options = SpeakRESTOptions(model="aura-asteria-en")
67
response = client.speak.rest.stream(source, options)
68
# Save audio to file
69
with open("output.wav", "wb") as f:
70
    f.write(response.content)
71
```
72

73
## Architecture
74

75
The Deepgram SDK is organized around a main client (`DeepgramClient`) that provides access to different service routers:
76

77
- **Listen Router**: Speech-to-text capabilities (REST and WebSocket)
78
- **Speak Router**: Text-to-speech capabilities (REST and WebSocket) 
79
- **Read Router**: Text analysis and intelligence
80
- **Manage Router**: Account, project, and usage management (sync and async variants)
81
- **Agent Router**: Conversational AI WebSocket connections
82
- **Auth Router**: Authentication token management (sync and async variants)
83
- **Self-hosted Router**: On-premises deployment support (sync and async variants)
84

85
Each router provides both synchronous and asynchronous clients, with REST interfaces for batch processing and WebSocket interfaces for real-time streaming.
86

87
### Router Access Patterns
88

89
```python
90
# Synchronous access
91
client.listen.rest          # ListenRESTClient
92
client.listen.websocket     # ListenWebSocketClient
93
client.speak.rest           # SpeakRESTClient
94
client.speak.websocket      # SpeakWebSocketClient
95
client.read                 # ReadClient/AnalyzeClient
96
client.manage               # ManageClient
97
client.auth.v("1")          # AuthRESTClient
98
client.selfhosted           # SelfHostedClient
99
client.agent                # AgentWebSocketClient
100

101
# Asynchronous access
102
client.listen.asyncrest     # AsyncListenRESTClient
103
client.listen.asyncwebsocket # AsyncListenWebSocketClient
104
client.speak.asyncrest      # AsyncSpeakRESTClient
105
client.speak.asyncwebsocket # AsyncSpeakWebSocketClient
106
client.read                 # AsyncReadClient/AsyncAnalyzeClient
107
client.asyncmanage          # AsyncManageClient
108
client.asyncauth.v("1")     # AsyncAuthRESTClient
109
client.asyncselfhosted      # AsyncSelfHostedClient
110
client.agent                # AsyncAgentWebSocketClient
111
```
112

113
## Capabilities
114

115
### Speech-to-Text (Listen)
116

117
Comprehensive speech recognition capabilities supporting both batch transcription of prerecorded audio and real-time streaming transcription. Includes advanced features like speaker diarization, punctuation, profanity filtering, keyword detection, and multiple language support.
118

119
```python { .api }
120
# REST Client
121
class ListenRESTClient:
122
    def transcribe_url(self, source, options): ...
123
    def transcribe_file(self, source, options): ...
124

125
# WebSocket Client  
126
class ListenWebSocketClient:
127
    def start(self, options): ...
128
    def send(self, data): ...
129
    def finish(self): ...
130
    def close(self): ...
131

132
# Options
133
class ListenRESTOptions:
134
    model: str
135
    language: str
136
    punctuate: bool
137
    diarize: bool
138
    # ... additional options
139

140
class ListenWebSocketOptions:
141
    model: str
142
    language: str
143
    encoding: str
144
    sample_rate: int
145
    # ... additional options
146
```
147

148
[Speech-to-Text](./speech-to-text.md)
149

150
### Text-to-Speech (Speak)
151

152
High-quality neural text-to-speech synthesis with multiple voice models and real-time streaming capabilities. Supports both REST API for generating complete audio files and WebSocket streaming for real-time audio generation.
153

154
```python { .api }
155
# REST Client
156
class SpeakRESTClient:
157
    def stream(self, source, options): ...
158
    def save(self, filename, source, options): ...
159

160
# WebSocket Client
161
class SpeakWebSocketClient:
162
    def start(self, options): ...
163
    def send(self, message): ...
164
    def close(self): ...
165

166
# Options
167
class SpeakRESTOptions:
168
    model: str
169
    encoding: str
170
    container: str
171
    sample_rate: int
172
    bit_rate: int
173

174
class SpeakWSOptions:
175
    model: str
176
    encoding: str
177
    sample_rate: int
178
```
179

180
[Text-to-Speech](./text-to-speech.md)
181

182
### Text Analysis (Read)
183

184
Advanced text intelligence capabilities including sentiment analysis, topic detection, intent recognition, and content summarization. Processes text content to extract insights and understanding.
185

186
```python { .api }
187
class AnalyzeClient:
188
    def analyze_url(self, source, options): ...
189
    def analyze_text(self, source, options): ...
190

191
class AnalyzeOptions:
192
    language: str
193
    topics: bool
194
    intents: bool
195
    sentiment: bool
196
    summarize: bool
197
```
198

199
[Text Analysis](./text-analysis.md)
200

201
### Project Management (Manage)
202

203
Complete account and project management functionality including API key management, usage tracking, team member management, and billing information access.
204

205
```python { .api }
206
class ManageClient:
207
    def get_projects(self): ...
208
    def get_project(self, project_id): ...
209
    def get_keys(self, project_id): ...
210
    def create_key(self, project_id, options): ...
211
    def get_usage(self, project_id, options): ...
212
    def get_balances(self, project_id): ...
213
    # ... additional management methods
214
```
215

216
[Project Management](./project-management.md)
217

218
### Conversational AI (Agent)
219

220
Real-time conversational AI capabilities enabling voice-based interactions with intelligent agents. Supports function calling, dynamic prompt updates, and bidirectional audio streaming.
221

222
```python { .api }
223
class AgentWebSocketClient:
224
    def start(self, options): ...
225
    def send_settings(self, settings): ...
226
    def update_prompt(self, prompt): ...
227
    def inject_message(self, message): ...
228
    def close(self): ...
229

230
class SettingsOptions:
231
    agent: dict
232
    listen: dict  
233
    speak: dict
234
    think: dict
235
```
236

237
[Conversational AI](./conversational-ai.md)
238

239
### Audio Utilities
240

241
Utility classes for audio input/output operations including microphone capture and speaker playback, with configurable audio parameters and error handling.
242

243
```python { .api }
244
class Microphone:
245
    def __init__(self, **kwargs): ...
246
    def start(self): ...
247
    def finish(self): ...
248

249
class Speaker:
250
    def __init__(self, **kwargs): ...
251
    def start(self): ...
252
    def finish(self): ...
253

254
# Constants
255
INPUT_CHANNELS: int = 1
256
INPUT_RATE: int = 16000
257
INPUT_CHUNK: int = 8192
258
OUTPUT_CHANNELS: int = 1
259
OUTPUT_RATE: int = 24000
260
OUTPUT_CHUNK: int = 8192
261
```
262

263
[Audio Utilities](./audio-utilities.md)
264

265
### Authentication (Auth)
266

267
Token management and authentication capabilities for generating temporary JWT tokens from API keys, enabling secure access with configurable time-to-live settings.
268

269
```python { .api }
270
class AuthRESTClient:
271
    def grant_token(self, ttl_seconds: int = None) -> GrantTokenResponse: ...
272

273
class AsyncAuthRESTClient:
274
    async def grant_token(self, ttl_seconds: int = None) -> GrantTokenResponse: ...
275

276
class GrantTokenResponse:
277
    access_token: str
278
    expires_in: int
279
```
280

281
### Self-Hosted (OnPrem)
282

283
Support for on-premises and self-hosted Deepgram deployments with custom endpoint configuration and deployment management.
284

285
```python { .api }
286
class SelfHostedClient:
287
    def __init__(self, config: DeepgramClientOptions): ...
288

289
class AsyncSelfHostedClient:
290
    def __init__(self, config: DeepgramClientOptions): ...
291

292
# Backward compatibility aliases
293
class OnPremClient(SelfHostedClient): ...
294
class AsyncOnPremClient(AsyncSelfHostedClient): ...
295
```
296

297
## Types
298

299
```python { .api }
300
class DeepgramClient:
301
    def __init__(self, api_key: str = "", config: DeepgramClientOptions = None, access_token: str = ""): ...
302
    @property
303
    def listen(self): ...
304
    @property  
305
    def speak(self): ...
306
    @property
307
    def read(self): ...
308
    @property
309
    def manage(self): ...
310
    @property
311
    def asyncmanage(self): ...
312
    @property
313
    def agent(self): ...
314
    @property
315
    def auth(self): ...
316
    @property
317
    def asyncauth(self): ...
318
    @property
319
    def selfhosted(self): ...
320
    @property
321
    def asyncselfhosted(self): ...
322

323
class DeepgramClientOptions:
324
    api_key: str
325
    access_token: str
326
    url: str
327
    verbose: int
328
    headers: dict
329
    options: dict
330

331
# Source types for different input methods
332
class TextSource:
333
    def __init__(self, text: str): ...
334

335
class BufferSource:
336
    def __init__(self, buffer: bytes): ...
337

338
class FileSource:
339
    def __init__(self, file: str): ...
340

341
class UrlSource:
342
    def __init__(self, url: str): ...
343

344
class StreamSource:
345
    def __init__(self, stream): ...
346

347
# Base response class
348
class BaseResponse:
349
    def __init__(self, **kwargs): ...
350
```
351

352
## Error Handling
353

354
```python { .api }
355
class DeepgramError(Exception):
356
    """Base exception for Deepgram SDK errors"""
357

358
class DeepgramApiError(DeepgramError):
359
    """API response errors"""
360
    
361
class DeepgramApiKeyError(DeepgramError):
362
    """Missing or invalid API key"""
363

364
class DeepgramTypeError(DeepgramError):
365
    """Type validation errors"""
366

367
class DeepgramMicrophoneError(Exception):
368
    """Microphone operation errors"""
369

370
class DeepgramSpeakerError(Exception):
371
    """Speaker operation errors"""
372
```

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/