or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-deepgram-sdk

The official Python SDK for the Deepgram automated speech recognition platform.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/deepgram-sdk@4.8.x

To install, run

npx @tessl/cli install tessl/pypi-deepgram-sdk@4.8.0

0

# Deepgram Python SDK

1

2

The official Python SDK for the Deepgram automated speech recognition platform, enabling developers to integrate advanced AI-powered speech-to-text, text-to-speech, and audio intelligence capabilities into their applications. The SDK offers comprehensive functionality including real-time streaming transcription via WebSocket connections, batch processing of pre-recorded audio files, text-to-speech synthesis, conversational AI agents, text intelligence analysis, and complete project management through Deepgram's platform APIs.

3

4

## Package Information

5

6

- **Package Name**: deepgram-sdk

7

- **Package Type**: pypi

8

- **Language**: Python

9

- **Installation**: `pip install deepgram-sdk`

10

- **Python Version**: 3.10+

11

12

## Core Imports

13

14

```python

15

from deepgram import DeepgramClient, DeepgramClientOptions

16

```

17

18

Common imports for specific functionality:

19

20

```python

21

# For speech-to-text

22

from deepgram import (

23

ListenRESTClient, ListenWebSocketClient,

24

ListenRESTOptions, ListenWebSocketOptions

25

)

26

27

# For text-to-speech

28

from deepgram import (

29

SpeakRESTClient, SpeakWebSocketClient,

30

SpeakRESTOptions, SpeakWSOptions

31

)

32

33

# For text analysis

34

from deepgram import AnalyzeClient, AnalyzeOptions

35

36

# For project management

37

from deepgram import ManageClient

38

39

# For conversational AI

40

from deepgram import AgentWebSocketClient

41

```

42

43

## Basic Usage

44

45

```python

46

from deepgram import DeepgramClient, DeepgramClientOptions

47

import os

48

49

# Initialize client with API key

50

client = DeepgramClient(api_key="your-api-key")

51

52

# Alternative: Initialize with environment variables

53

# Set DEEPGRAM_API_KEY environment variable

54

client = DeepgramClient()

55

56

# Speech-to-text with prerecorded audio

57

from deepgram import UrlSource, ListenRESTOptions

58

source = UrlSource("https://example.com/audio.wav")

59

options = ListenRESTOptions(model="nova-2", language="en-US")

60

response = client.listen.rest.transcribe_url(source, options)

61

print(response.results.channels[0].alternatives[0].transcript)

62

63

# Text-to-speech

64

from deepgram import TextSource, SpeakRESTOptions

65

source = TextSource("Hello, world!")

66

options = SpeakRESTOptions(model="aura-asteria-en")

67

response = client.speak.rest.stream(source, options)

68

# Save audio to file

69

with open("output.wav", "wb") as f:

70

f.write(response.content)

71

```

72

73

## Architecture

74

75

The Deepgram SDK is organized around a main client (`DeepgramClient`) that provides access to different service routers:

76

77

- **Listen Router**: Speech-to-text capabilities (REST and WebSocket)

78

- **Speak Router**: Text-to-speech capabilities (REST and WebSocket)

79

- **Read Router**: Text analysis and intelligence

80

- **Manage Router**: Account, project, and usage management (sync and async variants)

81

- **Agent Router**: Conversational AI WebSocket connections

82

- **Auth Router**: Authentication token management (sync and async variants)

83

- **Self-hosted Router**: On-premises deployment support (sync and async variants)

84

85

Each router provides both synchronous and asynchronous clients, with REST interfaces for batch processing and WebSocket interfaces for real-time streaming.

86

87

### Router Access Patterns

88

89

```python

90

# Synchronous access

91

client.listen.rest # ListenRESTClient

92

client.listen.websocket # ListenWebSocketClient

93

client.speak.rest # SpeakRESTClient

94

client.speak.websocket # SpeakWebSocketClient

95

client.read # ReadClient/AnalyzeClient

96

client.manage # ManageClient

97

client.auth.v("1") # AuthRESTClient

98

client.selfhosted # SelfHostedClient

99

client.agent # AgentWebSocketClient

100

101

# Asynchronous access

102

client.listen.asyncrest # AsyncListenRESTClient

103

client.listen.asyncwebsocket # AsyncListenWebSocketClient

104

client.speak.asyncrest # AsyncSpeakRESTClient

105

client.speak.asyncwebsocket # AsyncSpeakWebSocketClient

106

client.read # AsyncReadClient/AsyncAnalyzeClient

107

client.asyncmanage # AsyncManageClient

108

client.asyncauth.v("1") # AsyncAuthRESTClient

109

client.asyncselfhosted # AsyncSelfHostedClient

110

client.agent # AsyncAgentWebSocketClient

111

```

112

113

## Capabilities

114

115

### Speech-to-Text (Listen)

116

117

Comprehensive speech recognition capabilities supporting both batch transcription of prerecorded audio and real-time streaming transcription. Includes advanced features like speaker diarization, punctuation, profanity filtering, keyword detection, and multiple language support.

118

119

```python { .api }

120

# REST Client

121

class ListenRESTClient:

122

def transcribe_url(self, source, options): ...

123

def transcribe_file(self, source, options): ...

124

125

# WebSocket Client

126

class ListenWebSocketClient:

127

def start(self, options): ...

128

def send(self, data): ...

129

def finish(self): ...

130

def close(self): ...

131

132

# Options

133

class ListenRESTOptions:

134

model: str

135

language: str

136

punctuate: bool

137

diarize: bool

138

# ... additional options

139

140

class ListenWebSocketOptions:

141

model: str

142

language: str

143

encoding: str

144

sample_rate: int

145

# ... additional options

146

```

147

148

[Speech-to-Text](./speech-to-text.md)

149

150

### Text-to-Speech (Speak)

151

152

High-quality neural text-to-speech synthesis with multiple voice models and real-time streaming capabilities. Supports both REST API for generating complete audio files and WebSocket streaming for real-time audio generation.

153

154

```python { .api }

155

# REST Client

156

class SpeakRESTClient:

157

def stream(self, source, options): ...

158

def save(self, filename, source, options): ...

159

160

# WebSocket Client

161

class SpeakWebSocketClient:

162

def start(self, options): ...

163

def send(self, message): ...

164

def close(self): ...

165

166

# Options

167

class SpeakRESTOptions:

168

model: str

169

encoding: str

170

container: str

171

sample_rate: int

172

bit_rate: int

173

174

class SpeakWSOptions:

175

model: str

176

encoding: str

177

sample_rate: int

178

```

179

180

[Text-to-Speech](./text-to-speech.md)

181

182

### Text Analysis (Read)

183

184

Advanced text intelligence capabilities including sentiment analysis, topic detection, intent recognition, and content summarization. Processes text content to extract insights and understanding.

185

186

```python { .api }

187

class AnalyzeClient:

188

def analyze_url(self, source, options): ...

189

def analyze_text(self, source, options): ...

190

191

class AnalyzeOptions:

192

language: str

193

topics: bool

194

intents: bool

195

sentiment: bool

196

summarize: bool

197

```

198

199

[Text Analysis](./text-analysis.md)

200

201

### Project Management (Manage)

202

203

Complete account and project management functionality including API key management, usage tracking, team member management, and billing information access.

204

205

```python { .api }

206

class ManageClient:

207

def get_projects(self): ...

208

def get_project(self, project_id): ...

209

def get_keys(self, project_id): ...

210

def create_key(self, project_id, options): ...

211

def get_usage(self, project_id, options): ...

212

def get_balances(self, project_id): ...

213

# ... additional management methods

214

```

215

216

[Project Management](./project-management.md)

217

218

### Conversational AI (Agent)

219

220

Real-time conversational AI capabilities enabling voice-based interactions with intelligent agents. Supports function calling, dynamic prompt updates, and bidirectional audio streaming.

221

222

```python { .api }

223

class AgentWebSocketClient:

224

def start(self, options): ...

225

def send_settings(self, settings): ...

226

def update_prompt(self, prompt): ...

227

def inject_message(self, message): ...

228

def close(self): ...

229

230

class SettingsOptions:

231

agent: dict

232

listen: dict

233

speak: dict

234

think: dict

235

```

236

237

[Conversational AI](./conversational-ai.md)

238

239

### Audio Utilities

240

241

Utility classes for audio input/output operations including microphone capture and speaker playback, with configurable audio parameters and error handling.

242

243

```python { .api }

244

class Microphone:

245

def __init__(self, **kwargs): ...

246

def start(self): ...

247

def finish(self): ...

248

249

class Speaker:

250

def __init__(self, **kwargs): ...

251

def start(self): ...

252

def finish(self): ...

253

254

# Constants

255

INPUT_CHANNELS: int = 1

256

INPUT_RATE: int = 16000

257

INPUT_CHUNK: int = 8192

258

OUTPUT_CHANNELS: int = 1

259

OUTPUT_RATE: int = 24000

260

OUTPUT_CHUNK: int = 8192

261

```

262

263

[Audio Utilities](./audio-utilities.md)

264

265

### Authentication (Auth)

266

267

Token management and authentication capabilities for generating temporary JWT tokens from API keys, enabling secure access with configurable time-to-live settings.

268

269

```python { .api }

270

class AuthRESTClient:

271

def grant_token(self, ttl_seconds: int = None) -> GrantTokenResponse: ...

272

273

class AsyncAuthRESTClient:

274

async def grant_token(self, ttl_seconds: int = None) -> GrantTokenResponse: ...

275

276

class GrantTokenResponse:

277

access_token: str

278

expires_in: int

279

```

280

281

### Self-Hosted (OnPrem)

282

283

Support for on-premises and self-hosted Deepgram deployments with custom endpoint configuration and deployment management.

284

285

```python { .api }

286

class SelfHostedClient:

287

def __init__(self, config: DeepgramClientOptions): ...

288

289

class AsyncSelfHostedClient:

290

def __init__(self, config: DeepgramClientOptions): ...

291

292

# Backward compatibility aliases

293

class OnPremClient(SelfHostedClient): ...

294

class AsyncOnPremClient(AsyncSelfHostedClient): ...

295

```

296

297

## Types

298

299

```python { .api }

300

class DeepgramClient:

301

def __init__(self, api_key: str = "", config: DeepgramClientOptions = None, access_token: str = ""): ...

302

@property

303

def listen(self): ...

304

@property

305

def speak(self): ...

306

@property

307

def read(self): ...

308

@property

309

def manage(self): ...

310

@property

311

def asyncmanage(self): ...

312

@property

313

def agent(self): ...

314

@property

315

def auth(self): ...

316

@property

317

def asyncauth(self): ...

318

@property

319

def selfhosted(self): ...

320

@property

321

def asyncselfhosted(self): ...

322

323

class DeepgramClientOptions:

324

api_key: str

325

access_token: str

326

url: str

327

verbose: int

328

headers: dict

329

options: dict

330

331

# Source types for different input methods

332

class TextSource:

333

def __init__(self, text: str): ...

334

335

class BufferSource:

336

def __init__(self, buffer: bytes): ...

337

338

class FileSource:

339

def __init__(self, file: str): ...

340

341

class UrlSource:

342

def __init__(self, url: str): ...

343

344

class StreamSource:

345

def __init__(self, stream): ...

346

347

# Base response class

348

class BaseResponse:

349

def __init__(self, **kwargs): ...

350

```

351

352

## Error Handling

353

354

```python { .api }

355

class DeepgramError(Exception):

356

"""Base exception for Deepgram SDK errors"""

357

358

class DeepgramApiError(DeepgramError):

359

"""API response errors"""

360

361

class DeepgramApiKeyError(DeepgramError):

362

"""Missing or invalid API key"""

363

364

class DeepgramTypeError(DeepgramError):

365

"""Type validation errors"""

366

367

class DeepgramMicrophoneError(Exception):

368

"""Microphone operation errors"""

369

370

class DeepgramSpeakerError(Exception):

371

"""Speaker operation errors"""

372

```