or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-litellm

Library to easily interface with LLM API providers

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/litellm@1.76.x

To install, run

npx @tessl/cli install tessl/pypi-litellm@1.76.0

0

# LiteLLM

1

2

A unified Python interface for calling 100+ LLM API providers including OpenAI, Anthropic, Cohere, Replicate, and more. LiteLLM provides OpenAI-compatible API formats, intelligent routing, load balancing, fallbacks, and cost tracking across all supported providers.

3

4

## Package Information

5

6

- **Package Name**: litellm

7

- **Package Type**: pypi

8

- **Language**: Python

9

- **Installation**: `pip install litellm`

10

11

## Core Imports

12

13

```python

14

import litellm

15

from litellm import completion, embedding, Router

16

```

17

18

For async functions:

19

20

```python

21

from litellm import acompletion, aembedding

22

```

23

24

For specific components:

25

26

```python

27

from litellm import (

28

completion, text_completion, embedding, transcription, speech,

29

Router, token_counter, get_model_info, completion_cost

30

)

31

```

32

33

## Basic Usage

34

35

```python

36

import litellm

37

from litellm import completion

38

39

# OpenAI GPT-4

40

response = completion(

41

model="gpt-4",

42

messages=[

43

{"role": "system", "content": "You are a helpful assistant."},

44

{"role": "user", "content": "What is the capital of France?"}

45

]

46

)

47

print(response.choices[0].message.content)

48

49

# Anthropic Claude

50

response = completion(

51

model="claude-3-sonnet-20240229",

52

messages=[

53

{"role": "user", "content": "Explain quantum computing"}

54

]

55

)

56

57

# Cohere Command

58

response = completion(

59

model="command-nightly",

60

messages=[

61

{"role": "user", "content": "Write a short poem"}

62

]

63

)

64

65

# With streaming

66

response = completion(

67

model="gpt-3.5-turbo",

68

messages=[{"role": "user", "content": "Count to 10"}],

69

stream=True

70

)

71

72

for chunk in response:

73

if chunk.choices[0].delta.content:

74

print(chunk.choices[0].delta.content, end="")

75

```

76

77

## Architecture

78

79

LiteLLM provides a **unified interface** that abstracts away provider-specific differences while maintaining full OpenAI API compatibility. Key architectural components:

80

81

- **Unified API**: Single function signatures work across all 100+ providers

82

- **Provider Translation**: Automatic translation between OpenAI format and provider-specific formats

83

- **Router System**: Intelligent load balancing, fallbacks, and retry logic across multiple deployments

84

- **Cost & Usage Tracking**: Built-in token counting and cost calculation for all providers

85

- **Exception Handling**: Consistent error types across all providers with detailed context

86

- **Configuration Management**: Provider-specific settings and authentication handling

87

88

The library serves as a drop-in replacement for OpenAI's client while adding powerful enterprise features like routing, caching, and observability.

89

90

## Capabilities

91

92

### Core Completion APIs

93

94

Unified chat completion, text completion, and streaming interfaces that work across all supported LLM providers with OpenAI-compatible parameters.

95

96

```python { .api }

97

def completion(

98

model: str,

99

messages: List[Dict[str, Any]],

100

temperature: Optional[float] = None,

101

max_tokens: Optional[int] = None,

102

stream: Optional[bool] = None,

103

**kwargs

104

) -> Union[ModelResponse, Iterator[ModelResponseStream]]

105

106

def text_completion(

107

model: str,

108

prompt: str,

109

max_tokens: Optional[int] = None,

110

**kwargs

111

) -> Union[TextCompletionResponse, Iterator[TextCompletionResponse]]

112

113

async def acompletion(**kwargs) -> Union[ModelResponse, AsyncIterator[ModelResponseStream]]

114

```

115

116

[Core Completion API](./core-completion.md)

117

118

### Router & Load Balancing

119

120

Router class for intelligent load balancing, automatic fallbacks, and retry logic across multiple model deployments with cost optimization and reliability features.

121

122

```python { .api }

123

class Router:

124

def __init__(

125

self,

126

model_list: Optional[List[DeploymentTypedDict]] = None,

127

routing_strategy: Literal["simple-shuffle", "least-busy", "usage-based-routing", "latency-based-routing", "cost-based-routing"] = "simple-shuffle",

128

num_retries: Optional[int] = None,

129

max_fallbacks: Optional[int] = None,

130

**kwargs

131

)

132

133

def completion(self, **kwargs) -> Union[ModelResponse, Iterator[ModelResponseStream]]

134

def health_check(self, model: Optional[str] = None) -> Dict[str, Any]

135

```

136

137

[Router & Load Balancing](./router.md)

138

139

### Embeddings & Other APIs

140

141

Embedding generation, image creation, audio transcription/synthesis, moderation, and other specialized API endpoints with unified interfaces.

142

143

```python { .api }

144

def embedding(

145

model: str,

146

input: Union[str, List[str], List[int], List[List[int]]],

147

**kwargs

148

) -> EmbeddingResponse

149

150

def image_generation(

151

prompt: str,

152

model: Optional[str] = None,

153

**kwargs

154

) -> ImageResponse

155

156

def transcription(model: str, file: Union[str, bytes, IO], **kwargs) -> TranscriptionResponse

157

def speech(model: str, input: str, voice: str, **kwargs) -> bytes

158

def moderation(input: Union[str, List[str]], **kwargs) -> ModerationCreateResponse

159

```

160

161

[Embeddings & Other APIs](./other-apis.md)

162

163

### Exception Handling

164

165

Comprehensive exception hierarchy with provider-specific error handling, context information, and retry logic for robust error management.

166

167

```python { .api }

168

class AuthenticationError(openai.AuthenticationError): ...

169

class RateLimitError(openai.RateLimitError): ...

170

class ContextWindowExceededError(BadRequestError): ...

171

class ContentPolicyViolationError(BadRequestError): ...

172

class BudgetExceededError(Exception): ...

173

```

174

175

[Exception Handling](./exceptions.md)

176

177

### Provider Configuration

178

179

Configuration classes and settings for 100+ LLM providers including authentication, custom endpoints, and provider-specific parameters.

180

181

```python { .api }

182

class OpenAIConfig(BaseConfig):

183

frequency_penalty: Optional[int] = None

184

max_tokens: Optional[int] = None

185

temperature: Optional[int] = None

186

# ... all OpenAI parameters

187

188

class AnthropicConfig(BaseConfig):

189

max_tokens: int

190

temperature: Optional[float] = None

191

top_k: Optional[int] = None

192

```

193

194

[Provider Configuration](./providers.md)

195

196

### Utilities & Helpers

197

198

Token counting, cost calculation, model information, capability detection, and validation utilities for comprehensive LLM management.

199

200

```python { .api }

201

def token_counter(model: str, text: Union[str, List[str]], **kwargs) -> int

202

def completion_cost(completion_response: Union[ModelResponse, EmbeddingResponse], **kwargs) -> float

203

def get_model_info(model: str, **kwargs) -> Dict[str, Any]

204

def supports_function_calling(model: str, **kwargs) -> bool

205

def validate_environment(model: str, **kwargs) -> Dict[str, str]

206

```

207

208

[Utilities & Helpers](./utilities.md)

209

210

## Response Types

211

212

```python { .api }

213

class ModelResponse(BaseLiteLLMOpenAIResponseObject):

214

id: str

215

choices: List[Choices]

216

created: int

217

model: Optional[str] = None

218

usage: Optional[Usage] = None

219

220

class EmbeddingResponse(OpenAIObject):

221

data: List[EmbeddingData]

222

model: Optional[str]

223

usage: Optional[Usage]

224

225

class Usage:

226

prompt_tokens: int

227

completion_tokens: Optional[int] = None

228

total_tokens: int

229

230

class Choices:

231

finish_reason: Optional[str] = None

232

index: int = 0

233

message: Optional[Message] = None

234

235

class Message:

236

content: Optional[str] = None

237

role: str

238

tool_calls: Optional[List[ChatCompletionMessageToolCall]] = None

239

```

240

241

## Global Configuration

242

243

```python { .api }

244

# Authentication

245

litellm.api_key: Optional[str] = None

246

litellm.openai_key: Optional[str] = None

247

litellm.anthropic_key: Optional[str] = None

248

249

# Timeout & Retry Settings

250

litellm.request_timeout: float = 600

251

litellm.num_retries: Optional[int] = None

252

litellm.max_fallbacks: Optional[int] = None

253

254

# Debugging & Logging

255

litellm.set_verbose: bool = False

256

litellm.suppress_debug_info: bool = False

257

258

# Model Configuration

259

litellm.model_alias_map: Dict[str, str] = {}

260

litellm.drop_params: bool = False

261

litellm.modify_params: bool = False

262

```