0
# LiteLLM
1
2
A unified Python interface for calling 100+ LLM API providers including OpenAI, Anthropic, Cohere, Replicate, and more. LiteLLM provides OpenAI-compatible API formats, intelligent routing, load balancing, fallbacks, and cost tracking across all supported providers.
3
4
## Package Information
5
6
- **Package Name**: litellm
7
- **Package Type**: pypi
8
- **Language**: Python
9
- **Installation**: `pip install litellm`
10
11
## Core Imports
12
13
```python
14
import litellm
15
from litellm import completion, embedding, Router
16
```
17
18
For async functions:
19
20
```python
21
from litellm import acompletion, aembedding
22
```
23
24
For specific components:
25
26
```python
27
from litellm import (
28
completion, text_completion, embedding, transcription, speech,
29
Router, token_counter, get_model_info, completion_cost
30
)
31
```
32
33
## Basic Usage
34
35
```python
36
import litellm
37
from litellm import completion
38
39
# OpenAI GPT-4
40
response = completion(
41
model="gpt-4",
42
messages=[
43
{"role": "system", "content": "You are a helpful assistant."},
44
{"role": "user", "content": "What is the capital of France?"}
45
]
46
)
47
print(response.choices[0].message.content)
48
49
# Anthropic Claude
50
response = completion(
51
model="claude-3-sonnet-20240229",
52
messages=[
53
{"role": "user", "content": "Explain quantum computing"}
54
]
55
)
56
57
# Cohere Command
58
response = completion(
59
model="command-nightly",
60
messages=[
61
{"role": "user", "content": "Write a short poem"}
62
]
63
)
64
65
# With streaming
66
response = completion(
67
model="gpt-3.5-turbo",
68
messages=[{"role": "user", "content": "Count to 10"}],
69
stream=True
70
)
71
72
for chunk in response:
73
if chunk.choices[0].delta.content:
74
print(chunk.choices[0].delta.content, end="")
75
```
76
77
## Architecture
78
79
LiteLLM provides a **unified interface** that abstracts away provider-specific differences while maintaining full OpenAI API compatibility. Key architectural components:
80
81
- **Unified API**: Single function signatures work across all 100+ providers
82
- **Provider Translation**: Automatic translation between OpenAI format and provider-specific formats
83
- **Router System**: Intelligent load balancing, fallbacks, and retry logic across multiple deployments
84
- **Cost & Usage Tracking**: Built-in token counting and cost calculation for all providers
85
- **Exception Handling**: Consistent error types across all providers with detailed context
86
- **Configuration Management**: Provider-specific settings and authentication handling
87
88
The library serves as a drop-in replacement for OpenAI's client while adding powerful enterprise features like routing, caching, and observability.
89
90
## Capabilities
91
92
### Core Completion APIs
93
94
Unified chat completion, text completion, and streaming interfaces that work across all supported LLM providers with OpenAI-compatible parameters.
95
96
```python { .api }
97
def completion(
98
model: str,
99
messages: List[Dict[str, Any]],
100
temperature: Optional[float] = None,
101
max_tokens: Optional[int] = None,
102
stream: Optional[bool] = None,
103
**kwargs
104
) -> Union[ModelResponse, Iterator[ModelResponseStream]]
105
106
def text_completion(
107
model: str,
108
prompt: str,
109
max_tokens: Optional[int] = None,
110
**kwargs
111
) -> Union[TextCompletionResponse, Iterator[TextCompletionResponse]]
112
113
async def acompletion(**kwargs) -> Union[ModelResponse, AsyncIterator[ModelResponseStream]]
114
```
115
116
[Core Completion API](./core-completion.md)
117
118
### Router & Load Balancing
119
120
Router class for intelligent load balancing, automatic fallbacks, and retry logic across multiple model deployments with cost optimization and reliability features.
121
122
```python { .api }
123
class Router:
124
def __init__(
125
self,
126
model_list: Optional[List[DeploymentTypedDict]] = None,
127
routing_strategy: Literal["simple-shuffle", "least-busy", "usage-based-routing", "latency-based-routing", "cost-based-routing"] = "simple-shuffle",
128
num_retries: Optional[int] = None,
129
max_fallbacks: Optional[int] = None,
130
**kwargs
131
)
132
133
def completion(self, **kwargs) -> Union[ModelResponse, Iterator[ModelResponseStream]]
134
def health_check(self, model: Optional[str] = None) -> Dict[str, Any]
135
```
136
137
[Router & Load Balancing](./router.md)
138
139
### Embeddings & Other APIs
140
141
Embedding generation, image creation, audio transcription/synthesis, moderation, and other specialized API endpoints with unified interfaces.
142
143
```python { .api }
144
def embedding(
145
model: str,
146
input: Union[str, List[str], List[int], List[List[int]]],
147
**kwargs
148
) -> EmbeddingResponse
149
150
def image_generation(
151
prompt: str,
152
model: Optional[str] = None,
153
**kwargs
154
) -> ImageResponse
155
156
def transcription(model: str, file: Union[str, bytes, IO], **kwargs) -> TranscriptionResponse
157
def speech(model: str, input: str, voice: str, **kwargs) -> bytes
158
def moderation(input: Union[str, List[str]], **kwargs) -> ModerationCreateResponse
159
```
160
161
[Embeddings & Other APIs](./other-apis.md)
162
163
### Exception Handling
164
165
Comprehensive exception hierarchy with provider-specific error handling, context information, and retry logic for robust error management.
166
167
```python { .api }
168
class AuthenticationError(openai.AuthenticationError): ...
169
class RateLimitError(openai.RateLimitError): ...
170
class ContextWindowExceededError(BadRequestError): ...
171
class ContentPolicyViolationError(BadRequestError): ...
172
class BudgetExceededError(Exception): ...
173
```
174
175
[Exception Handling](./exceptions.md)
176
177
### Provider Configuration
178
179
Configuration classes and settings for 100+ LLM providers including authentication, custom endpoints, and provider-specific parameters.
180
181
```python { .api }
182
class OpenAIConfig(BaseConfig):
183
frequency_penalty: Optional[int] = None
184
max_tokens: Optional[int] = None
185
temperature: Optional[int] = None
186
# ... all OpenAI parameters
187
188
class AnthropicConfig(BaseConfig):
189
max_tokens: int
190
temperature: Optional[float] = None
191
top_k: Optional[int] = None
192
```
193
194
[Provider Configuration](./providers.md)
195
196
### Utilities & Helpers
197
198
Token counting, cost calculation, model information, capability detection, and validation utilities for comprehensive LLM management.
199
200
```python { .api }
201
def token_counter(model: str, text: Union[str, List[str]], **kwargs) -> int
202
def completion_cost(completion_response: Union[ModelResponse, EmbeddingResponse], **kwargs) -> float
203
def get_model_info(model: str, **kwargs) -> Dict[str, Any]
204
def supports_function_calling(model: str, **kwargs) -> bool
205
def validate_environment(model: str, **kwargs) -> Dict[str, str]
206
```
207
208
[Utilities & Helpers](./utilities.md)
209
210
## Response Types
211
212
```python { .api }
213
class ModelResponse(BaseLiteLLMOpenAIResponseObject):
214
id: str
215
choices: List[Choices]
216
created: int
217
model: Optional[str] = None
218
usage: Optional[Usage] = None
219
220
class EmbeddingResponse(OpenAIObject):
221
data: List[EmbeddingData]
222
model: Optional[str]
223
usage: Optional[Usage]
224
225
class Usage:
226
prompt_tokens: int
227
completion_tokens: Optional[int] = None
228
total_tokens: int
229
230
class Choices:
231
finish_reason: Optional[str] = None
232
index: int = 0
233
message: Optional[Message] = None
234
235
class Message:
236
content: Optional[str] = None
237
role: str
238
tool_calls: Optional[List[ChatCompletionMessageToolCall]] = None
239
```
240
241
## Global Configuration
242
243
```python { .api }
244
# Authentication
245
litellm.api_key: Optional[str] = None
246
litellm.openai_key: Optional[str] = None
247
litellm.anthropic_key: Optional[str] = None
248
249
# Timeout & Retry Settings
250
litellm.request_timeout: float = 600
251
litellm.num_retries: Optional[int] = None
252
litellm.max_fallbacks: Optional[int] = None
253
254
# Debugging & Logging
255
litellm.set_verbose: bool = False
256
litellm.suppress_debug_info: bool = False
257
258
# Model Configuration
259
litellm.model_alias_map: Dict[str, str] = {}
260
litellm.drop_params: bool = False
261
litellm.modify_params: bool = False
262
```