Tessl Tile for pypi/litellm@1.76.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

core-completion.md exceptions.md index.md other-apis.md providers.md router.md utilities.md

index.mddocs/

0
# LiteLLM
1

2
A unified Python interface for calling 100+ LLM API providers including OpenAI, Anthropic, Cohere, Replicate, and more. LiteLLM provides OpenAI-compatible API formats, intelligent routing, load balancing, fallbacks, and cost tracking across all supported providers.
3

4
## Package Information
5

6
- **Package Name**: litellm
7
- **Package Type**: pypi
8
- **Language**: Python
9
- **Installation**: `pip install litellm`
10

11
## Core Imports
12

13
```python
14
import litellm
15
from litellm import completion, embedding, Router
16
```
17

18
For async functions:
19

20
```python
21
from litellm import acompletion, aembedding
22
```
23

24
For specific components:
25

26
```python
27
from litellm import (
28
    completion, text_completion, embedding, transcription, speech,
29
    Router, token_counter, get_model_info, completion_cost
30
)
31
```
32

33
## Basic Usage
34

35
```python
36
import litellm
37
from litellm import completion
38

39
# OpenAI GPT-4
40
response = completion(
41
    model="gpt-4",
42
    messages=[
43
        {"role": "system", "content": "You are a helpful assistant."},
44
        {"role": "user", "content": "What is the capital of France?"}
45
    ]
46
)
47
print(response.choices[0].message.content)
48

49
# Anthropic Claude
50
response = completion(
51
    model="claude-3-sonnet-20240229",
52
    messages=[
53
        {"role": "user", "content": "Explain quantum computing"}
54
    ]
55
)
56

57
# Cohere Command
58
response = completion(
59
    model="command-nightly",
60
    messages=[
61
        {"role": "user", "content": "Write a short poem"}
62
    ]
63
)
64

65
# With streaming
66
response = completion(
67
    model="gpt-3.5-turbo",
68
    messages=[{"role": "user", "content": "Count to 10"}],
69
    stream=True
70
)
71

72
for chunk in response:
73
    if chunk.choices[0].delta.content:
74
        print(chunk.choices[0].delta.content, end="")
75
```
76

77
## Architecture
78

79
LiteLLM provides a **unified interface** that abstracts away provider-specific differences while maintaining full OpenAI API compatibility. Key architectural components:
80

81
- **Unified API**: Single function signatures work across all 100+ providers
82
- **Provider Translation**: Automatic translation between OpenAI format and provider-specific formats
83
- **Router System**: Intelligent load balancing, fallbacks, and retry logic across multiple deployments
84
- **Cost & Usage Tracking**: Built-in token counting and cost calculation for all providers
85
- **Exception Handling**: Consistent error types across all providers with detailed context
86
- **Configuration Management**: Provider-specific settings and authentication handling
87

88
The library serves as a drop-in replacement for OpenAI's client while adding powerful enterprise features like routing, caching, and observability.
89

90
## Capabilities
91

92
### Core Completion APIs
93

94
Unified chat completion, text completion, and streaming interfaces that work across all supported LLM providers with OpenAI-compatible parameters.
95

96
```python { .api }
97
def completion(
98
    model: str,
99
    messages: List[Dict[str, Any]],
100
    temperature: Optional[float] = None,
101
    max_tokens: Optional[int] = None,
102
    stream: Optional[bool] = None,
103
    **kwargs
104
) -> Union[ModelResponse, Iterator[ModelResponseStream]]
105

106
def text_completion(
107
    model: str,
108
    prompt: str,
109
    max_tokens: Optional[int] = None,
110
    **kwargs
111
) -> Union[TextCompletionResponse, Iterator[TextCompletionResponse]]
112

113
async def acompletion(**kwargs) -> Union[ModelResponse, AsyncIterator[ModelResponseStream]]
114
```
115

116
[Core Completion API](./core-completion.md)
117

118
### Router & Load Balancing
119

120
Router class for intelligent load balancing, automatic fallbacks, and retry logic across multiple model deployments with cost optimization and reliability features.
121

122
```python { .api }
123
class Router:
124
    def __init__(
125
        self,
126
        model_list: Optional[List[DeploymentTypedDict]] = None,
127
        routing_strategy: Literal["simple-shuffle", "least-busy", "usage-based-routing", "latency-based-routing", "cost-based-routing"] = "simple-shuffle",
128
        num_retries: Optional[int] = None,
129
        max_fallbacks: Optional[int] = None,
130
        **kwargs
131
    )
132
    
133
    def completion(self, **kwargs) -> Union[ModelResponse, Iterator[ModelResponseStream]]
134
    def health_check(self, model: Optional[str] = None) -> Dict[str, Any]
135
```
136

137
[Router & Load Balancing](./router.md)
138

139
### Embeddings & Other APIs
140

141
Embedding generation, image creation, audio transcription/synthesis, moderation, and other specialized API endpoints with unified interfaces.
142

143
```python { .api }
144
def embedding(
145
    model: str,
146
    input: Union[str, List[str], List[int], List[List[int]]],
147
    **kwargs
148
) -> EmbeddingResponse
149

150
def image_generation(
151
    prompt: str,
152
    model: Optional[str] = None,
153
    **kwargs
154
) -> ImageResponse
155

156
def transcription(model: str, file: Union[str, bytes, IO], **kwargs) -> TranscriptionResponse
157
def speech(model: str, input: str, voice: str, **kwargs) -> bytes
158
def moderation(input: Union[str, List[str]], **kwargs) -> ModerationCreateResponse
159
```
160

161
[Embeddings & Other APIs](./other-apis.md)
162

163
### Exception Handling
164

165
Comprehensive exception hierarchy with provider-specific error handling, context information, and retry logic for robust error management.
166

167
```python { .api }
168
class AuthenticationError(openai.AuthenticationError): ...
169
class RateLimitError(openai.RateLimitError): ...
170
class ContextWindowExceededError(BadRequestError): ...
171
class ContentPolicyViolationError(BadRequestError): ...
172
class BudgetExceededError(Exception): ...
173
```
174

175
[Exception Handling](./exceptions.md)
176

177
### Provider Configuration
178

179
Configuration classes and settings for 100+ LLM providers including authentication, custom endpoints, and provider-specific parameters.
180

181
```python { .api }
182
class OpenAIConfig(BaseConfig):
183
    frequency_penalty: Optional[int] = None
184
    max_tokens: Optional[int] = None
185
    temperature: Optional[int] = None
186
    # ... all OpenAI parameters
187

188
class AnthropicConfig(BaseConfig):
189
    max_tokens: int
190
    temperature: Optional[float] = None
191
    top_k: Optional[int] = None
192
```
193

194
[Provider Configuration](./providers.md)
195

196
### Utilities & Helpers
197

198
Token counting, cost calculation, model information, capability detection, and validation utilities for comprehensive LLM management.
199

200
```python { .api }
201
def token_counter(model: str, text: Union[str, List[str]], **kwargs) -> int
202
def completion_cost(completion_response: Union[ModelResponse, EmbeddingResponse], **kwargs) -> float
203
def get_model_info(model: str, **kwargs) -> Dict[str, Any]
204
def supports_function_calling(model: str, **kwargs) -> bool
205
def validate_environment(model: str, **kwargs) -> Dict[str, str]
206
```
207

208
[Utilities & Helpers](./utilities.md)
209

210
## Response Types
211

212
```python { .api }
213
class ModelResponse(BaseLiteLLMOpenAIResponseObject):
214
    id: str
215
    choices: List[Choices]
216
    created: int
217
    model: Optional[str] = None
218
    usage: Optional[Usage] = None
219

220
class EmbeddingResponse(OpenAIObject):
221
    data: List[EmbeddingData]
222
    model: Optional[str]
223
    usage: Optional[Usage]
224

225
class Usage:
226
    prompt_tokens: int
227
    completion_tokens: Optional[int] = None
228
    total_tokens: int
229

230
class Choices:
231
    finish_reason: Optional[str] = None
232
    index: int = 0
233
    message: Optional[Message] = None
234

235
class Message:
236
    content: Optional[str] = None
237
    role: str
238
    tool_calls: Optional[List[ChatCompletionMessageToolCall]] = None
239
```
240

241
## Global Configuration
242

243
```python { .api }
244
# Authentication
245
litellm.api_key: Optional[str] = None
246
litellm.openai_key: Optional[str] = None
247
litellm.anthropic_key: Optional[str] = None
248

249
# Timeout & Retry Settings
250
litellm.request_timeout: float = 600
251
litellm.num_retries: Optional[int] = None
252
litellm.max_fallbacks: Optional[int] = None
253

254
# Debugging & Logging
255
litellm.set_verbose: bool = False
256
litellm.suppress_debug_info: bool = False
257

258
# Model Configuration
259
litellm.model_alias_map: Dict[str, str] = {}
260
litellm.drop_params: bool = False
261
litellm.modify_params: bool = False
262
```

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/