0
# Client Management
1
2
Client initialization, configuration, and authentication for both synchronous and asynchronous usage patterns. The SDK provides flexible configuration options including environment variable support, custom timeouts, retry policies, and HTTP client customization.
3
4
## Capabilities
5
6
### Synchronous Client
7
8
The primary client class for synchronous API interactions with comprehensive configuration options and automatic API key detection from environment variables.
9
10
```python { .api }
11
class Cerebras:
12
def __init__(
13
self,
14
*,
15
api_key: str | None = None,
16
base_url: str | httpx.URL | None = None,
17
timeout: Union[float, Timeout, None, NotGiven] = NOT_GIVEN,
18
max_retries: int = DEFAULT_MAX_RETRIES,
19
default_headers: Mapping[str, str] | None = None,
20
default_query: Mapping[str, object] | None = None,
21
http_client: httpx.Client | None = None,
22
_strict_response_validation: bool = False,
23
warm_tcp_connection: bool = True,
24
) -> None:
25
"""
26
Construct a new synchronous Cerebras client instance.
27
28
This automatically infers the api_key argument from the CEREBRAS_API_KEY
29
environment variable if it is not provided.
30
31
Parameters:
32
- api_key: API key for authentication (from CEREBRAS_API_KEY env if None)
33
- base_url: Override the default base URL for the API
34
- timeout: Request timeout configuration (float, Timeout object, or NOT_GIVEN)
35
- max_retries: Maximum number of retries for failed requests
36
- default_headers: Default headers to include with all requests
37
- default_query: Default query parameters for all requests
38
- http_client: Custom httpx.Client instance (DefaultHttpxClient if None)
39
- _strict_response_validation: Enable strict API response validation
40
- warm_tcp_connection: Enable TCP connection warming for reduced latency
41
"""
42
43
# Resource properties
44
chat: chat.ChatResource
45
completions: completions.CompletionsResource
46
models: models.ModelsResource
47
48
# Response wrapper properties
49
with_raw_response: CerebrasWithRawResponse
50
with_streaming_response: CerebrasWithStreamedResponse
51
52
# Client configuration
53
api_key: str
54
```
55
56
### Asynchronous Client
57
58
The async client class providing identical functionality to the synchronous client but with async/await support for non-blocking operations.
59
60
```python { .api }
61
class AsyncCerebras:
62
def __init__(
63
self,
64
*,
65
api_key: str | None = None,
66
base_url: str | httpx.URL | None = None,
67
timeout: Union[float, Timeout, None, NotGiven] = NOT_GIVEN,
68
max_retries: int = DEFAULT_MAX_RETRIES,
69
default_headers: Mapping[str, str] | None = None,
70
default_query: Mapping[str, object] | None = None,
71
http_client: httpx.AsyncClient | None = None,
72
_strict_response_validation: bool = False,
73
warm_tcp_connection: bool = True,
74
) -> None:
75
"""
76
Construct a new asynchronous Cerebras client instance.
77
78
This automatically infers the api_key argument from the CEREBRAS_API_KEY
79
environment variable if it is not provided.
80
81
Parameters:
82
- api_key: API key for authentication (from CEREBRAS_API_KEY env if None)
83
- base_url: Override the default base URL for the API
84
- timeout: Request timeout configuration (float, Timeout object, or NOT_GIVEN)
85
- max_retries: Maximum number of retries for failed requests
86
- default_headers: Default headers to include with all requests
87
- default_query: Default query parameters for all requests
88
- http_client: Custom httpx.AsyncClient instance (DefaultAsyncHttpxClient if None)
89
- _strict_response_validation: Enable strict API response validation
90
- warm_tcp_connection: Enable TCP connection warming for reduced latency
91
"""
92
93
# Resource properties
94
chat: chat.AsyncChatResource
95
completions: completions.AsyncCompletionsResource
96
models: models.AsyncModelsResource
97
98
# Response wrapper properties
99
with_raw_response: AsyncCerebrasWithRawResponse
100
with_streaming_response: AsyncCerebrasWithStreamedResponse
101
102
# Client configuration
103
api_key: str
104
```
105
106
### Client Aliases
107
108
Convenience aliases for the main client classes to provide alternative naming options.
109
110
```python { .api }
111
Client = Cerebras
112
AsyncClient = AsyncCerebras
113
```
114
115
### Response Wrapper Classes
116
117
Classes that provide access to raw HTTP responses and streaming responses, useful for advanced use cases requiring direct access to response metadata.
118
119
```python { .api }
120
class CerebrasWithRawResponse:
121
"""Wrapper providing access to raw HTTP responses."""
122
123
class AsyncCerebrasWithRawResponse:
124
"""Async wrapper providing access to raw HTTP responses."""
125
126
class CerebrasWithStreamedResponse:
127
"""Wrapper providing access to streaming responses."""
128
129
class AsyncCerebrasWithStreamedResponse:
130
"""Async wrapper providing access to streaming responses."""
131
```
132
133
## Usage Examples
134
135
### Basic Client Initialization
136
137
```python
138
from cerebras.cloud.sdk import Cerebras
139
140
# Using environment variable CEREBRAS_API_KEY
141
client = Cerebras()
142
143
# Explicit API key
144
client = Cerebras(api_key="your-api-key-here")
145
```
146
147
### Advanced Configuration
148
149
```python
150
from cerebras.cloud.sdk import Cerebras, Timeout
151
import httpx
152
153
# Custom timeout configuration
154
timeout = Timeout(connect=5.0, read=30.0, write=10.0, pool=5.0)
155
156
# Custom headers and client configuration
157
client = Cerebras(
158
api_key="your-api-key",
159
timeout=timeout,
160
max_retries=3,
161
default_headers={"User-Agent": "MyApp/1.0"},
162
warm_tcp_connection=True
163
)
164
```
165
166
### Async Client Usage
167
168
```python
169
import asyncio
170
from cerebras.cloud.sdk import AsyncCerebras
171
172
async def main():
173
async with AsyncCerebras() as client:
174
response = await client.chat.completions.create(
175
model="llama3.1-70b",
176
messages=[{"role": "user", "content": "Hello!"}]
177
)
178
print(response.choices[0].message.content)
179
180
asyncio.run(main())
181
```
182
183
### Custom HTTP Client
184
185
```python
186
import httpx
187
from cerebras.cloud.sdk import Cerebras, DefaultHttpxClient
188
189
# Using custom httpx client with specific configuration
190
http_client = httpx.Client(
191
limits=httpx.Limits(max_keepalive_connections=20, max_connections=100),
192
timeout=httpx.Timeout(30.0)
193
)
194
195
client = Cerebras(
196
api_key="your-api-key",
197
http_client=http_client
198
)
199
```
200
201
### Raw Response Access
202
203
```python
204
from cerebras.cloud.sdk import Cerebras
205
206
client = Cerebras()
207
208
# Access raw HTTP response
209
raw_response = client.with_raw_response.chat.completions.create(
210
model="llama3.1-70b",
211
messages=[{"role": "user", "content": "Hello!"}]
212
)
213
214
print(f"Status: {raw_response.status_code}")
215
print(f"Headers: {raw_response.headers}")
216
parsed_response = raw_response.parse() # Get the ChatCompletion object
217
```
218
219
## Error Handling
220
221
All client operations can raise exceptions from the SDK's exception hierarchy. Common patterns:
222
223
```python
224
from cerebras.cloud.sdk import Cerebras, APIError, RateLimitError, AuthenticationError
225
226
client = Cerebras()
227
228
try:
229
response = client.chat.completions.create(
230
model="llama3.1-70b",
231
messages=[{"role": "user", "content": "Hello!"}]
232
)
233
except AuthenticationError:
234
print("Invalid API key")
235
except RateLimitError:
236
print("Rate limit exceeded")
237
except APIError as e:
238
print(f"API error: {e}")
239
```