or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

chat-completions.mdclient-management.mdindex.mdlegacy-completions.mdmodels.mdtypes-and-configuration.md

client-management.mddocs/

0

# Client Management

1

2

Client initialization, configuration, and authentication for both synchronous and asynchronous usage patterns. The SDK provides flexible configuration options including environment variable support, custom timeouts, retry policies, and HTTP client customization.

3

4

## Capabilities

5

6

### Synchronous Client

7

8

The primary client class for synchronous API interactions with comprehensive configuration options and automatic API key detection from environment variables.

9

10

```python { .api }

11

class Cerebras:

12

def __init__(

13

self,

14

*,

15

api_key: str | None = None,

16

base_url: str | httpx.URL | None = None,

17

timeout: Union[float, Timeout, None, NotGiven] = NOT_GIVEN,

18

max_retries: int = DEFAULT_MAX_RETRIES,

19

default_headers: Mapping[str, str] | None = None,

20

default_query: Mapping[str, object] | None = None,

21

http_client: httpx.Client | None = None,

22

_strict_response_validation: bool = False,

23

warm_tcp_connection: bool = True,

24

) -> None:

25

"""

26

Construct a new synchronous Cerebras client instance.

27

28

This automatically infers the api_key argument from the CEREBRAS_API_KEY

29

environment variable if it is not provided.

30

31

Parameters:

32

- api_key: API key for authentication (from CEREBRAS_API_KEY env if None)

33

- base_url: Override the default base URL for the API

34

- timeout: Request timeout configuration (float, Timeout object, or NOT_GIVEN)

35

- max_retries: Maximum number of retries for failed requests

36

- default_headers: Default headers to include with all requests

37

- default_query: Default query parameters for all requests

38

- http_client: Custom httpx.Client instance (DefaultHttpxClient if None)

39

- _strict_response_validation: Enable strict API response validation

40

- warm_tcp_connection: Enable TCP connection warming for reduced latency

41

"""

42

43

# Resource properties

44

chat: chat.ChatResource

45

completions: completions.CompletionsResource

46

models: models.ModelsResource

47

48

# Response wrapper properties

49

with_raw_response: CerebrasWithRawResponse

50

with_streaming_response: CerebrasWithStreamedResponse

51

52

# Client configuration

53

api_key: str

54

```

55

56

### Asynchronous Client

57

58

The async client class providing identical functionality to the synchronous client but with async/await support for non-blocking operations.

59

60

```python { .api }

61

class AsyncCerebras:

62

def __init__(

63

self,

64

*,

65

api_key: str | None = None,

66

base_url: str | httpx.URL | None = None,

67

timeout: Union[float, Timeout, None, NotGiven] = NOT_GIVEN,

68

max_retries: int = DEFAULT_MAX_RETRIES,

69

default_headers: Mapping[str, str] | None = None,

70

default_query: Mapping[str, object] | None = None,

71

http_client: httpx.AsyncClient | None = None,

72

_strict_response_validation: bool = False,

73

warm_tcp_connection: bool = True,

74

) -> None:

75

"""

76

Construct a new asynchronous Cerebras client instance.

77

78

This automatically infers the api_key argument from the CEREBRAS_API_KEY

79

environment variable if it is not provided.

80

81

Parameters:

82

- api_key: API key for authentication (from CEREBRAS_API_KEY env if None)

83

- base_url: Override the default base URL for the API

84

- timeout: Request timeout configuration (float, Timeout object, or NOT_GIVEN)

85

- max_retries: Maximum number of retries for failed requests

86

- default_headers: Default headers to include with all requests

87

- default_query: Default query parameters for all requests

88

- http_client: Custom httpx.AsyncClient instance (DefaultAsyncHttpxClient if None)

89

- _strict_response_validation: Enable strict API response validation

90

- warm_tcp_connection: Enable TCP connection warming for reduced latency

91

"""

92

93

# Resource properties

94

chat: chat.AsyncChatResource

95

completions: completions.AsyncCompletionsResource

96

models: models.AsyncModelsResource

97

98

# Response wrapper properties

99

with_raw_response: AsyncCerebrasWithRawResponse

100

with_streaming_response: AsyncCerebrasWithStreamedResponse

101

102

# Client configuration

103

api_key: str

104

```

105

106

### Client Aliases

107

108

Convenience aliases for the main client classes to provide alternative naming options.

109

110

```python { .api }

111

Client = Cerebras

112

AsyncClient = AsyncCerebras

113

```

114

115

### Response Wrapper Classes

116

117

Classes that provide access to raw HTTP responses and streaming responses, useful for advanced use cases requiring direct access to response metadata.

118

119

```python { .api }

120

class CerebrasWithRawResponse:

121

"""Wrapper providing access to raw HTTP responses."""

122

123

class AsyncCerebrasWithRawResponse:

124

"""Async wrapper providing access to raw HTTP responses."""

125

126

class CerebrasWithStreamedResponse:

127

"""Wrapper providing access to streaming responses."""

128

129

class AsyncCerebrasWithStreamedResponse:

130

"""Async wrapper providing access to streaming responses."""

131

```

132

133

## Usage Examples

134

135

### Basic Client Initialization

136

137

```python

138

from cerebras.cloud.sdk import Cerebras

139

140

# Using environment variable CEREBRAS_API_KEY

141

client = Cerebras()

142

143

# Explicit API key

144

client = Cerebras(api_key="your-api-key-here")

145

```

146

147

### Advanced Configuration

148

149

```python

150

from cerebras.cloud.sdk import Cerebras, Timeout

151

import httpx

152

153

# Custom timeout configuration

154

timeout = Timeout(connect=5.0, read=30.0, write=10.0, pool=5.0)

155

156

# Custom headers and client configuration

157

client = Cerebras(

158

api_key="your-api-key",

159

timeout=timeout,

160

max_retries=3,

161

default_headers={"User-Agent": "MyApp/1.0"},

162

warm_tcp_connection=True

163

)

164

```

165

166

### Async Client Usage

167

168

```python

169

import asyncio

170

from cerebras.cloud.sdk import AsyncCerebras

171

172

async def main():

173

async with AsyncCerebras() as client:

174

response = await client.chat.completions.create(

175

model="llama3.1-70b",

176

messages=[{"role": "user", "content": "Hello!"}]

177

)

178

print(response.choices[0].message.content)

179

180

asyncio.run(main())

181

```

182

183

### Custom HTTP Client

184

185

```python

186

import httpx

187

from cerebras.cloud.sdk import Cerebras, DefaultHttpxClient

188

189

# Using custom httpx client with specific configuration

190

http_client = httpx.Client(

191

limits=httpx.Limits(max_keepalive_connections=20, max_connections=100),

192

timeout=httpx.Timeout(30.0)

193

)

194

195

client = Cerebras(

196

api_key="your-api-key",

197

http_client=http_client

198

)

199

```

200

201

### Raw Response Access

202

203

```python

204

from cerebras.cloud.sdk import Cerebras

205

206

client = Cerebras()

207

208

# Access raw HTTP response

209

raw_response = client.with_raw_response.chat.completions.create(

210

model="llama3.1-70b",

211

messages=[{"role": "user", "content": "Hello!"}]

212

)

213

214

print(f"Status: {raw_response.status_code}")

215

print(f"Headers: {raw_response.headers}")

216

parsed_response = raw_response.parse() # Get the ChatCompletion object

217

```

218

219

## Error Handling

220

221

All client operations can raise exceptions from the SDK's exception hierarchy. Common patterns:

222

223

```python

224

from cerebras.cloud.sdk import Cerebras, APIError, RateLimitError, AuthenticationError

225

226

client = Cerebras()

227

228

try:

229

response = client.chat.completions.create(

230

model="llama3.1-70b",

231

messages=[{"role": "user", "content": "Hello!"}]

232

)

233

except AuthenticationError:

234

print("Invalid API key")

235

except RateLimitError:

236

print("Rate limit exceeded")

237

except APIError as e:

238

print(f"API error: {e}")

239

```