or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

audio.mdbatches.mdchat-completions.mdembeddings.mdfiles.mdindex.mdmodels.md

chat-completions.mddocs/

0

# Chat Completions

1

2

High-performance chat completions with streaming support, function calling, tool usage, and advanced features like reasoning modes and search integration. The chat completions API provides both synchronous and asynchronous interfaces for generating conversational AI responses.

3

4

## Capabilities

5

6

### Create Chat Completion

7

8

Generate chat completions with comprehensive configuration options, supporting both streaming and non-streaming responses.

9

10

```python { .api }

11

def create(

12

messages: Iterable[ChatCompletionMessageParam],

13

model: str,

14

exclude_domains: Optional[List[str]] = NOT_GIVEN,

15

frequency_penalty: Optional[float] = NOT_GIVEN,

16

function_call: Optional[FunctionCall] = NOT_GIVEN,

17

functions: Optional[Iterable[Function]] = NOT_GIVEN,

18

include_domains: Optional[List[str]] = NOT_GIVEN,

19

include_reasoning: Optional[bool] = NOT_GIVEN,

20

logit_bias: Optional[Dict[str, int]] = NOT_GIVEN,

21

logprobs: Optional[bool] = NOT_GIVEN,

22

max_completion_tokens: Optional[int] = NOT_GIVEN,

23

max_tokens: Optional[int] = NOT_GIVEN,

24

metadata: Optional[Dict[str, str]] = NOT_GIVEN,

25

n: Optional[int] = NOT_GIVEN,

26

parallel_tool_calls: Optional[bool] = NOT_GIVEN,

27

presence_penalty: Optional[float] = NOT_GIVEN,

28

reasoning_effort: Optional[Literal["none", "default", "low", "medium", "high"]] = NOT_GIVEN,

29

reasoning_format: Optional[Literal["hidden", "raw", "parsed"]] = NOT_GIVEN,

30

response_format: Optional[ResponseFormat] = NOT_GIVEN,

31

search_settings: Optional[SearchSettings] = NOT_GIVEN,

32

seed: Optional[int] = NOT_GIVEN,

33

service_tier: Optional[Literal["auto", "on_demand", "flex", "performance"]] = NOT_GIVEN,

34

stop: Union[Optional[str], List[str], None] = NOT_GIVEN,

35

store: Optional[bool] = NOT_GIVEN,

36

stream: Optional[bool] = NOT_GIVEN,

37

temperature: Optional[float] = NOT_GIVEN,

38

tool_choice: Optional[ChatCompletionToolChoiceOptionParam] = NOT_GIVEN,

39

tools: Optional[Iterable[ChatCompletionToolParam]] = NOT_GIVEN,

40

top_logprobs: Optional[int] = NOT_GIVEN,

41

top_p: Optional[float] = NOT_GIVEN,

42

user: Optional[str] = NOT_GIVEN,

43

extra_headers: Headers | None = None,

44

extra_query: Query | None = None,

45

extra_body: Body | None = None,

46

timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN

47

) -> ChatCompletion | Stream[ChatCompletionChunk]:

48

"""

49

Create a chat completion with the specified messages and configuration.

50

51

Parameters:

52

- messages: List of conversation messages with roles and content

53

- model: Model identifier to use for the completion

54

- exclude_domains: Domains to exclude from search results

55

- frequency_penalty: Penalize tokens based on their frequency in the text so far

56

- function_call: Control which function is called (deprecated, use tools)

57

- functions: List of functions the model may call (deprecated, use tools)

58

- include_domains: Domains to include in search results

59

- include_reasoning: Whether to include reasoning in the response

60

- logit_bias: Modify likelihood of specified tokens appearing

61

- logprobs: Whether to return log probabilities

62

- max_completion_tokens: Maximum number of completion tokens to generate

63

- max_tokens: Maximum number of tokens to generate (deprecated, use max_completion_tokens)

64

- metadata: Optional metadata to attach to the request

65

- n: Number of completions to generate for each prompt

66

- parallel_tool_calls: Whether to enable parallel function calling

67

- presence_penalty: Penalize tokens based on whether they appear in the text

68

- reasoning_effort: Level of reasoning effort for o1 models

69

- reasoning_format: Format for reasoning output

70

- response_format: Format specification for the response

71

- search_settings: Configuration for search functionality

72

- seed: Random seed for deterministic sampling

73

- service_tier: Service quality tier

74

- stop: Sequences where the API will stop generating tokens

75

- store: Whether to store the conversation for model training

76

- stream: Whether to stream back partial progress

77

- temperature: Sampling temperature between 0 and 2

78

- tool_choice: Controls which tool is called by the model

79

- tools: List of tools the model may call

80

- top_logprobs: Number of most likely tokens to return at each position

81

- top_p: Nucleus sampling parameter

82

- user: Unique identifier representing your end-user

83

84

Returns:

85

ChatCompletion for non-streaming requests or Stream[ChatCompletionChunk] for streaming

86

"""

87

```

88

89

### Async Create Chat Completion

90

91

Asynchronous version of chat completion creation with identical parameters and functionality.

92

93

```python { .api }

94

async def create(

95

messages: Iterable[ChatCompletionMessageParam],

96

model: str,

97

**kwargs

98

) -> ChatCompletion | AsyncStream[ChatCompletionChunk]:

99

"""Async version of create() with identical parameters."""

100

```

101

102

## Usage Examples

103

104

### Basic Chat Completion

105

106

```python

107

from groq import Groq

108

109

client = Groq()

110

111

completion = client.chat.completions.create(

112

messages=[

113

{"role": "system", "content": "You are a helpful assistant."},

114

{"role": "user", "content": "What is the capital of France?"}

115

],

116

model="llama3-8b-8192",

117

max_tokens=100,

118

temperature=0.7

119

)

120

121

print(completion.choices[0].message.content)

122

```

123

124

### Streaming Chat Completion

125

126

```python

127

from groq import Groq

128

129

client = Groq()

130

131

stream = client.chat.completions.create(

132

messages=[

133

{"role": "user", "content": "Write a short story about a robot."}

134

],

135

model="llama3-8b-8192",

136

stream=True,

137

max_tokens=500

138

)

139

140

for chunk in stream:

141

if chunk.choices[0].delta.content is not None:

142

print(chunk.choices[0].delta.content, end="")

143

```

144

145

### Function Calling with Tools

146

147

```python

148

from groq import Groq

149

150

client = Groq()

151

152

tools = [

153

{

154

"type": "function",

155

"function": {

156

"name": "get_weather",

157

"description": "Get the current weather for a location",

158

"parameters": {

159

"type": "object",

160

"properties": {

161

"location": {

162

"type": "string",

163

"description": "The city and state, e.g. San Francisco, CA"

164

}

165

},

166

"required": ["location"]

167

}

168

}

169

}

170

]

171

172

completion = client.chat.completions.create(

173

messages=[

174

{"role": "user", "content": "What's the weather like in New York?"}

175

],

176

model="llama3-8b-8192",

177

tools=tools,

178

tool_choice="auto"

179

)

180

181

# Check if the model wants to call a function

182

if completion.choices[0].message.tool_calls:

183

tool_call = completion.choices[0].message.tool_calls[0]

184

print(f"Function to call: {tool_call.function.name}")

185

print(f"Arguments: {tool_call.function.arguments}")

186

```

187

188

### Async Usage

189

190

```python

191

import asyncio

192

from groq import AsyncGroq

193

194

async def main():

195

client = AsyncGroq()

196

197

completion = await client.chat.completions.create(

198

messages=[

199

{"role": "user", "content": "Explain quantum computing briefly."}

200

],

201

model="llama3-8b-8192",

202

max_tokens=200

203

)

204

205

print(completion.choices[0].message.content)

206

207

asyncio.run(main())

208

```

209

210

## Types

211

212

### Message Types

213

214

```python { .api }

215

class ChatCompletionMessage:

216

content: Optional[str]

217

role: Literal["assistant", "system", "user", "tool", "function"]

218

function_call: Optional[FunctionCall]

219

tool_calls: Optional[List[ChatCompletionMessageToolCall]]

220

221

class ChatCompletionMessageParam:

222

role: str

223

content: Optional[str]

224

225

class ChatCompletionSystemMessageParam(ChatCompletionMessageParam):

226

role: Literal["system"]

227

content: str

228

229

class ChatCompletionUserMessageParam(ChatCompletionMessageParam):

230

role: Literal["user"]

231

content: Union[str, List[ChatCompletionContentPartParam]]

232

233

class ChatCompletionAssistantMessageParam(ChatCompletionMessageParam):

234

role: Literal["assistant"]

235

content: Optional[str]

236

function_call: Optional[ChatCompletionMessageToolCallParam]

237

tool_calls: Optional[List[ChatCompletionMessageToolCallParam]]

238

239

class ChatCompletionToolMessageParam(ChatCompletionMessageParam):

240

role: Literal["tool"]

241

content: str

242

tool_call_id: str

243

244

class ChatCompletionFunctionMessageParam(ChatCompletionMessageParam):

245

role: Literal["function"]

246

content: str

247

name: str

248

```

249

250

### Content Part Types

251

252

```python { .api }

253

class ChatCompletionContentPartTextParam:

254

type: Literal["text"]

255

text: str

256

257

class ChatCompletionContentPartImageParam:

258

type: Literal["image_url"]

259

image_url: Dict[str, str]

260

261

ChatCompletionContentPartParam = Union[

262

ChatCompletionContentPartTextParam,

263

ChatCompletionContentPartImageParam

264

]

265

```

266

267

### Tool Types

268

269

```python { .api }

270

class ChatCompletionToolParam:

271

type: Literal["function"]

272

function: FunctionDefinition

273

274

class ChatCompletionMessageToolCall:

275

id: str

276

type: Literal["function"]

277

function: Function

278

279

class ChatCompletionMessageToolCallParam:

280

id: str

281

type: Literal["function"]

282

function: Function

283

284

class ChatCompletionToolChoiceOptionParam:

285

type: Literal["function"]

286

function: ChatCompletionNamedToolChoiceParam

287

288

class ChatCompletionNamedToolChoiceParam:

289

name: str

290

```

291

292

### Response Types

293

294

```python { .api }

295

class ChatCompletion:

296

id: str

297

choices: List[Choice]

298

created: int

299

model: str

300

object: Literal["chat.completion"]

301

usage: Optional[CompletionUsage]

302

303

class Choice:

304

finish_reason: Optional[Literal["stop", "length", "tool_calls", "content_filter", "function_call"]]

305

index: int

306

logprobs: Optional[ChoiceLogprobs]

307

message: ChatCompletionMessage

308

309

class ChatCompletionChunk:

310

id: str

311

choices: List[ChoiceDelta]

312

created: int

313

model: str

314

object: Literal["chat.completion.chunk"]

315

usage: Optional[CompletionUsage]

316

317

class ChoiceDelta:

318

delta: Delta

319

finish_reason: Optional[str]

320

index: int

321

logprobs: Optional[ChoiceLogprobs]

322

323

class Delta:

324

content: Optional[str]

325

function_call: Optional[ChoiceDeltaFunctionCall]

326

role: Optional[Literal["system", "user", "assistant", "tool"]]

327

tool_calls: Optional[List[ChoiceDeltaToolCall]]

328

```

329

330

### Usage and Token Information

331

332

```python { .api }

333

class CompletionUsage:

334

completion_tokens: int

335

prompt_tokens: int

336

total_tokens: int

337

```

338

339

### Function Definition Types

340

341

```python { .api }

342

class FunctionDefinition:

343

name: str

344

description: Optional[str]

345

parameters: Optional[FunctionParameters]

346

347

class FunctionParameters:

348

# JSON Schema object defining function parameters

349

type: str

350

properties: Optional[Dict[str, Any]]

351

required: Optional[List[str]]

352

```

353

354

### Response Format Types

355

356

```python { .api }

357

class ResponseFormat:

358

type: Literal["text", "json_object"]

359

```

360

361

### Search Settings Types

362

363

```python { .api }

364

class SearchSettings:

365

# Configuration for search functionality

366

max_results: Optional[int]

367

domains: Optional[List[str]]

368

```