Tessl Tile for pypi/openai@2.8.1

or run

npx @tessl/cli init

chat-completions.mddocs/

0
# Chat Completions
1

2
Create conversational responses using OpenAI's language models with support for text, vision inputs, audio, function calling, structured outputs, and streaming. Chat completions are the primary interface for interacting with models like GPT-4 and GPT-3.5.
3

4
## Access Patterns
5

6
Chat completions are accessible via:
7
- `client.chat.completions` - Primary access pattern
8
- `client.beta.chat.completions` - Beta namespace alias (identical to `client.chat`)
9

10
## Capabilities
11

12
### Create Chat Completion
13

14
Generate a response for a chat conversation with extensive configuration options.
15

16
```python { .api }
17
def create(
18
    self,
19
    *,
20
    messages: Iterable[ChatCompletionMessageParam],
21
    model: str | ChatModel,
22
    audio: ChatCompletionAudioParam | None | Omit = omit,
23
    frequency_penalty: float | None | Omit = omit,
24
    function_call: dict | str | Omit = omit,
25
    functions: Iterable[dict] | Omit = omit,
26
    logit_bias: dict[str, int] | None | Omit = omit,
27
    logprobs: bool | None | Omit = omit,
28
    max_completion_tokens: int | None | Omit = omit,
29
    max_tokens: int | None | Omit = omit,
30
    metadata: dict[str, str] | None | Omit = omit,
31
    modalities: list[Literal["text", "audio"]] | None | Omit = omit,
32
    n: int | None | Omit = omit,
33
    parallel_tool_calls: bool | Omit = omit,
34
    prediction: dict | None | Omit = omit,
35
    presence_penalty: float | None | Omit = omit,
36
    prompt_cache_key: str | Omit = omit,
37
    prompt_cache_retention: Literal["in-memory", "24h"] | None | Omit = omit,
38
    reasoning_effort: Literal["none", "minimal", "low", "medium", "high"] | None | Omit = omit,
39
    response_format: completion_create_params.ResponseFormat | Omit = omit,
40
    safety_identifier: str | Omit = omit,
41
    seed: int | None | Omit = omit,
42
    service_tier: Literal["auto", "default", "flex", "scale", "priority"] | None | Omit = omit,
43
    stop: str | list[str] | None | Omit = omit,
44
    store: bool | None | Omit = omit,
45
    stream: bool | Omit = omit,
46
    stream_options: dict | None | Omit = omit,
47
    temperature: float | None | Omit = omit,
48
    tool_choice: ChatCompletionToolChoiceOptionParam | Omit = omit,
49
    tools: Iterable[ChatCompletionToolUnionParam] | Omit = omit,
50
    top_logprobs: int | None | Omit = omit,
51
    top_p: float | None | Omit = omit,
52
    user: str | Omit = omit,
53
    verbosity: Literal["low", "medium", "high"] | None | Omit = omit,
54
    web_search_options: dict | Omit = omit,
55
    extra_headers: dict[str, str] | None = None,
56
    extra_query: dict[str, object] | None = None,
57
    extra_body: dict[str, object] | None = None,
58
    timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
59
) -> ChatCompletion | Stream[ChatCompletionChunk]:
60
    """
61
    Create a model response for the given chat conversation.
62

63
    Args:
64
        messages: List of messages comprising the conversation. Each message can be:
65
            - System message: {"role": "system", "content": "..."}
66
            - User message: {"role": "user", "content": "..."}
67
            - Assistant message: {"role": "assistant", "content": "..."}
68
            - Tool message: {"role": "tool", "content": "...", "tool_call_id": "..."}
69
            Supports text, images, and audio content.
70

71
        model: Model ID like "gpt-4", "gpt-4-turbo", "gpt-3.5-turbo", or "o1".
72
            See https://platform.openai.com/docs/models for available models.
73

74
        audio: Parameters for audio output when modalities includes "audio".
75
            {"voice": "alloy|echo|fable|onyx|nova|shimmer", "format": "wav|mp3|flac|opus|pcm16"}
76

77
        frequency_penalty: Number between -2.0 and 2.0. Penalizes tokens based on frequency
78
            in the text, reducing repetition. Default 0.
79

80
        function_call: (Deprecated, use tool_choice) Controls function calling.
81
            - "none": No function calls
82
            - "auto": Model decides
83
            - {"name": "function_name"}: Force specific function
84

85
        functions: (Deprecated, use tools) List of function definitions.
86

87
        logit_bias: Modify token probabilities. Maps token IDs to bias values from
88
            -100 to 100. Values near ±1 slightly adjust probability, ±100 ban/force tokens.
89

90
        logprobs: If true, returns log probabilities of output tokens.
91

92
        max_completion_tokens: Maximum tokens for completion including reasoning tokens.
93
            Preferred over max_tokens for o-series models.
94

95
        max_tokens: (Deprecated for o-series) Maximum tokens in completion.
96
            Use max_completion_tokens instead.
97

98
        metadata: Up to 16 key-value pairs for storing additional object information.
99
            Keys max 64 chars, values max 512 chars.
100

101
        modalities: Output types to generate. Options: ["text"], ["audio"], ["text", "audio"].
102
            Default ["text"]. Audio requires gpt-4o-audio-preview model.
103

104
        n: Number of completion choices to generate. Default 1.
105
            Costs scale with number of choices.
106

107
        parallel_tool_calls: Enable parallel function calling during tool use.
108
            Default true when tools are present.
109

110
        prediction: Static predicted output for regeneration tasks (e.g., file content).
111

112
        presence_penalty: Number between -2.0 and 2.0. Penalizes tokens based on
113
            presence in text, encouraging new topics. Default 0.
114

115
        prompt_cache_key: Cache identifier for optimizing similar requests.
116
            Replaces the user field for caching.
117

118
        prompt_cache_retention: Cache retention policy. "24h" enables extended caching
119
            up to 24 hours. Default "in-memory".
120

121
        reasoning_effort: Effort level for reasoning models (o-series).
122
            Options: "none", "minimal", "low", "medium", "high".
123
            - gpt-5.1: defaults to "none"
124
            - Other models: default "medium"
125
            - gpt-5-pro: only supports "high"
126

127
        response_format: Output format specification.
128
            - {"type": "text"}: Plain text (default)
129
            - {"type": "json_object"}: Valid JSON
130
            - {"type": "json_schema", "json_schema": {...}}: Structured Outputs
131

132
        safety_identifier: Stable user identifier (hashed) for policy violation detection.
133

134
        seed: For deterministic sampling (Beta). Same seed + parameters should return
135
            same result, but not guaranteed. Check system_fingerprint for changes.
136

137
        service_tier: Processing type for serving. Options: "auto", "default", "flex",
138
            "scale", "priority". Affects latency and pricing.
139

140
        stop: Up to 4 sequences where generation stops. Can be string or list of strings.
141

142
        store: If true, stores the completion for model distillation/evals.
143

144
        stream: If true, returns SSE stream of ChatCompletionChunk objects.
145
            Returns Stream[ChatCompletionChunk] instead of ChatCompletion.
146

147
        stream_options: Streaming configuration. Accepts dict with:
148
            - "include_usage": bool - If true, includes token usage in final chunk
149
            - "include_obfuscation": bool - If true (default), adds random characters
150
              to obfuscation field on streaming delta events to normalize payload sizes
151
              as mitigation for side-channel attacks. Set to false to optimize bandwidth.
152

153
        temperature: Sampling temperature between 0 and 2. Higher values (e.g., 0.8)
154
            make output more random, lower values (e.g., 0.2) more deterministic.
155
            Default 1. Alter this or top_p, not both.
156

157
        tool_choice: Controls tool/function calling.
158
            - "none": No tools called
159
            - "auto": Model decides (default when tools present)
160
            - "required": Model must call at least one tool
161
            - {"type": "function", "function": {"name": "..."}}: Force specific tool
162

163
        tools: List of tools/functions available to the model. Each tool:
164
            {
165
                "type": "function",
166
                "function": {
167
                    "name": "function_name",
168
                    "description": "What it does",
169
                    "parameters": {...}  # JSON Schema
170
                }
171
            }
172

173
        top_logprobs: Number of most likely tokens to return with logprobs (0-20).
174
            Requires logprobs=true.
175

176
        top_p: Nucleus sampling parameter between 0 and 1. Model considers tokens
177
            with top_p probability mass. E.g., 0.1 means only tokens in top 10%.
178
            Default 1. Alter this or temperature, not both.
179

180
        user: (Deprecated for caching, use prompt_cache_key) Unique user identifier
181
            for abuse monitoring.
182

183
        verbosity: Output detail level for reasoning models. Options: "low", "medium", "high".
184

185
        web_search_options: Web search configuration (if available for model).
186

187
        extra_headers: Additional HTTP headers for the request.
188

189
        extra_query: Additional query parameters for the request.
190

191
        extra_body: Additional JSON fields for the request body.
192

193
        timeout: Request timeout in seconds.
194

195
    Returns:
196
        ChatCompletion: If stream=False (default), returns complete response.
197
        Stream[ChatCompletionChunk]: If stream=True, returns streaming response.
198

199
    Raises:
200
        BadRequestError: Invalid parameters
201
        AuthenticationError: Invalid API key
202
        RateLimitError: Rate limit exceeded
203
        APIError: Other API errors
204
    """
205
```
206

207
Usage examples:
208

209
```python
210
from openai import OpenAI
211

212
client = OpenAI()
213

214
# Basic chat completion
215
response = client.chat.completions.create(
216
    model="gpt-4",
217
    messages=[
218
        {"role": "system", "content": "You are a helpful assistant."},
219
        {"role": "user", "content": "What is the capital of France?"}
220
    ]
221
)
222
print(response.choices[0].message.content)
223

224
# With multiple messages and temperature
225
response = client.chat.completions.create(
226
    model="gpt-3.5-turbo",
227
    messages=[
228
        {"role": "system", "content": "You are a creative writer."},
229
        {"role": "user", "content": "Write a haiku about coding."},
230
    ],
231
    temperature=0.8,
232
    max_tokens=100
233
)
234

235
# With vision (image input)
236
response = client.chat.completions.create(
237
    model="gpt-4-turbo",
238
    messages=[
239
        {
240
            "role": "user",
241
            "content": [
242
                {"type": "text", "text": "What's in this image?"},
243
                {
244
                    "type": "image_url",
245
                    "image_url": {"url": "https://example.com/image.jpg"}
246
                }
247
            ]
248
        }
249
    ]
250
)
251

252
# With function calling / tools
253
tools = [
254
    {
255
        "type": "function",
256
        "function": {
257
            "name": "get_weather",
258
            "description": "Get current weather for a location",
259
            "parameters": {
260
                "type": "object",
261
                "properties": {
262
                    "location": {
263
                        "type": "string",
264
                        "description": "City name, e.g., San Francisco"
265
                    },
266
                    "unit": {
267
                        "type": "string",
268
                        "enum": ["celsius", "fahrenheit"]
269
                    }
270
                },
271
                "required": ["location"]
272
            }
273
        }
274
    }
275
]
276

277
response = client.chat.completions.create(
278
    model="gpt-4",
279
    messages=[{"role": "user", "content": "What's the weather in Boston?"}],
280
    tools=tools,
281
    tool_choice="auto"
282
)
283

284
# Check if model wants to call a function
285
if response.choices[0].message.tool_calls:
286
    for tool_call in response.choices[0].message.tool_calls:
287
        print(f"Function: {tool_call.function.name}")
288
        print(f"Arguments: {tool_call.function.arguments}")
289

290
# Streaming response
291
stream = client.chat.completions.create(
292
    model="gpt-4",
293
    messages=[{"role": "user", "content": "Tell me a story."}],
294
    stream=True
295
)
296

297
for chunk in stream:
298
    if chunk.choices[0].delta.content:
299
        print(chunk.choices[0].delta.content, end="")
300

301
# With reasoning effort (o-series models)
302
response = client.chat.completions.create(
303
    model="o1",
304
    messages=[
305
        {"role": "user", "content": "Solve this complex math problem: ..."}
306
    ],
307
    reasoning_effort="high"
308
)
309

310
# With structured output (JSON Schema)
311
response = client.chat.completions.create(
312
    model="gpt-4",
313
    messages=[{"role": "user", "content": "List 3 colors"}],
314
    response_format={
315
        "type": "json_schema",
316
        "json_schema": {
317
            "name": "colors_response",
318
            "strict": True,
319
            "schema": {
320
                "type": "object",
321
                "properties": {
322
                    "colors": {
323
                        "type": "array",
324
                        "items": {"type": "string"}
325
                    }
326
                },
327
                "required": ["colors"],
328
                "additionalProperties": False
329
            }
330
        }
331
    }
332
)
333
```
334

335
### Parse with Structured Output
336

337
Create chat completion with automatic Pydantic model parsing for structured outputs.
338

339
```python { .api }
340
def parse(
341
    self,
342
    *,
343
    messages: Iterable[ChatCompletionMessageParam],
344
    model: str | ChatModel,
345
    response_format: type[ResponseFormatT] | Omit = omit,
346
    audio: ChatCompletionAudioParam | None | Omit = omit,
347
    frequency_penalty: float | None | Omit = omit,
348
    function_call: dict | str | Omit = omit,
349
    functions: Iterable[dict] | Omit = omit,
350
    logit_bias: dict[str, int] | None | Omit = omit,
351
    logprobs: bool | None | Omit = omit,
352
    max_completion_tokens: int | None | Omit = omit,
353
    max_tokens: int | None | Omit = omit,
354
    metadata: dict[str, str] | None | Omit = omit,
355
    modalities: list[Literal["text", "audio"]] | None | Omit = omit,
356
    n: int | None | Omit = omit,
357
    parallel_tool_calls: bool | Omit = omit,
358
    prediction: dict | None | Omit = omit,
359
    presence_penalty: float | None | Omit = omit,
360
    prompt_cache_key: str | Omit = omit,
361
    prompt_cache_retention: Literal["in-memory", "24h"] | None | Omit = omit,
362
    reasoning_effort: Literal["none", "minimal", "low", "medium", "high"] | None | Omit = omit,
363
    safety_identifier: str | Omit = omit,
364
    seed: int | None | Omit = omit,
365
    service_tier: Literal["auto", "default", "flex", "scale", "priority"] | None | Omit = omit,
366
    stop: str | list[str] | None | Omit = omit,
367
    store: bool | None | Omit = omit,
368
    stream_options: dict | None | Omit = omit,
369
    temperature: float | None | Omit = omit,
370
    tool_choice: ChatCompletionToolChoiceOptionParam | Omit = omit,
371
    tools: Iterable[ChatCompletionToolUnionParam] | Omit = omit,
372
    top_logprobs: int | None | Omit = omit,
373
    top_p: float | None | Omit = omit,
374
    user: str | Omit = omit,
375
    verbosity: Literal["low", "medium", "high"] | None | Omit = omit,
376
    web_search_options: dict | Omit = omit,
377
    extra_headers: dict[str, str] | None = None,
378
    extra_query: dict[str, object] | None = None,
379
    extra_body: dict[str, object] | None = None,
380
    timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
381
) -> ParsedChatCompletion[ResponseFormatT]:
382
    """
383
    Create chat completion with automatic Pydantic model parsing.
384

385
    Converts Pydantic model to JSON schema, sends to API, and parses response
386
    back into the model. Also automatically parses function tool calls when
387
    using pydantic_function_tool() or strict mode.
388

389
    Args:
390
        messages: List of conversation messages.
391
        model: Model ID to use.
392
        response_format: Pydantic model class for structured output.
393
        (other parameters same as create method)
394

395
    Returns:
396
        ParsedChatCompletion[ResponseFormatT]: Completion with parsed content.
397
            Access via completion.choices[0].message.parsed
398

399
    Raises:
400
        Same as create method, plus validation errors for malformed responses.
401
    """
402
```
403

404
Usage example:
405

406
```python
407
from pydantic import BaseModel
408
from openai import OpenAI
409

410
# Define response structure
411
class Step(BaseModel):
412
    explanation: str
413
    output: str
414

415
class MathResponse(BaseModel):
416
    steps: list[Step]
417
    final_answer: str
418

419
client = OpenAI()
420

421
# Parse returns strongly-typed response
422
completion = client.chat.completions.parse(
423
    model="gpt-4",
424
    messages=[
425
        {"role": "system", "content": "You are a helpful math tutor."},
426
        {"role": "user", "content": "Solve: 8x + 31 = 2"}
427
    ],
428
    response_format=MathResponse
429
)
430

431
# Access parsed content with full type safety
432
message = completion.choices[0].message
433
if message.parsed:
434
    for step in message.parsed.steps:
435
        print(f"{step.explanation}: {step.output}")
436
    print(f"Answer: {message.parsed.final_answer}")
437

438
# With function tools using pydantic
439
from openai import pydantic_function_tool
440

441
class WeatherParams(BaseModel):
442
    location: str
443
    unit: Literal["celsius", "fahrenheit"] = "celsius"
444

445
tool = pydantic_function_tool(
446
    WeatherParams,
447
    name="get_weather",
448
    description="Get current weather"
449
)
450

451
completion = client.chat.completions.parse(
452
    model="gpt-4",
453
    messages=[{"role": "user", "content": "What's the weather in NYC?"}],
454
    tools=[tool],
455
    response_format=MathResponse  # For assistant's final response
456
)
457

458
# Tool calls are automatically parsed
459
if completion.choices[0].message.tool_calls:
460
    for call in completion.choices[0].message.tool_calls:
461
        if call.type == "function":
462
            # call.function.parsed_arguments is a WeatherParams instance
463
            params = call.function.parsed_arguments
464
            print(f"Location: {params.location}, Unit: {params.unit}")
465
```
466

467
### Stored Chat Completions
468

469
Store, retrieve, update, and delete chat completions for persistent conversation management.
470

471
```python { .api }
472
def retrieve(
473
    self,
474
    completion_id: str,
475
    *,
476
    extra_headers: dict[str, str] | None = None,
477
    extra_query: dict[str, object] | None = None,
478
    extra_body: dict[str, object] | None = None,
479
    timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
480
) -> ChatCompletion:
481
    """
482
    Retrieve a previously stored chat completion by its ID.
483

484
    Args:
485
        completion_id: The ID of the stored chat completion to retrieve.
486

487
    Returns:
488
        ChatCompletion: The stored completion object.
489
    """
490

491
def update(
492
    self,
493
    completion_id: str,
494
    *,
495
    metadata: dict[str, str] | None,
496
    extra_headers: dict[str, str] | None = None,
497
    extra_query: dict[str, object] | None = None,
498
    extra_body: dict[str, object] | None = None,
499
    timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
500
) -> ChatCompletion:
501
    """
502
    Update metadata for a stored chat completion.
503

504
    Args:
505
        completion_id: The ID of the stored completion to update.
506
        metadata: Updated metadata key-value pairs (max 16 pairs). Required parameter.
507

508
    Returns:
509
        ChatCompletion: The updated completion object.
510
    """
511

512
def list(
513
    self,
514
    *,
515
    after: str | Omit = omit,
516
    limit: int | Omit = omit,
517
    metadata: dict[str, str] | None | Omit = omit,
518
    model: str | Omit = omit,
519
    order: Literal["asc", "desc"] | Omit = omit,
520
    extra_headers: dict[str, str] | None = None,
521
    extra_query: dict[str, object] | None = None,
522
    extra_body: dict[str, object] | None = None,
523
    timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
524
) -> SyncCursorPage[ChatCompletion]:
525
    """
526
    List stored chat completions with cursor-based pagination.
527

528
    Args:
529
        after: Cursor for pagination (ID of object to start after).
530
        limit: Maximum number of completions to return (default 20, max 100).
531
        metadata: Filter by metadata key-value pairs. Only returns completions with matching metadata.
532
        model: Filter by model. Only returns completions generated with the specified model.
533
        order: Sort order: "asc" (oldest first) or "desc" (newest first).
534

535
    Returns:
536
        SyncCursorPage[ChatCompletion]: Paginated list of completions.
537
    """
538

539
def delete(
540
    self,
541
    completion_id: str,
542
    *,
543
    extra_headers: dict[str, str] | None = None,
544
    extra_query: dict[str, object] | None = None,
545
    extra_body: dict[str, object] | None = None,
546
    timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
547
) -> ChatCompletionDeleted:
548
    """
549
    Delete a stored chat completion.
550

551
    Args:
552
        completion_id: The ID of the stored completion to delete.
553

554
    Returns:
555
        ChatCompletionDeleted: Confirmation of deletion with deleted=True.
556
    """
557
```
558

559
Access stored completion messages:
560

561
```python { .api }
562
# Via client.chat.completions.messages.list()
563
def list(
564
    self,
565
    completion_id: str,
566
    *,
567
    after: str | Omit = omit,
568
    limit: int | Omit = omit,
569
    order: Literal["asc", "desc"] | Omit = omit,
570
    extra_headers: dict[str, str] | None = None,
571
    extra_query: dict[str, object] | None = None,
572
    extra_body: dict[str, object] | None = None,
573
    timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
574
) -> SyncCursorPage[ChatCompletionStoreMessage]:
575
    """
576
    List messages from a stored chat completion.
577

578
    Args:
579
        completion_id: The ID of the stored completion.
580
        after: Cursor for pagination.
581
        limit: Maximum number of messages to return.
582
        order: Sort order: "asc" or "desc".
583

584
    Returns:
585
        SyncCursorPage[ChatCompletionStoreMessage]: Paginated list of messages.
586
    """
587
```
588

589
#### Usage Example
590

591
```python
592
from openai import OpenAI
593

594
client = OpenAI()
595

596
# Create a chat completion with store=True
597
response = client.chat.completions.create(
598
    model="gpt-4",
599
    messages=[{"role": "user", "content": "Tell me about Python"}],
600
    store=True,
601
    metadata={"user_id": "user123", "session": "abc"}
602
)
603

604
completion_id = response.id
605
print(f"Stored completion: {completion_id}")
606

607
# Retrieve the stored completion later
608
stored = client.chat.completions.retrieve(completion_id)
609
print(stored.choices[0].message.content)
610

611
# Update metadata
612
updated = client.chat.completions.update(
613
    completion_id,
614
    metadata={"user_id": "user123", "session": "abc", "reviewed": "true"}
615
)
616

617
# List all stored completions
618
page = client.chat.completions.list(limit=10, order="desc")
619
for completion in page.data:
620
    print(f"{completion.id}: {completion.created}")
621

622
# List messages from a specific completion
623
messages_page = client.chat.completions.messages.list(completion_id)
624
for message in messages_page.data:
625
    print(f"{message.role}: {message.content}")
626

627
# Delete when no longer needed
628
deleted = client.chat.completions.delete(completion_id)
629
print(f"Deleted: {deleted.deleted}")
630
```
631

632
### Stream Chat Completions
633

634
Wrapper over `create(stream=True)` that provides a more granular event API and automatic accumulation of deltas. Requires use within a context manager.
635

636
```python { .api }
637
def stream(
638
    self,
639
    *,
640
    messages: Iterable[ChatCompletionMessageParam],
641
    model: str | ChatModel,
642
    audio: ChatCompletionAudioParam | None | Omit = omit,
643
    frequency_penalty: float | None | Omit = omit,
644
    function_call: dict | str | Omit = omit,
645
    functions: Iterable[dict] | Omit = omit,
646
    logit_bias: dict[str, int] | None | Omit = omit,
647
    logprobs: bool | None | Omit = omit,
648
    max_completion_tokens: int | None | Omit = omit,
649
    max_tokens: int | None | Omit = omit,
650
    metadata: dict[str, str] | None | Omit = omit,
651
    modalities: list[Literal["text", "audio"]] | None | Omit = omit,
652
    n: int | None | Omit = omit,
653
    parallel_tool_calls: bool | Omit = omit,
654
    prediction: dict | None | Omit = omit,
655
    presence_penalty: float | None | Omit = omit,
656
    prompt_cache_key: str | Omit = omit,
657
    prompt_cache_retention: Literal["in-memory", "24h"] | None | Omit = omit,
658
    reasoning_effort: Literal["none", "minimal", "low", "medium", "high"] | None | Omit = omit,
659
    response_format: completion_create_params.ResponseFormat | Omit = omit,
660
    safety_identifier: str | Omit = omit,
661
    seed: int | None | Omit = omit,
662
    service_tier: Literal["auto", "default", "flex", "scale", "priority"] | None | Omit = omit,
663
    stop: str | list[str] | None | Omit = omit,
664
    store: bool | None | Omit = omit,
665
    stream_options: dict | None | Omit = omit,
666
    temperature: float | None | Omit = omit,
667
    tool_choice: ChatCompletionToolChoiceOptionParam | Omit = omit,
668
    tools: Iterable[ChatCompletionToolUnionParam] | Omit = omit,
669
    top_logprobs: int | None | Omit = omit,
670
    top_p: float | None | Omit = omit,
671
    user: str | Omit = omit,
672
    verbosity: Literal["low", "medium", "high"] | None | Omit = omit,
673
    web_search_options: dict | Omit = omit,
674
    extra_headers: dict[str, str] | None = None,
675
    extra_query: dict[str, object] | None = None,
676
    extra_body: dict[str, object] | None = None,
677
    timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
678
) -> ChatCompletionStreamManager:
679
    """
680
    Streaming wrapper with granular event API and automatic delta accumulation.
681

682
    Unlike create(stream=True), this method requires a context manager to prevent
683
    resource leaks. Yields detailed events including content.delta, content.done,
684
    and provides accumulated snapshots.
685

686
    Args:
687
        Same parameters as create() method.
688

689
    Returns:
690
        ChatCompletionStreamManager: Context manager yielding stream events.
691
    """
692
```
693

694
**Usage Example:**
695

696
```python
697
from openai import OpenAI
698

699
client = OpenAI()
700

701
# Must use within context manager
702
with client.chat.completions.stream(
703
    model="gpt-4",
704
    messages=[{"role": "user", "content": "Tell me a story"}],
705
) as stream:
706
    for event in stream:
707
        if event.type == "content.delta":
708
            print(event.delta, flush=True, end="")
709
        elif event.type == "content.done":
710
            print(f"\nFinal content: {event.content}")
711

712
# Access accumulated result after streaming
713
print(f"Model: {stream.response.model}")
714
print(f"Total tokens: {stream.response.usage.total_tokens}")
715
```
716

717
### Streaming with Helpers
718

719
Advanced streaming with context managers for easier handling.
720

721
```python
722
from openai import OpenAI
723

724
client = OpenAI()
725

726
# Using stream() method with context manager
727
with client.chat.completions.create(
728
    model="gpt-4",
729
    messages=[{"role": "user", "content": "Count to 5"}],
730
    stream=True
731
) as stream:
732
    for chunk in stream:
733
        print(chunk.choices[0].delta.content or "", end="")
734

735
# Using stream context manager explicitly
736
stream = client.chat.completions.create(
737
    model="gpt-4",
738
    messages=[{"role": "user", "content": "Tell me a joke"}],
739
    stream=True
740
)
741

742
# Access streaming response
743
for chunk in stream:
744
    if chunk.choices[0].delta.content:
745
        print(chunk.choices[0].delta.content, end="", flush=True)
746

747
# Stream with usage information
748
stream = client.chat.completions.create(
749
    model="gpt-4",
750
    messages=[{"role": "user", "content": "Hello!"}],
751
    stream=True,
752
    stream_options={"include_usage": True}
753
)
754

755
for chunk in stream:
756
    if chunk.choices[0].delta.content:
757
        print(chunk.choices[0].delta.content, end="")
758
    # Usage included in final chunk
759
    if hasattr(chunk, 'usage') and chunk.usage:
760
        print(f"\nTokens used: {chunk.usage.total_tokens}")
761
```
762

763
## Types
764

765
```python { .api }
766
from typing import Literal, TypeVar, Generic
767
from pydantic import BaseModel
768
from openai.types.chat import (
769
    ChatCompletionToolUnionParam,
770
    ChatCompletionToolChoiceOptionParam,
771
    completion_create_params,
772
)
773

774
# Message types
775
ChatCompletionMessageParam = dict[str, Any]  # Union of message types
776

777
class ChatCompletionSystemMessageParam(TypedDict):
778
    role: Literal["system"]
779
    content: str
780
    name: NotRequired[str]
781

782
class ChatCompletionUserMessageParam(TypedDict):
783
    role: Literal["user"]
784
    content: str | list[dict]  # Text or multimodal content
785
    name: NotRequired[str]
786

787
class ChatCompletionAssistantMessageParam(TypedDict):
788
    role: Literal["assistant"]
789
    content: str | None
790
    name: NotRequired[str]
791
    tool_calls: NotRequired[list[dict]]
792

793
class ChatCompletionToolMessageParam(TypedDict):
794
    role: Literal["tool"]
795
    content: str
796
    tool_call_id: str
797

798
# Response types
799
class ChatCompletion(BaseModel):
800
    id: str
801
    choices: list[Choice]
802
    created: int
803
    model: str
804
    object: Literal["chat.completion"]
805
    system_fingerprint: str | None
806
    usage: CompletionUsage | None
807

808
class Choice(BaseModel):
809
    finish_reason: Literal["stop", "length", "tool_calls", "content_filter", "function_call"]
810
    index: int
811
    logprobs: Logprobs | None
812
    message: ChatCompletionMessage
813

814
class ChatCompletionMessage(BaseModel):
815
    content: str | None
816
    role: Literal["assistant"]
817
    tool_calls: list[ChatCompletionMessageToolCall] | None
818
    function_call: FunctionCall | None  # Deprecated
819
    audio: Audio | None  # When modalities includes audio
820

821
class ChatCompletionStoreMessage(BaseModel):
822
    """Message from a stored chat completion."""
823
    content: str | None
824
    role: Literal["system", "user", "assistant", "tool"]
825
    tool_calls: list[ChatCompletionMessageToolCall] | None
826
    tool_call_id: str | None  # For tool messages
827

828
class ChatCompletionMessageToolCall(BaseModel):
829
    id: str
830
    function: Function
831
    type: Literal["function"]
832

833
class Function(BaseModel):
834
    arguments: str  # JSON string
835
    name: str
836

837
class CompletionUsage(BaseModel):
838
    completion_tokens: int
839
    prompt_tokens: int
840
    total_tokens: int
841
    completion_tokens_details: CompletionTokensDetails | None
842

843
# Streaming types
844
class ChatCompletionChunk(BaseModel):
845
    id: str
846
    choices: list[ChunkChoice]
847
    created: int
848
    model: str
849
    object: Literal["chat.completion.chunk"]
850
    system_fingerprint: str | None
851
    usage: CompletionUsage | None  # Only in final chunk with include_usage
852

853
class ChunkChoice(BaseModel):
854
    delta: ChoiceDelta
855
    finish_reason: str | None
856
    index: int
857
    logprobs: Logprobs | None
858

859
class ChoiceDelta(BaseModel):
860
    content: str | None
861
    role: Literal["assistant"] | None
862
    tool_calls: list[ChoiceDeltaToolCall] | None
863

864
# Parsed completion types
865
ResponseFormatT = TypeVar("ResponseFormatT", bound=BaseModel)
866

867
class ParsedChatCompletion(Generic[ResponseFormatT], ChatCompletion):
868
    """ChatCompletion with parsed content."""
869
    choices: list[ParsedChoice[ResponseFormatT]]
870

871
class ParsedChoice(Generic[ResponseFormatT], Choice):
872
    message: ParsedChatCompletionMessage[ResponseFormatT]
873

874
class ParsedChatCompletionMessage(Generic[ResponseFormatT], ChatCompletionMessage):
875
    parsed: ResponseFormatT | None
876
    tool_calls: list[ParsedFunctionToolCall] | None
877

878
class ParsedFunctionToolCall(ChatCompletionMessageToolCall):
879
    function: ParsedFunction
880
    type: Literal["function"]
881

882
class ParsedFunction(Function):
883
    parsed_arguments: BaseModel | None
884

885
# Deletion type
886
class ChatCompletionDeleted(BaseModel):
887
    id: str
888
    deleted: bool
889
    object: Literal["chat.completion"]
890

891
# Tool/function definitions
892
class ChatCompletionToolParam(TypedDict):
893
    type: Literal["function"]
894
    function: FunctionDefinition
895

896
class FunctionDefinition(TypedDict):
897
    name: str
898
    description: NotRequired[str]
899
    parameters: dict  # JSON Schema
900
    strict: NotRequired[bool]  # Enable strict schema adherence
901

902
# Response format types
903
class ResponseFormatText(TypedDict):
904
    type: Literal["text"]
905

906
class ResponseFormatJSONObject(TypedDict):
907
    type: Literal["json_object"]
908

909
class ResponseFormatJSONSchema(TypedDict):
910
    type: Literal["json_schema"]
911
    json_schema: JSONSchema
912

913
class JSONSchema(TypedDict):
914
    name: str
915
    description: NotRequired[str]
916
    schema: dict  # JSON Schema object
917
    strict: NotRequired[bool]
918

919
# Audio types
920
class ChatCompletionAudioParam(TypedDict):
921
    voice: Literal["alloy", "echo", "fable", "onyx", "nova", "shimmer"]
922
    format: Literal["wav", "mp3", "flac", "opus", "pcm16"]
923

924
# Streaming options
925
class ChatCompletionStreamOptionsParam(TypedDict):
926
    include_usage: NotRequired[bool]
927

928
# Tool choice types
929
ChatCompletionToolChoiceOptionParam = (
930
    Literal["none", "auto", "required"] | dict
931
)
932

933
class ToolChoiceFunction(TypedDict):
934
    type: Literal["function"]
935
    function: FunctionChoice
936

937
class FunctionChoice(TypedDict):
938
    name: str
939

940
# Stream wrapper type
941
class Stream(Generic[T]):
942
    def __iter__(self) -> Iterator[T]: ...
943
    def __next__(self) -> T: ...
944
    def __enter__(self) -> Stream[T]: ...
945
    def __exit__(self, *args) -> None: ...
946
    def close(self) -> None: ...
947
```
948

949
## Async Usage
950

951
All chat completion methods are available in async variants through `AsyncOpenAI`:
952

953
```python
954
import asyncio
955
from openai import AsyncOpenAI
956

957
async def main():
958
    client = AsyncOpenAI()
959

960
    # Async create - returns ChatCompletion or AsyncStream[ChatCompletionChunk]
961
    response = await client.chat.completions.create(
962
        model="gpt-4",
963
        messages=[{"role": "user", "content": "Hello!"}]
964
    )
965
    print(response.choices[0].message.content)
966

967
    # Async streaming
968
    async for chunk in await client.chat.completions.create(
969
        model="gpt-4",
970
        messages=[{"role": "user", "content": "Tell me a story"}],
971
        stream=True
972
    ):
973
        if chunk.choices[0].delta.content:
974
            print(chunk.choices[0].delta.content, end="")
975

976
    # Async parse - returns ParsedChatCompletion with structured output
977
    from pydantic import BaseModel
978

979
    class CalendarEvent(BaseModel):
980
        name: str
981
        date: str
982
        participants: list[str]
983

984
    response = await client.chat.completions.parse(
985
        model="gpt-4o-2024-08-06",
986
        messages=[{"role": "user", "content": "Alice and Bob are meeting on Friday"}],
987
        response_format=CalendarEvent
988
    )
989
    event = response.choices[0].message.parsed
990

991
    # Other async methods: retrieve, update, list, delete, stream
992
    # All have the same signatures as sync versions
993

994
asyncio.run(main())
995
```
996

997
**Note**: AsyncOpenAI uses `AsyncStream[ChatCompletionChunk]` for streaming responses instead of `Stream[ChatCompletionChunk]`.
998

999
## Error Handling
1000

1001
```python
1002
from openai import OpenAI, APIError, RateLimitError
1003

1004
client = OpenAI()
1005

1006
try:
1007
    response = client.chat.completions.create(
1008
        model="gpt-4",
1009
        messages=[{"role": "user", "content": "Hello"}]
1010
    )
1011
except RateLimitError as e:
1012
    print(f"Rate limit hit: {e}")
1013
    # Handle rate limiting (e.g., retry with backoff)
1014
except APIError as e:
1015
    print(f"API error: {e.status_code} - {e.message}")
1016
    # Handle other API errors
1017
```
1018

Version

Tile

Files

chat-completions.mddocs/

Version

Tile

Files

chat-completions.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

chat-completions.mddocs/