0
# Chat Completions
1
2
Advanced conversational AI interface supporting text, image, and video inputs with streaming capabilities, comprehensive configuration options, and both synchronous and asynchronous operations.
3
4
## Capabilities
5
6
### Basic Chat Completion
7
8
Creates chat completions with conversational context and message history.
9
10
```python { .api }
11
def create(
12
*,
13
messages: List[Dict[str, Any]],
14
model: str,
15
max_tokens: Optional[int] = None,
16
stop: Optional[List[str]] = None,
17
temperature: Optional[float] = None,
18
top_p: Optional[float] = None,
19
top_k: Optional[int] = None,
20
repetition_penalty: Optional[float] = None,
21
presence_penalty: Optional[float] = None,
22
frequency_penalty: Optional[float] = None,
23
min_p: Optional[float] = None,
24
logit_bias: Optional[Dict[str, float]] = None,
25
seed: Optional[int] = None,
26
stream: bool = False,
27
logprobs: Optional[int] = None,
28
echo: Optional[bool] = None,
29
n: Optional[int] = None,
30
safety_model: Optional[str] = None,
31
response_format: Optional[Dict[str, Any]] = None,
32
tools: Optional[List[Dict[str, Any]]] = None,
33
tool_choice: Optional[Union[str, Dict[str, Union[str, Dict[str, str]]]]] = None,
34
**kwargs
35
) -> Union[ChatCompletionResponse, Iterator[ChatCompletionChunk]]:
36
"""
37
Create a chat completion with conversational messages.
38
39
Args:
40
messages: List of message objects with role and content (Dict[str, Any])
41
model: Model identifier for chat completion
42
max_tokens: Maximum tokens to generate in response
43
stop: List of stop sequences to end generation
44
temperature: Sampling temperature (0.0 to 2.0)
45
top_p: Nucleus sampling probability threshold
46
top_k: Top-k sampling parameter
47
repetition_penalty: Penalty for repeating tokens
48
presence_penalty: Penalty for token presence (-2.0 to 2.0)
49
frequency_penalty: Penalty for token frequency (-2.0 to 2.0)
50
min_p: Minimum percentage for token consideration (0.0 to 1.0)
51
logit_bias: Modify likelihood of specific tokens (-100 to 100)
52
seed: Seed for reproducible generation
53
stream: Enable streaming response chunks
54
logprobs: Number of log probabilities to return
55
echo: Include prompt in response with logprobs
56
n: Number of completion choices to generate
57
safety_model: Safety moderation model to apply
58
response_format: Output format specification
59
tools: List of tool definitions for function calling
60
tool_choice: Control tool selection behavior
61
62
Returns:
63
ChatCompletionResponse or Iterator[ChatCompletionChunk] when streaming
64
"""
65
```
66
67
### Multi-Modal Chat
68
69
Supports messages with text, images, and video content in conversational context.
70
71
```python { .api }
72
def create(
73
model: str,
74
messages: List[Dict[str, Union[str, List[dict]]]],
75
**kwargs
76
) -> ChatCompletionResponse:
77
"""
78
Create multi-modal chat completions with images and video.
79
80
Message content can be:
81
- String for text-only messages
82
- List of content objects for multi-modal messages
83
84
Content object types:
85
- {"type": "text", "text": str}
86
- {"type": "image_url", "image_url": {"url": str}}
87
- {"type": "video_url", "video_url": {"url": str}}
88
"""
89
```
90
91
### Streaming Chat
92
93
Real-time streaming of chat completion responses as they are generated.
94
95
```python { .api }
96
def create(
97
model: str,
98
messages: List[dict],
99
stream: bool = True,
100
**kwargs
101
) -> Iterator[ChatCompletionChunk]:
102
"""
103
Stream chat completion chunks in real-time.
104
105
Returns:
106
Iterator yielding ChatCompletionChunk objects
107
"""
108
```
109
110
### Async Chat Completion
111
112
Asynchronous chat completion operations for concurrent processing.
113
114
```python { .api }
115
async def create(
116
model: str,
117
messages: List[dict],
118
**kwargs
119
) -> ChatCompletionResponse:
120
"""
121
Asynchronously create chat completions.
122
123
Returns:
124
ChatCompletionResponse with generated content
125
"""
126
```
127
128
## Usage Examples
129
130
### Simple Text Chat
131
132
```python
133
from together import Together
134
135
client = Together()
136
137
response = client.chat.completions.create(
138
model="meta-llama/Llama-3.2-3B-Instruct-Turbo",
139
messages=[
140
{"role": "system", "content": "You are a helpful assistant."},
141
{"role": "user", "content": "Explain quantum computing in simple terms."}
142
],
143
max_tokens=300,
144
temperature=0.7
145
)
146
147
print(response.choices[0].message.content)
148
```
149
150
### Multi-Modal Chat with Image
151
152
```python
153
response = client.chat.completions.create(
154
model="meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo",
155
messages=[{
156
"role": "user",
157
"content": [
158
{
159
"type": "text",
160
"text": "What's in this image?"
161
},
162
{
163
"type": "image_url",
164
"image_url": {
165
"url": "https://example.com/image.jpg"
166
}
167
}
168
]
169
}],
170
max_tokens=200
171
)
172
173
print(response.choices[0].message.content)
174
```
175
176
### Video Analysis
177
178
```python
179
response = client.chat.completions.create(
180
model="Qwen/Qwen2.5-VL-72B-Instruct",
181
messages=[{
182
"role": "user",
183
"content": [
184
{
185
"type": "text",
186
"text": "Describe what happens in this video."
187
},
188
{
189
"type": "video_url",
190
"video_url": {
191
"url": "https://example.com/video.mp4"
192
}
193
}
194
]
195
}],
196
max_tokens=500
197
)
198
199
print(response.choices[0].message.content)
200
```
201
202
### Streaming Chat
203
204
```python
205
stream = client.chat.completions.create(
206
model="meta-llama/Llama-3.2-3B-Instruct-Turbo",
207
messages=[{"role": "user", "content": "Write a short story about AI"}],
208
stream=True,
209
max_tokens=500
210
)
211
212
for chunk in stream:
213
if chunk.choices[0].delta.content:
214
print(chunk.choices[0].delta.content, end="", flush=True)
215
```
216
217
### Async Chat Processing
218
219
```python
220
import asyncio
221
from together import AsyncTogether
222
223
async def process_chats():
224
client = AsyncTogether()
225
226
messages_list = [
227
[{"role": "user", "content": "Explain machine learning"}],
228
[{"role": "user", "content": "What is deep learning?"}],
229
[{"role": "user", "content": "How do neural networks work?"}]
230
]
231
232
tasks = [
233
client.chat.completions.create(
234
model="meta-llama/Llama-3.2-3B-Instruct-Turbo",
235
messages=messages,
236
max_tokens=200
237
)
238
for messages in messages_list
239
]
240
241
responses = await asyncio.gather(*tasks)
242
243
for i, response in enumerate(responses):
244
print(f"Response {i+1}: {response.choices[0].message.content}")
245
246
asyncio.run(process_chats())
247
```
248
249
### Logprobs Analysis
250
251
```python
252
response = client.chat.completions.create(
253
model="meta-llama/Llama-3.2-3B-Instruct-Turbo",
254
messages=[{"role": "user", "content": "The capital of France is"}],
255
logprobs=3,
256
max_tokens=10
257
)
258
259
logprobs_data = response.choices[0].logprobs
260
for token, logprob in zip(logprobs_data.tokens, logprobs_data.token_logprobs):
261
print(f"Token: '{token}', Log Probability: {logprob}")
262
```
263
264
## Types
265
266
### Request Types
267
268
```python { .api }
269
class ChatCompletionRequest:
270
model: str
271
messages: List[dict]
272
max_tokens: Optional[int] = None
273
temperature: Optional[float] = None
274
top_p: Optional[float] = None
275
top_k: Optional[int] = None
276
repetition_penalty: Optional[float] = None
277
stream: bool = False
278
logprobs: Optional[int] = None
279
echo: Optional[bool] = None
280
n: Optional[int] = None
281
presence_penalty: Optional[float] = None
282
frequency_penalty: Optional[float] = None
283
logit_bias: Optional[Dict[str, float]] = None
284
stop: Optional[Union[str, List[str]]] = None
285
safety_model: Optional[str] = None
286
```
287
288
### Response Types
289
290
```python { .api }
291
class ChatCompletionResponse:
292
id: str
293
object: str
294
created: int
295
model: str
296
choices: List[ChatChoice]
297
usage: Usage
298
299
class ChatChoice:
300
index: int
301
message: ChatMessage
302
finish_reason: Optional[str]
303
logprobs: Optional[Logprobs]
304
305
class ChatMessage:
306
role: str
307
content: Optional[str]
308
309
class Usage:
310
prompt_tokens: int
311
completion_tokens: int
312
total_tokens: int
313
314
class Logprobs:
315
tokens: List[str]
316
token_logprobs: List[Optional[float]]
317
top_logprobs: Optional[List[Dict[str, float]]]
318
```
319
320
### Streaming Types
321
322
```python { .api }
323
class ChatCompletionChunk:
324
id: str
325
object: str
326
created: int
327
model: str
328
choices: List[ChatChoiceDelta]
329
330
class ChatChoiceDelta:
331
index: int
332
delta: ChatDelta
333
finish_reason: Optional[str]
334
335
class ChatDelta:
336
role: Optional[str]
337
content: Optional[str]
338
```
339
340
### Message Content Types
341
342
```python { .api }
343
class TextContent:
344
type: Literal["text"]
345
text: str
346
347
class ImageContent:
348
type: Literal["image_url"]
349
image_url: ImageUrl
350
351
class VideoContent:
352
type: Literal["video_url"]
353
video_url: VideoUrl
354
355
class ImageUrl:
356
url: str
357
detail: Optional[Literal["low", "high", "auto"]] = None
358
359
class VideoUrl:
360
url: str
361
```