0
# Files
1
2
Upload, manage, and process files for use with fine-tuning, agents, and other AI capabilities. The files API provides comprehensive file management including upload, download, metadata retrieval, and deletion.
3
4
## Capabilities
5
6
### File Upload
7
8
Upload files to the Mistral AI platform for use with various services. Maximum file size is 512 MB. Fine-tuning API only supports .jsonl files.
9
10
```python { .api }
11
def upload(
12
file: Union[File, FileTypedDict],
13
purpose: Optional[FilePurpose] = None,
14
**kwargs
15
) -> UploadFileOut:
16
"""
17
Upload a file that can be used across various endpoints.
18
19
Parameters:
20
- file: The File object to be uploaded
21
- purpose: File purpose for filtering and organization
22
23
Returns:
24
UploadFileOut with file metadata and ID
25
"""
26
```
27
28
### File Listing
29
30
List uploaded files with optional filtering and pagination.
31
32
```python { .api }
33
def list(
34
page: Optional[int] = 0,
35
page_size: Optional[int] = 100,
36
sample_type: Optional[List[SampleType]] = None,
37
source: Optional[List[Source]] = None,
38
search: Optional[str] = None,
39
purpose: Optional[FilePurpose] = None,
40
**kwargs
41
) -> ListFilesOut:
42
"""
43
Returns a list of files that belong to the user's organization.
44
45
Parameters:
46
- page: Page number for pagination (default: 0)
47
- page_size: Number of files per page (default: 100)
48
- sample_type: Filter by sample types
49
- source: Filter by source
50
- search: Search query string for filtering files
51
- purpose: Filter by file purpose
52
53
Returns:
54
ListFilesOut with file metadata list
55
"""
56
```
57
58
### File Retrieval
59
60
Get detailed information about a specific file.
61
62
```python { .api }
63
def retrieve(file_id: str, **kwargs) -> RetrieveFileOut:
64
"""
65
Retrieve file metadata.
66
67
Parameters:
68
- file_id: Unique identifier of the file
69
70
Returns:
71
RetrieveFileOut with detailed file information
72
"""
73
```
74
75
### File Download
76
77
Download file content from the platform.
78
79
```python { .api }
80
def download(file_id: str, **kwargs) -> httpx.Response:
81
"""
82
Download a file (returns raw binary data as httpx.Response).
83
84
Parameters:
85
- file_id: The ID of the file to download
86
87
Returns:
88
httpx.Response with binary file content
89
"""
90
```
91
92
### File Deletion
93
94
Delete files that are no longer needed.
95
96
```python { .api }
97
def delete(file_id: str, **kwargs) -> DeleteFileOut:
98
"""
99
Delete a file.
100
101
Parameters:
102
- file_id: Unique identifier of the file to delete
103
104
Returns:
105
DeleteFileOut with deletion confirmation
106
"""
107
```
108
109
### Signed URLs
110
111
Generate signed URLs for secure file access.
112
113
```python { .api }
114
def get_signed_url(
115
file_id: str,
116
expiry: Optional[int] = 24,
117
**kwargs
118
) -> FileSignedURL:
119
"""
120
Get a signed URL for accessing the file.
121
122
Parameters:
123
- file_id: The ID of the file
124
- expiry: Number of hours before the URL becomes invalid (default: 24)
125
126
Returns:
127
FileSignedURL with secure access URL
128
"""
129
```
130
131
## Usage Examples
132
133
### Upload Training Data
134
135
```python
136
from mistralai import Mistral
137
138
client = Mistral(api_key="your-api-key")
139
140
# Upload a JSONL file for fine-tuning
141
with open("training_data.jsonl", "rb") as f:
142
upload_result = client.files.upload(
143
file=f,
144
purpose="fine-tune",
145
filename="my_training_data.jsonl"
146
)
147
148
print(f"Uploaded file ID: {upload_result.id}")
149
print(f"Filename: {upload_result.filename}")
150
print(f"Size: {upload_result.bytes} bytes")
151
```
152
153
### Upload from File Path
154
155
```python
156
# Upload using file path
157
upload_result = client.files.upload(
158
file="/path/to/document.pdf",
159
purpose="assistants"
160
)
161
162
print(f"File uploaded: {upload_result.id}")
163
print(f"Purpose: {upload_result.purpose}")
164
print(f"Status: {upload_result.status}")
165
```
166
167
### List and Filter Files
168
169
```python
170
# List all files
171
all_files = client.files.list()
172
print(f"Total files: {len(all_files.data)}")
173
174
# Filter by purpose
175
fine_tune_files = client.files.list(purpose="fine-tune")
176
print(f"Fine-tuning files: {len(fine_tune_files.data)}")
177
178
for file in fine_tune_files.data:
179
print(f"- {file.filename}: {file.bytes} bytes")
180
181
# Paginated listing
182
recent_files = client.files.list(limit=10)
183
for file in recent_files.data:
184
print(f"File: {file.id} - {file.filename}")
185
```
186
187
### Download and Process Files
188
189
```python
190
# Get file information
191
file_id = "file-abc123"
192
file_info = client.files.retrieve(file_id)
193
print(f"File: {file_info.filename}")
194
print(f"Size: {file_info.bytes} bytes")
195
print(f"Created: {file_info.created_at}")
196
197
# Download file content
198
file_content = client.files.download(file_id)
199
200
# Save to local file
201
with open(f"downloaded_{file_info.filename}", "wb") as f:
202
f.write(file_content)
203
204
print(f"Downloaded {len(file_content)} bytes")
205
```
206
207
### Secure File Access
208
209
```python
210
# Generate signed URL for secure access
211
signed_url = client.files.get_signed_url(
212
file_id=file_id,
213
expiration=3600 # 1 hour
214
)
215
216
print(f"Signed URL: {signed_url.url}")
217
print(f"Expires: {signed_url.expires_at}")
218
219
# Use signed URL for direct access (external to SDK)
220
import requests
221
response = requests.get(signed_url.url)
222
if response.status_code == 200:
223
content = response.content
224
print(f"Retrieved {len(content)} bytes via signed URL")
225
```
226
227
### File Management
228
229
```python
230
# Clean up old files
231
files = client.files.list()
232
for file in files.data:
233
# Delete files older than 30 days
234
if file.created_at < (time.time() - (30 * 24 * 3600)):
235
result = client.files.delete(file.id)
236
print(f"Deleted file: {file.filename}")
237
```
238
239
## Types
240
241
### File Upload Types
242
243
```python { .api }
244
class UploadFileOut:
245
id: str
246
object: str
247
bytes: int
248
created_at: int
249
filename: str
250
purpose: str
251
status: str
252
status_details: Optional[str]
253
254
class FilePurpose:
255
FINE_TUNE = "fine-tune"
256
ASSISTANTS = "assistants"
257
BATCH = "batch"
258
```
259
260
### File Listing Types
261
262
```python { .api }
263
class ListFilesOut:
264
object: str
265
data: List[File]
266
267
class File:
268
id: str
269
object: str
270
bytes: int
271
created_at: int
272
filename: str
273
purpose: str
274
status: str
275
status_details: Optional[str]
276
```
277
278
### File Operations
279
280
```python { .api }
281
class RetrieveFileOut:
282
id: str
283
object: str
284
bytes: int
285
created_at: int
286
filename: str
287
purpose: str
288
status: str
289
status_details: Optional[str]
290
291
class DeleteFileOut:
292
id: str
293
object: str
294
deleted: bool
295
296
class FileSignedURL:
297
url: str
298
expires_at: int
299
```
300
301
### File Status
302
303
```python { .api }
304
class FileStatus:
305
UPLOADED = "uploaded"
306
PROCESSED = "processed"
307
ERROR = "error"
308
DELETING = "deleting"
309
```
310
311
## File Management Best Practices
312
313
### Supported File Types
314
315
- **Text files**: `.txt`, `.jsonl`, `.json`, `.csv`
316
- **Documents**: `.pdf`, `.docx`, `.md`
317
- **Data files**: Various formats depending on purpose
318
319
### File Size Limits
320
321
- Maximum file size varies by purpose and plan
322
- Check current limits in the API documentation
323
- Consider splitting large datasets for better processing
324
325
### Purpose Guidelines
326
327
- **fine-tune**: Training data in JSONL format for model fine-tuning
328
- **assistants**: Documents and files for agent knowledge bases
329
- **batch**: Input files for batch processing operations
330
331
### Security Considerations
332
333
- Files are encrypted at rest and in transit
334
- Signed URLs provide time-limited secure access
335
- Regular cleanup of unused files is recommended
336
- Access is controlled through API key authentication