0
# Vector Stores
1
2
Core functionality for creating and managing vector stores with the OpenAI API. Vector stores enable semantic search and file management for AI assistants.
3
4
## Overview
5
6
Vector stores are collections of processed files that can be used with the `file_search` tool in assistants. They support semantic searching, batch file operations, and flexible chunking strategies for text extraction and embedding.
7
8
## Capabilities
9
10
### Create Vector Store
11
12
Creates a new vector store with optional initial files and configuration.
13
14
```typescript { .api }
15
function create(params: VectorStoreCreateParams): Promise<VectorStore>;
16
17
interface VectorStoreCreateParams {
18
name?: string;
19
description?: string;
20
file_ids?: Array<string>;
21
chunking_strategy?: FileChunkingStrategyParam;
22
expires_after?: {
23
anchor: 'last_active_at';
24
days: number;
25
};
26
metadata?: Record<string, string> | null;
27
}
28
```
29
30
**Example:**
31
32
```typescript
33
import { OpenAI } from 'openai';
34
35
const client = new OpenAI();
36
37
// Create a basic vector store
38
const store = await client.vectorStores.create({
39
name: 'Product Documentation',
40
description: 'API and product docs',
41
});
42
43
// Create with initial files
44
const storeWithFiles = await client.vectorStores.create({
45
name: 'Support Docs',
46
file_ids: ['file-id-1', 'file-id-2'],
47
chunking_strategy: {
48
type: 'static',
49
static: {
50
max_chunk_size_tokens: 800,
51
chunk_overlap_tokens: 400,
52
},
53
},
54
});
55
56
// Create with auto-expiration
57
const expiringStore = await client.vectorStores.create({
58
name: 'Temporary Data',
59
expires_after: {
60
anchor: 'last_active_at',
61
days: 30,
62
},
63
});
64
```
65
66
**Parameters:**
67
- `name` - Display name for the vector store
68
- `description` - Optional description
69
- `file_ids` - Array of file IDs to attach (must be pre-uploaded)
70
- `chunking_strategy` - Text chunking configuration (defaults to auto)
71
- `expires_after` - Automatic expiration policy
72
- `metadata` - Key-value pairs for custom data (max 16 entries)
73
74
**Status Values:**
75
- `completed` - Ready for use
76
- `in_progress` - Files are being processed
77
- `expired` - Vector store has expired
78
79
[Retrieve](#retrieve-vector-store) | [Update](#update-vector-store) | [List](#list-vector-stores) | [Delete](#delete-vector-store) | [Search](#search-vector-store)
80
81
---
82
83
### Retrieve Vector Store
84
85
Fetches details about a specific vector store.
86
87
```typescript { .api }
88
function retrieve(storeID: string): Promise<VectorStore>;
89
```
90
91
**Example:**
92
93
```typescript
94
const store = await client.vectorStores.retrieve('vs_abc123');
95
96
console.log(store.id);
97
console.log(store.status); // 'completed' | 'in_progress' | 'expired'
98
console.log(store.file_counts);
99
// {
100
// total: 5,
101
// completed: 4,
102
// failed: 0,
103
// in_progress: 1,
104
// cancelled: 0
105
// }
106
console.log(store.usage_bytes);
107
console.log(store.created_at); // Unix timestamp
108
console.log(store.last_active_at); // Unix timestamp or null
109
```
110
111
**Returns:** `VectorStore { .api }` with current file processing status and metadata.
112
113
---
114
115
### Update Vector Store
116
117
Modifies an existing vector store's name, metadata, or expiration policy.
118
119
```typescript { .api }
120
function update(
121
storeID: string,
122
params: VectorStoreUpdateParams
123
): Promise<VectorStore>;
124
125
interface VectorStoreUpdateParams {
126
name?: string | null;
127
expires_after?: {
128
anchor: 'last_active_at';
129
days: number;
130
} | null;
131
metadata?: Record<string, string> | null;
132
}
133
```
134
135
**Example:**
136
137
```typescript
138
const updated = await client.vectorStores.update('vs_abc123', {
139
name: 'Updated Store Name',
140
metadata: {
141
department: 'support',
142
version: '2.0',
143
},
144
});
145
146
// Update expiration policy
147
await client.vectorStores.update('vs_abc123', {
148
expires_after: {
149
anchor: 'last_active_at',
150
days: 60,
151
},
152
});
153
154
// Clear metadata
155
await client.vectorStores.update('vs_abc123', {
156
metadata: null,
157
});
158
```
159
160
---
161
162
### List Vector Stores
163
164
Retrieves paginated list of vector stores with optional filtering.
165
166
```typescript { .api }
167
function list(
168
params?: VectorStoreListParams
169
): Promise<VectorStoresPage>;
170
171
interface VectorStoreListParams {
172
after?: string; // Cursor for pagination
173
before?: string; // Cursor for pagination
174
limit?: number; // Items per page (default 20)
175
order?: 'asc' | 'desc'; // Sort by created_at
176
}
177
```
178
179
**Example:**
180
181
```typescript
182
// List all stores
183
const page = await client.vectorStores.list();
184
185
for (const store of page.data) {
186
console.log(`${store.name} (${store.id}): ${store.file_counts.completed}/${store.file_counts.total} files`);
187
}
188
189
// Paginate
190
if (page.hasNextPage()) {
191
const nextPage = await page.getNextPage();
192
}
193
194
// Iterate all pages
195
for await (const page of (await client.vectorStores.list()).iterPages()) {
196
for (const store of page.data) {
197
console.log(store.name);
198
}
199
}
200
201
// Sort by newest first
202
const newestStores = await client.vectorStores.list({
203
order: 'desc',
204
limit: 10,
205
});
206
207
// Iterate all stores across pages
208
for await (const store of await client.vectorStores.list()) {
209
console.log(store.name);
210
}
211
```
212
213
---
214
215
### Delete Vector Store
216
217
Permanently removes a vector store (files remain in system).
218
219
```typescript { .api }
220
function delete(storeID: string): Promise<VectorStoreDeleted>;
221
222
interface VectorStoreDeleted {
223
id: string;
224
deleted: boolean;
225
object: 'vector_store.deleted';
226
}
227
```
228
229
**Example:**
230
231
```typescript
232
const result = await client.vectorStores.delete('vs_abc123');
233
console.log(result.deleted); // true
234
235
// Files are preserved - delete separately if needed
236
await client.files.delete('file-id-1');
237
```
238
239
---
240
241
### Search Vector Store
242
243
Searches for relevant content chunks based on query and optional filters.
244
245
```typescript { .api }
246
function search(
247
storeID: string,
248
params: VectorStoreSearchParams
249
): Promise<VectorStoreSearchResponsesPage>;
250
251
interface VectorStoreSearchParams {
252
query: string | Array<string>;
253
max_num_results?: number; // 1-50, default 20
254
rewrite_query?: boolean; // Rewrite for semantic search
255
filters?: ComparisonFilter | CompoundFilter;
256
ranking_options?: {
257
ranker?: 'none' | 'auto' | 'default-2024-11-15';
258
score_threshold?: number;
259
};
260
}
261
```
262
263
**Example:**
264
265
```typescript
266
// Basic search
267
const results = await client.vectorStores.search('vs_abc123', {
268
query: 'How to authenticate API requests?',
269
max_num_results: 5,
270
});
271
272
for (const result of results.data) {
273
console.log(`File: ${result.filename}`);
274
console.log(`Similarity: ${result.score}`);
275
for (const chunk of result.content) {
276
console.log(chunk.text);
277
}
278
}
279
280
// Search with query rewriting
281
const rewritten = await client.vectorStores.search('vs_abc123', {
282
query: 'auth stuff',
283
rewrite_query: true, // Rewrites to better semantic query
284
});
285
286
// Search with filters
287
const filtered = await client.vectorStores.search('vs_abc123', {
288
query: 'pricing',
289
filters: {
290
key: 'department',
291
type: 'eq',
292
value: 'sales',
293
},
294
});
295
296
// Search multiple queries
297
const multi = await client.vectorStores.search('vs_abc123', {
298
query: ['pricing', 'cost', 'billing'],
299
max_num_results: 10,
300
});
301
302
// Advanced ranking options
303
const ranked = await client.vectorStores.search('vs_abc123', {
304
query: 'documentation',
305
ranking_options: {
306
ranker: 'default-2024-11-15',
307
score_threshold: 0.5,
308
},
309
});
310
```
311
312
**Returns:** Search results with file references, content chunks, and similarity scores.
313
314
---
315
316
## Vector Store Files
317
318
File management for vector stores with chunking and polling support.
319
320
Access via: `client.vectorStores.files`
321
322
### Add File to Vector Store
323
324
Attaches an existing file to a vector store for processing.
325
326
```typescript { .api }
327
function create(
328
storeID: string,
329
params: FileCreateParams
330
): Promise<VectorStoreFile>;
331
332
interface FileCreateParams {
333
file_id: string;
334
attributes?: Record<string, string | number | boolean> | null;
335
chunking_strategy?: FileChunkingStrategyParam;
336
}
337
```
338
339
**Example:**
340
341
```typescript
342
// Add file to vector store
343
const vsFile = await client.vectorStores.files.create('vs_abc123', {
344
file_id: 'file-id-1',
345
});
346
347
console.log(vsFile.status); // 'in_progress', 'completed', 'failed', 'cancelled'
348
console.log(vsFile.usage_bytes);
349
350
// Add with custom chunking
351
const customChunked = await client.vectorStores.files.create('vs_abc123', {
352
file_id: 'file-id-2',
353
chunking_strategy: {
354
type: 'static',
355
static: {
356
max_chunk_size_tokens: 1024,
357
chunk_overlap_tokens: 200,
358
},
359
},
360
attributes: {
361
source: 'support-docs',
362
version: '1.0',
363
},
364
});
365
366
// Add with auto chunking (default)
367
await client.vectorStores.files.create('vs_abc123', {
368
file_id: 'file-id-3',
369
chunking_strategy: {
370
type: 'auto',
371
},
372
});
373
```
374
375
---
376
377
### Retrieve File Details
378
379
Gets metadata and status for a file in a vector store.
380
381
```typescript { .api }
382
function retrieve(
383
vectorStoreId: string,
384
fileId: string
385
): Promise<VectorStoreFile>;
386
```
387
388
**Example:**
389
390
```typescript
391
const file = await client.vectorStores.files.retrieve('vs_abc123', 'file-id-1');
392
393
console.log(file.status); // 'in_progress' | 'completed' | 'failed' | 'cancelled'
394
console.log(file.usage_bytes);
395
console.log(file.created_at);
396
397
if (file.status === 'failed') {
398
console.log(file.last_error?.code); // 'server_error' | 'unsupported_file' | 'invalid_file'
399
console.log(file.last_error?.message);
400
}
401
```
402
403
**File Statuses:**
404
- `in_progress` - Currently being processed
405
- `completed` - Ready for search and use
406
- `failed` - Processing error (check `last_error`)
407
- `cancelled` - Processing was cancelled
408
409
---
410
411
### Update File Metadata
412
413
Modifies attributes and metadata for a file.
414
415
```typescript { .api }
416
function update(
417
vectorStoreId: string,
418
fileId: string,
419
params: FileUpdateParams
420
): Promise<VectorStoreFile>;
421
422
interface FileUpdateParams {
423
attributes: Record<string, string | number | boolean> | null;
424
}
425
```
426
427
**Example:**
428
429
```typescript
430
// Update file attributes
431
const updated = await client.vectorStores.files.update('vs_abc123', 'file-id-1', {
432
attributes: {
433
category: 'billing',
434
reviewed: true,
435
priority: 1,
436
},
437
});
438
439
// Clear attributes
440
await client.vectorStores.files.update('vs_abc123', 'file-id-1', {
441
attributes: null,
442
});
443
```
444
445
---
446
447
### List Files in Vector Store
448
449
Retrieves paginated list of files with filtering and sorting.
450
451
```typescript { .api }
452
function list(
453
storeID: string,
454
params?: FileListParams
455
): Promise<VectorStoreFilesPage>;
456
457
interface FileListParams {
458
after?: string; // Cursor
459
before?: string; // Cursor
460
limit?: number; // Items per page
461
filter?: 'in_progress' | 'completed' | 'failed' | 'cancelled';
462
order?: 'asc' | 'desc'; // Sort by created_at
463
}
464
```
465
466
**Example:**
467
468
```typescript
469
// List all files
470
const page = await client.vectorStores.files.list('vs_abc123');
471
472
for (const file of page.data) {
473
console.log(`${file.id}: ${file.status} (${file.usage_bytes} bytes)`);
474
}
475
476
// Filter by status
477
const completed = await client.vectorStores.files.list('vs_abc123', {
478
filter: 'completed',
479
});
480
481
const failed = await client.vectorStores.files.list('vs_abc123', {
482
filter: 'failed',
483
});
484
485
// Sort and paginate
486
const sorted = await client.vectorStores.files.list('vs_abc123', {
487
order: 'desc',
488
limit: 10,
489
});
490
491
// Iterate all pages
492
for await (const file of await client.vectorStores.files.list('vs_abc123')) {
493
console.log(file.id);
494
}
495
```
496
497
---
498
499
### Remove File from Vector Store
500
501
Deletes a file association from the vector store (file remains in system).
502
503
```typescript { .api }
504
function del(
505
vectorStoreId: string,
506
fileId: string
507
): Promise<VectorStoreFileDeleted>;
508
```
509
510
**Example:**
511
512
```typescript
513
const deleted = await client.vectorStores.files.del('vs_abc123', 'file-id-1');
514
515
console.log(deleted.deleted); // true
516
```
517
518
Note: Use `client.files.delete()` to permanently delete the file itself.
519
520
---
521
522
### Get File Content
523
524
Retrieves parsed text content from a file in chunks.
525
526
```typescript { .api }
527
function content(
528
vectorStoreId: string,
529
fileId: string
530
): Promise<FileContentResponsesPage>;
531
532
interface FileContentResponse {
533
type?: string; // 'text'
534
text?: string; // Content chunk
535
}
536
```
537
538
**Example:**
539
540
```typescript
541
// Get file contents as paginated chunks
542
const content = await client.vectorStores.files.content('vs_abc123', 'file-id-1');
543
544
for (const chunk of content.data) {
545
console.log(chunk.text);
546
}
547
548
// Iterate all chunks
549
for await (const chunk of await client.vectorStores.files.content('vs_abc123', 'file-id-1')) {
550
console.log(chunk.text);
551
}
552
```
553
554
---
555
556
## Helper Methods
557
558
### Upload File Helper
559
560
Uploads a raw file to the Files API and adds it to the vector store in one operation.
561
562
```typescript { .api }
563
async function upload(
564
storeID: string,
565
file: Uploadable,
566
options?: RequestOptions
567
): Promise<VectorStoreFile>;
568
569
type Uploadable = File | Blob | Buffer | ReadStream | string;
570
```
571
572
**Example:**
573
574
```typescript
575
import { toFile } from 'openai';
576
import fs from 'fs';
577
578
// From file path
579
const file = await client.vectorStores.files.upload('vs_abc123',
580
await toFile(fs.createReadStream('./docs.pdf'), 'docs.pdf', { type: 'application/pdf' })
581
);
582
583
// From Buffer
584
const buffer = Buffer.from('PDF content here');
585
const vsFile = await client.vectorStores.files.upload('vs_abc123',
586
await toFile(buffer, 'document.pdf', { type: 'application/pdf' })
587
);
588
589
// From File object (browser)
590
const input = document.querySelector('input[type="file"]');
591
const vsFile = await client.vectorStores.files.upload('vs_abc123', input.files[0]);
592
```
593
594
Note: File will start processing asynchronously. Use `poll()` to wait for completion.
595
596
---
597
598
### Upload and Poll Helper
599
600
Uploads a file and waits for processing to complete.
601
602
```typescript { .api }
603
async function uploadAndPoll(
604
storeID: string,
605
file: Uploadable,
606
options?: RequestOptions & { pollIntervalMs?: number }
607
): Promise<VectorStoreFile>;
608
```
609
610
**Example:**
611
612
```typescript
613
import { toFile } from 'openai';
614
615
// Upload and wait for processing
616
const completed = await client.vectorStores.files.uploadAndPoll(
617
'vs_abc123',
618
await toFile(Buffer.from('content'), 'doc.txt'),
619
{ pollIntervalMs: 2000 } // Check every 2 seconds
620
);
621
622
console.log(completed.status); // 'completed' or 'failed'
623
624
if (completed.status === 'failed') {
625
console.error(completed.last_error?.message);
626
}
627
```
628
629
---
630
631
### Create and Poll Helper
632
633
Attaches a file to vector store and waits for processing without uploading.
634
635
```typescript { .api }
636
async function createAndPoll(
637
storeID: string,
638
body: FileCreateParams,
639
options?: RequestOptions & { pollIntervalMs?: number }
640
): Promise<VectorStoreFile>;
641
```
642
643
**Example:**
644
645
```typescript
646
// Attach pre-uploaded file and wait for processing
647
const completed = await client.vectorStores.files.createAndPoll(
648
'vs_abc123',
649
{
650
file_id: 'file-abc123',
651
chunking_strategy: {
652
type: 'static',
653
static: {
654
max_chunk_size_tokens: 1024,
655
chunk_overlap_tokens: 300,
656
},
657
},
658
},
659
{ pollIntervalMs: 3000 }
660
);
661
```
662
663
---
664
665
### Poll File Status
666
667
Manually polls a file until processing completes or fails.
668
669
```typescript { .api }
670
async function poll(
671
storeID: string,
672
fileID: string,
673
options?: RequestOptions & { pollIntervalMs?: number }
674
): Promise<VectorStoreFile>;
675
```
676
677
**Example:**
678
679
```typescript
680
// Start processing
681
const file = await client.vectorStores.files.create('vs_abc123', {
682
file_id: 'file-id-1',
683
});
684
685
// Poll manually
686
const completed = await client.vectorStores.files.poll('vs_abc123', file.id, {
687
pollIntervalMs: 2000,
688
});
689
690
// Or in a loop with custom logic
691
while (true) {
692
const status = await client.vectorStores.files.retrieve(file.id, {
693
vector_store_id: 'vs_abc123',
694
});
695
696
if (status.status === 'completed') {
697
console.log('Done!');
698
break;
699
} else if (status.status === 'failed') {
700
console.error(status.last_error?.message);
701
break;
702
}
703
704
await new Promise(resolve => setTimeout(resolve, 5000));
705
}
706
```
707
708
---
709
710
## Batch File Operations
711
712
Process multiple files efficiently with batch operations.
713
714
Access via: `client.vectorStores.fileBatches`
715
716
### Create Batch
717
718
Creates a batch of file operations for a vector store.
719
720
```typescript { .api }
721
function create(
722
storeID: string,
723
params: FileBatchCreateParams
724
): Promise<VectorStoreFileBatch>;
725
726
interface FileBatchCreateParams {
727
file_ids?: Array<string>;
728
files?: Array<{
729
file_id: string;
730
attributes?: Record<string, string | number | boolean> | null;
731
chunking_strategy?: FileChunkingStrategyParam;
732
}>;
733
attributes?: Record<string, string | number | boolean> | null;
734
chunking_strategy?: FileChunkingStrategyParam;
735
}
736
```
737
738
**Example:**
739
740
```typescript
741
// Batch with file IDs (same settings for all)
742
const batch = await client.vectorStores.fileBatches.create('vs_abc123', {
743
file_ids: ['file-1', 'file-2', 'file-3'],
744
chunking_strategy: {
745
type: 'static',
746
static: {
747
max_chunk_size_tokens: 800,
748
chunk_overlap_tokens: 400,
749
},
750
},
751
});
752
753
// Batch with per-file configuration
754
const customBatch = await client.vectorStores.fileBatches.create('vs_abc123', {
755
files: [
756
{
757
file_id: 'file-1',
758
chunking_strategy: { type: 'auto' },
759
},
760
{
761
file_id: 'file-2',
762
chunking_strategy: {
763
type: 'static',
764
static: { max_chunk_size_tokens: 512, chunk_overlap_tokens: 200 },
765
},
766
attributes: { category: 'support' },
767
},
768
],
769
});
770
771
console.log(batch.status); // 'in_progress' | 'completed' | 'failed' | 'cancelled'
772
console.log(batch.file_counts);
773
// { total: 3, completed: 0, failed: 0, in_progress: 3, cancelled: 0 }
774
```
775
776
---
777
778
### Retrieve Batch
779
780
Gets status and details of a batch operation.
781
782
```typescript { .api }
783
function retrieve(
784
batchID: string,
785
params: FileBatchRetrieveParams
786
): Promise<VectorStoreFileBatch>;
787
788
interface FileBatchRetrieveParams {
789
vector_store_id: string;
790
}
791
```
792
793
**Example:**
794
795
```typescript
796
const batch = await client.vectorStores.fileBatches.retrieve('batch-id-1', {
797
vector_store_id: 'vs_abc123',
798
});
799
800
console.log(batch.file_counts);
801
// { total: 5, completed: 3, failed: 0, in_progress: 2, cancelled: 0 }
802
```
803
804
---
805
806
### Cancel Batch
807
808
Stops a batch operation and cancels remaining file processing.
809
810
```typescript { .api }
811
function cancel(
812
batchID: string,
813
params: FileBatchCancelParams
814
): Promise<VectorStoreFileBatch>;
815
816
interface FileBatchCancelParams {
817
vector_store_id: string;
818
}
819
```
820
821
**Example:**
822
823
```typescript
824
const cancelled = await client.vectorStores.fileBatches.cancel('batch-id-1', {
825
vector_store_id: 'vs_abc123',
826
});
827
828
console.log(cancelled.status); // 'cancelled'
829
```
830
831
---
832
833
### List Files in Batch
834
835
Retrieves paginated list of files processed in a batch.
836
837
```typescript { .api }
838
function listFiles(
839
batchID: string,
840
params: FileBatchListFilesParams
841
): Promise<VectorStoreFilesPage>;
842
843
interface FileBatchListFilesParams {
844
vector_store_id: string;
845
after?: string;
846
before?: string;
847
limit?: number;
848
filter?: 'in_progress' | 'completed' | 'failed' | 'cancelled';
849
order?: 'asc' | 'desc';
850
}
851
```
852
853
**Example:**
854
855
```typescript
856
// List all files in batch
857
const page = await client.vectorStores.fileBatches.listFiles('batch-id-1', {
858
vector_store_id: 'vs_abc123',
859
});
860
861
// Filter by status
862
const failed = await client.vectorStores.fileBatches.listFiles('batch-id-1', {
863
vector_store_id: 'vs_abc123',
864
filter: 'failed',
865
});
866
867
for (const file of failed.data) {
868
console.log(file.last_error?.message);
869
}
870
```
871
872
---
873
874
## Batch Helper Methods
875
876
### Create and Poll
877
878
Creates a batch and waits for all files to finish processing.
879
880
```typescript { .api }
881
async function createAndPoll(
882
storeID: string,
883
body: FileBatchCreateParams,
884
options?: RequestOptions & { pollIntervalMs?: number }
885
): Promise<VectorStoreFileBatch>;
886
```
887
888
**Example:**
889
890
```typescript
891
// Create batch and wait for all files
892
const completed = await client.vectorStores.fileBatches.createAndPoll(
893
'vs_abc123',
894
{
895
file_ids: ['file-1', 'file-2', 'file-3'],
896
},
897
{ pollIntervalMs: 5000 }
898
);
899
900
console.log(completed.file_counts);
901
// { total: 3, completed: 3, failed: 0, in_progress: 0, cancelled: 0 }
902
```
903
904
---
905
906
### Upload and Poll
907
908
Uploads raw files and creates a batch, waiting for processing.
909
910
```typescript { .api }
911
async function uploadAndPoll(
912
storeID: string,
913
{ files: Uploadable[], fileIds?: string[] },
914
options?: RequestOptions & {
915
pollIntervalMs?: number;
916
maxConcurrency?: number;
917
}
918
): Promise<VectorStoreFileBatch>;
919
```
920
921
**Example:**
922
923
```typescript
924
import { toFile } from 'openai';
925
import fs from 'fs';
926
927
// Upload multiple files concurrently and create batch
928
const batch = await client.vectorStores.fileBatches.uploadAndPoll(
929
'vs_abc123',
930
{
931
files: [
932
await toFile(fs.createReadStream('./doc1.pdf'), 'doc1.pdf'),
933
await toFile(fs.createReadStream('./doc2.pdf'), 'doc2.pdf'),
934
await toFile(fs.createReadStream('./doc3.txt'), 'doc3.txt'),
935
],
936
fileIds: ['pre-uploaded-file-id'],
937
maxConcurrency: 3, // Upload 3 files at a time
938
},
939
{ pollIntervalMs: 5000 }
940
);
941
942
if (batch.file_counts.failed > 0) {
943
const failed = await client.vectorStores.fileBatches.listFiles(batch.id, {
944
vector_store_id: 'vs_abc123',
945
filter: 'failed',
946
});
947
}
948
```
949
950
---
951
952
### Poll Batch Status
953
954
Manually polls a batch until processing completes.
955
956
```typescript { .api }
957
async function poll(
958
storeID: string,
959
batchID: string,
960
options?: RequestOptions & { pollIntervalMs?: number }
961
): Promise<VectorStoreFileBatch>;
962
```
963
964
**Example:**
965
966
```typescript
967
const batch = await client.vectorStores.fileBatches.create('vs_abc123', {
968
file_ids: ['file-1', 'file-2'],
969
});
970
971
// Poll for completion
972
const completed = await client.vectorStores.fileBatches.poll(
973
'vs_abc123',
974
batch.id,
975
{ pollIntervalMs: 3000 }
976
);
977
978
console.log(`Complete: ${completed.file_counts.completed}/${completed.file_counts.total}`);
979
```
980
981
---
982
983
## Type Reference
984
985
### VectorStore { .api }
986
987
```typescript
988
interface VectorStore {
989
id: string; // Unique identifier
990
object: 'vector_store';
991
created_at: number; // Unix timestamp
992
status: 'expired' | 'in_progress' | 'completed';
993
usage_bytes: number; // Total storage used
994
last_active_at: number | null; // Unix timestamp or null
995
name: string; // Display name
996
metadata: Record<string, string> | null;
997
998
file_counts: {
999
total: number; // All files
1000
completed: number; // Fully processed
1001
failed: number; // Processing failed
1002
in_progress: number; // Being processed
1003
cancelled: number; // Cancelled
1004
};
1005
1006
expires_after?: {
1007
anchor: 'last_active_at';
1008
days: number;
1009
};
1010
1011
expires_at?: number | null; // Unix timestamp when expired
1012
}
1013
```
1014
1015
---
1016
1017
### VectorStoreFile { .api }
1018
1019
```typescript
1020
interface VectorStoreFile {
1021
id: string; // Unique identifier
1022
object: 'vector_store.file';
1023
vector_store_id: string; // Parent store ID
1024
status: 'in_progress' | 'completed' | 'failed' | 'cancelled';
1025
created_at: number; // Unix timestamp
1026
usage_bytes: number; // Storage used
1027
1028
last_error?: {
1029
code: 'server_error' | 'unsupported_file' | 'invalid_file';
1030
message: string;
1031
} | null;
1032
1033
attributes?: Record<string, string | number | boolean> | null;
1034
chunking_strategy?: FileChunkingStrategy;
1035
}
1036
```
1037
1038
---
1039
1040
### VectorStoreFileBatch { .api }
1041
1042
```typescript
1043
interface VectorStoreFileBatch {
1044
id: string; // Unique identifier
1045
object: 'vector_store.files_batch';
1046
vector_store_id: string; // Parent store ID
1047
status: 'in_progress' | 'completed' | 'failed' | 'cancelled';
1048
created_at: number; // Unix timestamp
1049
1050
file_counts: {
1051
total: number;
1052
completed: number;
1053
failed: number;
1054
in_progress: number;
1055
cancelled: number;
1056
};
1057
}
1058
```
1059
1060
---
1061
1062
### VectorStoreSearchResponse { .api }
1063
1064
```typescript
1065
interface VectorStoreSearchResponse {
1066
file_id: string; // Source file ID
1067
filename: string; // File name
1068
score: number; // Similarity score
1069
1070
content: Array<{
1071
type: 'text';
1072
text: string;
1073
}>;
1074
1075
attributes?: Record<string, string | number | boolean> | null;
1076
}
1077
```
1078
1079
---
1080
1081
### Chunking Strategies
1082
1083
#### Auto Chunking { .api }
1084
1085
Default strategy with smart chunk sizing:
1086
1087
```typescript
1088
interface AutoFileChunkingStrategyParam {
1089
type: 'auto';
1090
}
1091
```
1092
1093
Uses `800` tokens per chunk with `400` token overlap automatically.
1094
1095
---
1096
1097
#### Static Chunking { .api }
1098
1099
Customize chunk size and overlap:
1100
1101
```typescript
1102
interface StaticFileChunkingStrategy {
1103
max_chunk_size_tokens: number; // 100-4096, default 800
1104
chunk_overlap_tokens: number; // Default 400, must be <= half of max
1105
}
1106
1107
type StaticFileChunkingStrategyParam = {
1108
type: 'static';
1109
static: StaticFileChunkingStrategy;
1110
};
1111
```
1112
1113
**Example:**
1114
1115
```typescript
1116
// Large chunks for long documents
1117
const largeChunks = {
1118
type: 'static',
1119
static: {
1120
max_chunk_size_tokens: 2048,
1121
chunk_overlap_tokens: 512,
1122
},
1123
};
1124
1125
// Small chunks for precise retrieval
1126
const smallChunks = {
1127
type: 'static',
1128
static: {
1129
max_chunk_size_tokens: 256,
1130
chunk_overlap_tokens: 64,
1131
},
1132
};
1133
```
1134
1135
**Guidelines:**
1136
- `max_chunk_size_tokens`: Controls semantic unit size. Larger = more context, slower retrieval
1137
- `chunk_overlap_tokens`: Prevents losing context at chunk boundaries
1138
- Overlap should be 25-50% of max size
1139
- 800 tokens ≈ 600 words for typical English text
1140
1141
---
1142
1143
#### Other/Unknown Chunking Strategy { .api }
1144
1145
This strategy type is returned for files indexed before the `chunking_strategy` concept was introduced. It indicates the chunking method is unknown.
1146
1147
```typescript
1148
interface OtherFileChunkingStrategyObject {
1149
/** Always 'other' */
1150
type: 'other';
1151
}
1152
```
1153
1154
This type appears in the `FileChunkingStrategy` union (response type) but not in `FileChunkingStrategyParam` (input type). You cannot create files with `type: 'other'` - it only appears in responses for legacy files.
1155
1156
```typescript
1157
type FileChunkingStrategy = StaticFileChunkingStrategyObject | OtherFileChunkingStrategyObject;
1158
```
1159
1160
---
1161
1162
## Complete Example: Building a Vector Store
1163
1164
```typescript
1165
import { OpenAI, toFile } from 'openai';
1166
import fs from 'fs';
1167
1168
const client = new OpenAI({
1169
apiKey: process.env.OPENAI_API_KEY,
1170
});
1171
1172
async function buildSupportDocs() {
1173
// 1. Create vector store
1174
const store = await client.vectorStores.create({
1175
name: 'Support Documentation',
1176
description: 'Company support and FAQ documentation',
1177
metadata: {
1178
department: 'support',
1179
version: '1.0',
1180
},
1181
});
1182
1183
console.log(`Created store: ${store.id}`);
1184
1185
// 2. Upload files with batch operation
1186
const batch = await client.vectorStores.fileBatches.uploadAndPoll(
1187
store.id,
1188
{
1189
files: [
1190
await toFile(fs.createReadStream('./docs/faq.pdf'), 'faq.pdf'),
1191
await toFile(fs.createReadStream('./docs/api.pdf'), 'api.pdf'),
1192
await toFile(fs.createReadStream('./docs/billing.txt'), 'billing.txt'),
1193
],
1194
maxConcurrency: 3,
1195
},
1196
{ pollIntervalMs: 2000 }
1197
);
1198
1199
console.log(`Batch status: ${batch.status}`);
1200
console.log(`Files: ${batch.file_counts.completed}/${batch.file_counts.total} processed`);
1201
1202
if (batch.file_counts.failed > 0) {
1203
const failed = await client.vectorStores.fileBatches.listFiles(batch.id, {
1204
vector_store_id: store.id,
1205
filter: 'failed',
1206
});
1207
1208
for (const file of failed.data) {
1209
console.error(`Failed: ${file.id} - ${file.last_error?.message}`);
1210
}
1211
}
1212
1213
// 3. Search the vector store
1214
const results = await client.vectorStores.search(store.id, {
1215
query: 'How do I reset my password?',
1216
max_num_results: 3,
1217
});
1218
1219
console.log('Search results:');
1220
for (const result of results.data) {
1221
console.log(`- ${result.filename} (score: ${result.score})`);
1222
for (const chunk of result.content) {
1223
console.log(` ${chunk.text.substring(0, 100)}...`);
1224
}
1225
}
1226
1227
// 4. Use with assistant
1228
const assistant = await client.beta.assistants.create({
1229
name: 'Support Bot',
1230
model: 'gpt-4-turbo',
1231
tools: [{ type: 'file_search' }],
1232
tool_resources: {
1233
file_search: {
1234
vector_store_ids: [store.id],
1235
},
1236
},
1237
});
1238
1239
console.log(`Created assistant: ${assistant.id}`);
1240
1241
return { store, assistant, batch };
1242
}
1243
1244
buildSupportDocs().catch(console.error);
1245
```
1246
1247
---
1248
1249
## Error Handling
1250
1251
```typescript
1252
import { APIError, NotFoundError, RateLimitError } from 'openai';
1253
1254
try {
1255
const file = await client.vectorStores.files.retrieve('file-id', {
1256
vector_store_id: 'vs-id',
1257
});
1258
} catch (error) {
1259
if (error instanceof NotFoundError) {
1260
console.error('File not found');
1261
} else if (error instanceof RateLimitError) {
1262
console.error('Rate limited - retry in a moment');
1263
} else if (error instanceof APIError) {
1264
console.error(`API Error: ${error.status} ${error.message}`);
1265
}
1266
}
1267
```
1268
1269
---
1270
1271
## Best Practices
1272
1273
1. **Batch Operations**: Use batch operations for multiple files to handle concurrency efficiently
1274
2. **Polling**: Always poll until `in_progress` status completes before searching
1275
3. **Chunking**: Adjust chunk size based on your content - smaller for precise retrieval, larger for context
1276
4. **Metadata**: Use attributes to organize and filter files
1277
5. **Error Handling**: Check `last_error` after processing completes
1278
6. **Expiration**: Set expiration policies for temporary data stores
1279
7. **Search Options**: Use `rewrite_query` for better semantic search and filtering for precision
1280