0
# Fine-Tuning
1
2
Create and manage fine-tuning jobs to adapt OpenAI models to your specific use case with your own training data. Fine-tuning supports supervised learning, Direct Preference Optimization (DPO), and reinforcement learning methods.
3
4
## Capabilities
5
6
### Fine-Tuning Job Management
7
8
Complete lifecycle management for fine-tuning jobs, from creation through monitoring to completion. Control job execution with pause, resume, and cancel operations.
9
10
```typescript { .api }
11
function create(params: FineTuningJobCreateParams): Promise<FineTuningJob>;
12
function retrieve(jobID: string): Promise<FineTuningJob>;
13
function list(params?: FineTuningJobListParams): Promise<FineTuningJobsPage>;
14
function cancel(jobID: string): Promise<FineTuningJob>;
15
function pause(jobID: string): Promise<FineTuningJob>;
16
function resume(jobID: string): Promise<FineTuningJob>;
17
```
18
19
**Available at:** `client.fineTuning.jobs`
20
21
### Job Monitoring and Events
22
23
Track job progress through detailed event logs with status updates, metrics, and error information. Events include training progress, validation results, and completion notifications.
24
25
```typescript { .api }
26
function listEvents(jobID: string, params?: JobEventListParams): Promise<FineTuningJobEventsPage>;
27
```
28
29
**Available at:** `client.fineTuning.jobs.listEvents()`
30
31
### Checkpoint Management
32
33
Access intermediate model checkpoints during fine-tuning to evaluate progress and use partially-trained models. Each checkpoint includes training metrics at specific steps.
34
35
```typescript { .api }
36
function list(jobID: string, params?: CheckpointListParams): Promise<FineTuningJobCheckpointsPage>;
37
```
38
39
**Available at:** `client.fineTuning.jobs.checkpoints.list()`
40
41
### Checkpoint Permissions
42
43
Manage sharing permissions for fine-tuned checkpoints, allowing you to grant or revoke access to specific checkpoints for other users or organizations.
44
45
```typescript { .api }
46
// Create permission for a checkpoint
47
function create(
48
fineTunedModelCheckpoint: string,
49
body: PermissionCreateParams,
50
options?: RequestOptions
51
): Promise<PermissionCreateResponsesPage>;
52
53
// Retrieve permission details
54
function retrieve(
55
fineTunedModelCheckpoint: string,
56
query?: PermissionRetrieveParams,
57
options?: RequestOptions
58
): Promise<PermissionRetrieveResponse>;
59
60
// Delete/revoke permission
61
function delete(
62
permissionID: string,
63
params: PermissionDeleteParams,
64
options?: RequestOptions
65
): Promise<PermissionDeleteResponse>;
66
```
67
68
**Available at:** `client.fineTuning.checkpoints.permissions`
69
70
### Alpha Features - Grader Validation
71
72
Experimental grader tools for validating and testing graders before using them in fine-tuning jobs. These features are in alpha and subject to change.
73
74
```typescript { .api }
75
// Run a grader on test data
76
function run(body: GraderRunParams): Promise<GraderRunResponse>;
77
78
// Validate grader configuration
79
function validate(body: GraderValidateParams): Promise<GraderValidateResponse>;
80
81
interface GraderRunParams {
82
grader: StringCheckGrader | TextSimilarityGrader | PythonGrader | ScoreModelGrader | LabelModelGrader | MultiGrader;
83
model_sample: string;
84
item?: unknown;
85
}
86
87
interface GraderValidateParams {
88
grader: StringCheckGrader | TextSimilarityGrader | PythonGrader | ScoreModelGrader | LabelModelGrader | MultiGrader;
89
}
90
```
91
92
**Available at:** `client.fineTuning.alpha.graders`
93
94
**Note:** These are alpha/experimental features. The API may change in future versions.
95
96
---
97
98
## Core Types
99
100
### FineTuningJob { .api }
101
102
Represents a fine-tuning job that has been created through the API.
103
104
```typescript { .api }
105
interface FineTuningJob {
106
id: string;
107
created_at: number;
108
finished_at: number | null;
109
error: FineTuningJob.Error | null;
110
fine_tuned_model: string | null;
111
hyperparameters: FineTuningJob.Hyperparameters;
112
model: string;
113
object: 'fine_tuning.job';
114
organization_id: string;
115
result_files: Array<string>;
116
seed: number;
117
status: 'validating_files' | 'queued' | 'running' | 'succeeded' | 'failed' | 'cancelled';
118
trained_tokens: number | null;
119
training_file: string;
120
validation_file: string | null;
121
estimated_finish?: number | null;
122
integrations?: Array<FineTuningJobIntegration> | null;
123
metadata?: Record<string, string> | null;
124
method?: FineTuningJob.Method;
125
}
126
127
namespace FineTuningJob {
128
interface Error { { .api }
129
code: string;
130
message: string;
131
param: string | null;
132
}
133
134
interface Hyperparameters { { .api }
135
batch_size?: 'auto' | number | null;
136
learning_rate_multiplier?: 'auto' | number;
137
n_epochs?: 'auto' | number;
138
}
139
140
interface Method { { .api }
141
type: 'supervised' | 'dpo' | 'reinforcement';
142
dpo?: DpoMethod;
143
reinforcement?: ReinforcementMethod;
144
supervised?: SupervisedMethod;
145
}
146
}
147
```
148
149
### FineTuningJobEvent { .api }
150
151
Event log entry for a fine-tuning job containing status updates and metrics.
152
153
```typescript { .api }
154
interface FineTuningJobEvent {
155
id: string;
156
created_at: number;
157
level: 'info' | 'warn' | 'error';
158
message: string;
159
object: 'fine_tuning.job.event';
160
data?: unknown;
161
type?: 'message' | 'metrics';
162
}
163
```
164
165
### FineTuningJobCheckpoint { .api }
166
167
Represents an intermediate model checkpoint during a fine-tuning job, ready for evaluation or use.
168
169
```typescript { .api }
170
interface FineTuningJobCheckpoint {
171
id: string;
172
created_at: number;
173
fine_tuned_model_checkpoint: string;
174
fine_tuning_job_id: string;
175
metrics: FineTuningJobCheckpoint.Metrics;
176
object: 'fine_tuning.job.checkpoint';
177
step_number: number;
178
}
179
180
namespace FineTuningJobCheckpoint {
181
interface Metrics { { .api }
182
full_valid_loss?: number;
183
full_valid_mean_token_accuracy?: number;
184
step?: number;
185
train_loss?: number;
186
train_mean_token_accuracy?: number;
187
valid_loss?: number;
188
valid_mean_token_accuracy?: number;
189
}
190
}
191
```
192
193
### Training Method Types
194
195
#### SupervisedMethod { .api }
196
197
Standard supervised fine-tuning configuration for training on input-output pairs.
198
199
```typescript { .api }
200
interface SupervisedMethod {
201
hyperparameters?: SupervisedHyperparameters;
202
}
203
204
interface SupervisedHyperparameters { { .api }
205
batch_size?: 'auto' | number;
206
learning_rate_multiplier?: 'auto' | number;
207
n_epochs?: 'auto' | number;
208
}
209
```
210
211
#### DpoMethod { .api }
212
213
Direct Preference Optimization configuration for training with preference pairs (preferred vs. dispreferred responses).
214
215
```typescript { .api }
216
interface DpoMethod {
217
hyperparameters?: DpoHyperparameters;
218
}
219
220
interface DpoHyperparameters { { .api }
221
batch_size?: 'auto' | number;
222
beta?: 'auto' | number;
223
learning_rate_multiplier?: 'auto' | number;
224
n_epochs?: 'auto' | number;
225
}
226
```
227
228
#### ReinforcementMethod { .api }
229
230
Reinforcement learning configuration for training with reward scoring.
231
232
```typescript { .api }
233
interface ReinforcementMethod {
234
grader: StringCheckGrader | TextSimilarityGrader | PythonGrader | ScoreModelGrader | MultiGrader;
235
hyperparameters?: ReinforcementHyperparameters;
236
}
237
238
interface ReinforcementHyperparameters {
239
batch_size?: 'auto' | number;
240
compute_multiplier?: 'auto' | number;
241
eval_interval?: 'auto' | number;
242
eval_samples?: 'auto' | number;
243
learning_rate_multiplier?: 'auto' | number;
244
n_epochs?: 'auto' | number;
245
reasoning_effort?: 'default' | 'low' | 'medium' | 'high';
246
}
247
```
248
249
### Grader Types
250
251
Graders are used in reinforcement learning fine-tuning to automatically score model outputs and provide rewards for training.
252
253
#### LabelModelGrader { .api }
254
255
Uses a language model to assign labels to evaluation items. Useful for classification-style evaluation where outputs should fall into specific categories.
256
257
```typescript { .api }
258
interface LabelModelGrader {
259
input: Array<LabelModelGraderInput>;
260
labels: string[];
261
model: string;
262
name: string;
263
passing_labels: string[];
264
type: 'label_model';
265
}
266
267
interface LabelModelGraderInput {
268
content: string | ResponseInputText | OutputText | InputImage | ResponseInputAudio | Array<unknown>;
269
role: 'user' | 'assistant' | 'system' | 'developer';
270
type?: 'message';
271
}
272
273
interface OutputText {
274
text: string;
275
type: 'output_text';
276
}
277
278
interface InputImage {
279
image_url: string;
280
type: 'input_image';
281
detail?: string;
282
}
283
```
284
285
**Properties:**
286
- `input`: Array of message inputs to the grader model, can include template strings
287
- `labels`: Available labels to assign to each evaluation item
288
- `model`: The model to use for evaluation (must support structured outputs)
289
- `name`: Identifier for the grader
290
- `passing_labels`: Labels that indicate a passing result (must be subset of `labels`)
291
- `type`: Always `'label_model'`
292
293
#### StringCheckGrader { .api }
294
295
Performs string comparison operations between input and reference text.
296
297
```typescript { .api }
298
interface StringCheckGrader {
299
input: string;
300
name: string;
301
operation: 'eq' | 'ne' | 'like' | 'ilike';
302
reference: string;
303
type: 'string_check';
304
}
305
```
306
307
**Properties:**
308
- `operation`: `'eq'` (equals), `'ne'` (not equals), `'like'` (SQL LIKE), `'ilike'` (case-insensitive LIKE)
309
310
#### TextSimilarityGrader { .api }
311
312
Grades text based on similarity metrics. Supports various metrics for comparing model output with reference text.
313
314
```typescript { .api }
315
interface TextSimilarityGrader {
316
evaluation_metric: 'cosine' | 'fuzzy_match' | 'bleu' | 'gleu' | 'meteor' | 'rouge_1' | 'rouge_2' | 'rouge_3' | 'rouge_4' | 'rouge_5' | 'rouge_l';
317
input: string;
318
name: string;
319
reference: string;
320
type: 'text_similarity';
321
}
322
```
323
324
#### PythonGrader { .api }
325
326
Executes custom Python code for evaluation. Provides maximum flexibility for complex grading logic.
327
328
```typescript { .api }
329
interface PythonGrader {
330
name: string;
331
source: string;
332
type: 'python';
333
image_tag?: string;
334
}
335
```
336
337
**Properties:**
338
- `source`: Python code to execute for grading
339
- `image_tag`: Optional Docker image tag for the Python environment
340
341
#### ScoreModelGrader { .api }
342
343
Uses a language model to assign numerical scores to outputs. Useful for open-ended evaluation criteria.
344
345
```typescript { .api }
346
interface ScoreModelGrader {
347
input: Array<ScoreModelGraderInput>;
348
model: string;
349
name: string;
350
type: 'score_model';
351
range?: [number, number];
352
sampling_params?: SamplingParams;
353
}
354
355
interface ScoreModelGraderInput {
356
content: string | ResponseInputText | OutputText | InputImage | ResponseInputAudio | Array<unknown>;
357
role: 'user' | 'assistant' | 'system' | 'developer';
358
type?: 'message';
359
}
360
361
interface SamplingParams {
362
max_completions_tokens?: number | null;
363
reasoning_effort?: 'none' | 'minimal' | 'low' | 'medium' | 'high' | null;
364
seed?: number | null;
365
temperature?: number | null;
366
top_p?: number | null;
367
}
368
```
369
370
**Properties:**
371
- `range`: Score range (defaults to `[0, 1]`)
372
- `sampling_params`: Optional parameters for controlling model behavior
373
374
#### MultiGrader { .api }
375
376
Combines multiple graders using a formula to produce a final score.
377
378
```typescript { .api }
379
interface MultiGrader {
380
calculate_output: string;
381
graders: StringCheckGrader | TextSimilarityGrader | PythonGrader | ScoreModelGrader | LabelModelGrader;
382
name: string;
383
type: 'multi';
384
}
385
```
386
387
**Properties:**
388
- `calculate_output`: Formula to calculate final score from grader results
389
- `graders`: The individual graders to combine
390
391
### Pagination Types
392
393
```typescript { .api }
394
type FineTuningJobsPage = CursorPage<FineTuningJob>;
395
type FineTuningJobEventsPage = CursorPage<FineTuningJobEvent>;
396
type FineTuningJobCheckpointsPage = CursorPage<FineTuningJobCheckpoint>;
397
```
398
399
---
400
401
## Examples
402
403
### Creating a Fine-Tuning Job (Supervised)
404
405
Train a model using standard supervised learning with input-output pairs.
406
407
```typescript
408
import OpenAI from 'openai';
409
410
const client = new OpenAI({
411
apiKey: process.env.OPENAI_API_KEY,
412
});
413
414
// Create a fine-tuning job
415
const job = await client.fineTuning.jobs.create({
416
model: 'gpt-4o-mini',
417
training_file: 'file-abc123', // JSONL file with training data
418
method: {
419
type: 'supervised',
420
supervised: {
421
hyperparameters: {
422
batch_size: 8,
423
learning_rate_multiplier: 1.0,
424
n_epochs: 3,
425
},
426
},
427
},
428
suffix: 'my-fine-tuned-model',
429
});
430
431
console.log(`Job created: ${job.id}`);
432
console.log(`Status: ${job.status}`);
433
console.log(`Model: ${job.model}`);
434
```
435
436
### Creating a DPO Fine-Tuning Job
437
438
Train using Direct Preference Optimization with preference pairs.
439
440
```typescript
441
const dpoJob = await client.fineTuning.jobs.create({
442
model: 'gpt-4o-mini',
443
training_file: 'file-dpo-pairs-123', // JSONL with preference pairs
444
method: {
445
type: 'dpo',
446
dpo: {
447
hyperparameters: {
448
batch_size: 16,
449
beta: 0.1,
450
learning_rate_multiplier: 0.5,
451
n_epochs: 1,
452
},
453
},
454
},
455
});
456
457
console.log(`DPO Job: ${dpoJob.id}`);
458
```
459
460
### Creating a Reinforcement Learning Fine-Tuning Job
461
462
Train using reinforcement learning with reward scoring.
463
464
```typescript
465
const rlJob = await client.fineTuning.jobs.create({
466
model: 'gpt-4o-mini',
467
training_file: 'file-rl-data-123',
468
method: {
469
type: 'reinforcement',
470
reinforcement: {
471
grader: {
472
type: 'string_check', // or 'text_similarity', 'python', 'score_model', 'multi'
473
name: 'string-check-grader',
474
input: '{{ sample.output }}',
475
operation: 'eq',
476
reference: 'expected_output',
477
},
478
hyperparameters: {
479
batch_size: 'auto',
480
n_epochs: 2,
481
learning_rate_multiplier: 0.8,
482
eval_interval: 100,
483
eval_samples: 50,
484
},
485
},
486
},
487
});
488
489
console.log(`RL Job: ${rlJob.id}`);
490
```
491
492
### Retrieving Job Details
493
494
Get complete information about a specific fine-tuning job.
495
496
```typescript
497
const jobId = 'ft-AF1WoRqd3aJAHsqc9NY7iL8F';
498
const job = await client.fineTuning.jobs.retrieve(jobId);
499
500
console.log(`Job Status: ${job.status}`);
501
console.log(`Created: ${new Date(job.created_at * 1000).toISOString()}`);
502
console.log(`Fine-tuned Model: ${job.fine_tuned_model}`);
503
console.log(`Trained Tokens: ${job.trained_tokens}`);
504
505
if (job.error) {
506
console.error(`Error: ${job.error.message}`);
507
}
508
```
509
510
### Monitoring Job Progress with Events
511
512
Track job execution through event logs including training metrics.
513
514
```typescript
515
const jobId = 'ft-AF1WoRqd3aJAHsqc9NY7iL8F';
516
517
// Iterate through all events
518
for await (const event of client.fineTuning.jobs.listEvents(jobId)) {
519
console.log(`[${event.level}] ${event.message}`);
520
521
if (event.type === 'metrics') {
522
console.log('Metrics:', event.data);
523
}
524
}
525
526
// List with pagination parameters
527
const eventPage = await client.fineTuning.jobs.listEvents(jobId, {
528
limit: 10,
529
});
530
531
console.log(`Retrieved ${eventPage.data.length} events`);
532
```
533
534
### Working with Checkpoints
535
536
Access intermediate model checkpoints and their metrics.
537
538
```typescript
539
const jobId = 'ft-AF1WoRqd3aJAHsqc9NY7iL8F';
540
541
// Get all checkpoints for a job
542
for await (const checkpoint of client.fineTuning.jobs.checkpoints.list(jobId)) {
543
console.log(`Checkpoint: ${checkpoint.fine_tuned_model_checkpoint}`);
544
console.log(`Step: ${checkpoint.step_number}`);
545
console.log(`Training Loss: ${checkpoint.metrics.train_loss}`);
546
console.log(`Validation Loss: ${checkpoint.metrics.valid_loss}`);
547
console.log(`Token Accuracy: ${checkpoint.metrics.valid_mean_token_accuracy}`);
548
}
549
550
// List checkpoints with pagination
551
const checkpointPage = await client.fineTuning.jobs.checkpoints.list(jobId, {
552
limit: 5,
553
});
554
555
const bestCheckpoint = checkpointPage.data.reduce((best, current) => {
556
return (current.metrics.valid_loss || 0) < (best.metrics.valid_loss || 0)
557
? current
558
: best;
559
});
560
561
console.log(`Best checkpoint by validation loss: ${bestCheckpoint.id}`);
562
```
563
564
### Listing Fine-Tuning Jobs
565
566
Retrieve all fine-tuning jobs in your organization with filtering.
567
568
```typescript
569
// List all jobs
570
for await (const job of client.fineTuning.jobs.list()) {
571
console.log(`${job.id}: ${job.status} (Model: ${job.model})`);
572
}
573
574
// List with filters
575
const jobsPage = await client.fineTuning.jobs.list({
576
limit: 20,
577
});
578
579
const runningJobs = jobsPage.data.filter(j => j.status === 'running');
580
console.log(`Active jobs: ${runningJobs.length}`);
581
582
// Filter by metadata
583
const metadataFilteredJobs = await client.fineTuning.jobs.list({
584
metadata: {
585
'project': 'chatbot-v2',
586
},
587
});
588
```
589
590
### Controlling Job Execution
591
592
Pause, resume, and cancel jobs as needed.
593
594
```typescript
595
const jobId = 'ft-AF1WoRqd3aJAHsqc9NY7iL8F';
596
597
// Pause a running job
598
const pausedJob = await client.fineTuning.jobs.pause(jobId);
599
console.log(`Job paused: ${pausedJob.status}`); // status: 'paused'
600
601
// Wait a bit...
602
await new Promise(resolve => setTimeout(resolve, 5000));
603
604
// Resume the job
605
const resumedJob = await client.fineTuning.jobs.resume(jobId);
606
console.log(`Job resumed: ${resumedJob.status}`); // status: 'running'
607
608
// Cancel a job (can cancel running, queued, or paused jobs)
609
const cancelledJob = await client.fineTuning.jobs.cancel(jobId);
610
console.log(`Job cancelled: ${cancelledJob.status}`); // status: 'cancelled'
611
```
612
613
---
614
615
## Training Data Format
616
617
Fine-tuning data must be formatted as JSONL (JSON Lines) files. Different formats are required depending on the training method and model type.
618
619
### Supervised Training - Chat Format
620
621
For chat-based models like GPT-4 and GPT-3.5-turbo with supervised learning:
622
623
```jsonl
624
{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Question about biology"}, {"role": "assistant", "content": "The answer is..."}]}
625
{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Question about physics"}, {"role": "assistant", "content": "The answer is..."}]}
626
```
627
628
### Supervised Training - Completions Format
629
630
For models using completions format:
631
632
```jsonl
633
{"prompt": "Write a poem about:", "completion": " nature and its beauty"}
634
{"prompt": "What is the capital of:", "completion": " France? Paris"}
635
```
636
637
### DPO Training - Preference Format
638
639
For Direct Preference Optimization with preference pairs:
640
641
```jsonl
642
{"messages": [{"role": "user", "content": "Question"}], "preferred": {"content": "Better answer"}, "dispreferred": {"content": "Worse answer"}}
643
{"messages": [{"role": "user", "content": "Another question"}], "preferred": {"content": "Preferred response"}, "dispreferred": {"content": "Dispreferred response"}}
644
```
645
646
### Reinforcement Learning Format
647
648
For RL training with prompts (rewards are assigned via grader):
649
650
```jsonl
651
{"messages": [{"role": "user", "content": "Write a story about adventure"}]}
652
{"messages": [{"role": "user", "content": "Explain quantum computing"}]}
653
```
654
655
### Data Preparation Best Practices
656
657
```typescript
658
import * as fs from 'fs';
659
660
// Example: Convert CSV training data to JSONL format
661
function csvToJsonl(csvFilePath: string): void {
662
const lines = fs.readFileSync(csvFilePath, 'utf-8').split('\n');
663
const [header, ...rows] = lines;
664
const headers = header.split(',');
665
666
const jsonlLines = rows
667
.filter(row => row.trim())
668
.map(row => {
669
const values = row.split(',');
670
const obj: any = {};
671
headers.forEach((h, i) => {
672
obj[h.trim()] = values[i]?.trim();
673
});
674
return JSON.stringify({
675
messages: [
676
{ role: 'user', content: obj.prompt },
677
{ role: 'assistant', content: obj.completion },
678
],
679
});
680
})
681
.join('\n');
682
683
fs.writeFileSync('training_data.jsonl', jsonlLines);
684
}
685
686
// Validate JSONL format
687
function validateJsonl(jsonlPath: string): boolean {
688
const lines = fs.readFileSync(jsonlPath, 'utf-8').split('\n');
689
return lines
690
.filter(line => line.trim())
691
.every(line => {
692
try {
693
JSON.parse(line);
694
return true;
695
} catch {
696
return false;
697
}
698
});
699
}
700
701
// Upload training file
702
async function uploadTrainingFile(client: OpenAI, jsonlPath: string): Promise<string> {
703
const fileContent = fs.createReadStream(jsonlPath);
704
const response = await client.files.create({
705
file: fileContent,
706
purpose: 'fine-tune',
707
});
708
return response.id;
709
}
710
```
711
712
---
713
714
## Hyperparameter Tuning
715
716
Fine-tuning outcomes depend heavily on hyperparameter selection. Here's a guide to tuning each parameter.
717
718
### Batch Size
719
720
Controls how many examples are processed before updating model weights.
721
722
- **Effect**: Larger batch sizes lead to more stable training but slower convergence
723
- **'auto' (recommended)**: OpenAI automatically selects based on dataset
724
- **Typical Range**: 1-256
725
- **Tradeoff**: Larger batches = lower variance, less frequent updates
726
- **Guidance**: Start with 'auto', then experiment with 8, 16, 32 if needed
727
728
```typescript
729
// Conservative tuning with larger batch size
730
hyperparameters: {
731
batch_size: 32, // More stable, slower
732
}
733
734
// Aggressive tuning with smaller batch size
735
hyperparameters: {
736
batch_size: 8, // Faster convergence, more noise
737
}
738
```
739
740
### Learning Rate Multiplier
741
742
Scales the base learning rate for the fine-tuning process.
743
744
- **Effect**: Controls the magnitude of weight updates
745
- **Typical Range**: 0.02 to 2.0
746
- **'auto' (recommended)**: Automatically selected based on model
747
- **Guidance**:
748
- < 1.0: More conservative, less overfitting risk
749
- 1.0: Default, balanced training
750
- > 1.0: More aggressive, faster convergence
751
752
```typescript
753
// Conservative fine-tuning (prefer stability)
754
hyperparameters: {
755
learning_rate_multiplier: 0.5, // Half the default rate
756
}
757
758
// Aggressive fine-tuning (prefer speed)
759
hyperparameters: {
760
learning_rate_multiplier: 2.0, // Double the default rate
761
}
762
```
763
764
### Number of Epochs
765
766
How many complete passes through the training data to perform.
767
768
- **Effect**: More epochs generally improve performance but risk overfitting
769
- **Typical Range**: 1-10
770
- **'auto': Automatically selected
771
- **Guidance**:
772
- 1 epoch: Fast, may underfit
773
- 3-4 epochs: Balanced (recommended)
774
- > 5 epochs: Risk of overfitting on small datasets
775
776
```typescript
777
// Small dataset - few epochs to avoid overfitting
778
hyperparameters: {
779
n_epochs: 1,
780
}
781
782
// Large dataset - more epochs for better convergence
783
hyperparameters: {
784
n_epochs: 4,
785
}
786
```
787
788
### DPO-Specific: Beta Parameter
789
790
The beta value weights the penalty between policy and reference model.
791
792
- **Effect**: Higher beta enforces stronger adherence to preference pairs
793
- **Typical Range**: 0.05 to 0.5
794
- **'auto' (recommended)**: Automatically tuned
795
- **Guidance**:
796
- Low beta (0.05): More exploration, less constraint
797
- High beta (0.3+): Strict preference alignment
798
799
```typescript
800
dpo: {
801
hyperparameters: {
802
beta: 0.1, // Moderate preference alignment
803
},
804
}
805
```
806
807
### Hyperparameter Tuning Workflow
808
809
```typescript
810
async function tuneFinetuningModel(
811
client: OpenAI,
812
trainingFile: string,
813
validationFile: string,
814
): Promise<string> {
815
const configurations = [
816
{
817
name: 'conservative',
818
batch_size: 32,
819
learning_rate_multiplier: 0.5,
820
n_epochs: 2,
821
},
822
{
823
name: 'balanced',
824
batch_size: 16,
825
learning_rate_multiplier: 1.0,
826
n_epochs: 3,
827
},
828
{
829
name: 'aggressive',
830
batch_size: 8,
831
learning_rate_multiplier: 2.0,
832
n_epochs: 4,
833
},
834
];
835
836
const results: Array<{ config: string; jobId: string; metrics: any }> = [];
837
838
for (const config of configurations) {
839
console.log(`Starting ${config.name} configuration...`);
840
841
const job = await client.fineTuning.jobs.create({
842
model: 'gpt-4o-mini',
843
training_file: trainingFile,
844
validation_file: validationFile,
845
method: {
846
type: 'supervised',
847
supervised: {
848
hyperparameters: {
849
batch_size: config.batch_size,
850
learning_rate_multiplier: config.learning_rate_multiplier,
851
n_epochs: config.n_epochs,
852
},
853
},
854
},
855
suffix: `tune-${config.name}`,
856
metadata: {
857
'experiment': 'hyperparameter-tuning',
858
'config': config.name,
859
},
860
});
861
862
results.push({
863
config: config.name,
864
jobId: job.id,
865
metrics: {
866
batch_size: config.batch_size,
867
learning_rate_multiplier: config.learning_rate_multiplier,
868
n_epochs: config.n_epochs,
869
},
870
});
871
}
872
873
// Wait for jobs and compare results
874
for (const result of results) {
875
let job = await client.fineTuning.jobs.retrieve(result.jobId);
876
877
// Poll until completion
878
while (job.status === 'running' || job.status === 'queued') {
879
await new Promise(resolve => setTimeout(resolve, 30000)); // Wait 30s
880
job = await client.fineTuning.jobs.retrieve(result.jobId);
881
}
882
883
if (job.status === 'succeeded') {
884
console.log(`${result.config} job succeeded: ${job.fine_tuned_model}`);
885
console.log(` Trained tokens: ${job.trained_tokens}`);
886
}
887
}
888
889
return results[0].jobId; // Return first result ID
890
}
891
```
892
893
---
894
895
## Advanced Usage
896
897
### Using Metadata for Job Organization
898
899
Tag and filter jobs with custom metadata for better organization and tracking.
900
901
```typescript
902
// Create job with metadata
903
const job = await client.fineTuning.jobs.create({
904
model: 'gpt-4o-mini',
905
training_file: 'file-123',
906
metadata: {
907
'project': 'customer-support',
908
'version': '1.0',
909
'team': 'ai-products',
910
'environment': 'production',
911
},
912
});
913
914
// Later, filter jobs by metadata
915
const productionJobs = await client.fineTuning.jobs.list({
916
metadata: {
917
'environment': 'production',
918
},
919
});
920
921
for await (const job of productionJobs) {
922
console.log(`${job.id} - ${job.metadata?.project}`);
923
}
924
```
925
926
### Weights and Biases Integration
927
928
Monitor fine-tuning jobs in real-time using Weights and Biases.
929
930
```typescript
931
const job = await client.fineTuning.jobs.create({
932
model: 'gpt-4o-mini',
933
training_file: 'file-123',
934
integrations: [
935
{
936
type: 'wandb',
937
wandb: {
938
project: 'openai-fine-tuning',
939
entity: 'my-team',
940
name: 'gpt4-mini-v1',
941
tags: ['production', 'customer-support'],
942
},
943
},
944
],
945
});
946
947
console.log(`Monitor at: https://wandb.ai/my-team/openai-fine-tuning`);
948
```
949
950
### Reproducible Training
951
952
Use seeds for reproducible fine-tuning results.
953
954
```typescript
955
const seed = 42;
956
957
// Job 1
958
const job1 = await client.fineTuning.jobs.create({
959
model: 'gpt-4o-mini',
960
training_file: 'file-123',
961
seed: seed,
962
suffix: 'run1',
963
});
964
965
// Job 2 with same seed produces identical results
966
const job2 = await client.fineTuning.jobs.create({
967
model: 'gpt-4o-mini',
968
training_file: 'file-123',
969
seed: seed,
970
suffix: 'run2',
971
});
972
973
// Both jobs should produce equivalent models
974
```
975
976
### Validation File Usage
977
978
Provide validation data to monitor generalization during training.
979
980
```typescript
981
const job = await client.fineTuning.jobs.create({
982
model: 'gpt-4o-mini',
983
training_file: 'file-train-123',
984
validation_file: 'file-val-456', // Optional but recommended
985
method: {
986
type: 'supervised',
987
supervised: {
988
hyperparameters: {
989
batch_size: 16,
990
n_epochs: 3,
991
learning_rate_multiplier: 1.0,
992
},
993
},
994
},
995
});
996
997
// Monitor validation metrics in events
998
for await (const event of client.fineTuning.jobs.listEvents(job.id)) {
999
if (event.type === 'metrics' && event.data?.validating) {
1000
console.log('Validation Metrics:', event.data);
1001
}
1002
}
1003
```
1004
1005
### Checkpoint-Based Model Selection
1006
1007
Use checkpoints to select the best intermediate model rather than the final one.
1008
1009
```typescript
1010
async function findBestCheckpoint(
1011
client: OpenAI,
1012
jobId: string,
1013
): Promise<string> {
1014
let bestCheckpoint: any = null;
1015
let bestValidationLoss = Infinity;
1016
1017
for await (const checkpoint of client.fineTuning.jobs.checkpoints.list(jobId)) {
1018
const validationLoss = checkpoint.metrics.valid_loss || Infinity;
1019
1020
if (validationLoss < bestValidationLoss) {
1021
bestValidationLoss = validationLoss;
1022
bestCheckpoint = checkpoint;
1023
}
1024
}
1025
1026
if (bestCheckpoint) {
1027
console.log(
1028
`Best checkpoint at step ${bestCheckpoint.step_number}: ${bestCheckpoint.fine_tuned_model_checkpoint}`,
1029
);
1030
return bestCheckpoint.fine_tuned_model_checkpoint;
1031
}
1032
1033
throw new Error('No checkpoints found');
1034
}
1035
1036
// Use the checkpoint model
1037
const bestModel = await findBestCheckpoint(client, 'ft-123');
1038
const completion = await client.chat.completions.create({
1039
model: bestModel,
1040
messages: [{ role: 'user', content: 'Hello' }],
1041
});
1042
```
1043
1044
### Long-Running Job Polling
1045
1046
Monitor job completion with exponential backoff polling.
1047
1048
```typescript
1049
async function pollJobUntilComplete(
1050
client: OpenAI,
1051
jobId: string,
1052
maxWaitMs = 7200000, // 2 hours
1053
): Promise<FineTuningJob> {
1054
const startTime = Date.now();
1055
let pollInterval = 5000; // Start at 5 seconds
1056
const maxPollInterval = 60000; // Cap at 60 seconds
1057
1058
while (Date.now() - startTime < maxWaitMs) {
1059
const job = await client.fineTuning.jobs.retrieve(jobId);
1060
1061
if (job.status === 'succeeded' || job.status === 'failed' || job.status === 'cancelled') {
1062
return job;
1063
}
1064
1065
console.log(`Job ${jobId} status: ${job.status}`);
1066
if (job.status === 'running' && job.estimated_finish) {
1067
const remaining = job.estimated_finish * 1000 - Date.now();
1068
console.log(`Estimated time remaining: ${Math.ceil(remaining / 1000)} seconds`);
1069
}
1070
1071
await new Promise(resolve => setTimeout(resolve, pollInterval));
1072
1073
// Exponential backoff
1074
pollInterval = Math.min(pollInterval * 1.5, maxPollInterval);
1075
}
1076
1077
throw new Error(`Job ${jobId} did not complete within ${maxWaitMs}ms`);
1078
}
1079
1080
// Usage
1081
const completedJob = await pollJobUntilComplete(client, 'ft-123');
1082
console.log(`Job completed with status: ${completedJob.status}`);
1083
```
1084
1085
### Bulk Job Monitoring
1086
1087
Track multiple fine-tuning jobs simultaneously.
1088
1089
```typescript
1090
async function monitorMultipleJobs(client: OpenAI, jobIds: string[]): Promise<void> {
1091
const statusMap = new Map<string, string>();
1092
jobIds.forEach(id => statusMap.set(id, 'unknown'));
1093
1094
const updateStatus = async () => {
1095
for (const jobId of jobIds) {
1096
const job = await client.fineTuning.jobs.retrieve(jobId);
1097
statusMap.set(jobId, job.status);
1098
}
1099
};
1100
1101
const allComplete = () =>
1102
Array.from(statusMap.values()).every(
1103
status =>
1104
status === 'succeeded' ||
1105
status === 'failed' ||
1106
status === 'cancelled',
1107
);
1108
1109
while (!allComplete()) {
1110
await updateStatus();
1111
1112
console.clear();
1113
console.log('Fine-Tuning Jobs Status:');
1114
for (const [id, status] of statusMap) {
1115
const symbol =
1116
status === 'succeeded'
1117
? '✓'
1118
: status === 'failed'
1119
? '✗'
1120
: status === 'running'
1121
? '→'
1122
: '-';
1123
console.log(`${symbol} ${id}: ${status}`);
1124
}
1125
1126
if (!allComplete()) {
1127
await new Promise(resolve => setTimeout(resolve, 30000)); // Check every 30s
1128
}
1129
}
1130
1131
console.log('\nAll jobs completed!');
1132
}
1133
1134
// Usage
1135
await monitorMultipleJobs(client, [
1136
'ft-123',
1137
'ft-456',
1138
'ft-789',
1139
]);
1140
```
1141
1142
---
1143
1144
## Supported Models
1145
1146
Fine-tuning is available for the following models:
1147
1148
- `gpt-4o-mini` (Recommended for most use cases)
1149
- `gpt-3.5-turbo`
1150
- `davinci-002`
1151
- `babbage-002`
1152
1153
Model availability and capabilities may change. Check the [OpenAI documentation](https://platform.openai.com/docs/guides/fine-tuning) for the most current list.
1154
1155
---
1156
1157
## Error Handling
1158
1159
```typescript
1160
import { BadRequestError, NotFoundError } from 'openai';
1161
1162
async function createJobWithErrorHandling(
1163
client: OpenAI,
1164
trainingFile: string,
1165
) {
1166
try {
1167
const job = await client.fineTuning.jobs.create({
1168
model: 'gpt-4o-mini',
1169
training_file: trainingFile,
1170
});
1171
return job;
1172
} catch (error) {
1173
if (error instanceof BadRequestError) {
1174
console.error('Invalid request:', error.message);
1175
// Usually validation errors in training data format
1176
} else if (error instanceof NotFoundError) {
1177
console.error('Training file not found:', error.message);
1178
} else {
1179
throw error;
1180
}
1181
}
1182
}
1183
1184
// Monitor job for errors
1185
const job = await client.fineTuning.jobs.retrieve(jobId);
1186
1187
if (job.status === 'failed' && job.error) {
1188
console.error(
1189
`Job failed: ${job.error.code} - ${job.error.message}`,
1190
);
1191
console.error(`Failed parameter: ${job.error.param}`);
1192
}
1193
```
1194
1195
---
1196
1197
## See Also
1198
1199
- [Chat Completions](./chat-completions.md) - Use fine-tuned models for chat
1200
- [Files and Uploads](./files-uploads.md) - Upload training data
1201
- [Embeddings](./embeddings.md) - Fine-tune embedding models
1202