Tessl Tile for npm/openai@6.9.1

or run

npx @tessl/cli init

fine-tuning.mddocs/

0
# Fine-Tuning
1

2
Create and manage fine-tuning jobs to adapt OpenAI models to your specific use case with your own training data. Fine-tuning supports supervised learning, Direct Preference Optimization (DPO), and reinforcement learning methods.
3

4
## Capabilities
5

6
### Fine-Tuning Job Management
7

8
Complete lifecycle management for fine-tuning jobs, from creation through monitoring to completion. Control job execution with pause, resume, and cancel operations.
9

10
```typescript { .api }
11
function create(params: FineTuningJobCreateParams): Promise<FineTuningJob>;
12
function retrieve(jobID: string): Promise<FineTuningJob>;
13
function list(params?: FineTuningJobListParams): Promise<FineTuningJobsPage>;
14
function cancel(jobID: string): Promise<FineTuningJob>;
15
function pause(jobID: string): Promise<FineTuningJob>;
16
function resume(jobID: string): Promise<FineTuningJob>;
17
```
18

19
**Available at:** `client.fineTuning.jobs`
20

21
### Job Monitoring and Events
22

23
Track job progress through detailed event logs with status updates, metrics, and error information. Events include training progress, validation results, and completion notifications.
24

25
```typescript { .api }
26
function listEvents(jobID: string, params?: JobEventListParams): Promise<FineTuningJobEventsPage>;
27
```
28

29
**Available at:** `client.fineTuning.jobs.listEvents()`
30

31
### Checkpoint Management
32

33
Access intermediate model checkpoints during fine-tuning to evaluate progress and use partially-trained models. Each checkpoint includes training metrics at specific steps.
34

35
```typescript { .api }
36
function list(jobID: string, params?: CheckpointListParams): Promise<FineTuningJobCheckpointsPage>;
37
```
38

39
**Available at:** `client.fineTuning.jobs.checkpoints.list()`
40

41
### Checkpoint Permissions
42

43
Manage sharing permissions for fine-tuned checkpoints, allowing you to grant or revoke access to specific checkpoints for other users or organizations.
44

45
```typescript { .api }
46
// Create permission for a checkpoint
47
function create(
48
  fineTunedModelCheckpoint: string,
49
  body: PermissionCreateParams,
50
  options?: RequestOptions
51
): Promise<PermissionCreateResponsesPage>;
52

53
// Retrieve permission details
54
function retrieve(
55
  fineTunedModelCheckpoint: string,
56
  query?: PermissionRetrieveParams,
57
  options?: RequestOptions
58
): Promise<PermissionRetrieveResponse>;
59

60
// Delete/revoke permission
61
function delete(
62
  permissionID: string,
63
  params: PermissionDeleteParams,
64
  options?: RequestOptions
65
): Promise<PermissionDeleteResponse>;
66
```
67

68
**Available at:** `client.fineTuning.checkpoints.permissions`
69

70
### Alpha Features - Grader Validation
71

72
Experimental grader tools for validating and testing graders before using them in fine-tuning jobs. These features are in alpha and subject to change.
73

74
```typescript { .api }
75
// Run a grader on test data
76
function run(body: GraderRunParams): Promise<GraderRunResponse>;
77

78
// Validate grader configuration
79
function validate(body: GraderValidateParams): Promise<GraderValidateResponse>;
80

81
interface GraderRunParams {
82
  grader: StringCheckGrader | TextSimilarityGrader | PythonGrader | ScoreModelGrader | LabelModelGrader | MultiGrader;
83
  model_sample: string;
84
  item?: unknown;
85
}
86

87
interface GraderValidateParams {
88
  grader: StringCheckGrader | TextSimilarityGrader | PythonGrader | ScoreModelGrader | LabelModelGrader | MultiGrader;
89
}
90
```
91

92
**Available at:** `client.fineTuning.alpha.graders`
93

94
**Note:** These are alpha/experimental features. The API may change in future versions.
95

96
---
97

98
## Core Types
99

100
### FineTuningJob { .api }
101

102
Represents a fine-tuning job that has been created through the API.
103

104
```typescript { .api }
105
interface FineTuningJob {
106
  id: string;
107
  created_at: number;
108
  finished_at: number | null;
109
  error: FineTuningJob.Error | null;
110
  fine_tuned_model: string | null;
111
  hyperparameters: FineTuningJob.Hyperparameters;
112
  model: string;
113
  object: 'fine_tuning.job';
114
  organization_id: string;
115
  result_files: Array<string>;
116
  seed: number;
117
  status: 'validating_files' | 'queued' | 'running' | 'succeeded' | 'failed' | 'cancelled';
118
  trained_tokens: number | null;
119
  training_file: string;
120
  validation_file: string | null;
121
  estimated_finish?: number | null;
122
  integrations?: Array<FineTuningJobIntegration> | null;
123
  metadata?: Record<string, string> | null;
124
  method?: FineTuningJob.Method;
125
}
126

127
namespace FineTuningJob {
128
  interface Error { { .api }
129
    code: string;
130
    message: string;
131
    param: string | null;
132
  }
133

134
  interface Hyperparameters { { .api }
135
    batch_size?: 'auto' | number | null;
136
    learning_rate_multiplier?: 'auto' | number;
137
    n_epochs?: 'auto' | number;
138
  }
139

140
  interface Method { { .api }
141
    type: 'supervised' | 'dpo' | 'reinforcement';
142
    dpo?: DpoMethod;
143
    reinforcement?: ReinforcementMethod;
144
    supervised?: SupervisedMethod;
145
  }
146
}
147
```
148

149
### FineTuningJobEvent { .api }
150

151
Event log entry for a fine-tuning job containing status updates and metrics.
152

153
```typescript { .api }
154
interface FineTuningJobEvent {
155
  id: string;
156
  created_at: number;
157
  level: 'info' | 'warn' | 'error';
158
  message: string;
159
  object: 'fine_tuning.job.event';
160
  data?: unknown;
161
  type?: 'message' | 'metrics';
162
}
163
```
164

165
### FineTuningJobCheckpoint { .api }
166

167
Represents an intermediate model checkpoint during a fine-tuning job, ready for evaluation or use.
168

169
```typescript { .api }
170
interface FineTuningJobCheckpoint {
171
  id: string;
172
  created_at: number;
173
  fine_tuned_model_checkpoint: string;
174
  fine_tuning_job_id: string;
175
  metrics: FineTuningJobCheckpoint.Metrics;
176
  object: 'fine_tuning.job.checkpoint';
177
  step_number: number;
178
}
179

180
namespace FineTuningJobCheckpoint {
181
  interface Metrics { { .api }
182
    full_valid_loss?: number;
183
    full_valid_mean_token_accuracy?: number;
184
    step?: number;
185
    train_loss?: number;
186
    train_mean_token_accuracy?: number;
187
    valid_loss?: number;
188
    valid_mean_token_accuracy?: number;
189
  }
190
}
191
```
192

193
### Training Method Types
194

195
#### SupervisedMethod { .api }
196

197
Standard supervised fine-tuning configuration for training on input-output pairs.
198

199
```typescript { .api }
200
interface SupervisedMethod {
201
  hyperparameters?: SupervisedHyperparameters;
202
}
203

204
interface SupervisedHyperparameters { { .api }
205
  batch_size?: 'auto' | number;
206
  learning_rate_multiplier?: 'auto' | number;
207
  n_epochs?: 'auto' | number;
208
}
209
```
210

211
#### DpoMethod { .api }
212

213
Direct Preference Optimization configuration for training with preference pairs (preferred vs. dispreferred responses).
214

215
```typescript { .api }
216
interface DpoMethod {
217
  hyperparameters?: DpoHyperparameters;
218
}
219

220
interface DpoHyperparameters { { .api }
221
  batch_size?: 'auto' | number;
222
  beta?: 'auto' | number;
223
  learning_rate_multiplier?: 'auto' | number;
224
  n_epochs?: 'auto' | number;
225
}
226
```
227

228
#### ReinforcementMethod { .api }
229

230
Reinforcement learning configuration for training with reward scoring.
231

232
```typescript { .api }
233
interface ReinforcementMethod {
234
  grader: StringCheckGrader | TextSimilarityGrader | PythonGrader | ScoreModelGrader | MultiGrader;
235
  hyperparameters?: ReinforcementHyperparameters;
236
}
237

238
interface ReinforcementHyperparameters {
239
  batch_size?: 'auto' | number;
240
  compute_multiplier?: 'auto' | number;
241
  eval_interval?: 'auto' | number;
242
  eval_samples?: 'auto' | number;
243
  learning_rate_multiplier?: 'auto' | number;
244
  n_epochs?: 'auto' | number;
245
  reasoning_effort?: 'default' | 'low' | 'medium' | 'high';
246
}
247
```
248

249
### Grader Types
250

251
Graders are used in reinforcement learning fine-tuning to automatically score model outputs and provide rewards for training.
252

253
####  LabelModelGrader { .api }
254

255
Uses a language model to assign labels to evaluation items. Useful for classification-style evaluation where outputs should fall into specific categories.
256

257
```typescript { .api }
258
interface LabelModelGrader {
259
  input: Array<LabelModelGraderInput>;
260
  labels: string[];
261
  model: string;
262
  name: string;
263
  passing_labels: string[];
264
  type: 'label_model';
265
}
266

267
interface LabelModelGraderInput {
268
  content: string | ResponseInputText | OutputText | InputImage | ResponseInputAudio | Array<unknown>;
269
  role: 'user' | 'assistant' | 'system' | 'developer';
270
  type?: 'message';
271
}
272

273
interface OutputText {
274
  text: string;
275
  type: 'output_text';
276
}
277

278
interface InputImage {
279
  image_url: string;
280
  type: 'input_image';
281
  detail?: string;
282
}
283
```
284

285
**Properties:**
286
- `input`: Array of message inputs to the grader model, can include template strings
287
- `labels`: Available labels to assign to each evaluation item
288
- `model`: The model to use for evaluation (must support structured outputs)
289
- `name`: Identifier for the grader
290
- `passing_labels`: Labels that indicate a passing result (must be subset of `labels`)
291
- `type`: Always `'label_model'`
292

293
#### StringCheckGrader { .api }
294

295
Performs string comparison operations between input and reference text.
296

297
```typescript { .api }
298
interface StringCheckGrader {
299
  input: string;
300
  name: string;
301
  operation: 'eq' | 'ne' | 'like' | 'ilike';
302
  reference: string;
303
  type: 'string_check';
304
}
305
```
306

307
**Properties:**
308
- `operation`: `'eq'` (equals), `'ne'` (not equals), `'like'` (SQL LIKE), `'ilike'` (case-insensitive LIKE)
309

310
#### TextSimilarityGrader { .api }
311

312
Grades text based on similarity metrics. Supports various metrics for comparing model output with reference text.
313

314
```typescript { .api }
315
interface TextSimilarityGrader {
316
  evaluation_metric: 'cosine' | 'fuzzy_match' | 'bleu' | 'gleu' | 'meteor' | 'rouge_1' | 'rouge_2' | 'rouge_3' | 'rouge_4' | 'rouge_5' | 'rouge_l';
317
  input: string;
318
  name: string;
319
  reference: string;
320
  type: 'text_similarity';
321
}
322
```
323

324
#### PythonGrader { .api }
325

326
Executes custom Python code for evaluation. Provides maximum flexibility for complex grading logic.
327

328
```typescript { .api }
329
interface PythonGrader {
330
  name: string;
331
  source: string;
332
  type: 'python';
333
  image_tag?: string;
334
}
335
```
336

337
**Properties:**
338
- `source`: Python code to execute for grading
339
- `image_tag`: Optional Docker image tag for the Python environment
340

341
#### ScoreModelGrader { .api }
342

343
Uses a language model to assign numerical scores to outputs. Useful for open-ended evaluation criteria.
344

345
```typescript { .api }
346
interface ScoreModelGrader {
347
  input: Array<ScoreModelGraderInput>;
348
  model: string;
349
  name: string;
350
  type: 'score_model';
351
  range?: [number, number];
352
  sampling_params?: SamplingParams;
353
}
354

355
interface ScoreModelGraderInput {
356
  content: string | ResponseInputText | OutputText | InputImage | ResponseInputAudio | Array<unknown>;
357
  role: 'user' | 'assistant' | 'system' | 'developer';
358
  type?: 'message';
359
}
360

361
interface SamplingParams {
362
  max_completions_tokens?: number | null;
363
  reasoning_effort?: 'none' | 'minimal' | 'low' | 'medium' | 'high' | null;
364
  seed?: number | null;
365
  temperature?: number | null;
366
  top_p?: number | null;
367
}
368
```
369

370
**Properties:**
371
- `range`: Score range (defaults to `[0, 1]`)
372
- `sampling_params`: Optional parameters for controlling model behavior
373

374
#### MultiGrader { .api }
375

376
Combines multiple graders using a formula to produce a final score.
377

378
```typescript { .api }
379
interface MultiGrader {
380
  calculate_output: string;
381
  graders: StringCheckGrader | TextSimilarityGrader | PythonGrader | ScoreModelGrader | LabelModelGrader;
382
  name: string;
383
  type: 'multi';
384
}
385
```
386

387
**Properties:**
388
- `calculate_output`: Formula to calculate final score from grader results
389
- `graders`: The individual graders to combine
390

391
### Pagination Types
392

393
```typescript { .api }
394
type FineTuningJobsPage = CursorPage<FineTuningJob>;
395
type FineTuningJobEventsPage = CursorPage<FineTuningJobEvent>;
396
type FineTuningJobCheckpointsPage = CursorPage<FineTuningJobCheckpoint>;
397
```
398

399
---
400

401
## Examples
402

403
### Creating a Fine-Tuning Job (Supervised)
404

405
Train a model using standard supervised learning with input-output pairs.
406

407
```typescript
408
import OpenAI from 'openai';
409

410
const client = new OpenAI({
411
  apiKey: process.env.OPENAI_API_KEY,
412
});
413

414
// Create a fine-tuning job
415
const job = await client.fineTuning.jobs.create({
416
  model: 'gpt-4o-mini',
417
  training_file: 'file-abc123', // JSONL file with training data
418
  method: {
419
    type: 'supervised',
420
    supervised: {
421
      hyperparameters: {
422
        batch_size: 8,
423
        learning_rate_multiplier: 1.0,
424
        n_epochs: 3,
425
      },
426
    },
427
  },
428
  suffix: 'my-fine-tuned-model',
429
});
430

431
console.log(`Job created: ${job.id}`);
432
console.log(`Status: ${job.status}`);
433
console.log(`Model: ${job.model}`);
434
```
435

436
### Creating a DPO Fine-Tuning Job
437

438
Train using Direct Preference Optimization with preference pairs.
439

440
```typescript
441
const dpoJob = await client.fineTuning.jobs.create({
442
  model: 'gpt-4o-mini',
443
  training_file: 'file-dpo-pairs-123', // JSONL with preference pairs
444
  method: {
445
    type: 'dpo',
446
    dpo: {
447
      hyperparameters: {
448
        batch_size: 16,
449
        beta: 0.1,
450
        learning_rate_multiplier: 0.5,
451
        n_epochs: 1,
452
      },
453
    },
454
  },
455
});
456

457
console.log(`DPO Job: ${dpoJob.id}`);
458
```
459

460
### Creating a Reinforcement Learning Fine-Tuning Job
461

462
Train using reinforcement learning with reward scoring.
463

464
```typescript
465
const rlJob = await client.fineTuning.jobs.create({
466
  model: 'gpt-4o-mini',
467
  training_file: 'file-rl-data-123',
468
  method: {
469
    type: 'reinforcement',
470
    reinforcement: {
471
      grader: {
472
        type: 'string_check', // or 'text_similarity', 'python', 'score_model', 'multi'
473
        name: 'string-check-grader',
474
        input: '{{ sample.output }}',
475
        operation: 'eq',
476
        reference: 'expected_output',
477
      },
478
      hyperparameters: {
479
        batch_size: 'auto',
480
        n_epochs: 2,
481
        learning_rate_multiplier: 0.8,
482
        eval_interval: 100,
483
        eval_samples: 50,
484
      },
485
    },
486
  },
487
});
488

489
console.log(`RL Job: ${rlJob.id}`);
490
```
491

492
### Retrieving Job Details
493

494
Get complete information about a specific fine-tuning job.
495

496
```typescript
497
const jobId = 'ft-AF1WoRqd3aJAHsqc9NY7iL8F';
498
const job = await client.fineTuning.jobs.retrieve(jobId);
499

500
console.log(`Job Status: ${job.status}`);
501
console.log(`Created: ${new Date(job.created_at * 1000).toISOString()}`);
502
console.log(`Fine-tuned Model: ${job.fine_tuned_model}`);
503
console.log(`Trained Tokens: ${job.trained_tokens}`);
504

505
if (job.error) {
506
  console.error(`Error: ${job.error.message}`);
507
}
508
```
509

510
### Monitoring Job Progress with Events
511

512
Track job execution through event logs including training metrics.
513

514
```typescript
515
const jobId = 'ft-AF1WoRqd3aJAHsqc9NY7iL8F';
516

517
// Iterate through all events
518
for await (const event of client.fineTuning.jobs.listEvents(jobId)) {
519
  console.log(`[${event.level}] ${event.message}`);
520

521
  if (event.type === 'metrics') {
522
    console.log('Metrics:', event.data);
523
  }
524
}
525

526
// List with pagination parameters
527
const eventPage = await client.fineTuning.jobs.listEvents(jobId, {
528
  limit: 10,
529
});
530

531
console.log(`Retrieved ${eventPage.data.length} events`);
532
```
533

534
### Working with Checkpoints
535

536
Access intermediate model checkpoints and their metrics.
537

538
```typescript
539
const jobId = 'ft-AF1WoRqd3aJAHsqc9NY7iL8F';
540

541
// Get all checkpoints for a job
542
for await (const checkpoint of client.fineTuning.jobs.checkpoints.list(jobId)) {
543
  console.log(`Checkpoint: ${checkpoint.fine_tuned_model_checkpoint}`);
544
  console.log(`Step: ${checkpoint.step_number}`);
545
  console.log(`Training Loss: ${checkpoint.metrics.train_loss}`);
546
  console.log(`Validation Loss: ${checkpoint.metrics.valid_loss}`);
547
  console.log(`Token Accuracy: ${checkpoint.metrics.valid_mean_token_accuracy}`);
548
}
549

550
// List checkpoints with pagination
551
const checkpointPage = await client.fineTuning.jobs.checkpoints.list(jobId, {
552
  limit: 5,
553
});
554

555
const bestCheckpoint = checkpointPage.data.reduce((best, current) => {
556
  return (current.metrics.valid_loss || 0) < (best.metrics.valid_loss || 0)
557
    ? current
558
    : best;
559
});
560

561
console.log(`Best checkpoint by validation loss: ${bestCheckpoint.id}`);
562
```
563

564
### Listing Fine-Tuning Jobs
565

566
Retrieve all fine-tuning jobs in your organization with filtering.
567

568
```typescript
569
// List all jobs
570
for await (const job of client.fineTuning.jobs.list()) {
571
  console.log(`${job.id}: ${job.status} (Model: ${job.model})`);
572
}
573

574
// List with filters
575
const jobsPage = await client.fineTuning.jobs.list({
576
  limit: 20,
577
});
578

579
const runningJobs = jobsPage.data.filter(j => j.status === 'running');
580
console.log(`Active jobs: ${runningJobs.length}`);
581

582
// Filter by metadata
583
const metadataFilteredJobs = await client.fineTuning.jobs.list({
584
  metadata: {
585
    'project': 'chatbot-v2',
586
  },
587
});
588
```
589

590
### Controlling Job Execution
591

592
Pause, resume, and cancel jobs as needed.
593

594
```typescript
595
const jobId = 'ft-AF1WoRqd3aJAHsqc9NY7iL8F';
596

597
// Pause a running job
598
const pausedJob = await client.fineTuning.jobs.pause(jobId);
599
console.log(`Job paused: ${pausedJob.status}`); // status: 'paused'
600

601
// Wait a bit...
602
await new Promise(resolve => setTimeout(resolve, 5000));
603

604
// Resume the job
605
const resumedJob = await client.fineTuning.jobs.resume(jobId);
606
console.log(`Job resumed: ${resumedJob.status}`); // status: 'running'
607

608
// Cancel a job (can cancel running, queued, or paused jobs)
609
const cancelledJob = await client.fineTuning.jobs.cancel(jobId);
610
console.log(`Job cancelled: ${cancelledJob.status}`); // status: 'cancelled'
611
```
612

613
---
614

615
## Training Data Format
616

617
Fine-tuning data must be formatted as JSONL (JSON Lines) files. Different formats are required depending on the training method and model type.
618

619
### Supervised Training - Chat Format
620

621
For chat-based models like GPT-4 and GPT-3.5-turbo with supervised learning:
622

623
```jsonl
624
{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Question about biology"}, {"role": "assistant", "content": "The answer is..."}]}
625
{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Question about physics"}, {"role": "assistant", "content": "The answer is..."}]}
626
```
627

628
### Supervised Training - Completions Format
629

630
For models using completions format:
631

632
```jsonl
633
{"prompt": "Write a poem about:", "completion": " nature and its beauty"}
634
{"prompt": "What is the capital of:", "completion": " France? Paris"}
635
```
636

637
### DPO Training - Preference Format
638

639
For Direct Preference Optimization with preference pairs:
640

641
```jsonl
642
{"messages": [{"role": "user", "content": "Question"}], "preferred": {"content": "Better answer"}, "dispreferred": {"content": "Worse answer"}}
643
{"messages": [{"role": "user", "content": "Another question"}], "preferred": {"content": "Preferred response"}, "dispreferred": {"content": "Dispreferred response"}}
644
```
645

646
### Reinforcement Learning Format
647

648
For RL training with prompts (rewards are assigned via grader):
649

650
```jsonl
651
{"messages": [{"role": "user", "content": "Write a story about adventure"}]}
652
{"messages": [{"role": "user", "content": "Explain quantum computing"}]}
653
```
654

655
### Data Preparation Best Practices
656

657
```typescript
658
import * as fs from 'fs';
659

660
// Example: Convert CSV training data to JSONL format
661
function csvToJsonl(csvFilePath: string): void {
662
  const lines = fs.readFileSync(csvFilePath, 'utf-8').split('\n');
663
  const [header, ...rows] = lines;
664
  const headers = header.split(',');
665

666
  const jsonlLines = rows
667
    .filter(row => row.trim())
668
    .map(row => {
669
      const values = row.split(',');
670
      const obj: any = {};
671
      headers.forEach((h, i) => {
672
        obj[h.trim()] = values[i]?.trim();
673
      });
674
      return JSON.stringify({
675
        messages: [
676
          { role: 'user', content: obj.prompt },
677
          { role: 'assistant', content: obj.completion },
678
        ],
679
      });
680
    })
681
    .join('\n');
682

683
  fs.writeFileSync('training_data.jsonl', jsonlLines);
684
}
685

686
// Validate JSONL format
687
function validateJsonl(jsonlPath: string): boolean {
688
  const lines = fs.readFileSync(jsonlPath, 'utf-8').split('\n');
689
  return lines
690
    .filter(line => line.trim())
691
    .every(line => {
692
      try {
693
        JSON.parse(line);
694
        return true;
695
      } catch {
696
        return false;
697
      }
698
    });
699
}
700

701
// Upload training file
702
async function uploadTrainingFile(client: OpenAI, jsonlPath: string): Promise<string> {
703
  const fileContent = fs.createReadStream(jsonlPath);
704
  const response = await client.files.create({
705
    file: fileContent,
706
    purpose: 'fine-tune',
707
  });
708
  return response.id;
709
}
710
```
711

712
---
713

714
## Hyperparameter Tuning
715

716
Fine-tuning outcomes depend heavily on hyperparameter selection. Here's a guide to tuning each parameter.
717

718
### Batch Size
719

720
Controls how many examples are processed before updating model weights.
721

722
- **Effect**: Larger batch sizes lead to more stable training but slower convergence
723
- **'auto' (recommended)**: OpenAI automatically selects based on dataset
724
- **Typical Range**: 1-256
725
- **Tradeoff**: Larger batches = lower variance, less frequent updates
726
- **Guidance**: Start with 'auto', then experiment with 8, 16, 32 if needed
727

728
```typescript
729
// Conservative tuning with larger batch size
730
hyperparameters: {
731
  batch_size: 32, // More stable, slower
732
}
733

734
// Aggressive tuning with smaller batch size
735
hyperparameters: {
736
  batch_size: 8, // Faster convergence, more noise
737
}
738
```
739

740
### Learning Rate Multiplier
741

742
Scales the base learning rate for the fine-tuning process.
743

744
- **Effect**: Controls the magnitude of weight updates
745
- **Typical Range**: 0.02 to 2.0
746
- **'auto' (recommended)**: Automatically selected based on model
747
- **Guidance**:
748
  - < 1.0: More conservative, less overfitting risk
749
  - 1.0: Default, balanced training
750
  - > 1.0: More aggressive, faster convergence
751

752
```typescript
753
// Conservative fine-tuning (prefer stability)
754
hyperparameters: {
755
  learning_rate_multiplier: 0.5, // Half the default rate
756
}
757

758
// Aggressive fine-tuning (prefer speed)
759
hyperparameters: {
760
  learning_rate_multiplier: 2.0, // Double the default rate
761
}
762
```
763

764
### Number of Epochs
765

766
How many complete passes through the training data to perform.
767

768
- **Effect**: More epochs generally improve performance but risk overfitting
769
- **Typical Range**: 1-10
770
- **'auto': Automatically selected
771
- **Guidance**:
772
  - 1 epoch: Fast, may underfit
773
  - 3-4 epochs: Balanced (recommended)
774
  - > 5 epochs: Risk of overfitting on small datasets
775

776
```typescript
777
// Small dataset - few epochs to avoid overfitting
778
hyperparameters: {
779
  n_epochs: 1,
780
}
781

782
// Large dataset - more epochs for better convergence
783
hyperparameters: {
784
  n_epochs: 4,
785
}
786
```
787

788
### DPO-Specific: Beta Parameter
789

790
The beta value weights the penalty between policy and reference model.
791

792
- **Effect**: Higher beta enforces stronger adherence to preference pairs
793
- **Typical Range**: 0.05 to 0.5
794
- **'auto' (recommended)**: Automatically tuned
795
- **Guidance**:
796
  - Low beta (0.05): More exploration, less constraint
797
  - High beta (0.3+): Strict preference alignment
798

799
```typescript
800
dpo: {
801
  hyperparameters: {
802
    beta: 0.1, // Moderate preference alignment
803
  },
804
}
805
```
806

807
### Hyperparameter Tuning Workflow
808

809
```typescript
810
async function tuneFinetuningModel(
811
  client: OpenAI,
812
  trainingFile: string,
813
  validationFile: string,
814
): Promise<string> {
815
  const configurations = [
816
    {
817
      name: 'conservative',
818
      batch_size: 32,
819
      learning_rate_multiplier: 0.5,
820
      n_epochs: 2,
821
    },
822
    {
823
      name: 'balanced',
824
      batch_size: 16,
825
      learning_rate_multiplier: 1.0,
826
      n_epochs: 3,
827
    },
828
    {
829
      name: 'aggressive',
830
      batch_size: 8,
831
      learning_rate_multiplier: 2.0,
832
      n_epochs: 4,
833
    },
834
  ];
835

836
  const results: Array<{ config: string; jobId: string; metrics: any }> = [];
837

838
  for (const config of configurations) {
839
    console.log(`Starting ${config.name} configuration...`);
840

841
    const job = await client.fineTuning.jobs.create({
842
      model: 'gpt-4o-mini',
843
      training_file: trainingFile,
844
      validation_file: validationFile,
845
      method: {
846
        type: 'supervised',
847
        supervised: {
848
          hyperparameters: {
849
            batch_size: config.batch_size,
850
            learning_rate_multiplier: config.learning_rate_multiplier,
851
            n_epochs: config.n_epochs,
852
          },
853
        },
854
      },
855
      suffix: `tune-${config.name}`,
856
      metadata: {
857
        'experiment': 'hyperparameter-tuning',
858
        'config': config.name,
859
      },
860
    });
861

862
    results.push({
863
      config: config.name,
864
      jobId: job.id,
865
      metrics: {
866
        batch_size: config.batch_size,
867
        learning_rate_multiplier: config.learning_rate_multiplier,
868
        n_epochs: config.n_epochs,
869
      },
870
    });
871
  }
872

873
  // Wait for jobs and compare results
874
  for (const result of results) {
875
    let job = await client.fineTuning.jobs.retrieve(result.jobId);
876

877
    // Poll until completion
878
    while (job.status === 'running' || job.status === 'queued') {
879
      await new Promise(resolve => setTimeout(resolve, 30000)); // Wait 30s
880
      job = await client.fineTuning.jobs.retrieve(result.jobId);
881
    }
882

883
    if (job.status === 'succeeded') {
884
      console.log(`${result.config} job succeeded: ${job.fine_tuned_model}`);
885
      console.log(`  Trained tokens: ${job.trained_tokens}`);
886
    }
887
  }
888

889
  return results[0].jobId; // Return first result ID
890
}
891
```
892

893
---
894

895
## Advanced Usage
896

897
### Using Metadata for Job Organization
898

899
Tag and filter jobs with custom metadata for better organization and tracking.
900

901
```typescript
902
// Create job with metadata
903
const job = await client.fineTuning.jobs.create({
904
  model: 'gpt-4o-mini',
905
  training_file: 'file-123',
906
  metadata: {
907
    'project': 'customer-support',
908
    'version': '1.0',
909
    'team': 'ai-products',
910
    'environment': 'production',
911
  },
912
});
913

914
// Later, filter jobs by metadata
915
const productionJobs = await client.fineTuning.jobs.list({
916
  metadata: {
917
    'environment': 'production',
918
  },
919
});
920

921
for await (const job of productionJobs) {
922
  console.log(`${job.id} - ${job.metadata?.project}`);
923
}
924
```
925

926
### Weights and Biases Integration
927

928
Monitor fine-tuning jobs in real-time using Weights and Biases.
929

930
```typescript
931
const job = await client.fineTuning.jobs.create({
932
  model: 'gpt-4o-mini',
933
  training_file: 'file-123',
934
  integrations: [
935
    {
936
      type: 'wandb',
937
      wandb: {
938
        project: 'openai-fine-tuning',
939
        entity: 'my-team',
940
        name: 'gpt4-mini-v1',
941
        tags: ['production', 'customer-support'],
942
      },
943
    },
944
  ],
945
});
946

947
console.log(`Monitor at: https://wandb.ai/my-team/openai-fine-tuning`);
948
```
949

950
### Reproducible Training
951

952
Use seeds for reproducible fine-tuning results.
953

954
```typescript
955
const seed = 42;
956

957
// Job 1
958
const job1 = await client.fineTuning.jobs.create({
959
  model: 'gpt-4o-mini',
960
  training_file: 'file-123',
961
  seed: seed,
962
  suffix: 'run1',
963
});
964

965
// Job 2 with same seed produces identical results
966
const job2 = await client.fineTuning.jobs.create({
967
  model: 'gpt-4o-mini',
968
  training_file: 'file-123',
969
  seed: seed,
970
  suffix: 'run2',
971
});
972

973
// Both jobs should produce equivalent models
974
```
975

976
### Validation File Usage
977

978
Provide validation data to monitor generalization during training.
979

980
```typescript
981
const job = await client.fineTuning.jobs.create({
982
  model: 'gpt-4o-mini',
983
  training_file: 'file-train-123',
984
  validation_file: 'file-val-456', // Optional but recommended
985
  method: {
986
    type: 'supervised',
987
    supervised: {
988
      hyperparameters: {
989
        batch_size: 16,
990
        n_epochs: 3,
991
        learning_rate_multiplier: 1.0,
992
      },
993
    },
994
  },
995
});
996

997
// Monitor validation metrics in events
998
for await (const event of client.fineTuning.jobs.listEvents(job.id)) {
999
  if (event.type === 'metrics' && event.data?.validating) {
1000
    console.log('Validation Metrics:', event.data);
1001
  }
1002
}
1003
```
1004

1005
### Checkpoint-Based Model Selection
1006

1007
Use checkpoints to select the best intermediate model rather than the final one.
1008

1009
```typescript
1010
async function findBestCheckpoint(
1011
  client: OpenAI,
1012
  jobId: string,
1013
): Promise<string> {
1014
  let bestCheckpoint: any = null;
1015
  let bestValidationLoss = Infinity;
1016

1017
  for await (const checkpoint of client.fineTuning.jobs.checkpoints.list(jobId)) {
1018
    const validationLoss = checkpoint.metrics.valid_loss || Infinity;
1019

1020
    if (validationLoss < bestValidationLoss) {
1021
      bestValidationLoss = validationLoss;
1022
      bestCheckpoint = checkpoint;
1023
    }
1024
  }
1025

1026
  if (bestCheckpoint) {
1027
    console.log(
1028
      `Best checkpoint at step ${bestCheckpoint.step_number}: ${bestCheckpoint.fine_tuned_model_checkpoint}`,
1029
    );
1030
    return bestCheckpoint.fine_tuned_model_checkpoint;
1031
  }
1032

1033
  throw new Error('No checkpoints found');
1034
}
1035

1036
// Use the checkpoint model
1037
const bestModel = await findBestCheckpoint(client, 'ft-123');
1038
const completion = await client.chat.completions.create({
1039
  model: bestModel,
1040
  messages: [{ role: 'user', content: 'Hello' }],
1041
});
1042
```
1043

1044
### Long-Running Job Polling
1045

1046
Monitor job completion with exponential backoff polling.
1047

1048
```typescript
1049
async function pollJobUntilComplete(
1050
  client: OpenAI,
1051
  jobId: string,
1052
  maxWaitMs = 7200000, // 2 hours
1053
): Promise<FineTuningJob> {
1054
  const startTime = Date.now();
1055
  let pollInterval = 5000; // Start at 5 seconds
1056
  const maxPollInterval = 60000; // Cap at 60 seconds
1057

1058
  while (Date.now() - startTime < maxWaitMs) {
1059
    const job = await client.fineTuning.jobs.retrieve(jobId);
1060

1061
    if (job.status === 'succeeded' || job.status === 'failed' || job.status === 'cancelled') {
1062
      return job;
1063
    }
1064

1065
    console.log(`Job ${jobId} status: ${job.status}`);
1066
    if (job.status === 'running' && job.estimated_finish) {
1067
      const remaining = job.estimated_finish * 1000 - Date.now();
1068
      console.log(`Estimated time remaining: ${Math.ceil(remaining / 1000)} seconds`);
1069
    }
1070

1071
    await new Promise(resolve => setTimeout(resolve, pollInterval));
1072

1073
    // Exponential backoff
1074
    pollInterval = Math.min(pollInterval * 1.5, maxPollInterval);
1075
  }
1076

1077
  throw new Error(`Job ${jobId} did not complete within ${maxWaitMs}ms`);
1078
}
1079

1080
// Usage
1081
const completedJob = await pollJobUntilComplete(client, 'ft-123');
1082
console.log(`Job completed with status: ${completedJob.status}`);
1083
```
1084

1085
### Bulk Job Monitoring
1086

1087
Track multiple fine-tuning jobs simultaneously.
1088

1089
```typescript
1090
async function monitorMultipleJobs(client: OpenAI, jobIds: string[]): Promise<void> {
1091
  const statusMap = new Map<string, string>();
1092
  jobIds.forEach(id => statusMap.set(id, 'unknown'));
1093

1094
  const updateStatus = async () => {
1095
    for (const jobId of jobIds) {
1096
      const job = await client.fineTuning.jobs.retrieve(jobId);
1097
      statusMap.set(jobId, job.status);
1098
    }
1099
  };
1100

1101
  const allComplete = () =>
1102
    Array.from(statusMap.values()).every(
1103
      status =>
1104
        status === 'succeeded' ||
1105
        status === 'failed' ||
1106
        status === 'cancelled',
1107
    );
1108

1109
  while (!allComplete()) {
1110
    await updateStatus();
1111

1112
    console.clear();
1113
    console.log('Fine-Tuning Jobs Status:');
1114
    for (const [id, status] of statusMap) {
1115
      const symbol =
1116
        status === 'succeeded'
1117
          ? '✓'
1118
          : status === 'failed'
1119
            ? '✗'
1120
            : status === 'running'
1121
              ? '→'
1122
              : '-';
1123
      console.log(`${symbol} ${id}: ${status}`);
1124
    }
1125

1126
    if (!allComplete()) {
1127
      await new Promise(resolve => setTimeout(resolve, 30000)); // Check every 30s
1128
    }
1129
  }
1130

1131
  console.log('\nAll jobs completed!');
1132
}
1133

1134
// Usage
1135
await monitorMultipleJobs(client, [
1136
  'ft-123',
1137
  'ft-456',
1138
  'ft-789',
1139
]);
1140
```
1141

1142
---
1143

1144
## Supported Models
1145

1146
Fine-tuning is available for the following models:
1147

1148
- `gpt-4o-mini` (Recommended for most use cases)
1149
- `gpt-3.5-turbo`
1150
- `davinci-002`
1151
- `babbage-002`
1152

1153
Model availability and capabilities may change. Check the [OpenAI documentation](https://platform.openai.com/docs/guides/fine-tuning) for the most current list.
1154

1155
---
1156

1157
## Error Handling
1158

1159
```typescript
1160
import { BadRequestError, NotFoundError } from 'openai';
1161

1162
async function createJobWithErrorHandling(
1163
  client: OpenAI,
1164
  trainingFile: string,
1165
) {
1166
  try {
1167
    const job = await client.fineTuning.jobs.create({
1168
      model: 'gpt-4o-mini',
1169
      training_file: trainingFile,
1170
    });
1171
    return job;
1172
  } catch (error) {
1173
    if (error instanceof BadRequestError) {
1174
      console.error('Invalid request:', error.message);
1175
      // Usually validation errors in training data format
1176
    } else if (error instanceof NotFoundError) {
1177
      console.error('Training file not found:', error.message);
1178
    } else {
1179
      throw error;
1180
    }
1181
  }
1182
}
1183

1184
// Monitor job for errors
1185
const job = await client.fineTuning.jobs.retrieve(jobId);
1186

1187
if (job.status === 'failed' && job.error) {
1188
  console.error(
1189
    `Job failed: ${job.error.code} - ${job.error.message}`,
1190
  );
1191
  console.error(`Failed parameter: ${job.error.param}`);
1192
}
1193
```
1194

1195
---
1196

1197
## See Also
1198

1199
- [Chat Completions](./chat-completions.md) - Use fine-tuned models for chat
1200
- [Files and Uploads](./files-uploads.md) - Upload training data
1201
- [Embeddings](./embeddings.md) - Fine-tune embedding models
1202

Version

Tile

Files

fine-tuning.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

fine-tuning.mddocs/