or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

assistants.mdaudio.mdbatches-evals.mdchat-completions.mdclient-configuration.mdcontainers.mdconversations.mdembeddings.mdfiles-uploads.mdfine-tuning.mdhelpers-audio.mdhelpers-zod.mdimages.mdindex.mdrealtime.mdresponses-api.mdvector-stores.mdvideos.md

fine-tuning.mddocs/

0

# Fine-Tuning

1

2

Create and manage fine-tuning jobs to adapt OpenAI models to your specific use case with your own training data. Fine-tuning supports supervised learning, Direct Preference Optimization (DPO), and reinforcement learning methods.

3

4

## Capabilities

5

6

### Fine-Tuning Job Management

7

8

Complete lifecycle management for fine-tuning jobs, from creation through monitoring to completion. Control job execution with pause, resume, and cancel operations.

9

10

```typescript { .api }

11

function create(params: FineTuningJobCreateParams): Promise<FineTuningJob>;

12

function retrieve(jobID: string): Promise<FineTuningJob>;

13

function list(params?: FineTuningJobListParams): Promise<FineTuningJobsPage>;

14

function cancel(jobID: string): Promise<FineTuningJob>;

15

function pause(jobID: string): Promise<FineTuningJob>;

16

function resume(jobID: string): Promise<FineTuningJob>;

17

```

18

19

**Available at:** `client.fineTuning.jobs`

20

21

### Job Monitoring and Events

22

23

Track job progress through detailed event logs with status updates, metrics, and error information. Events include training progress, validation results, and completion notifications.

24

25

```typescript { .api }

26

function listEvents(jobID: string, params?: JobEventListParams): Promise<FineTuningJobEventsPage>;

27

```

28

29

**Available at:** `client.fineTuning.jobs.listEvents()`

30

31

### Checkpoint Management

32

33

Access intermediate model checkpoints during fine-tuning to evaluate progress and use partially-trained models. Each checkpoint includes training metrics at specific steps.

34

35

```typescript { .api }

36

function list(jobID: string, params?: CheckpointListParams): Promise<FineTuningJobCheckpointsPage>;

37

```

38

39

**Available at:** `client.fineTuning.jobs.checkpoints.list()`

40

41

### Checkpoint Permissions

42

43

Manage sharing permissions for fine-tuned checkpoints, allowing you to grant or revoke access to specific checkpoints for other users or organizations.

44

45

```typescript { .api }

46

// Create permission for a checkpoint

47

function create(

48

fineTunedModelCheckpoint: string,

49

body: PermissionCreateParams,

50

options?: RequestOptions

51

): Promise<PermissionCreateResponsesPage>;

52

53

// Retrieve permission details

54

function retrieve(

55

fineTunedModelCheckpoint: string,

56

query?: PermissionRetrieveParams,

57

options?: RequestOptions

58

): Promise<PermissionRetrieveResponse>;

59

60

// Delete/revoke permission

61

function delete(

62

permissionID: string,

63

params: PermissionDeleteParams,

64

options?: RequestOptions

65

): Promise<PermissionDeleteResponse>;

66

```

67

68

**Available at:** `client.fineTuning.checkpoints.permissions`

69

70

### Alpha Features - Grader Validation

71

72

Experimental grader tools for validating and testing graders before using them in fine-tuning jobs. These features are in alpha and subject to change.

73

74

```typescript { .api }

75

// Run a grader on test data

76

function run(body: GraderRunParams): Promise<GraderRunResponse>;

77

78

// Validate grader configuration

79

function validate(body: GraderValidateParams): Promise<GraderValidateResponse>;

80

81

interface GraderRunParams {

82

grader: StringCheckGrader | TextSimilarityGrader | PythonGrader | ScoreModelGrader | LabelModelGrader | MultiGrader;

83

model_sample: string;

84

item?: unknown;

85

}

86

87

interface GraderValidateParams {

88

grader: StringCheckGrader | TextSimilarityGrader | PythonGrader | ScoreModelGrader | LabelModelGrader | MultiGrader;

89

}

90

```

91

92

**Available at:** `client.fineTuning.alpha.graders`

93

94

**Note:** These are alpha/experimental features. The API may change in future versions.

95

96

---

97

98

## Core Types

99

100

### FineTuningJob { .api }

101

102

Represents a fine-tuning job that has been created through the API.

103

104

```typescript { .api }

105

interface FineTuningJob {

106

id: string;

107

created_at: number;

108

finished_at: number | null;

109

error: FineTuningJob.Error | null;

110

fine_tuned_model: string | null;

111

hyperparameters: FineTuningJob.Hyperparameters;

112

model: string;

113

object: 'fine_tuning.job';

114

organization_id: string;

115

result_files: Array<string>;

116

seed: number;

117

status: 'validating_files' | 'queued' | 'running' | 'succeeded' | 'failed' | 'cancelled';

118

trained_tokens: number | null;

119

training_file: string;

120

validation_file: string | null;

121

estimated_finish?: number | null;

122

integrations?: Array<FineTuningJobIntegration> | null;

123

metadata?: Record<string, string> | null;

124

method?: FineTuningJob.Method;

125

}

126

127

namespace FineTuningJob {

128

interface Error { { .api }

129

code: string;

130

message: string;

131

param: string | null;

132

}

133

134

interface Hyperparameters { { .api }

135

batch_size?: 'auto' | number | null;

136

learning_rate_multiplier?: 'auto' | number;

137

n_epochs?: 'auto' | number;

138

}

139

140

interface Method { { .api }

141

type: 'supervised' | 'dpo' | 'reinforcement';

142

dpo?: DpoMethod;

143

reinforcement?: ReinforcementMethod;

144

supervised?: SupervisedMethod;

145

}

146

}

147

```

148

149

### FineTuningJobEvent { .api }

150

151

Event log entry for a fine-tuning job containing status updates and metrics.

152

153

```typescript { .api }

154

interface FineTuningJobEvent {

155

id: string;

156

created_at: number;

157

level: 'info' | 'warn' | 'error';

158

message: string;

159

object: 'fine_tuning.job.event';

160

data?: unknown;

161

type?: 'message' | 'metrics';

162

}

163

```

164

165

### FineTuningJobCheckpoint { .api }

166

167

Represents an intermediate model checkpoint during a fine-tuning job, ready for evaluation or use.

168

169

```typescript { .api }

170

interface FineTuningJobCheckpoint {

171

id: string;

172

created_at: number;

173

fine_tuned_model_checkpoint: string;

174

fine_tuning_job_id: string;

175

metrics: FineTuningJobCheckpoint.Metrics;

176

object: 'fine_tuning.job.checkpoint';

177

step_number: number;

178

}

179

180

namespace FineTuningJobCheckpoint {

181

interface Metrics { { .api }

182

full_valid_loss?: number;

183

full_valid_mean_token_accuracy?: number;

184

step?: number;

185

train_loss?: number;

186

train_mean_token_accuracy?: number;

187

valid_loss?: number;

188

valid_mean_token_accuracy?: number;

189

}

190

}

191

```

192

193

### Training Method Types

194

195

#### SupervisedMethod { .api }

196

197

Standard supervised fine-tuning configuration for training on input-output pairs.

198

199

```typescript { .api }

200

interface SupervisedMethod {

201

hyperparameters?: SupervisedHyperparameters;

202

}

203

204

interface SupervisedHyperparameters { { .api }

205

batch_size?: 'auto' | number;

206

learning_rate_multiplier?: 'auto' | number;

207

n_epochs?: 'auto' | number;

208

}

209

```

210

211

#### DpoMethod { .api }

212

213

Direct Preference Optimization configuration for training with preference pairs (preferred vs. dispreferred responses).

214

215

```typescript { .api }

216

interface DpoMethod {

217

hyperparameters?: DpoHyperparameters;

218

}

219

220

interface DpoHyperparameters { { .api }

221

batch_size?: 'auto' | number;

222

beta?: 'auto' | number;

223

learning_rate_multiplier?: 'auto' | number;

224

n_epochs?: 'auto' | number;

225

}

226

```

227

228

#### ReinforcementMethod { .api }

229

230

Reinforcement learning configuration for training with reward scoring.

231

232

```typescript { .api }

233

interface ReinforcementMethod {

234

grader: StringCheckGrader | TextSimilarityGrader | PythonGrader | ScoreModelGrader | MultiGrader;

235

hyperparameters?: ReinforcementHyperparameters;

236

}

237

238

interface ReinforcementHyperparameters {

239

batch_size?: 'auto' | number;

240

compute_multiplier?: 'auto' | number;

241

eval_interval?: 'auto' | number;

242

eval_samples?: 'auto' | number;

243

learning_rate_multiplier?: 'auto' | number;

244

n_epochs?: 'auto' | number;

245

reasoning_effort?: 'default' | 'low' | 'medium' | 'high';

246

}

247

```

248

249

### Grader Types

250

251

Graders are used in reinforcement learning fine-tuning to automatically score model outputs and provide rewards for training.

252

253

#### LabelModelGrader { .api }

254

255

Uses a language model to assign labels to evaluation items. Useful for classification-style evaluation where outputs should fall into specific categories.

256

257

```typescript { .api }

258

interface LabelModelGrader {

259

input: Array<LabelModelGraderInput>;

260

labels: string[];

261

model: string;

262

name: string;

263

passing_labels: string[];

264

type: 'label_model';

265

}

266

267

interface LabelModelGraderInput {

268

content: string | ResponseInputText | OutputText | InputImage | ResponseInputAudio | Array<unknown>;

269

role: 'user' | 'assistant' | 'system' | 'developer';

270

type?: 'message';

271

}

272

273

interface OutputText {

274

text: string;

275

type: 'output_text';

276

}

277

278

interface InputImage {

279

image_url: string;

280

type: 'input_image';

281

detail?: string;

282

}

283

```

284

285

**Properties:**

286

- `input`: Array of message inputs to the grader model, can include template strings

287

- `labels`: Available labels to assign to each evaluation item

288

- `model`: The model to use for evaluation (must support structured outputs)

289

- `name`: Identifier for the grader

290

- `passing_labels`: Labels that indicate a passing result (must be subset of `labels`)

291

- `type`: Always `'label_model'`

292

293

#### StringCheckGrader { .api }

294

295

Performs string comparison operations between input and reference text.

296

297

```typescript { .api }

298

interface StringCheckGrader {

299

input: string;

300

name: string;

301

operation: 'eq' | 'ne' | 'like' | 'ilike';

302

reference: string;

303

type: 'string_check';

304

}

305

```

306

307

**Properties:**

308

- `operation`: `'eq'` (equals), `'ne'` (not equals), `'like'` (SQL LIKE), `'ilike'` (case-insensitive LIKE)

309

310

#### TextSimilarityGrader { .api }

311

312

Grades text based on similarity metrics. Supports various metrics for comparing model output with reference text.

313

314

```typescript { .api }

315

interface TextSimilarityGrader {

316

evaluation_metric: 'cosine' | 'fuzzy_match' | 'bleu' | 'gleu' | 'meteor' | 'rouge_1' | 'rouge_2' | 'rouge_3' | 'rouge_4' | 'rouge_5' | 'rouge_l';

317

input: string;

318

name: string;

319

reference: string;

320

type: 'text_similarity';

321

}

322

```

323

324

#### PythonGrader { .api }

325

326

Executes custom Python code for evaluation. Provides maximum flexibility for complex grading logic.

327

328

```typescript { .api }

329

interface PythonGrader {

330

name: string;

331

source: string;

332

type: 'python';

333

image_tag?: string;

334

}

335

```

336

337

**Properties:**

338

- `source`: Python code to execute for grading

339

- `image_tag`: Optional Docker image tag for the Python environment

340

341

#### ScoreModelGrader { .api }

342

343

Uses a language model to assign numerical scores to outputs. Useful for open-ended evaluation criteria.

344

345

```typescript { .api }

346

interface ScoreModelGrader {

347

input: Array<ScoreModelGraderInput>;

348

model: string;

349

name: string;

350

type: 'score_model';

351

range?: [number, number];

352

sampling_params?: SamplingParams;

353

}

354

355

interface ScoreModelGraderInput {

356

content: string | ResponseInputText | OutputText | InputImage | ResponseInputAudio | Array<unknown>;

357

role: 'user' | 'assistant' | 'system' | 'developer';

358

type?: 'message';

359

}

360

361

interface SamplingParams {

362

max_completions_tokens?: number | null;

363

reasoning_effort?: 'none' | 'minimal' | 'low' | 'medium' | 'high' | null;

364

seed?: number | null;

365

temperature?: number | null;

366

top_p?: number | null;

367

}

368

```

369

370

**Properties:**

371

- `range`: Score range (defaults to `[0, 1]`)

372

- `sampling_params`: Optional parameters for controlling model behavior

373

374

#### MultiGrader { .api }

375

376

Combines multiple graders using a formula to produce a final score.

377

378

```typescript { .api }

379

interface MultiGrader {

380

calculate_output: string;

381

graders: StringCheckGrader | TextSimilarityGrader | PythonGrader | ScoreModelGrader | LabelModelGrader;

382

name: string;

383

type: 'multi';

384

}

385

```

386

387

**Properties:**

388

- `calculate_output`: Formula to calculate final score from grader results

389

- `graders`: The individual graders to combine

390

391

### Pagination Types

392

393

```typescript { .api }

394

type FineTuningJobsPage = CursorPage<FineTuningJob>;

395

type FineTuningJobEventsPage = CursorPage<FineTuningJobEvent>;

396

type FineTuningJobCheckpointsPage = CursorPage<FineTuningJobCheckpoint>;

397

```

398

399

---

400

401

## Examples

402

403

### Creating a Fine-Tuning Job (Supervised)

404

405

Train a model using standard supervised learning with input-output pairs.

406

407

```typescript

408

import OpenAI from 'openai';

409

410

const client = new OpenAI({

411

apiKey: process.env.OPENAI_API_KEY,

412

});

413

414

// Create a fine-tuning job

415

const job = await client.fineTuning.jobs.create({

416

model: 'gpt-4o-mini',

417

training_file: 'file-abc123', // JSONL file with training data

418

method: {

419

type: 'supervised',

420

supervised: {

421

hyperparameters: {

422

batch_size: 8,

423

learning_rate_multiplier: 1.0,

424

n_epochs: 3,

425

},

426

},

427

},

428

suffix: 'my-fine-tuned-model',

429

});

430

431

console.log(`Job created: ${job.id}`);

432

console.log(`Status: ${job.status}`);

433

console.log(`Model: ${job.model}`);

434

```

435

436

### Creating a DPO Fine-Tuning Job

437

438

Train using Direct Preference Optimization with preference pairs.

439

440

```typescript

441

const dpoJob = await client.fineTuning.jobs.create({

442

model: 'gpt-4o-mini',

443

training_file: 'file-dpo-pairs-123', // JSONL with preference pairs

444

method: {

445

type: 'dpo',

446

dpo: {

447

hyperparameters: {

448

batch_size: 16,

449

beta: 0.1,

450

learning_rate_multiplier: 0.5,

451

n_epochs: 1,

452

},

453

},

454

},

455

});

456

457

console.log(`DPO Job: ${dpoJob.id}`);

458

```

459

460

### Creating a Reinforcement Learning Fine-Tuning Job

461

462

Train using reinforcement learning with reward scoring.

463

464

```typescript

465

const rlJob = await client.fineTuning.jobs.create({

466

model: 'gpt-4o-mini',

467

training_file: 'file-rl-data-123',

468

method: {

469

type: 'reinforcement',

470

reinforcement: {

471

grader: {

472

type: 'string_check', // or 'text_similarity', 'python', 'score_model', 'multi'

473

name: 'string-check-grader',

474

input: '{{ sample.output }}',

475

operation: 'eq',

476

reference: 'expected_output',

477

},

478

hyperparameters: {

479

batch_size: 'auto',

480

n_epochs: 2,

481

learning_rate_multiplier: 0.8,

482

eval_interval: 100,

483

eval_samples: 50,

484

},

485

},

486

},

487

});

488

489

console.log(`RL Job: ${rlJob.id}`);

490

```

491

492

### Retrieving Job Details

493

494

Get complete information about a specific fine-tuning job.

495

496

```typescript

497

const jobId = 'ft-AF1WoRqd3aJAHsqc9NY7iL8F';

498

const job = await client.fineTuning.jobs.retrieve(jobId);

499

500

console.log(`Job Status: ${job.status}`);

501

console.log(`Created: ${new Date(job.created_at * 1000).toISOString()}`);

502

console.log(`Fine-tuned Model: ${job.fine_tuned_model}`);

503

console.log(`Trained Tokens: ${job.trained_tokens}`);

504

505

if (job.error) {

506

console.error(`Error: ${job.error.message}`);

507

}

508

```

509

510

### Monitoring Job Progress with Events

511

512

Track job execution through event logs including training metrics.

513

514

```typescript

515

const jobId = 'ft-AF1WoRqd3aJAHsqc9NY7iL8F';

516

517

// Iterate through all events

518

for await (const event of client.fineTuning.jobs.listEvents(jobId)) {

519

console.log(`[${event.level}] ${event.message}`);

520

521

if (event.type === 'metrics') {

522

console.log('Metrics:', event.data);

523

}

524

}

525

526

// List with pagination parameters

527

const eventPage = await client.fineTuning.jobs.listEvents(jobId, {

528

limit: 10,

529

});

530

531

console.log(`Retrieved ${eventPage.data.length} events`);

532

```

533

534

### Working with Checkpoints

535

536

Access intermediate model checkpoints and their metrics.

537

538

```typescript

539

const jobId = 'ft-AF1WoRqd3aJAHsqc9NY7iL8F';

540

541

// Get all checkpoints for a job

542

for await (const checkpoint of client.fineTuning.jobs.checkpoints.list(jobId)) {

543

console.log(`Checkpoint: ${checkpoint.fine_tuned_model_checkpoint}`);

544

console.log(`Step: ${checkpoint.step_number}`);

545

console.log(`Training Loss: ${checkpoint.metrics.train_loss}`);

546

console.log(`Validation Loss: ${checkpoint.metrics.valid_loss}`);

547

console.log(`Token Accuracy: ${checkpoint.metrics.valid_mean_token_accuracy}`);

548

}

549

550

// List checkpoints with pagination

551

const checkpointPage = await client.fineTuning.jobs.checkpoints.list(jobId, {

552

limit: 5,

553

});

554

555

const bestCheckpoint = checkpointPage.data.reduce((best, current) => {

556

return (current.metrics.valid_loss || 0) < (best.metrics.valid_loss || 0)

557

? current

558

: best;

559

});

560

561

console.log(`Best checkpoint by validation loss: ${bestCheckpoint.id}`);

562

```

563

564

### Listing Fine-Tuning Jobs

565

566

Retrieve all fine-tuning jobs in your organization with filtering.

567

568

```typescript

569

// List all jobs

570

for await (const job of client.fineTuning.jobs.list()) {

571

console.log(`${job.id}: ${job.status} (Model: ${job.model})`);

572

}

573

574

// List with filters

575

const jobsPage = await client.fineTuning.jobs.list({

576

limit: 20,

577

});

578

579

const runningJobs = jobsPage.data.filter(j => j.status === 'running');

580

console.log(`Active jobs: ${runningJobs.length}`);

581

582

// Filter by metadata

583

const metadataFilteredJobs = await client.fineTuning.jobs.list({

584

metadata: {

585

'project': 'chatbot-v2',

586

},

587

});

588

```

589

590

### Controlling Job Execution

591

592

Pause, resume, and cancel jobs as needed.

593

594

```typescript

595

const jobId = 'ft-AF1WoRqd3aJAHsqc9NY7iL8F';

596

597

// Pause a running job

598

const pausedJob = await client.fineTuning.jobs.pause(jobId);

599

console.log(`Job paused: ${pausedJob.status}`); // status: 'paused'

600

601

// Wait a bit...

602

await new Promise(resolve => setTimeout(resolve, 5000));

603

604

// Resume the job

605

const resumedJob = await client.fineTuning.jobs.resume(jobId);

606

console.log(`Job resumed: ${resumedJob.status}`); // status: 'running'

607

608

// Cancel a job (can cancel running, queued, or paused jobs)

609

const cancelledJob = await client.fineTuning.jobs.cancel(jobId);

610

console.log(`Job cancelled: ${cancelledJob.status}`); // status: 'cancelled'

611

```

612

613

---

614

615

## Training Data Format

616

617

Fine-tuning data must be formatted as JSONL (JSON Lines) files. Different formats are required depending on the training method and model type.

618

619

### Supervised Training - Chat Format

620

621

For chat-based models like GPT-4 and GPT-3.5-turbo with supervised learning:

622

623

```jsonl

624

{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Question about biology"}, {"role": "assistant", "content": "The answer is..."}]}

625

{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Question about physics"}, {"role": "assistant", "content": "The answer is..."}]}

626

```

627

628

### Supervised Training - Completions Format

629

630

For models using completions format:

631

632

```jsonl

633

{"prompt": "Write a poem about:", "completion": " nature and its beauty"}

634

{"prompt": "What is the capital of:", "completion": " France? Paris"}

635

```

636

637

### DPO Training - Preference Format

638

639

For Direct Preference Optimization with preference pairs:

640

641

```jsonl

642

{"messages": [{"role": "user", "content": "Question"}], "preferred": {"content": "Better answer"}, "dispreferred": {"content": "Worse answer"}}

643

{"messages": [{"role": "user", "content": "Another question"}], "preferred": {"content": "Preferred response"}, "dispreferred": {"content": "Dispreferred response"}}

644

```

645

646

### Reinforcement Learning Format

647

648

For RL training with prompts (rewards are assigned via grader):

649

650

```jsonl

651

{"messages": [{"role": "user", "content": "Write a story about adventure"}]}

652

{"messages": [{"role": "user", "content": "Explain quantum computing"}]}

653

```

654

655

### Data Preparation Best Practices

656

657

```typescript

658

import * as fs from 'fs';

659

660

// Example: Convert CSV training data to JSONL format

661

function csvToJsonl(csvFilePath: string): void {

662

const lines = fs.readFileSync(csvFilePath, 'utf-8').split('\n');

663

const [header, ...rows] = lines;

664

const headers = header.split(',');

665

666

const jsonlLines = rows

667

.filter(row => row.trim())

668

.map(row => {

669

const values = row.split(',');

670

const obj: any = {};

671

headers.forEach((h, i) => {

672

obj[h.trim()] = values[i]?.trim();

673

});

674

return JSON.stringify({

675

messages: [

676

{ role: 'user', content: obj.prompt },

677

{ role: 'assistant', content: obj.completion },

678

],

679

});

680

})

681

.join('\n');

682

683

fs.writeFileSync('training_data.jsonl', jsonlLines);

684

}

685

686

// Validate JSONL format

687

function validateJsonl(jsonlPath: string): boolean {

688

const lines = fs.readFileSync(jsonlPath, 'utf-8').split('\n');

689

return lines

690

.filter(line => line.trim())

691

.every(line => {

692

try {

693

JSON.parse(line);

694

return true;

695

} catch {

696

return false;

697

}

698

});

699

}

700

701

// Upload training file

702

async function uploadTrainingFile(client: OpenAI, jsonlPath: string): Promise<string> {

703

const fileContent = fs.createReadStream(jsonlPath);

704

const response = await client.files.create({

705

file: fileContent,

706

purpose: 'fine-tune',

707

});

708

return response.id;

709

}

710

```

711

712

---

713

714

## Hyperparameter Tuning

715

716

Fine-tuning outcomes depend heavily on hyperparameter selection. Here's a guide to tuning each parameter.

717

718

### Batch Size

719

720

Controls how many examples are processed before updating model weights.

721

722

- **Effect**: Larger batch sizes lead to more stable training but slower convergence

723

- **'auto' (recommended)**: OpenAI automatically selects based on dataset

724

- **Typical Range**: 1-256

725

- **Tradeoff**: Larger batches = lower variance, less frequent updates

726

- **Guidance**: Start with 'auto', then experiment with 8, 16, 32 if needed

727

728

```typescript

729

// Conservative tuning with larger batch size

730

hyperparameters: {

731

batch_size: 32, // More stable, slower

732

}

733

734

// Aggressive tuning with smaller batch size

735

hyperparameters: {

736

batch_size: 8, // Faster convergence, more noise

737

}

738

```

739

740

### Learning Rate Multiplier

741

742

Scales the base learning rate for the fine-tuning process.

743

744

- **Effect**: Controls the magnitude of weight updates

745

- **Typical Range**: 0.02 to 2.0

746

- **'auto' (recommended)**: Automatically selected based on model

747

- **Guidance**:

748

- < 1.0: More conservative, less overfitting risk

749

- 1.0: Default, balanced training

750

- > 1.0: More aggressive, faster convergence

751

752

```typescript

753

// Conservative fine-tuning (prefer stability)

754

hyperparameters: {

755

learning_rate_multiplier: 0.5, // Half the default rate

756

}

757

758

// Aggressive fine-tuning (prefer speed)

759

hyperparameters: {

760

learning_rate_multiplier: 2.0, // Double the default rate

761

}

762

```

763

764

### Number of Epochs

765

766

How many complete passes through the training data to perform.

767

768

- **Effect**: More epochs generally improve performance but risk overfitting

769

- **Typical Range**: 1-10

770

- **'auto': Automatically selected

771

- **Guidance**:

772

- 1 epoch: Fast, may underfit

773

- 3-4 epochs: Balanced (recommended)

774

- > 5 epochs: Risk of overfitting on small datasets

775

776

```typescript

777

// Small dataset - few epochs to avoid overfitting

778

hyperparameters: {

779

n_epochs: 1,

780

}

781

782

// Large dataset - more epochs for better convergence

783

hyperparameters: {

784

n_epochs: 4,

785

}

786

```

787

788

### DPO-Specific: Beta Parameter

789

790

The beta value weights the penalty between policy and reference model.

791

792

- **Effect**: Higher beta enforces stronger adherence to preference pairs

793

- **Typical Range**: 0.05 to 0.5

794

- **'auto' (recommended)**: Automatically tuned

795

- **Guidance**:

796

- Low beta (0.05): More exploration, less constraint

797

- High beta (0.3+): Strict preference alignment

798

799

```typescript

800

dpo: {

801

hyperparameters: {

802

beta: 0.1, // Moderate preference alignment

803

},

804

}

805

```

806

807

### Hyperparameter Tuning Workflow

808

809

```typescript

810

async function tuneFinetuningModel(

811

client: OpenAI,

812

trainingFile: string,

813

validationFile: string,

814

): Promise<string> {

815

const configurations = [

816

{

817

name: 'conservative',

818

batch_size: 32,

819

learning_rate_multiplier: 0.5,

820

n_epochs: 2,

821

},

822

{

823

name: 'balanced',

824

batch_size: 16,

825

learning_rate_multiplier: 1.0,

826

n_epochs: 3,

827

},

828

{

829

name: 'aggressive',

830

batch_size: 8,

831

learning_rate_multiplier: 2.0,

832

n_epochs: 4,

833

},

834

];

835

836

const results: Array<{ config: string; jobId: string; metrics: any }> = [];

837

838

for (const config of configurations) {

839

console.log(`Starting ${config.name} configuration...`);

840

841

const job = await client.fineTuning.jobs.create({

842

model: 'gpt-4o-mini',

843

training_file: trainingFile,

844

validation_file: validationFile,

845

method: {

846

type: 'supervised',

847

supervised: {

848

hyperparameters: {

849

batch_size: config.batch_size,

850

learning_rate_multiplier: config.learning_rate_multiplier,

851

n_epochs: config.n_epochs,

852

},

853

},

854

},

855

suffix: `tune-${config.name}`,

856

metadata: {

857

'experiment': 'hyperparameter-tuning',

858

'config': config.name,

859

},

860

});

861

862

results.push({

863

config: config.name,

864

jobId: job.id,

865

metrics: {

866

batch_size: config.batch_size,

867

learning_rate_multiplier: config.learning_rate_multiplier,

868

n_epochs: config.n_epochs,

869

},

870

});

871

}

872

873

// Wait for jobs and compare results

874

for (const result of results) {

875

let job = await client.fineTuning.jobs.retrieve(result.jobId);

876

877

// Poll until completion

878

while (job.status === 'running' || job.status === 'queued') {

879

await new Promise(resolve => setTimeout(resolve, 30000)); // Wait 30s

880

job = await client.fineTuning.jobs.retrieve(result.jobId);

881

}

882

883

if (job.status === 'succeeded') {

884

console.log(`${result.config} job succeeded: ${job.fine_tuned_model}`);

885

console.log(` Trained tokens: ${job.trained_tokens}`);

886

}

887

}

888

889

return results[0].jobId; // Return first result ID

890

}

891

```

892

893

---

894

895

## Advanced Usage

896

897

### Using Metadata for Job Organization

898

899

Tag and filter jobs with custom metadata for better organization and tracking.

900

901

```typescript

902

// Create job with metadata

903

const job = await client.fineTuning.jobs.create({

904

model: 'gpt-4o-mini',

905

training_file: 'file-123',

906

metadata: {

907

'project': 'customer-support',

908

'version': '1.0',

909

'team': 'ai-products',

910

'environment': 'production',

911

},

912

});

913

914

// Later, filter jobs by metadata

915

const productionJobs = await client.fineTuning.jobs.list({

916

metadata: {

917

'environment': 'production',

918

},

919

});

920

921

for await (const job of productionJobs) {

922

console.log(`${job.id} - ${job.metadata?.project}`);

923

}

924

```

925

926

### Weights and Biases Integration

927

928

Monitor fine-tuning jobs in real-time using Weights and Biases.

929

930

```typescript

931

const job = await client.fineTuning.jobs.create({

932

model: 'gpt-4o-mini',

933

training_file: 'file-123',

934

integrations: [

935

{

936

type: 'wandb',

937

wandb: {

938

project: 'openai-fine-tuning',

939

entity: 'my-team',

940

name: 'gpt4-mini-v1',

941

tags: ['production', 'customer-support'],

942

},

943

},

944

],

945

});

946

947

console.log(`Monitor at: https://wandb.ai/my-team/openai-fine-tuning`);

948

```

949

950

### Reproducible Training

951

952

Use seeds for reproducible fine-tuning results.

953

954

```typescript

955

const seed = 42;

956

957

// Job 1

958

const job1 = await client.fineTuning.jobs.create({

959

model: 'gpt-4o-mini',

960

training_file: 'file-123',

961

seed: seed,

962

suffix: 'run1',

963

});

964

965

// Job 2 with same seed produces identical results

966

const job2 = await client.fineTuning.jobs.create({

967

model: 'gpt-4o-mini',

968

training_file: 'file-123',

969

seed: seed,

970

suffix: 'run2',

971

});

972

973

// Both jobs should produce equivalent models

974

```

975

976

### Validation File Usage

977

978

Provide validation data to monitor generalization during training.

979

980

```typescript

981

const job = await client.fineTuning.jobs.create({

982

model: 'gpt-4o-mini',

983

training_file: 'file-train-123',

984

validation_file: 'file-val-456', // Optional but recommended

985

method: {

986

type: 'supervised',

987

supervised: {

988

hyperparameters: {

989

batch_size: 16,

990

n_epochs: 3,

991

learning_rate_multiplier: 1.0,

992

},

993

},

994

},

995

});

996

997

// Monitor validation metrics in events

998

for await (const event of client.fineTuning.jobs.listEvents(job.id)) {

999

if (event.type === 'metrics' && event.data?.validating) {

1000

console.log('Validation Metrics:', event.data);

1001

}

1002

}

1003

```

1004

1005

### Checkpoint-Based Model Selection

1006

1007

Use checkpoints to select the best intermediate model rather than the final one.

1008

1009

```typescript

1010

async function findBestCheckpoint(

1011

client: OpenAI,

1012

jobId: string,

1013

): Promise<string> {

1014

let bestCheckpoint: any = null;

1015

let bestValidationLoss = Infinity;

1016

1017

for await (const checkpoint of client.fineTuning.jobs.checkpoints.list(jobId)) {

1018

const validationLoss = checkpoint.metrics.valid_loss || Infinity;

1019

1020

if (validationLoss < bestValidationLoss) {

1021

bestValidationLoss = validationLoss;

1022

bestCheckpoint = checkpoint;

1023

}

1024

}

1025

1026

if (bestCheckpoint) {

1027

console.log(

1028

`Best checkpoint at step ${bestCheckpoint.step_number}: ${bestCheckpoint.fine_tuned_model_checkpoint}`,

1029

);

1030

return bestCheckpoint.fine_tuned_model_checkpoint;

1031

}

1032

1033

throw new Error('No checkpoints found');

1034

}

1035

1036

// Use the checkpoint model

1037

const bestModel = await findBestCheckpoint(client, 'ft-123');

1038

const completion = await client.chat.completions.create({

1039

model: bestModel,

1040

messages: [{ role: 'user', content: 'Hello' }],

1041

});

1042

```

1043

1044

### Long-Running Job Polling

1045

1046

Monitor job completion with exponential backoff polling.

1047

1048

```typescript

1049

async function pollJobUntilComplete(

1050

client: OpenAI,

1051

jobId: string,

1052

maxWaitMs = 7200000, // 2 hours

1053

): Promise<FineTuningJob> {

1054

const startTime = Date.now();

1055

let pollInterval = 5000; // Start at 5 seconds

1056

const maxPollInterval = 60000; // Cap at 60 seconds

1057

1058

while (Date.now() - startTime < maxWaitMs) {

1059

const job = await client.fineTuning.jobs.retrieve(jobId);

1060

1061

if (job.status === 'succeeded' || job.status === 'failed' || job.status === 'cancelled') {

1062

return job;

1063

}

1064

1065

console.log(`Job ${jobId} status: ${job.status}`);

1066

if (job.status === 'running' && job.estimated_finish) {

1067

const remaining = job.estimated_finish * 1000 - Date.now();

1068

console.log(`Estimated time remaining: ${Math.ceil(remaining / 1000)} seconds`);

1069

}

1070

1071

await new Promise(resolve => setTimeout(resolve, pollInterval));

1072

1073

// Exponential backoff

1074

pollInterval = Math.min(pollInterval * 1.5, maxPollInterval);

1075

}

1076

1077

throw new Error(`Job ${jobId} did not complete within ${maxWaitMs}ms`);

1078

}

1079

1080

// Usage

1081

const completedJob = await pollJobUntilComplete(client, 'ft-123');

1082

console.log(`Job completed with status: ${completedJob.status}`);

1083

```

1084

1085

### Bulk Job Monitoring

1086

1087

Track multiple fine-tuning jobs simultaneously.

1088

1089

```typescript

1090

async function monitorMultipleJobs(client: OpenAI, jobIds: string[]): Promise<void> {

1091

const statusMap = new Map<string, string>();

1092

jobIds.forEach(id => statusMap.set(id, 'unknown'));

1093

1094

const updateStatus = async () => {

1095

for (const jobId of jobIds) {

1096

const job = await client.fineTuning.jobs.retrieve(jobId);

1097

statusMap.set(jobId, job.status);

1098

}

1099

};

1100

1101

const allComplete = () =>

1102

Array.from(statusMap.values()).every(

1103

status =>

1104

status === 'succeeded' ||

1105

status === 'failed' ||

1106

status === 'cancelled',

1107

);

1108

1109

while (!allComplete()) {

1110

await updateStatus();

1111

1112

console.clear();

1113

console.log('Fine-Tuning Jobs Status:');

1114

for (const [id, status] of statusMap) {

1115

const symbol =

1116

status === 'succeeded'

1117

? '✓'

1118

: status === 'failed'

1119

? '✗'

1120

: status === 'running'

1121

? '→'

1122

: '-';

1123

console.log(`${symbol} ${id}: ${status}`);

1124

}

1125

1126

if (!allComplete()) {

1127

await new Promise(resolve => setTimeout(resolve, 30000)); // Check every 30s

1128

}

1129

}

1130

1131

console.log('\nAll jobs completed!');

1132

}

1133

1134

// Usage

1135

await monitorMultipleJobs(client, [

1136

'ft-123',

1137

'ft-456',

1138

'ft-789',

1139

]);

1140

```

1141

1142

---

1143

1144

## Supported Models

1145

1146

Fine-tuning is available for the following models:

1147

1148

- `gpt-4o-mini` (Recommended for most use cases)

1149

- `gpt-3.5-turbo`

1150

- `davinci-002`

1151

- `babbage-002`

1152

1153

Model availability and capabilities may change. Check the [OpenAI documentation](https://platform.openai.com/docs/guides/fine-tuning) for the most current list.

1154

1155

---

1156

1157

## Error Handling

1158

1159

```typescript

1160

import { BadRequestError, NotFoundError } from 'openai';

1161

1162

async function createJobWithErrorHandling(

1163

client: OpenAI,

1164

trainingFile: string,

1165

) {

1166

try {

1167

const job = await client.fineTuning.jobs.create({

1168

model: 'gpt-4o-mini',

1169

training_file: trainingFile,

1170

});

1171

return job;

1172

} catch (error) {

1173

if (error instanceof BadRequestError) {

1174

console.error('Invalid request:', error.message);

1175

// Usually validation errors in training data format

1176

} else if (error instanceof NotFoundError) {

1177

console.error('Training file not found:', error.message);

1178

} else {

1179

throw error;

1180

}

1181

}

1182

}

1183

1184

// Monitor job for errors

1185

const job = await client.fineTuning.jobs.retrieve(jobId);

1186

1187

if (job.status === 'failed' && job.error) {

1188

console.error(

1189

`Job failed: ${job.error.code} - ${job.error.message}`,

1190

);

1191

console.error(`Failed parameter: ${job.error.param}`);

1192

}

1193

```

1194

1195

---

1196

1197

## See Also

1198

1199

- [Chat Completions](./chat-completions.md) - Use fine-tuned models for chat

1200

- [Files and Uploads](./files-uploads.md) - Upload training data

1201

- [Embeddings](./embeddings.md) - Fine-tune embedding models

1202