or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

index.mdmodels-tokenizers.mdpipelines.mdprocessors.mdutilities.md

pipelines.mddocs/

0

# Pipelines

1

2

Pipelines provide a high-level, task-specific API for running machine learning models. The pipeline interface is the easiest way to use transformers.js for most ML tasks, automatically handling model loading, preprocessing, and postprocessing.

3

4

## Capabilities

5

6

### Main Pipeline Function

7

8

Creates a pipeline instance for a specific machine learning task with automatic model selection and preprocessing.

9

10

```javascript { .api }

11

/**

12

* Create a pipeline for a specific ML task

13

* @param task - The task identifier (see supported tasks below)

14

* @param model - Optional model name/path (uses default if not specified)

15

* @param options - Configuration options for the pipeline

16

* @returns Promise that resolves to a Pipeline instance

17

*/

18

async function pipeline(

19

task: string,

20

model?: string,

21

options?: PipelineOptions

22

): Promise<Pipeline>;

23

24

interface PipelineOptions {

25

/** Whether to use quantized version of the model (default: true) */

26

quantized?: boolean;

27

/** Callback function to track model download progress */

28

progress_callback?: (progress: any) => void;

29

/** Custom model configuration */

30

config?: any;

31

/** Directory to cache downloaded models */

32

cache_dir?: string;

33

/** Only use local files, don't download from remote */

34

local_files_only?: boolean;

35

/** Model revision/branch to use (default: 'main') */

36

revision?: string;

37

/** Specific model file name to use */

38

model_file_name?: string;

39

}

40

```

41

42

**Usage Examples:**

43

44

```javascript

45

import { pipeline } from "@xenova/transformers";

46

47

// Basic usage with default model

48

const classifier = await pipeline("sentiment-analysis");

49

const result = await classifier("I love this library!");

50

// Output: [{ label: 'POSITIVE', score: 0.999 }]

51

52

// Custom model specification

53

const translator = await pipeline("translation", "Xenova/opus-mt-en-de");

54

const translation = await translator("Hello world");

55

56

// With custom options

57

const generator = await pipeline("text-generation", "gpt2", {

58

quantized: false,

59

progress_callback: (progress) => console.log(progress),

60

});

61

```

62

63

### Text Processing Tasks

64

65

#### Text Classification

66

67

Classify text into predefined categories (sentiment analysis, topic classification, etc.).

68

69

```javascript { .api }

70

interface TextClassificationPipeline {

71

(

72

texts: string | string[],

73

options?: {

74

top_k?: number;

75

function_to_apply?: string;

76

}

77

): Promise<Array<{

78

label: string;

79

score: number;

80

}>>;

81

}

82

```

83

84

**Supported Task Names:** `"text-classification"`, `"sentiment-analysis"`

85

86

**Usage Example:**

87

88

```javascript

89

const classifier = await pipeline("sentiment-analysis");

90

const results = await classifier(["I love this!", "This is terrible"]);

91

// Results: [

92

// [{ label: 'POSITIVE', score: 0.999 }],

93

// [{ label: 'NEGATIVE', score: 0.998 }]

94

// ]

95

```

96

97

#### Token Classification

98

99

Classify individual tokens (Named Entity Recognition, Part-of-Speech tagging).

100

101

```javascript { .api }

102

interface TokenClassificationPipeline {

103

(

104

texts: string | string[],

105

options?: {

106

aggregation_strategy?: string;

107

ignore_labels?: string[];

108

}

109

): Promise<Array<{

110

entity: string;

111

score: number;

112

index: number;

113

word: string;

114

start: number;

115

end: number;

116

}>>;

117

}

118

```

119

120

**Supported Task Names:** `"token-classification"`, `"ner"`

121

122

#### Question Answering

123

124

Extract answers from context text based on questions.

125

126

```javascript { .api }

127

interface QuestionAnsweringPipeline {

128

(

129

question: string,

130

context: string,

131

options?: {

132

top_k?: number;

133

}

134

): Promise<{

135

answer: string;

136

score: number;

137

start: number;

138

end: number;

139

}>;

140

}

141

```

142

143

**Supported Task Names:** `"question-answering"`

144

145

#### Fill Mask

146

147

Fill masked tokens in text.

148

149

```javascript { .api }

150

interface FillMaskPipeline {

151

(

152

texts: string | string[],

153

options?: {

154

top_k?: number;

155

}

156

): Promise<Array<{

157

score: number;

158

token: number;

159

token_str: string;

160

sequence: string;

161

}>>;

162

}

163

```

164

165

**Supported Task Names:** `"fill-mask"`

166

167

#### Text Generation

168

169

Generate text continuations from input prompts.

170

171

```javascript { .api }

172

interface TextGenerationPipeline {

173

(

174

texts: string | string[],

175

options?: {

176

max_new_tokens?: number;

177

do_sample?: boolean;

178

temperature?: number;

179

top_k?: number;

180

top_p?: number;

181

}

182

): Promise<Array<{

183

generated_text: string;

184

}>>;

185

}

186

```

187

188

**Supported Task Names:** `"text-generation"`

189

190

#### Text-to-Text Generation

191

192

Generate text from text input (includes summarization, translation).

193

194

```javascript { .api }

195

interface Text2TextGenerationPipeline {

196

(

197

texts: string | string[],

198

options?: {

199

max_new_tokens?: number;

200

do_sample?: boolean;

201

temperature?: number;

202

}

203

): Promise<Array<{

204

generated_text: string;

205

}>>;

206

}

207

208

interface SummarizationPipeline {

209

(

210

texts: string | string[],

211

options?: {

212

max_new_tokens?: number;

213

min_new_tokens?: number;

214

}

215

): Promise<Array<{

216

summary_text: string;

217

}>>;

218

}

219

220

interface TranslationPipeline {

221

(

222

texts: string | string[],

223

options?: {

224

max_new_tokens?: number;

225

}

226

): Promise<Array<{

227

translation_text: string;

228

}>>;

229

}

230

```

231

232

**Supported Task Names:** `"text2text-generation"`, `"summarization"`, `"translation"`

233

234

#### Zero-Shot Classification

235

236

Classify text without predefined training examples.

237

238

```javascript { .api }

239

interface ZeroShotClassificationPipeline {

240

(

241

texts: string | string[],

242

candidate_labels: string[],

243

options?: {

244

hypothesis_template?: string;

245

multi_label?: boolean;

246

}

247

): Promise<{

248

sequence: string;

249

labels: string[];

250

scores: number[];

251

}>;

252

}

253

```

254

255

**Supported Task Names:** `"zero-shot-classification"`

256

257

#### Feature Extraction

258

259

Extract embeddings from text for similarity tasks.

260

261

```javascript { .api }

262

interface FeatureExtractionPipeline {

263

(

264

texts: string | string[],

265

options?: {

266

pooling?: string;

267

normalize?: boolean;

268

quantize?: boolean;

269

precision?: string;

270

}

271

): Promise<Tensor>;

272

}

273

```

274

275

**Supported Task Names:** `"feature-extraction"`, `"embeddings"`

276

277

### Vision Processing Tasks

278

279

#### Image Classification

280

281

Classify images into predefined categories.

282

283

```javascript { .api }

284

interface ImageClassificationPipeline {

285

(

286

images: ImageInput | ImageInput[],

287

options?: {

288

top_k?: number;

289

}

290

): Promise<Array<{

291

label: string;

292

score: number;

293

}>>;

294

}

295

```

296

297

**Supported Task Names:** `"image-classification"`

298

299

#### Object Detection

300

301

Detect and locate objects in images.

302

303

```javascript { .api }

304

interface ObjectDetectionPipeline {

305

(

306

images: ImageInput | ImageInput[],

307

options?: {

308

threshold?: number;

309

percentage?: boolean;

310

}

311

): Promise<Array<{

312

score: number;

313

label: string;

314

box: {

315

xmin: number;

316

ymin: number;

317

xmax: number;

318

ymax: number;

319

};

320

}>>;

321

}

322

```

323

324

**Supported Task Names:** `"object-detection"`

325

326

#### Zero-Shot Object Detection

327

328

Detect objects in images using text descriptions.

329

330

```javascript { .api }

331

interface ZeroShotObjectDetectionPipeline {

332

(

333

images: ImageInput | ImageInput[],

334

candidate_labels: string[],

335

options?: {

336

threshold?: number;

337

percentage?: boolean;

338

}

339

): Promise<Array<{

340

score: number;

341

label: string;

342

box: {

343

xmin: number;

344

ymin: number;

345

xmax: number;

346

ymax: number;

347

};

348

}>>;

349

}

350

```

351

352

**Supported Task Names:** `"zero-shot-object-detection"`

353

354

#### Image Segmentation

355

356

Segment objects and regions in images.

357

358

```javascript { .api }

359

interface ImageSegmentationPipeline {

360

(

361

images: ImageInput | ImageInput[],

362

options?: {

363

threshold?: number;

364

mask_threshold?: number;

365

overlap_mask_area_threshold?: number;

366

}

367

): Promise<Array<{

368

score: number;

369

label: string;

370

mask: RawImage;

371

}>>;

372

}

373

```

374

375

**Supported Task Names:** `"image-segmentation"`

376

377

#### Zero-Shot Image Classification

378

379

Classify images using text descriptions.

380

381

```javascript { .api }

382

interface ZeroShotImageClassificationPipeline {

383

(

384

images: ImageInput | ImageInput[],

385

candidate_labels: string[],

386

options?: {

387

hypothesis_template?: string;

388

}

389

): Promise<Array<{

390

label: string;

391

score: number;

392

}>>;

393

}

394

```

395

396

**Supported Task Names:** `"zero-shot-image-classification"`

397

398

#### Image-to-Text

399

400

Generate text descriptions from images.

401

402

```javascript { .api }

403

interface ImageToTextPipeline {

404

(

405

images: ImageInput | ImageInput[],

406

options?: {

407

max_new_tokens?: number;

408

do_sample?: boolean;

409

temperature?: number;

410

}

411

): Promise<Array<{

412

generated_text: string;

413

}>>;

414

}

415

```

416

417

**Supported Task Names:** `"image-to-text"`

418

419

#### Image-to-Image

420

421

Transform images (super-resolution, style transfer).

422

423

```javascript { .api }

424

interface ImageToImagePipeline {

425

(

426

images: ImageInput | ImageInput[]

427

): Promise<RawImage[]>;

428

}

429

```

430

431

**Supported Task Names:** `"image-to-image"`

432

433

#### Depth Estimation

434

435

Estimate depth maps from images.

436

437

```javascript { .api }

438

interface DepthEstimationPipeline {

439

(

440

images: ImageInput | ImageInput[]

441

): Promise<Array<{

442

predicted_depth: Tensor;

443

depth: RawImage;

444

}>>;

445

}

446

```

447

448

**Supported Task Names:** `"depth-estimation"`

449

450

#### Image Feature Extraction

451

452

Extract embeddings from images.

453

454

```javascript { .api }

455

interface ImageFeatureExtractionPipeline {

456

(

457

images: ImageInput | ImageInput[],

458

options?: {

459

pool?: boolean;

460

normalize?: boolean;

461

quantize?: boolean;

462

precision?: string;

463

}

464

): Promise<Tensor>;

465

}

466

```

467

468

**Supported Task Names:** `"image-feature-extraction"`

469

470

### Audio Processing Tasks

471

472

#### Audio Classification

473

474

Classify audio content into categories.

475

476

```javascript { .api }

477

interface AudioClassificationPipeline {

478

(

479

audio: AudioInput | AudioInput[],

480

options?: {

481

top_k?: number;

482

}

483

): Promise<Array<{

484

label: string;

485

score: number;

486

}>>;

487

}

488

```

489

490

**Supported Task Names:** `"audio-classification"`

491

492

#### Zero-Shot Audio Classification

493

494

Classify audio using text descriptions.

495

496

```javascript { .api }

497

interface ZeroShotAudioClassificationPipeline {

498

(

499

audio: AudioInput | AudioInput[],

500

candidate_labels: string[],

501

options?: {

502

hypothesis_template?: string;

503

}

504

): Promise<Array<{

505

label: string;

506

score: number;

507

}>>;

508

}

509

```

510

511

**Supported Task Names:** `"zero-shot-audio-classification"`

512

513

#### Automatic Speech Recognition

514

515

Convert speech to text.

516

517

```javascript { .api }

518

interface AutomaticSpeechRecognitionPipeline {

519

(

520

audio: AudioInput | AudioInput[],

521

options?: {

522

top_k?: number;

523

hotwords?: string;

524

language?: string;

525

task?: string;

526

return_timestamps?: boolean | string;

527

chunk_length_s?: number;

528

stride_length_s?: number;

529

}

530

): Promise<{

531

text: string;

532

chunks?: Array<{

533

text: string;

534

timestamp: [number, number];

535

}>;

536

}>;

537

}

538

```

539

540

**Supported Task Names:** `"automatic-speech-recognition"`, `"asr"`

541

542

#### Text-to-Audio

543

544

Generate audio from text.

545

546

```javascript { .api }

547

interface TextToAudioPipeline {

548

(

549

texts: string | string[],

550

options?: {

551

speaker_embeddings?: Tensor;

552

}

553

): Promise<{

554

audio: Float32Array;

555

sampling_rate: number;

556

}>;

557

}

558

```

559

560

**Supported Task Names:** `"text-to-audio"`, `"text-to-speech"`

561

562

### Multimodal Tasks

563

564

#### Document Question Answering

565

566

Answer questions about document images.

567

568

```javascript { .api }

569

interface DocumentQuestionAnsweringPipeline {

570

(

571

image: ImageInput,

572

question: string,

573

options?: {

574

top_k?: number;

575

}

576

): Promise<Array<{

577

answer: string;

578

score: number;

579

}>>;

580

}

581

```

582

583

**Supported Task Names:** `"document-question-answering"`

584

585

## Types

586

587

```javascript { .api }

588

type ImageInput = string | RawImage | URL;

589

type AudioInput = string | URL | Float32Array | Float64Array;

590

591

interface Pipeline {

592

(input: any, options?: any): Promise<any>;

593

dispose(): Promise<void>;

594

}

595

```

596

597

## Supported Tasks Summary

598

599

| Task | Task Names | Input Type | Output Type |

600

|------|------------|-----------|-------------|

601

| Text Classification | `text-classification`, `sentiment-analysis` | Text | Labels + Scores |

602

| Token Classification | `token-classification`, `ner` | Text | Token Labels |

603

| Question Answering | `question-answering` | Question + Context | Answer + Score |

604

| Fill Mask | `fill-mask` | Masked Text | Token Predictions |

605

| Text Generation | `text-generation` | Text Prompt | Generated Text |

606

| Summarization | `summarization` | Text | Summary |

607

| Translation | `translation` | Text | Translated Text |

608

| Zero-Shot Classification | `zero-shot-classification` | Text + Labels | Classification |

609

| Feature Extraction | `feature-extraction`, `embeddings` | Text | Embeddings |

610

| Image Classification | `image-classification` | Image | Labels + Scores |

611

| Object Detection | `object-detection` | Image | Objects + Boxes |

612

| Image Segmentation | `image-segmentation` | Image | Segments + Masks |

613

| Zero-Shot Image Classification | `zero-shot-image-classification` | Image + Labels | Classification |

614

| Image-to-Text | `image-to-text` | Image | Generated Text |

615

| Audio Classification | `audio-classification` | Audio | Labels + Scores |

616

| Speech Recognition | `automatic-speech-recognition`, `asr` | Audio | Transcribed Text |

617

| Text-to-Audio | `text-to-audio`, `text-to-speech` | Text | Audio Waveform |

618

| Document QA | `document-question-answering` | Document + Question | Answer |