or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

index.mdmodels-tokenizers.mdpipelines.mdprocessors.mdutilities.md

index.mddocs/

0

# Transformers.js

1

2

Transformers.js is a JavaScript/TypeScript library that brings state-of-the-art machine learning models to web browsers and Node.js environments without requiring server-side processing. It provides functionally equivalent APIs to Hugging Face's Python transformers library, enabling developers to run pretrained models for natural language processing, computer vision, audio processing, and multimodal tasks directly in client-side applications.

3

4

## Package Information

5

6

- **Package Name**: @xenova/transformers

7

- **Package Type**: npm

8

- **Language**: JavaScript/TypeScript

9

- **Installation**: `npm install @xenova/transformers`

10

11

## Core Imports

12

13

```javascript

14

import { pipeline, AutoTokenizer, AutoModel, AutoProcessor, env } from "@xenova/transformers";

15

```

16

17

For CommonJS environments:

18

19

```javascript

20

const { pipeline, AutoTokenizer, AutoModel, AutoProcessor, env } = require("@xenova/transformers");

21

```

22

23

## Basic Usage

24

25

```javascript

26

import { pipeline } from "@xenova/transformers";

27

28

// Create a text classification pipeline

29

const classifier = await pipeline("sentiment-analysis");

30

31

// Classify text

32

const result = await classifier("I love this library!");

33

// Output: [{ label: 'POSITIVE', score: 0.999 }]

34

35

// Create a text generation pipeline

36

const generator = await pipeline("text-generation", "gpt2");

37

38

// Generate text

39

const output = await generator("The future of AI is", {

40

max_new_tokens: 50,

41

do_sample: true,

42

temperature: 0.7,

43

});

44

```

45

46

## Architecture

47

48

Transformers.js is built around several key components:

49

50

- **Pipeline API**: High-level interface (`pipeline()`) providing task-specific implementations for common ML operations

51

- **Auto Classes**: Automatic model, tokenizer, and processor selection (`AutoModel`, `AutoTokenizer`, `AutoProcessor`, `AutoConfig`, etc.)

52

- **Processors**: Input preprocessing for images, audio, and multimodal data

53

- **Specific Classes**: Direct access to model implementations (BERT, GPT-2, T5, etc.)

54

- **Utilities**: Tensor operations, image/audio processing, and mathematical functions

55

- **Environment Configuration**: Global settings for model caching, backends, and runtime options

56

57

The library leverages ONNX Runtime for efficient model execution and supports both browser and Node.js environments.

58

59

## Capabilities

60

61

### Pipeline Interface

62

63

High-level API providing ready-to-use implementations for common machine learning tasks including text classification, text generation, translation, image classification, object detection, and audio processing.

64

65

```javascript { .api }

66

function pipeline(

67

task: string,

68

model?: string,

69

options?: PipelineOptions

70

): Promise<Pipeline>;

71

72

interface PipelineOptions {

73

quantized?: boolean;

74

progress_callback?: (progress: any) => void;

75

config?: any;

76

cache_dir?: string;

77

local_files_only?: boolean;

78

revision?: string;

79

model_file_name?: string;

80

}

81

```

82

83

[Pipelines](./pipelines.md)

84

85

### Models and Tokenizers

86

87

Auto classes for automatic model and tokenizer selection, plus direct access to specific model implementations for fine-grained control over model loading and inference.

88

89

```javascript { .api }

90

class AutoModel {

91

static async from_pretrained(

92

model_name_or_path: string,

93

options?: ModelOptions

94

): Promise<PreTrainedModel>;

95

}

96

97

class AutoTokenizer {

98

static async from_pretrained(

99

model_name_or_path: string,

100

options?: TokenizerOptions

101

): Promise<PreTrainedTokenizer>;

102

}

103

104

interface ModelOptions {

105

quantized?: boolean;

106

progress_callback?: (progress: any) => void;

107

config?: any;

108

cache_dir?: string;

109

local_files_only?: boolean;

110

revision?: string;

111

}

112

113

interface TokenizerOptions {

114

quantized?: boolean;

115

progress_callback?: (progress: any) => void;

116

config?: any;

117

cache_dir?: string;

118

local_files_only?: boolean;

119

revision?: string;

120

}

121

```

122

123

[Models and Tokenizers](./models-tokenizers.md)

124

125

### Processors

126

127

Input preprocessing classes for non-textual data including images, audio, and multimodal inputs. Processors handle format conversion, normalization, and feature extraction.

128

129

```javascript { .api }

130

class AutoProcessor {

131

static async from_pretrained(

132

model_name_or_path: string,

133

options?: ProcessorOptions

134

): Promise<Processor>;

135

}

136

137

interface ProcessorOptions {

138

quantized?: boolean;

139

progress_callback?: (progress: any) => void;

140

config?: any;

141

cache_dir?: string;

142

local_files_only?: boolean;

143

revision?: string;

144

}

145

```

146

147

[Processors](./processors.md)

148

149

### Utilities

150

151

Comprehensive utility functions for tensor operations, image processing, audio processing, and mathematical computations that support the core ML functionality.

152

153

```javascript { .api }

154

class Tensor {

155

constructor(type: string, data: any, dims: number[]);

156

157

// Core tensor operations

158

mean(dim?: number | number[]): Tensor;

159

permute(dims: number[]): Tensor;

160

squeeze(dim?: number): Tensor;

161

unsqueeze(dim: number): Tensor;

162

}

163

164

class RawImage {

165

static async read(input: string | URL | Buffer): Promise<RawImage>;

166

static async fromURL(url: string | URL): Promise<RawImage>;

167

168

resize(width: number, height: number): RawImage;

169

crop(left: number, top: number, width: number, height: number): RawImage;

170

}

171

```

172

173

[Utilities](./utilities.md)

174

175

### Environment Configuration

176

177

Global environment settings for controlling model caching, backend selection, local file usage, and runtime behavior.

178

179

```javascript { .api }

180

const env: {

181

/** Directory for caching downloaded models and tokenizers */

182

cacheDir: string;

183

184

/** Only use local files, disable remote downloads */

185

localFilesOnly: boolean;

186

187

/** Allow downloading models from remote sources */

188

allowRemoteModels: boolean;

189

190

/** Base URL for remote model downloads */

191

remoteURL: string;

192

193

/** Template for remote model paths */

194

remotePathTemplate: string;

195

196

/** Allow loading models from local filesystem */

197

allowLocalModels: boolean;

198

199

/** Base URL for local model loading */

200

localURL: string;

201

202

/** Template for local model paths */

203

localPathTemplate: string;

204

205

/** ONNX Runtime backend configuration */

206

backends: {

207

onnx: {

208

wasm: {

209

/** Path to ONNX Runtime WASM files */

210

wasmPaths: string;

211

212

/** Number of threads for WASM execution (default: 1) */

213

numThreads: number;

214

215

/** Enable SIMD optimizations (default: true) */

216

simd: boolean;

217

218

/** Use multithreaded WASM (default: false) */

219

multiThread: boolean;

220

};

221

};

222

};

223

224

/** Current library version */

225

readonly VERSION: string;

226

227

/** Whether web cache is available (browser only) */

228

readonly WEB_CACHE_AVAILABLE: boolean;

229

230

/** Whether file system access is available */

231

readonly FS_AVAILABLE: boolean;

232

233

/** Whether running in local environment with file system */

234

readonly RUNNING_LOCALLY: boolean;

235

};

236

```

237

238

**Configuration Examples:**

239

240

```javascript

241

import { env } from "@xenova/transformers";

242

243

// Disable remote model downloads

244

env.allowRemoteModels = false;

245

246

// Set custom cache directory

247

env.cacheDir = "/path/to/custom/cache";

248

249

// Use only local files

250

env.localFilesOnly = true;

251

252

// Configure WASM backend

253

env.backends.onnx.wasm.numThreads = 4;

254

env.backends.onnx.wasm.simd = true;

255

256

// Set custom remote URL for model downloads

257

env.remoteURL = "https://custom-model-server.com/";

258

env.remotePathTemplate = "models/{model}/";

259

```

260

261

The `env` object provides comprehensive configuration for the library's runtime behavior including model storage locations, backend preferences, performance settings, and environment detection.

262

263

## Types

264

265

```javascript { .api }

266

interface Pipeline {

267

(input: any, options?: any): Promise<any>;

268

dispose(): Promise<void>;

269

}

270

271

interface PreTrainedModel {

272

config: any;

273

forward(model_inputs: any): Promise<any>;

274

dispose(): Promise<void>;

275

}

276

277

interface PreTrainedTokenizer {

278

encode(text: string, options?: any): any;

279

decode(token_ids: number[] | Tensor, options?: any): string;

280

batch_decode(sequences: number[][] | Tensor[], options?: any): string[];

281

dispose(): Promise<void>;

282

}

283

```