or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

converters.mdindex.mdinference.mdspecialized.mdspecifications.mdutilities.md

converters.mddocs/

0

# Model Conversion

1

2

Convert models from popular frameworks (Transformers, Fairseq, OpenNMT, etc.) to CTranslate2 format for optimized inference. CTranslate2 converters support quantization, file copying, and various framework-specific options to ensure optimal performance and compatibility.

3

4

## Capabilities

5

6

### Transformers Converter

7

8

Convert Hugging Face Transformers models to CTranslate2 format. Supports most popular model architectures including BERT, GPT-2, T5, BART, and more.

9

10

```python { .api }

11

class TransformersConverter:

12

def __init__(self, model_name_or_path: str, activation_scales: str = None,

13

copy_files: list = None, load_as_float16: bool = False,

14

revision: str = None, low_cpu_mem_usage: bool = False,

15

trust_remote_code: bool = False):

16

"""

17

Initialize converter for Hugging Face Transformers models.

18

19

Args:

20

model_name_or_path (str): Model name on Hub or local path

21

activation_scales (str): Path to activation scales for smoothquant

22

copy_files (list): Additional files to copy to output directory

23

load_as_float16 (bool): Load model weights in float16

24

revision (str): Model revision/branch to use

25

low_cpu_mem_usage (bool): Enable low CPU memory loading

26

trust_remote_code (bool): Allow custom code execution

27

"""

28

29

def convert(self, output_dir: str, vmap: str = None,

30

quantization: str = None, force: bool = False) -> str:

31

"""

32

Convert the model to CTranslate2 format.

33

34

Args:

35

output_dir (str): Output directory for converted model

36

vmap (str): Path to vocabulary mapping file

37

quantization (str): Quantization type ("int8", "int8_float16", "int16", "float16")

38

force (bool): Overwrite output directory if it exists

39

40

Returns:

41

str: Path to the converted model directory

42

"""

43

44

def convert_from_args(self, args) -> str:

45

"""

46

Convert model using parsed command-line arguments.

47

48

Args:

49

args: Parsed arguments object

50

51

Returns:

52

str: Path to the converted model directory

53

"""

54

55

@staticmethod

56

def declare_arguments(parser):

57

"""

58

Add converter-specific arguments to argument parser.

59

60

Args:

61

parser: ArgumentParser instance to modify

62

"""

63

```

64

65

### Fairseq Converter

66

67

Convert Fairseq models to CTranslate2 format. Supports various Fairseq model architectures.

68

69

```python { .api }

70

class FairseqConverter:

71

def __init__(self, model_path: str, data_dir: str = None):

72

"""

73

Initialize converter for Fairseq models.

74

75

Args:

76

model_path (str): Path to Fairseq model checkpoint

77

data_dir (str): Path to data directory with vocabularies

78

"""

79

80

def convert(self, output_dir: str, vmap: str = None,

81

quantization: str = None, force: bool = False) -> str:

82

"""

83

Convert the Fairseq model to CTranslate2 format.

84

85

Args:

86

output_dir (str): Output directory for converted model

87

vmap (str): Path to vocabulary mapping file

88

quantization (str): Quantization type

89

force (bool): Overwrite output directory if it exists

90

91

Returns:

92

str: Path to the converted model directory

93

"""

94

```

95

96

### OpenNMT Converters

97

98

Convert OpenNMT-py and OpenNMT-tf models to CTranslate2 format.

99

100

```python { .api }

101

class OpenNMTPyConverter:

102

def __init__(self, model_path: str):

103

"""

104

Initialize converter for OpenNMT-py models.

105

106

Args:

107

model_path (str): Path to OpenNMT-py model file

108

"""

109

110

def convert(self, output_dir: str, vmap: str = None,

111

quantization: str = None, force: bool = False) -> str:

112

"""Convert the OpenNMT-py model to CTranslate2 format."""

113

114

class OpenNMTTFConverter:

115

def __init__(self, model_path: str):

116

"""

117

Initialize converter for OpenNMT-tf models.

118

119

Args:

120

model_path (str): Path to OpenNMT-tf model checkpoint

121

"""

122

123

def convert(self, output_dir: str, vmap: str = None,

124

quantization: str = None, force: bool = False) -> str:

125

"""Convert the OpenNMT-tf model to CTranslate2 format."""

126

```

127

128

### Marian Converter

129

130

Convert Marian NMT models to CTranslate2 format.

131

132

```python { .api }

133

class MarianConverter:

134

def __init__(self, model_path: str):

135

"""

136

Initialize converter for Marian models.

137

138

Args:

139

model_path (str): Path to Marian model directory

140

"""

141

142

def convert(self, output_dir: str, vmap: str = None,

143

quantization: str = None, force: bool = False) -> str:

144

"""Convert the Marian model to CTranslate2 format."""

145

```

146

147

### OPUS-MT Converter

148

149

Convert OPUS-MT models to CTranslate2 format.

150

151

```python { .api }

152

class OpusMTConverter:

153

def __init__(self, model_name: str):

154

"""

155

Initialize converter for OPUS-MT models.

156

157

Args:

158

model_name (str): OPUS-MT model name from Hugging Face Hub

159

"""

160

161

def convert(self, output_dir: str, vmap: str = None,

162

quantization: str = None, force: bool = False) -> str:

163

"""Convert the OPUS-MT model to CTranslate2 format."""

164

```

165

166

### OpenAI GPT-2 Converter

167

168

Convert OpenAI GPT-2 models to CTranslate2 format.

169

170

```python { .api }

171

class OpenAIGPT2Converter:

172

def __init__(self, model_name: str = "124M"):

173

"""

174

Initialize converter for OpenAI GPT-2 models.

175

176

Args:

177

model_name (str): GPT-2 model size ("124M", "355M", "774M", "1558M")

178

"""

179

180

def convert(self, output_dir: str, vmap: str = None,

181

quantization: str = None, force: bool = False) -> str:

182

"""Convert the GPT-2 model to CTranslate2 format."""

183

```

184

185

### Base Converter Class

186

187

All converters inherit from this base class providing common functionality.

188

189

```python { .api }

190

class Converter:

191

"""Abstract base class for model converters."""

192

193

def convert(self, output_dir: str, vmap: str = None,

194

quantization: str = None, force: bool = False) -> str:

195

"""

196

Convert model to CTranslate2 format.

197

198

Args:

199

output_dir (str): Output directory for converted model

200

vmap (str): Path to vocabulary mapping file

201

quantization (str): Quantization type

202

force (bool): Overwrite output directory if it exists

203

204

Returns:

205

str: Path to the converted model directory

206

"""

207

208

def convert_from_args(self, args) -> str:

209

"""

210

Convert model using parsed command-line arguments.

211

212

Args:

213

args: Parsed arguments object with conversion parameters

214

215

Returns:

216

str: Path to the converted model directory

217

"""

218

219

@staticmethod

220

def declare_arguments(parser):

221

"""

222

Add common converter arguments to argument parser.

223

224

Args:

225

parser: ArgumentParser instance to modify

226

"""

227

```

228

229

## Console Scripts

230

231

CTranslate2 provides command-line tools for model conversion:

232

233

```python { .api }

234

# Available console scripts (entry points):

235

# ct2-transformers-converter - Convert Transformers models

236

# ct2-fairseq-converter - Convert Fairseq models

237

# ct2-opennmt-py-converter - Convert OpenNMT-py models

238

# ct2-opennmt-tf-converter - Convert OpenNMT-tf models

239

# ct2-marian-converter - Convert Marian models

240

# ct2-opus-mt-converter - Convert OPUS-MT models

241

# ct2-openai-gpt2-converter - Convert OpenAI GPT-2 models

242

```

243

244

## Conversion Utilities

245

246

Helper functions for model conversion and optimization.

247

248

```python { .api }

249

def fuse_linear(spec, layers: list):

250

"""

251

Fuse multiple linear layers for optimization.

252

253

Args:

254

spec: Model specification object

255

layers (list): List of linear layers to fuse

256

"""

257

258

def fuse_linear_prequant(spec, layers: list, axis: int):

259

"""

260

Fuse pre-quantized linear layers.

261

262

Args:

263

spec: Model specification object

264

layers (list): List of pre-quantized linear layers

265

axis (int): Axis along which to fuse

266

"""

267

268

def permute_for_sliced_rotary(weight, num_heads: int, rotary_dim: int = None):

269

"""

270

Permute weights for rotary position embeddings.

271

272

Args:

273

weight: Weight tensor to permute

274

num_heads (int): Number of attention heads

275

rotary_dim (int): Rotary embedding dimension

276

277

Returns:

278

Permuted weight tensor

279

"""

280

281

def smooth_activation(layer_norm, linear, activation_scales):

282

"""

283

Apply SmoothQuant activation smoothing technique.

284

285

Args:

286

layer_norm: Layer normalization module

287

linear: Linear layer module

288

activation_scales: Activation scaling factors

289

"""

290

```

291

292

## Usage Examples

293

294

### Converting Transformers Models

295

296

```python

297

import ctranslate2

298

299

# Convert a Hugging Face model

300

converter = ctranslate2.converters.TransformersConverter("microsoft/DialoGPT-medium")

301

converter.convert("ct2_model", quantization="int8")

302

303

# Convert with additional options

304

converter = ctranslate2.converters.TransformersConverter(

305

"t5-small",

306

copy_files=["config.json", "tokenizer.json"],

307

load_as_float16=True

308

)

309

converter.convert("t5_ct2", quantization="int8_float16")

310

311

# Convert local model

312

converter = ctranslate2.converters.TransformersConverter("/path/to/local/model")

313

converter.convert("output_dir", force=True)

314

```

315

316

### Converting Other Frameworks

317

318

```python

319

import ctranslate2

320

321

# Convert Fairseq model

322

fairseq_converter = ctranslate2.converters.FairseqConverter(

323

"checkpoint_best.pt",

324

data_dir="data-bin/wmt14_en_de"

325

)

326

fairseq_converter.convert("fairseq_ct2")

327

328

# Convert OpenNMT-py model

329

opennmt_converter = ctranslate2.converters.OpenNMTPyConverter("model.pt")

330

opennmt_converter.convert("opennmt_ct2")

331

332

# Convert OPUS-MT model

333

opus_converter = ctranslate2.converters.OpusMTConverter("Helsinki-NLP/opus-mt-en-de")

334

opus_converter.convert("opus_ct2")

335

```

336

337

### Using Command Line Tools

338

339

```bash

340

# Convert Transformers model

341

ct2-transformers-converter --model microsoft/DialoGPT-medium --output_dir ct2_model --quantization int8

342

343

# Convert with custom options

344

ct2-transformers-converter \

345

--model t5-small \

346

--output_dir t5_ct2 \

347

--quantization int8_float16 \

348

--copy_files config.json tokenizer.json \

349

--load_as_float16

350

351

# Convert Fairseq model

352

ct2-fairseq-converter \

353

--model_path checkpoint_best.pt \

354

--data_dir data-bin/wmt14_en_de \

355

--output_dir fairseq_ct2 \

356

--quantization int8

357

```

358

359

### Quantization Options

360

361

```python

362

# Available quantization types:

363

quantization_options = [

364

"int8", # 8-bit integer quantization

365

"int8_float16", # 8-bit weights, 16-bit activations

366

"int16", # 16-bit integer quantization

367

"float16", # 16-bit floating point

368

"int8_float32", # 8-bit weights, 32-bit activations

369

"int4", # 4-bit integer quantization (experimental)

370

]

371

372

# Example with different quantization levels

373

converter = ctranslate2.converters.TransformersConverter("gpt2")

374

375

# Fastest inference, smaller model

376

converter.convert("gpt2_int8", quantization="int8")

377

378

# Balanced speed/quality

379

converter.convert("gpt2_fp16", quantization="float16")

380

381

# Highest quality, larger model

382

converter.convert("gpt2_fp32") # No quantization (default)

383

```

384

385

## Types

386

387

```python { .api }

388

# Quantization types

389

class Quantization:

390

CT2: str # Standard CTranslate2 quantization

391

AWQ_GEMM: str # AWQ quantization with GEMM

392

AWQ_GEMV: str # AWQ quantization with GEMV

393

```