Tessl Tile for pypi/ctranslate2@4.6.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

converters.md index.md inference.md specialized.md specifications.md utilities.md

converters.mddocs/

0
# Model Conversion
1

2
Convert models from popular frameworks (Transformers, Fairseq, OpenNMT, etc.) to CTranslate2 format for optimized inference. CTranslate2 converters support quantization, file copying, and various framework-specific options to ensure optimal performance and compatibility.
3

4
## Capabilities
5

6
### Transformers Converter
7

8
Convert Hugging Face Transformers models to CTranslate2 format. Supports most popular model architectures including BERT, GPT-2, T5, BART, and more.
9

10
```python { .api }
11
class TransformersConverter:
12
    def __init__(self, model_name_or_path: str, activation_scales: str = None, 
13
                 copy_files: list = None, load_as_float16: bool = False, 
14
                 revision: str = None, low_cpu_mem_usage: bool = False, 
15
                 trust_remote_code: bool = False):
16
        """
17
        Initialize converter for Hugging Face Transformers models.
18
        
19
        Args:
20
            model_name_or_path (str): Model name on Hub or local path
21
            activation_scales (str): Path to activation scales for smoothquant
22
            copy_files (list): Additional files to copy to output directory
23
            load_as_float16 (bool): Load model weights in float16
24
            revision (str): Model revision/branch to use
25
            low_cpu_mem_usage (bool): Enable low CPU memory loading
26
            trust_remote_code (bool): Allow custom code execution
27
        """
28
    
29
    def convert(self, output_dir: str, vmap: str = None, 
30
                quantization: str = None, force: bool = False) -> str:
31
        """
32
        Convert the model to CTranslate2 format.
33
        
34
        Args:
35
            output_dir (str): Output directory for converted model
36
            vmap (str): Path to vocabulary mapping file
37
            quantization (str): Quantization type ("int8", "int8_float16", "int16", "float16")
38
            force (bool): Overwrite output directory if it exists
39
            
40
        Returns:
41
            str: Path to the converted model directory
42
        """
43
    
44
    def convert_from_args(self, args) -> str:
45
        """
46
        Convert model using parsed command-line arguments.
47
        
48
        Args:
49
            args: Parsed arguments object
50
            
51
        Returns:
52
            str: Path to the converted model directory
53
        """
54
    
55
    @staticmethod
56
    def declare_arguments(parser):
57
        """
58
        Add converter-specific arguments to argument parser.
59
        
60
        Args:
61
            parser: ArgumentParser instance to modify
62
        """
63
```
64

65
### Fairseq Converter
66

67
Convert Fairseq models to CTranslate2 format. Supports various Fairseq model architectures.
68

69
```python { .api }
70
class FairseqConverter:
71
    def __init__(self, model_path: str, data_dir: str = None):
72
        """
73
        Initialize converter for Fairseq models.
74
        
75
        Args:
76
            model_path (str): Path to Fairseq model checkpoint
77
            data_dir (str): Path to data directory with vocabularies
78
        """
79
    
80
    def convert(self, output_dir: str, vmap: str = None, 
81
                quantization: str = None, force: bool = False) -> str:
82
        """
83
        Convert the Fairseq model to CTranslate2 format.
84
        
85
        Args:
86
            output_dir (str): Output directory for converted model
87
            vmap (str): Path to vocabulary mapping file
88
            quantization (str): Quantization type
89
            force (bool): Overwrite output directory if it exists
90
            
91
        Returns:
92
            str: Path to the converted model directory
93
        """
94
```
95

96
### OpenNMT Converters
97

98
Convert OpenNMT-py and OpenNMT-tf models to CTranslate2 format.
99

100
```python { .api }
101
class OpenNMTPyConverter:
102
    def __init__(self, model_path: str):
103
        """
104
        Initialize converter for OpenNMT-py models.
105
        
106
        Args:
107
            model_path (str): Path to OpenNMT-py model file
108
        """
109
    
110
    def convert(self, output_dir: str, vmap: str = None, 
111
                quantization: str = None, force: bool = False) -> str:
112
        """Convert the OpenNMT-py model to CTranslate2 format."""
113

114
class OpenNMTTFConverter:
115
    def __init__(self, model_path: str):
116
        """
117
        Initialize converter for OpenNMT-tf models.
118
        
119
        Args:
120
            model_path (str): Path to OpenNMT-tf model checkpoint
121
        """
122
    
123
    def convert(self, output_dir: str, vmap: str = None, 
124
                quantization: str = None, force: bool = False) -> str:
125
        """Convert the OpenNMT-tf model to CTranslate2 format."""
126
```
127

128
### Marian Converter
129

130
Convert Marian NMT models to CTranslate2 format.
131

132
```python { .api }
133
class MarianConverter:
134
    def __init__(self, model_path: str):
135
        """
136
        Initialize converter for Marian models.
137
        
138
        Args:
139
            model_path (str): Path to Marian model directory
140
        """
141
    
142
    def convert(self, output_dir: str, vmap: str = None, 
143
                quantization: str = None, force: bool = False) -> str:
144
        """Convert the Marian model to CTranslate2 format."""
145
```
146

147
### OPUS-MT Converter
148

149
Convert OPUS-MT models to CTranslate2 format.
150

151
```python { .api }
152
class OpusMTConverter:
153
    def __init__(self, model_name: str):
154
        """
155
        Initialize converter for OPUS-MT models.
156
        
157
        Args:
158
            model_name (str): OPUS-MT model name from Hugging Face Hub
159
        """
160
    
161
    def convert(self, output_dir: str, vmap: str = None, 
162
                quantization: str = None, force: bool = False) -> str:
163
        """Convert the OPUS-MT model to CTranslate2 format."""
164
```
165

166
### OpenAI GPT-2 Converter
167

168
Convert OpenAI GPT-2 models to CTranslate2 format.
169

170
```python { .api }
171
class OpenAIGPT2Converter:
172
    def __init__(self, model_name: str = "124M"):
173
        """
174
        Initialize converter for OpenAI GPT-2 models.
175
        
176
        Args:
177
            model_name (str): GPT-2 model size ("124M", "355M", "774M", "1558M")
178
        """
179
    
180
    def convert(self, output_dir: str, vmap: str = None, 
181
                quantization: str = None, force: bool = False) -> str:
182
        """Convert the GPT-2 model to CTranslate2 format."""
183
```
184

185
### Base Converter Class
186

187
All converters inherit from this base class providing common functionality.
188

189
```python { .api }
190
class Converter:
191
    """Abstract base class for model converters."""
192
    
193
    def convert(self, output_dir: str, vmap: str = None, 
194
                quantization: str = None, force: bool = False) -> str:
195
        """
196
        Convert model to CTranslate2 format.
197
        
198
        Args:
199
            output_dir (str): Output directory for converted model
200
            vmap (str): Path to vocabulary mapping file
201
            quantization (str): Quantization type
202
            force (bool): Overwrite output directory if it exists
203
            
204
        Returns:
205
            str: Path to the converted model directory
206
        """
207
    
208
    def convert_from_args(self, args) -> str:
209
        """
210
        Convert model using parsed command-line arguments.
211
        
212
        Args:
213
            args: Parsed arguments object with conversion parameters
214
            
215
        Returns:
216
            str: Path to the converted model directory
217
        """
218
    
219
    @staticmethod
220
    def declare_arguments(parser):
221
        """
222
        Add common converter arguments to argument parser.
223
        
224
        Args:
225
            parser: ArgumentParser instance to modify
226
        """
227
```
228

229
## Console Scripts
230

231
CTranslate2 provides command-line tools for model conversion:
232

233
```python { .api }
234
# Available console scripts (entry points):
235
# ct2-transformers-converter    - Convert Transformers models
236
# ct2-fairseq-converter        - Convert Fairseq models  
237
# ct2-opennmt-py-converter     - Convert OpenNMT-py models
238
# ct2-opennmt-tf-converter     - Convert OpenNMT-tf models
239
# ct2-marian-converter         - Convert Marian models
240
# ct2-opus-mt-converter        - Convert OPUS-MT models
241
# ct2-openai-gpt2-converter    - Convert OpenAI GPT-2 models
242
```
243

244
## Conversion Utilities
245

246
Helper functions for model conversion and optimization.
247

248
```python { .api }
249
def fuse_linear(spec, layers: list):
250
    """
251
    Fuse multiple linear layers for optimization.
252
    
253
    Args:
254
        spec: Model specification object
255
        layers (list): List of linear layers to fuse
256
    """
257

258
def fuse_linear_prequant(spec, layers: list, axis: int):
259
    """
260
    Fuse pre-quantized linear layers.
261
    
262
    Args:
263
        spec: Model specification object
264
        layers (list): List of pre-quantized linear layers
265
        axis (int): Axis along which to fuse
266
    """
267

268
def permute_for_sliced_rotary(weight, num_heads: int, rotary_dim: int = None):
269
    """
270
    Permute weights for rotary position embeddings.
271
    
272
    Args:
273
        weight: Weight tensor to permute
274
        num_heads (int): Number of attention heads
275
        rotary_dim (int): Rotary embedding dimension
276
        
277
    Returns:
278
        Permuted weight tensor
279
    """
280

281
def smooth_activation(layer_norm, linear, activation_scales):
282
    """
283
    Apply SmoothQuant activation smoothing technique.
284
    
285
    Args:
286
        layer_norm: Layer normalization module
287
        linear: Linear layer module  
288
        activation_scales: Activation scaling factors
289
    """
290
```
291

292
## Usage Examples
293

294
### Converting Transformers Models
295

296
```python
297
import ctranslate2
298

299
# Convert a Hugging Face model
300
converter = ctranslate2.converters.TransformersConverter("microsoft/DialoGPT-medium")
301
converter.convert("ct2_model", quantization="int8")
302

303
# Convert with additional options
304
converter = ctranslate2.converters.TransformersConverter(
305
    "t5-small",
306
    copy_files=["config.json", "tokenizer.json"],
307
    load_as_float16=True
308
)
309
converter.convert("t5_ct2", quantization="int8_float16")
310

311
# Convert local model
312
converter = ctranslate2.converters.TransformersConverter("/path/to/local/model")
313
converter.convert("output_dir", force=True)
314
```
315

316
### Converting Other Frameworks
317

318
```python
319
import ctranslate2
320

321
# Convert Fairseq model
322
fairseq_converter = ctranslate2.converters.FairseqConverter(
323
    "checkpoint_best.pt", 
324
    data_dir="data-bin/wmt14_en_de"
325
)
326
fairseq_converter.convert("fairseq_ct2")
327

328
# Convert OpenNMT-py model
329
opennmt_converter = ctranslate2.converters.OpenNMTPyConverter("model.pt")
330
opennmt_converter.convert("opennmt_ct2")
331

332
# Convert OPUS-MT model
333
opus_converter = ctranslate2.converters.OpusMTConverter("Helsinki-NLP/opus-mt-en-de")
334
opus_converter.convert("opus_ct2")
335
```
336

337
### Using Command Line Tools
338

339
```bash
340
# Convert Transformers model
341
ct2-transformers-converter --model microsoft/DialoGPT-medium --output_dir ct2_model --quantization int8
342

343
# Convert with custom options
344
ct2-transformers-converter \
345
    --model t5-small \
346
    --output_dir t5_ct2 \
347
    --quantization int8_float16 \
348
    --copy_files config.json tokenizer.json \
349
    --load_as_float16
350

351
# Convert Fairseq model
352
ct2-fairseq-converter \
353
    --model_path checkpoint_best.pt \
354
    --data_dir data-bin/wmt14_en_de \
355
    --output_dir fairseq_ct2 \
356
    --quantization int8
357
```
358

359
### Quantization Options
360

361
```python
362
# Available quantization types:
363
quantization_options = [
364
    "int8",           # 8-bit integer quantization
365
    "int8_float16",   # 8-bit weights, 16-bit activations
366
    "int16",          # 16-bit integer quantization  
367
    "float16",        # 16-bit floating point
368
    "int8_float32",   # 8-bit weights, 32-bit activations
369
    "int4",           # 4-bit integer quantization (experimental)
370
]
371

372
# Example with different quantization levels
373
converter = ctranslate2.converters.TransformersConverter("gpt2")
374

375
# Fastest inference, smaller model
376
converter.convert("gpt2_int8", quantization="int8")
377

378
# Balanced speed/quality
379
converter.convert("gpt2_fp16", quantization="float16") 
380

381
# Highest quality, larger model
382
converter.convert("gpt2_fp32")  # No quantization (default)
383
```
384

385
## Types
386

387
```python { .api }
388
# Quantization types
389
class Quantization:
390
    CT2: str        # Standard CTranslate2 quantization
391
    AWQ_GEMM: str   # AWQ quantization with GEMM
392
    AWQ_GEMV: str   # AWQ quantization with GEMV
393
```

Version

Tile

Files

converters.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

converters.mddocs/