Tessl Tile for pypi/ctranslate2@4.6.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

converters.md index.md inference.md specialized.md specifications.md utilities.md

index.mddocs/

0
# CTranslate2
1

2
A high-performance C++ and Python library specifically designed for efficient inference with Transformer models across various architectures including encoder-decoder models (Transformer, BART, T5, Whisper), decoder-only models (GPT-2, Llama, Mistral), and encoder-only models (BERT, RoBERTa). The library implements a custom runtime that applies advanced performance optimization techniques such as weights quantization, layer fusion, batch reordering, and memory management to significantly accelerate inference and reduce memory usage on both CPU and GPU platforms.
3

4
## Package Information
5

6
- **Package Name**: ctranslate2
7
- **Package Type**: PyPI
8
- **Language**: Python (with C++ backend)
9
- **Installation**: `pip install ctranslate2`
10

11
## Core Imports
12

13
```python
14
import ctranslate2
15
```
16

17
Common usage patterns:
18

19
```python
20
from ctranslate2 import Translator, Generator, Encoder
21
from ctranslate2 import TransformersConverter, contains_model
22
```
23

24
## Basic Usage
25

26
```python
27
import ctranslate2
28

29
# Translation example (seq2seq models)
30
translator = ctranslate2.Translator("path/to/ct2_model", device="cpu")
31
results = translator.translate_batch([["Hello", "world"]])
32
print(results[0].hypotheses[0])  # Translated text
33

34
# Generation example (language models)
35
generator = ctranslate2.Generator("path/to/ct2_model", device="cpu")
36
results = generator.generate_batch([["The quick brown"]])
37
print(results[0].sequences[0])  # Generated continuation
38

39
# Model conversion example
40
converter = ctranslate2.converters.TransformersConverter("microsoft/DialoGPT-medium")
41
converter.convert("ct2_model_output")
42
```
43

44
## Architecture
45

46
CTranslate2 follows a modular architecture:
47

48
- **Core Inference Classes**: `Translator`, `Generator`, `Encoder` for different model types
49
- **Model Converters**: Framework-specific converters for Transformers, Fairseq, OpenNMT, etc.
50
- **Model Specifications**: Programmatic model definition classes for building models from scratch
51
- **Specialized Models**: Domain-specific classes like `Whisper` for speech recognition
52
- **Storage and Configuration**: `StorageView` for efficient tensor operations, device management
53

54
## Capabilities
55

56
### Model Inference
57

58
Core inference functionality for running Transformer models with high performance. Supports translation, generation, and encoding tasks with batching, streaming, and asynchronous processing.
59

60
```python { .api }
61
class Translator:
62
    def __init__(self, model_path: str, device: str = "auto", 
63
                 device_index: int = 0, compute_type: str = "default", 
64
                 inter_threads: int = 1, intra_threads: int = 0, 
65
                 max_queued_batches: int = 0, flash_attention: bool = False, 
66
                 tensor_parallel: bool = False, files: dict = None): ...
67
    
68
    def translate_batch(self, source: list, target_prefix: list = None, **kwargs) -> list: ...
69
    def score_batch(self, source: list, target: list, **kwargs) -> list: ...
70

71
class Generator:
72
    def __init__(self, model_path: str, device: str = "auto", 
73
                 device_index: int = 0, compute_type: str = "default", 
74
                 inter_threads: int = 1, intra_threads: int = 0, 
75
                 max_queued_batches: int = 0, flash_attention: bool = False, 
76
                 tensor_parallel: bool = False, files: dict = None): ...
77
    
78
    def generate_batch(self, start_tokens: list, **kwargs) -> list: ...
79
    def score_batch(self, tokens: list, **kwargs) -> list: ...
80

81
class Encoder:
82
    def __init__(self, model_path: str, device: str = "auto", 
83
                 device_index: int = 0, compute_type: str = "default", 
84
                 inter_threads: int = 1, intra_threads: int = 0, 
85
                 max_queued_batches: int = 0, files: dict = None): ...
86
    
87
    def forward_batch(self, inputs: list, **kwargs) -> list: ...
88
```
89

90
[Model Inference](./inference.md)
91

92
### Model Conversion
93

94
Convert models from popular frameworks (Transformers, Fairseq, OpenNMT, etc.) to CTranslate2 format for optimized inference. Supports quantization, file copying, and various framework-specific options.
95

96
```python { .api }
97
class TransformersConverter:
98
    def __init__(self, model_name_or_path: str, activation_scales: str = None, 
99
                 copy_files: list = None, load_as_float16: bool = False, 
100
                 revision: str = None, low_cpu_mem_usage: bool = False, 
101
                 trust_remote_code: bool = False): ...
102
    
103
    def convert(self, output_dir: str, vmap: str = None, 
104
                quantization: str = None, force: bool = False): ...
105

106
# Additional converters
107
class FairseqConverter: ...
108
class OpenNMTPyConverter: ...
109
class OpenNMTTFConverter: ...
110
class MarianConverter: ...
111
class OpusMTConverter: ...
112
class OpenAIGPT2Converter: ...
113
```
114

115
[Model Conversion](./converters.md)
116

117
### Model Specifications
118

119
Programmatically define and build Transformer model architectures from scratch. Supports various model types including sequence-to-sequence, decoder-only, and encoder-only models with extensive configuration options.
120

121
```python { .api }
122
class TransformerSpec:
123
    def __init__(self, encoder: TransformerEncoderSpec, decoder: TransformerDecoderSpec): ...
124
    @classmethod
125
    def from_config(cls, num_layers: int, num_heads: int, **kwargs): ...
126
    
127
    def save(self, output_dir: str): ...
128
    def validate(self): ...
129
    def optimize(self, quantization: str = None): ...
130

131
class TransformerDecoderModelSpec:
132
    def __init__(self, decoder: TransformerDecoderSpec): ...
133
    @classmethod
134
    def from_config(cls, num_layers: int, num_heads: int, **kwargs): ...
135

136
class TransformerEncoderModelSpec:
137
    def __init__(self, encoder: TransformerEncoderSpec, pooling_layer: bool = False): ...
138
```
139

140
[Model Specifications](./specifications.md)
141

142
### Specialized Models
143

144
Domain-specific model classes for speech recognition and audio processing tasks. Includes Whisper for speech-to-text and Wav2Vec2 for speech representation learning.
145

146
```python { .api }
147
class Whisper:
148
    def __init__(self, model_path: str, device: str = "auto", **kwargs): ...
149
    def transcribe(self, features: list, **kwargs) -> list: ...
150
    def detect_language(self, features: list, **kwargs) -> list: ...
151

152
class Wav2Vec2:
153
    def __init__(self, model_path: str, device: str = "auto", **kwargs): ...
154
    def encode(self, features: list, **kwargs) -> list: ...
155

156
class Wav2Vec2Bert:
157
    def __init__(self, model_path: str, device: str = "auto", **kwargs): ...
158
    def encode(self, features: list, **kwargs) -> list: ...
159
```
160

161
[Specialized Models](./specialized.md)
162

163
### Utilities and Configuration
164

165
Helper functions for model management, device configuration, logging, and tensor operations. Includes utilities for checking model compatibility and managing computational resources.
166

167
```python { .api }
168
def contains_model(path: str) -> bool: ...
169
def get_cuda_device_count() -> int: ...
170
def get_supported_compute_types(device: str, device_index: int = 0) -> list: ...
171
def set_random_seed(seed: int): ...
172
def get_log_level() -> str: ...
173
def set_log_level(level: str): ...
174

175
class StorageView:
176
    def __init__(self, array=None, dtype=None): ...
177
    def numpy(self): ...
178
    def copy(self): ...
179
    def to(self, dtype: str): ...
180
    
181
    @property
182
    def shape(self) -> tuple: ...
183
    @property
184
    def size(self) -> int: ...
185
    @property
186
    def dtype(self) -> str: ...
187
```
188

189
[Utilities](./utilities.md)
190

191
## Types
192

193
```python { .api }
194
# Result classes
195
class TranslationResult:
196
    hypotheses: list[str]
197
    scores: list[float]
198

199
class GenerationResult:
200
    sequences: list[list[str]]
201
    scores: list[float]
202

203
class ScoringResult:
204
    scores: list[float]
205

206
class GenerationStepResult:
207
    token: str
208
    token_id: int
209
    is_last: bool
210
    log_prob: float
211

212
class EncoderForwardOutput:
213
    last_hidden_state: StorageView
214
    pooler_output: StorageView
215

216
# Enumerations
217
class DataType:
218
    FLOAT32: str
219
    FLOAT16: str
220
    INT8: str
221
    INT16: str
222
    INT32: str
223

224
class Device:
225
    CPU: str
226
    CUDA: str
227
    AUTO: str
228

229
# Configuration classes
230
class ExecutionStats:
231
    num_tokens: int
232
    num_examples: int
233
    total_time_in_ms: float
234
    
235
class MpiInfo:
236
    rank: int
237
    size: int
238
```

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/