Tessl Tile for pypi/sentence-transformers@5.1.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

core-transformers.md cross-encoder.md evaluation.md index.md loss-functions.md sparse-encoder.md training.md utilities.md

core-transformers.mddocs/

0
# Core Transformers
1

2
The `SentenceTransformer` class is the main interface for loading, using, and customizing bi-encoder models that map sentences and text to dense vector embeddings.
3

4
## SentenceTransformer Class
5

6
### Constructor
7

8
```python
9
SentenceTransformer(
10
    model_name_or_path: str | None = None,
11
    modules: Iterable[nn.Module] | None = None,
12
    device: str | None = None,
13
    prompts: dict[str, str] | None = None,
14
    default_prompt_name: str | None = None,
15
    similarity_fn_name: str | SimilarityFunction | None = None,
16
    cache_folder: str | None = None,
17
    trust_remote_code: bool = False,
18
    revision: str | None = None,
19
    local_files_only: bool = False,
20
    token: bool | str | None = None,
21
    use_auth_token: bool | str | None = None,
22
    truncate_dim: int | None = None,
23
    model_kwargs: dict[str, Any] | None = None,
24
    tokenizer_kwargs: dict[str, Any] | None = None,
25
    config_kwargs: dict[str, Any] | None = None,
26
    model_card_data: SentenceTransformerModelCardData | None = None,
27
    backend: Literal["torch", "onnx", "openvino"] = "torch"
28
)
29
```
30
`{ .api }`
31

32
Initialize a SentenceTransformer model.
33

34
**Parameters**:
35
- `model_name_or_path`: Model identifier from HuggingFace Hub or local path
36
- `modules`: Iterable of PyTorch modules to create custom model architecture
37
- `device`: Device to run the model on ('cpu', 'cuda', 'mps', 'npu', etc.)
38
- `prompts`: Dictionary of prompts for different tasks
39
- `default_prompt_name`: Default prompt to use
40
- `similarity_fn_name`: Similarity function for embeddings comparison
41
- `cache_folder`: Custom cache directory for models
42
- `trust_remote_code`: Allow custom code execution from remote models
43
- `revision`: Specific model revision/branch to load
44
- `local_files_only`: Only use locally cached files
45
- `token`: HuggingFace authentication token
46
- `use_auth_token`: Deprecated argument, use `token` instead
47
- `truncate_dim`: Truncate embeddings to this dimension
48
- `model_kwargs`: Additional model configuration parameters
49
- `tokenizer_kwargs`: Additional tokenizer configuration parameters
50
- `config_kwargs`: Additional model configuration parameters
51
- `model_card_data`: Model card data object for generating model cards
52
- `backend`: Backend to use for inference ("torch", "onnx", "openvino")
53

54
### Core Encoding Methods
55

56
```python
57
def encode(
58
    sentences: str | list[str] | np.ndarray,
59
    prompt_name: str | None = None,
60
    prompt: str | None = None,
61
    batch_size: int = 32,
62
    show_progress_bar: bool | None = None,
63
    output_value: Literal["sentence_embedding", "token_embeddings"] | None = "sentence_embedding",
64
    precision: Literal["float32", "int8", "uint8", "binary", "ubinary"] = "float32",
65
    convert_to_numpy: bool = True,
66
    convert_to_tensor: bool = False,
67
    device: str | list[str | torch.device] | None = None,
68
    normalize_embeddings: bool = False,
69
    truncate_dim: int | None = None,
70
    pool: dict[Literal["input", "output", "processes"], Any] | None = None,
71
    chunk_size: int | None = None,
72
    **kwargs
73
) -> list[Tensor] | np.ndarray | Tensor | dict[str, Tensor] | list[dict[str, Tensor]]
74
```
75
`{ .api }`
76

77
Encode sentences into embeddings.
78

79
**Parameters**:
80
- `sentences`: Input text(s) to encode
81
- `prompt_name`: Name of the prompt to use for encoding
82
- `prompt`: The prompt to use for encoding
83
- `batch_size`: Batch size for processing
84
- `show_progress_bar`: Display progress bar during encoding
85
- `output_value`: Type of embeddings to return ('sentence_embedding', 'token_embeddings', or None for all)
86
- `precision`: Precision to use for embeddings ("float32", "int8", "uint8", "binary", "ubinary")
87
- `convert_to_numpy`: Return numpy arrays instead of tensors
88
- `convert_to_tensor`: Return PyTorch tensors
89
- `device`: Device(s) for computation (single device or list for multi-process)
90
- `normalize_embeddings`: L2 normalize the embeddings
91
- `truncate_dim`: Dimension to truncate sentence embeddings to
92
- `pool`: Multi-process pool for encoding
93
- `chunk_size`: Size of chunks for multi-process encoding
94
- `**kwargs`: Additional keyword arguments
95

96
**Returns**: Embeddings as numpy arrays, tensors, or lists
97

98
```python
99
def encode_query(
100
    sentences: str | list[str] | np.ndarray,
101
    prompt_name: str | None = None,
102
    prompt: str | None = None,
103
    batch_size: int = 32,
104
    show_progress_bar: bool | None = None,
105
    output_value: Literal["sentence_embedding", "token_embeddings"] | None = "sentence_embedding",
106
    precision: Literal["float32", "int8", "uint8", "binary", "ubinary"] = "float32",
107
    convert_to_numpy: bool = True,
108
    convert_to_tensor: bool = False,
109
    device: str | list[str | torch.device] | None = None,
110
    normalize_embeddings: bool = False,
111
    truncate_dim: int | None = None,
112
    pool: dict[Literal["input", "output", "processes"], Any] | None = None,
113
    chunk_size: int | None = None,
114
    **kwargs
115
) -> list[Tensor] | np.ndarray | Tensor | dict[str, Tensor] | list[dict[str, Tensor]]
116
```
117
`{ .api }`
118

119
Encode queries for retrieval tasks with query-specific prompt.
120

121
```python
122
def encode_document(
123
    sentences: str | list[str] | np.ndarray,
124
    prompt_name: str | None = None,
125
    prompt: str | None = None,
126
    batch_size: int = 32,
127
    show_progress_bar: bool | None = None,
128
    output_value: Literal["sentence_embedding", "token_embeddings"] | None = "sentence_embedding",
129
    precision: Literal["float32", "int8", "uint8", "binary", "ubinary"] = "float32",
130
    convert_to_numpy: bool = True,
131
    convert_to_tensor: bool = False,
132
    device: str | list[str | torch.device] | None = None,
133
    normalize_embeddings: bool = False,
134
    truncate_dim: int | None = None,
135
    pool: dict[Literal["input", "output", "processes"], Any] | None = None,
136
    chunk_size: int | None = None,
137
    **kwargs
138
) -> list[Tensor] | np.ndarray | Tensor | dict[str, Tensor] | list[dict[str, Tensor]]
139
```
140
`{ .api }`
141

142
Encode documents for retrieval tasks with document-specific prompt.
143

144
### Similarity Methods
145

146
```python
147
def similarity(
148
    embeddings1: Tensor | npt.NDArray[np.float32],
149
    embeddings2: Tensor | npt.NDArray[np.float32]
150
) -> Tensor
151
```
152
`{ .api }`
153

154
Compute similarity between two sets of embeddings using the model's similarity function.
155

156
```python
157
def similarity_pairwise(
158
    embeddings1: Tensor | npt.NDArray[np.float32],
159
    embeddings2: Tensor | npt.NDArray[np.float32]
160
) -> Tensor
161
```
162
`{ .api }`
163

164
Compute pairwise similarities between all embeddings in two sets.
165

166
### Model Inspection Methods
167

168
```python
169
def get_sentence_embedding_dimension() -> int | None
170
```
171
`{ .api }`
172

173
Get the dimension of sentence embeddings.
174

175
```python
176
def get_max_seq_length() -> int | None
177
```
178
`{ .api }`
179

180
Get the maximum sequence length the model can handle.
181

182
```python
183
def tokenize(
184
    texts: list[str] | list[dict] | list[tuple[str, str]],
185
    **kwargs
186
) -> dict[str, Tensor]
187
```
188
`{ .api }`
189

190
Tokenize input texts using the model's tokenizer.
191

192
### Model Persistence
193

194
```python
195
def save(
196
    path: str,
197
    model_name: str | None = None,
198
    create_model_card: bool = True,
199
    train_datasets: list[str] | None = None,
200
    safe_serialization: bool = True
201
) -> None
202
```
203
`{ .api }`
204

205
Save the model to a local directory.
206

207
```python
208
def save_pretrained(
209
    save_directory: str,
210
    **kwargs
211
) -> None
212
```
213
`{ .api }`
214

215
Save model using HuggingFace format.
216

217
```python
218
def save_to_hub(
219
    repo_id: str,
220
    organization: str | None = None,
221
    token: str | None = None,
222
    private: bool | None = None,
223
    safe_serialization: bool = True,
224
    commit_message: str = "Add new SentenceTransformer model.",
225
    local_model_path: str | None = None,
226
    exist_ok: bool = False,
227
    replace_model_card: bool = False,
228
    train_datasets: list[str] | None = None
229
) -> str
230
```
231
`{ .api }`
232

233
Save and push model to HuggingFace Hub.
234

235
```python
236
def push_to_hub(
237
    repo_id: str,
238
    token: str | None = None,
239
    private: bool | None = None,
240
    safe_serialization: bool = True,
241
    commit_message: str | None = None,
242
    local_model_path: str | None = None,
243
    exist_ok: bool = False,
244
    replace_model_card: bool = False,
245
    train_datasets: list[str] | None = None,
246
    revision: str | None = None,
247
    create_pr: bool = False
248
) -> str
249
```
250
`{ .api }`
251

252
Push existing model to HuggingFace Hub.
253

254
### Evaluation and Processing
255

256
```python
257
def evaluate(
258
    evaluator: SentenceEvaluator,
259
    output_path: str | None = None
260
) -> float | dict[str, float]
261
```
262
`{ .api }`
263

264
Evaluate the model using a provided evaluator.
265

266
```python
267
def forward(
268
    input: dict[str, torch.Tensor],
269
    **kwargs
270
) -> dict[str, torch.Tensor]
271
```
272
`{ .api }`
273

274
Forward pass through the model.
275

276
### Multi-Processing Support
277

278
```python
279
def start_multi_process_pool(
280
    target_devices: list[str] | None = None
281
) -> dict[Literal["input", "output", "processes"], Any]
282
```
283
`{ .api }`
284

285
Start a multi-process pool for parallel encoding.
286

287
```python
288
@staticmethod
289
def stop_multi_process_pool(pool: dict[Literal["input", "output", "processes"], Any]) -> None
290
```
291
`{ .api }`
292

293
Stop a multi-process pool.
294

295
```python
296
def encode_multi_process(
297
    sentences: list[str],
298
    pool: dict[Literal["input", "output", "processes"], Any],
299
    prompt_name: str | None = None,
300
    prompt: str | None = None,
301
    batch_size: int = 32,
302
    chunk_size: int | None = None,
303
    show_progress_bar: bool | None = None,
304
    precision: Literal["float32", "int8", "uint8", "binary", "ubinary"] = "float32",
305
    normalize_embeddings: bool = False,
306
    truncate_dim: int | None = None
307
) -> np.ndarray
308
```
309
`{ .api }`
310

311
Encode sentences using multi-processing for improved performance.
312

313
### Properties
314

315
```python
316
@property
317
def device() -> torch.device
318
```
319
`{ .api }`
320

321
Current device of the model.
322

323
```python
324
@property
325
def tokenizer() -> PreTrainedTokenizer
326
```
327
`{ .api }`
328

329
Access to the model's tokenizer.
330

331
```python
332
@property
333
def max_seq_length() -> int
334
```
335
`{ .api }`
336

337
Maximum sequence length supported by the model.
338

339
```python
340
@property
341
def similarity_fn_name() -> Literal["cosine", "dot", "euclidean", "manhattan"]
342
```
343
`{ .api }`
344

345
Name of the similarity function used by the model.
346

347
```python
348
@property
349
def transformers_model() -> PreTrainedModel | None
350
```
351
`{ .api }`
352

353
Access to the underlying transformer model.
354

355
## Usage Examples
356

357
### Basic Encoding
358

359
```python
360
from sentence_transformers import SentenceTransformer
361

362
# Load pre-trained model
363
model = SentenceTransformer('all-MiniLM-L6-v2')
364

365
# Encode single sentence
366
embedding = model.encode("Hello world")
367
print(f"Embedding shape: {embedding.shape}")
368

369
# Encode multiple sentences
370
sentences = [
371
    "The cat sits on the mat",
372
    "A feline rests on a rug",
373
    "Dogs are great pets"
374
]
375
embeddings = model.encode(sentences)
376
print(f"Embeddings shape: {embeddings.shape}")
377
```
378

379
### Similarity Computation
380

381
```python
382
# Compute similarity between two sentences
383
sentence1 = "The weather is nice today"
384
sentence2 = "Today has beautiful weather"
385

386
emb1 = model.encode(sentence1)
387
emb2 = model.encode(sentence2)
388

389
similarity = model.similarity(emb1, emb2)
390
print(f"Similarity: {similarity.item():.4f}")
391

392
# Pairwise similarities
393
embeddings = model.encode([
394
    "Python is a programming language",
395
    "Java is used for software development", 
396
    "I love pizza",
397
    "Pasta is delicious"
398
])
399

400
# Compute all pairwise similarities
401
similarities = model.similarity_pairwise(embeddings, embeddings)
402
print(f"Similarity matrix shape: {similarities.shape}")
403
```
404

405
### Asymmetric Retrieval
406

407
```python
408
# For retrieval tasks with different prompts
409
queries = ["What is machine learning?", "How does neural networks work?"]
410
documents = [
411
    "Machine learning is a subset of artificial intelligence",
412
    "Neural networks are computational models inspired by biological neurons",
413
    "Pizza recipes vary by region and preference"
414
]
415

416
# Encode with task-specific methods
417
query_embeddings = model.encode_query(queries)
418
doc_embeddings = model.encode_document(documents)
419

420
# Compute retrieval similarities
421
similarities = model.similarity(query_embeddings, doc_embeddings)
422
```
423

424
### Custom Model Creation
425

426
```python
427
from sentence_transformers import SentenceTransformer
428
from sentence_transformers.models import Transformer, Pooling, Dense
429

430
# Create custom model architecture
431
transformer = Transformer('distilbert-base-uncased', max_seq_length=256)
432
pooling = Pooling(transformer.get_word_embedding_dimension(), pooling_mode='mean')
433
dense = Dense(pooling.get_sentence_embedding_dimension(), 256, activation_function='tanh')
434

435
# Combine modules
436
model = SentenceTransformer(modules=[transformer, pooling, dense])
437

438
# Use the custom model
439
embeddings = model.encode(["Custom model example"])
440
```
441

442
### Performance Optimization
443

444
```python
445
# Multi-process encoding for large datasets
446
sentences = ["sentence " + str(i) for i in range(10000)]
447

448
# Start multi-process pool
449
pool = model.start_multi_process_pool(['cuda:0', 'cuda:1'])
450

451
# Encode using multiple GPUs
452
embeddings = model.encode_multi_process(sentences, pool, batch_size=64)
453

454
# Clean up
455
model.stop_multi_process_pool(pool)
456

457
# Normalized embeddings for cosine similarity
458
embeddings = model.encode(sentences, normalize_embeddings=True)
459
```
460

461
### Model Persistence
462

463
```python
464
# Save model locally
465
model.save('./my-sentence-transformer')
466

467
# Save to HuggingFace Hub
468
model.save_to_hub('my-username/my-sentence-transformer')
469

470
# Load saved model
471
loaded_model = SentenceTransformer('./my-sentence-transformer')
472
```
473

474
## SimilarityFunction Enum
475

476
```python
477
from sentence_transformers import SimilarityFunction
478

479
class SimilarityFunction(Enum):
480
    COSINE = "cosine"
481
    DOT_PRODUCT = "dot"
482
    DOT = "dot"  # Alias for DOT_PRODUCT
483
    EUCLIDEAN = "euclidean"
484
    MANHATTAN = "manhattan"
485
```
486
`{ .api }`
487

488
Enumeration of available similarity functions for comparing embeddings.
489

490
### Usage with SentenceTransformer
491

492
```python
493
# Set similarity function during initialization
494
model = SentenceTransformer(
495
    'all-MiniLM-L6-v2',
496
    similarity_fn_name=SimilarityFunction.COSINE
497
)
498

499
# Or use string names
500
model = SentenceTransformer(
501
    'all-MiniLM-L6-v2', 
502
    similarity_fn_name='euclidean'
503
)
504
```
505

506
## Best Practices
507

508
1. **Batch Processing**: Use appropriate batch sizes for your hardware
509
2. **Device Management**: Specify device explicitly for consistent behavior
510
3. **Normalization**: Use normalized embeddings when comparing with cosine similarity
511
4. **Model Selection**: Choose models appropriate for your task and domain
512
5. **Caching**: Enable caching for repeated model loading
513
6. **Multi-Processing**: Use multi-process encoding for large datasets

Version

Tile

Files

core-transformers.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

core-transformers.mddocs/