Tessl Tile for pypi/flagembedding@1.3.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

auto-models.md base-classes.md decoder-embedders.md encoder-embedders.md index.md model-types.md rerankers.md

rerankers.mddocs/

0
# Rerankers
1

2
Reranking models for scoring query-document pairs to improve retrieval accuracy. Rerankers take a query and a set of candidate documents and assign relevance scores to help identify the most relevant matches.
3

4
## Capabilities
5

6
### FlagReranker (Base Encoder Reranker)
7

8
Standard reranker for encoder-only models. Efficiently scores query-document pairs using cross-encoder architecture for high-accuracy relevance scoring.
9

10
```python { .api }
11
from typing import Union
12

13
class FlagReranker(AbsReranker):
14
    def __init__(
15
        self,
16
        model_name_or_path: str,
17
        use_fp16: bool = False,
18
        query_instruction_for_rerank: Optional[str] = None,
19
        query_instruction_format: str = "{}{}",
20
        passage_instruction_for_rerank: Optional[str] = None,
21
        passage_instruction_format: str = "{}{}",
22
        devices: Optional[Union[str, List[str]]] = None,
23
        batch_size: int = 128,
24
        query_max_length: Optional[int] = None,
25
        max_length: int = 512,
26
        normalize: bool = False,
27
        trust_remote_code: bool = False,
28
        cache_dir: Optional[str] = None,
29
        **kwargs
30
    ):
31
        """
32
        Initialize encoder-only reranker.
33
        
34
        Args:
35
            model_name_or_path: Path to reranker model
36
            use_fp16: Use half precision for inference
37
            query_instruction_for_rerank: Instruction prepended to queries
38
            query_instruction_format: Format string for query instructions
39
            passage_instruction_for_rerank: Instruction prepended to passages
40
            passage_instruction_format: Format string for passage instructions
41
            devices: List of devices for multi-GPU inference
42
            batch_size: Default batch size for scoring
43
            query_max_length: Maximum query token length
44
            max_length: Maximum total sequence length
45
            normalize: Whether to normalize output scores
46
            trust_remote_code: Allow custom model code execution
47
            cache_dir: Directory for model cache
48
            **kwargs: Additional model parameters
49
        """
50
```
51

52
### FlagLLMReranker (Base LLM Reranker)
53

54
Reranker using large language models for sophisticated relevance assessment. Leverages LLM reasoning capabilities for nuanced query-document relevance scoring.
55

56
```python { .api }
57
class FlagLLMReranker(AbsReranker):
58
    def __init__(
59
        self,
60
        model_name_or_path: str,
61
        use_fp16: bool = False,
62
        query_instruction_for_rerank: Optional[str] = None,
63
        query_instruction_format: str = "{}{}",
64
        passage_instruction_for_rerank: Optional[str] = None,
65
        passage_instruction_format: str = "{}{}",
66
        devices: Optional[Union[str, List[str]]] = None,
67
        batch_size: int = 128,
68
        query_max_length: Optional[int] = None,
69
        max_length: int = 512,
70
        normalize: bool = False,
71
        **kwargs
72
    ):
73
        """
74
        Initialize LLM-based reranker.
75
        
76
        Args:
77
            model_name_or_path: Path to LLM reranker model
78
            use_fp16: Use half precision for inference
79
            query_instruction_for_rerank: Instruction prepended to queries
80
            query_instruction_format: Format string for query instructions
81
            passage_instruction_for_rerank: Instruction prepended to passages
82
            passage_instruction_format: Format string for passage instructions
83
            devices: List of devices for multi-GPU inference
84
            batch_size: Default batch size for scoring
85
            query_max_length: Maximum query token length
86
            max_length: Maximum total sequence length
87
            normalize: Whether to normalize output scores
88
            **kwargs: Additional model parameters
89
        """
90
```
91

92
### LayerWiseFlagLLMReranker (Layer-wise LLM Reranker)
93

94
Specialized LLM reranker that uses layer-wise processing for enhanced efficiency and performance. Optimized for large-scale reranking tasks.
95

96
```python { .api }
97
class LayerWiseFlagLLMReranker(AbsReranker):
98
    def __init__(
99
        self,
100
        model_name_or_path: str,
101
        use_fp16: bool = False,
102
        query_instruction_for_rerank: Optional[str] = None,
103
        query_instruction_format: str = "{}{}",
104
        passage_instruction_for_rerank: Optional[str] = None,
105
        passage_instruction_format: str = "{}{}",
106
        devices: Optional[Union[str, List[str]]] = None,
107
        batch_size: int = 128,
108
        query_max_length: Optional[int] = None,
109
        max_length: int = 512,
110
        normalize: bool = False,
111
        **kwargs
112
    ):
113
        """
114
        Initialize layer-wise LLM reranker for efficient processing.
115
        
116
        Args:
117
            model_name_or_path: Path to layer-wise reranker model
118
            use_fp16: Use half precision for inference
119
            query_instruction_for_rerank: Instruction prepended to queries
120
            query_instruction_format: Format string for query instructions
121
            passage_instruction_for_rerank: Instruction prepended to passages
122
            passage_instruction_format: Format string for passage instructions
123
            devices: List of devices for multi-GPU inference
124
            batch_size: Default batch size for scoring
125
            query_max_length: Maximum query token length
126
            max_length: Maximum total sequence length
127
            normalize: Whether to normalize output scores
128
            **kwargs: Additional model parameters
129
        """
130
```
131

132
### LightWeightFlagLLMReranker (Lightweight LLM Reranker)
133

134
Optimized lightweight LLM reranker for resource-constrained environments. Provides good reranking performance with reduced computational requirements.
135

136
```python { .api }
137
class LightWeightFlagLLMReranker(AbsReranker):
138
    def __init__(
139
        self,
140
        model_name_or_path: str,
141
        use_fp16: bool = False,
142
        query_instruction_for_rerank: Optional[str] = None,
143
        query_instruction_format: str = "{}{}",
144
        passage_instruction_for_rerank: Optional[str] = None,
145
        passage_instruction_format: str = "{}{}",
146
        devices: Optional[Union[str, List[str]]] = None,
147
        batch_size: int = 128,
148
        query_max_length: Optional[int] = None,
149
        max_length: int = 512,
150
        normalize: bool = False,
151
        **kwargs
152
    ):
153
        """
154
        Initialize lightweight LLM reranker for efficient processing.
155
        
156
        Args:
157
            model_name_or_path: Path to lightweight reranker model
158
            use_fp16: Use half precision for inference
159
            query_instruction_for_rerank: Instruction prepended to queries
160
            query_instruction_format: Format string for query instructions
161
            passage_instruction_for_rerank: Instruction prepended to passages
162
            passage_instruction_format: Format string for passage instructions
163
            devices: List of devices for multi-GPU inference
164
            batch_size: Default batch size for scoring
165
            query_max_length: Maximum query token length
166
            max_length: Maximum total sequence length
167
            normalize: Whether to normalize output scores
168
            **kwargs: Additional model parameters
169
        """
170
```
171

172
## Usage Examples
173

174
### Basic Reranking
175

176
```python
177
from FlagEmbedding import FlagReranker
178

179
# Initialize reranker
180
reranker = FlagReranker('bge-reranker-base', use_fp16=True)
181

182
# Score query-document pairs
183
query = "What is machine learning?"
184
documents = [
185
    "Machine learning is a subset of artificial intelligence",
186
    "Cooking recipes for Italian pasta dishes",
187
    "ML algorithms learn patterns from data",
188
    "Weather forecast for next week"
189
]
190

191
# Create query-document pairs
192
pairs = [(query, doc) for doc in documents]
193

194
# Get relevance scores
195
scores = reranker.compute_score(pairs)
196

197
# Sort documents by relevance
198
ranked_docs = sorted(zip(documents, scores), key=lambda x: x[1], reverse=True)
199

200
for doc, score in ranked_docs:
201
    print(f"Score: {score:.4f} - {doc[:50]}...")
202
```
203

204
### Batch Processing with Custom Instructions
205

206
```python
207
from FlagEmbedding import FlagReranker
208

209
# Initialize with custom instructions
210
reranker = FlagReranker(
211
    'bge-reranker-base',
212
    query_instruction_for_rerank="Query: ",
213
    passage_instruction_for_rerank="Passage: ",
214
    query_instruction_format="{}{}",
215
    passage_instruction_format="{}{}",
216
    use_fp16=True,
217
    batch_size=64
218
)
219

220
# Multiple queries
221
queries = [
222
    "Python programming tutorials",
223
    "Machine learning algorithms",
224
    "Data science techniques"
225
]
226

227
documents = [
228
    "Learn Python programming from scratch",
229
    "Advanced ML algorithms explained",
230
    "Data analysis with pandas and numpy",
231
    "Web development with Django",
232
    "Deep learning neural networks"
233
]
234

235
# Score all query-document combinations
236
all_pairs = [(q, d) for q in queries for d in documents]
237
scores = reranker.compute_score(all_pairs)
238

239
# Reshape scores for analysis
240
import numpy as np
241
score_matrix = np.array(scores).reshape(len(queries), len(documents))
242

243
for i, query in enumerate(queries):
244
    print(f"\\nQuery: {query}")
245
    query_scores = score_matrix[i]
246
    ranked_indices = np.argsort(query_scores)[::-1]
247
    
248
    for j in ranked_indices[:3]:  # Top 3 documents
249
        print(f"  {query_scores[j]:.4f}: {documents[j]}")
250
```
251

252
### LLM Reranker Usage
253

254
```python
255
from FlagEmbedding import FlagLLMReranker
256

257
# Initialize LLM reranker for nuanced scoring
258
reranker = FlagLLMReranker(
259
    'bge-reranker-v2-gemma',
260
    use_fp16=True,
261
    batch_size=32,  # Smaller batch for LLM
262
    max_length=1024  # Longer context for LLM
263
)
264

265
# Complex query requiring reasoning
266
query = "How can renewable energy help reduce climate change impacts?"
267

268
documents = [
269
    "Solar panels convert sunlight to electricity with zero emissions",
270
    "Climate change causes rising sea levels and extreme weather",
271
    "Wind turbines generate clean energy without carbon footprint",
272
    "Fossil fuels are the primary cause of greenhouse gas emissions",
273
    "Electric vehicles reduce transportation emissions significantly"
274
]
275

276
pairs = [(query, doc) for doc in documents]
277
scores = reranker.compute_score(pairs)
278

279
# LLM rerankers often provide more nuanced scoring
280
for doc, score in zip(documents, scores):
281
    print(f"{score:.4f}: {doc}")
282
```
283

284
### Multi-GPU Reranking
285

286
```python
287
from FlagEmbedding import FlagReranker
288

289
# Use multiple GPUs for large-scale reranking
290
reranker = FlagReranker(
291
    'bge-reranker-large',
292
    devices=['cuda:0', 'cuda:1', 'cuda:2'],
293
    batch_size=256,
294
    use_fp16=True
295
)
296

297
# Large-scale reranking scenario
298
query = "artificial intelligence applications"
299
large_document_set = [f"Document {i} about AI applications" for i in range(10000)]
300

301
# Create pairs (this could be memory intensive)
302
pairs = [(query, doc) for doc in large_document_set]
303

304
# Efficient batch processing across GPUs
305
scores = reranker.compute_score(pairs)
306

307
# Get top-k results
308
k = 100
309
top_indices = np.argsort(scores)[-k:][::-1]
310
top_documents = [large_document_set[i] for i in top_indices]
311
top_scores = [scores[i] for i in top_indices]
312
```
313

314
### Lightweight Reranker for Resource Constraints
315

316
```python
317
from FlagEmbedding import LightWeightFlagLLMReranker
318

319
# Use lightweight reranker for efficiency
320
reranker = LightWeightFlagLLMReranker(
321
    'bge-reranker-v2.5-gemma2-lightweight',
322
    use_fp16=True,
323
    batch_size=128,
324
    normalize=True  # Normalize scores for consistency
325
)
326

327
# Efficient processing with good performance
328
query = "best practices for software development"
329
candidates = [
330
    "Code review processes improve software quality",
331
    "Unit testing prevents bugs in production",
332
    "Agile methodology enhances team collaboration",
333
    "Version control systems track code changes"
334
]
335

336
pairs = [(query, candidate) for candidate in candidates]
337
scores = reranker.compute_score(pairs)
338

339
# Normalized scores for easy interpretation
340
for candidate, score in zip(candidates, scores):
341
    print(f"Relevance: {score:.3f} - {candidate}")
342
```
343

344
### Layer-wise Processing
345

346
```python
347
from FlagEmbedding import LayerWiseFlagLLMReranker
348

349
# Layer-wise reranker for balanced performance-efficiency
350
reranker = LayerWiseFlagLLMReranker(
351
    'bge-reranker-v2-minicpm-layerwise',
352
    use_fp16=True,
353
    batch_size=64
354
)
355

356
# Particularly effective for medium-scale tasks
357
query = "quantum computing applications"
358
documents = [
359
    "Quantum computers solve complex optimization problems",
360
    "Classical computers use binary logic gates",
361
    "Quantum algorithms leverage superposition and entanglement",
362
    "Cryptography applications of quantum computing",
363
    "Machine learning acceleration with quantum processors"
364
]
365

366
pairs = [(query, doc) for doc in documents]
367
scores = reranker.compute_score(pairs)
368

369
# Layer-wise processing often provides good relevance ranking
370
sorted_results = sorted(zip(documents, scores), key=lambda x: x[1], reverse=True)
371
for doc, score in sorted_results:
372
    print(f"{score:.4f}: {doc}")
373
```
374

375
## Supported Models
376

377
### Encoder-Only Rerankers
378
- bge-reranker-base (standard cross-encoder)
379
- bge-reranker-large (larger cross-encoder)
380

381
### LLM Rerankers
382
- bge-reranker-v2-m3 (multi-vector reranker)
383
- bge-reranker-v2-gemma (Gemma-based reranker)
384

385
### Specialized LLM Rerankers
386
- bge-reranker-v2-minicpm-layerwise (layer-wise processing)
387
- bge-reranker-v2.5-gemma2-lightweight (lightweight variant)
388

389
## Model Selection Guidelines
390

391
### FlagReranker (Encoder-Only)
392
- **Best for**: Fast, efficient reranking
393
- **Use when**: Need high throughput, have shorter documents
394
- **Pros**: Fast inference, lower memory usage
395
- **Cons**: Limited context understanding
396

397
### FlagLLMReranker (LLM-Based)
398
- **Best for**: Complex reasoning, nuanced relevance
399
- **Use when**: Need sophisticated understanding, longer contexts
400
- **Pros**: Better understanding, contextual reasoning
401
- **Cons**: Slower inference, higher memory usage
402

403
### LayerWiseFlagLLMReranker
404
- **Best for**: Balanced performance-efficiency
405
- **Use when**: Medium-scale tasks, need LLM benefits with efficiency
406
- **Pros**: Good balance of speed and understanding
407
- **Cons**: Model-specific implementation
408

409
### LightWeightFlagLLMReranker
410
- **Best for**: Resource-constrained environments
411
- **Use when**: Limited compute, need reasonable LLM performance
412
- **Pros**: Lower resource usage, still provides LLM benefits
413
- **Cons**: May sacrifice some accuracy for efficiency
414

415
## Types
416

417
```python { .api }
418
from typing import List, Tuple, Optional, Union
419
import numpy as np
420

421
# Core reranking types
422
QueryDocumentPair = Tuple[str, str]
423
RelevanceScore = float
424
BatchPairs = List[QueryDocumentPair]
425
BatchScores = np.ndarray
426

427
# Instruction formatting
428
InstructionFormat = str  # Format string with {} placeholders
429
```

Version

Tile

Files

rerankers.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

rerankers.mddocs/