or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

auto-models.mdbase-classes.mddecoder-embedders.mdencoder-embedders.mdindex.mdmodel-types.mdrerankers.md

rerankers.mddocs/

0

# Rerankers

1

2

Reranking models for scoring query-document pairs to improve retrieval accuracy. Rerankers take a query and a set of candidate documents and assign relevance scores to help identify the most relevant matches.

3

4

## Capabilities

5

6

### FlagReranker (Base Encoder Reranker)

7

8

Standard reranker for encoder-only models. Efficiently scores query-document pairs using cross-encoder architecture for high-accuracy relevance scoring.

9

10

```python { .api }

11

from typing import Union

12

13

class FlagReranker(AbsReranker):

14

def __init__(

15

self,

16

model_name_or_path: str,

17

use_fp16: bool = False,

18

query_instruction_for_rerank: Optional[str] = None,

19

query_instruction_format: str = "{}{}",

20

passage_instruction_for_rerank: Optional[str] = None,

21

passage_instruction_format: str = "{}{}",

22

devices: Optional[Union[str, List[str]]] = None,

23

batch_size: int = 128,

24

query_max_length: Optional[int] = None,

25

max_length: int = 512,

26

normalize: bool = False,

27

trust_remote_code: bool = False,

28

cache_dir: Optional[str] = None,

29

**kwargs

30

):

31

"""

32

Initialize encoder-only reranker.

33

34

Args:

35

model_name_or_path: Path to reranker model

36

use_fp16: Use half precision for inference

37

query_instruction_for_rerank: Instruction prepended to queries

38

query_instruction_format: Format string for query instructions

39

passage_instruction_for_rerank: Instruction prepended to passages

40

passage_instruction_format: Format string for passage instructions

41

devices: List of devices for multi-GPU inference

42

batch_size: Default batch size for scoring

43

query_max_length: Maximum query token length

44

max_length: Maximum total sequence length

45

normalize: Whether to normalize output scores

46

trust_remote_code: Allow custom model code execution

47

cache_dir: Directory for model cache

48

**kwargs: Additional model parameters

49

"""

50

```

51

52

### FlagLLMReranker (Base LLM Reranker)

53

54

Reranker using large language models for sophisticated relevance assessment. Leverages LLM reasoning capabilities for nuanced query-document relevance scoring.

55

56

```python { .api }

57

class FlagLLMReranker(AbsReranker):

58

def __init__(

59

self,

60

model_name_or_path: str,

61

use_fp16: bool = False,

62

query_instruction_for_rerank: Optional[str] = None,

63

query_instruction_format: str = "{}{}",

64

passage_instruction_for_rerank: Optional[str] = None,

65

passage_instruction_format: str = "{}{}",

66

devices: Optional[Union[str, List[str]]] = None,

67

batch_size: int = 128,

68

query_max_length: Optional[int] = None,

69

max_length: int = 512,

70

normalize: bool = False,

71

**kwargs

72

):

73

"""

74

Initialize LLM-based reranker.

75

76

Args:

77

model_name_or_path: Path to LLM reranker model

78

use_fp16: Use half precision for inference

79

query_instruction_for_rerank: Instruction prepended to queries

80

query_instruction_format: Format string for query instructions

81

passage_instruction_for_rerank: Instruction prepended to passages

82

passage_instruction_format: Format string for passage instructions

83

devices: List of devices for multi-GPU inference

84

batch_size: Default batch size for scoring

85

query_max_length: Maximum query token length

86

max_length: Maximum total sequence length

87

normalize: Whether to normalize output scores

88

**kwargs: Additional model parameters

89

"""

90

```

91

92

### LayerWiseFlagLLMReranker (Layer-wise LLM Reranker)

93

94

Specialized LLM reranker that uses layer-wise processing for enhanced efficiency and performance. Optimized for large-scale reranking tasks.

95

96

```python { .api }

97

class LayerWiseFlagLLMReranker(AbsReranker):

98

def __init__(

99

self,

100

model_name_or_path: str,

101

use_fp16: bool = False,

102

query_instruction_for_rerank: Optional[str] = None,

103

query_instruction_format: str = "{}{}",

104

passage_instruction_for_rerank: Optional[str] = None,

105

passage_instruction_format: str = "{}{}",

106

devices: Optional[Union[str, List[str]]] = None,

107

batch_size: int = 128,

108

query_max_length: Optional[int] = None,

109

max_length: int = 512,

110

normalize: bool = False,

111

**kwargs

112

):

113

"""

114

Initialize layer-wise LLM reranker for efficient processing.

115

116

Args:

117

model_name_or_path: Path to layer-wise reranker model

118

use_fp16: Use half precision for inference

119

query_instruction_for_rerank: Instruction prepended to queries

120

query_instruction_format: Format string for query instructions

121

passage_instruction_for_rerank: Instruction prepended to passages

122

passage_instruction_format: Format string for passage instructions

123

devices: List of devices for multi-GPU inference

124

batch_size: Default batch size for scoring

125

query_max_length: Maximum query token length

126

max_length: Maximum total sequence length

127

normalize: Whether to normalize output scores

128

**kwargs: Additional model parameters

129

"""

130

```

131

132

### LightWeightFlagLLMReranker (Lightweight LLM Reranker)

133

134

Optimized lightweight LLM reranker for resource-constrained environments. Provides good reranking performance with reduced computational requirements.

135

136

```python { .api }

137

class LightWeightFlagLLMReranker(AbsReranker):

138

def __init__(

139

self,

140

model_name_or_path: str,

141

use_fp16: bool = False,

142

query_instruction_for_rerank: Optional[str] = None,

143

query_instruction_format: str = "{}{}",

144

passage_instruction_for_rerank: Optional[str] = None,

145

passage_instruction_format: str = "{}{}",

146

devices: Optional[Union[str, List[str]]] = None,

147

batch_size: int = 128,

148

query_max_length: Optional[int] = None,

149

max_length: int = 512,

150

normalize: bool = False,

151

**kwargs

152

):

153

"""

154

Initialize lightweight LLM reranker for efficient processing.

155

156

Args:

157

model_name_or_path: Path to lightweight reranker model

158

use_fp16: Use half precision for inference

159

query_instruction_for_rerank: Instruction prepended to queries

160

query_instruction_format: Format string for query instructions

161

passage_instruction_for_rerank: Instruction prepended to passages

162

passage_instruction_format: Format string for passage instructions

163

devices: List of devices for multi-GPU inference

164

batch_size: Default batch size for scoring

165

query_max_length: Maximum query token length

166

max_length: Maximum total sequence length

167

normalize: Whether to normalize output scores

168

**kwargs: Additional model parameters

169

"""

170

```

171

172

## Usage Examples

173

174

### Basic Reranking

175

176

```python

177

from FlagEmbedding import FlagReranker

178

179

# Initialize reranker

180

reranker = FlagReranker('bge-reranker-base', use_fp16=True)

181

182

# Score query-document pairs

183

query = "What is machine learning?"

184

documents = [

185

"Machine learning is a subset of artificial intelligence",

186

"Cooking recipes for Italian pasta dishes",

187

"ML algorithms learn patterns from data",

188

"Weather forecast for next week"

189

]

190

191

# Create query-document pairs

192

pairs = [(query, doc) for doc in documents]

193

194

# Get relevance scores

195

scores = reranker.compute_score(pairs)

196

197

# Sort documents by relevance

198

ranked_docs = sorted(zip(documents, scores), key=lambda x: x[1], reverse=True)

199

200

for doc, score in ranked_docs:

201

print(f"Score: {score:.4f} - {doc[:50]}...")

202

```

203

204

### Batch Processing with Custom Instructions

205

206

```python

207

from FlagEmbedding import FlagReranker

208

209

# Initialize with custom instructions

210

reranker = FlagReranker(

211

'bge-reranker-base',

212

query_instruction_for_rerank="Query: ",

213

passage_instruction_for_rerank="Passage: ",

214

query_instruction_format="{}{}",

215

passage_instruction_format="{}{}",

216

use_fp16=True,

217

batch_size=64

218

)

219

220

# Multiple queries

221

queries = [

222

"Python programming tutorials",

223

"Machine learning algorithms",

224

"Data science techniques"

225

]

226

227

documents = [

228

"Learn Python programming from scratch",

229

"Advanced ML algorithms explained",

230

"Data analysis with pandas and numpy",

231

"Web development with Django",

232

"Deep learning neural networks"

233

]

234

235

# Score all query-document combinations

236

all_pairs = [(q, d) for q in queries for d in documents]

237

scores = reranker.compute_score(all_pairs)

238

239

# Reshape scores for analysis

240

import numpy as np

241

score_matrix = np.array(scores).reshape(len(queries), len(documents))

242

243

for i, query in enumerate(queries):

244

print(f"\\nQuery: {query}")

245

query_scores = score_matrix[i]

246

ranked_indices = np.argsort(query_scores)[::-1]

247

248

for j in ranked_indices[:3]: # Top 3 documents

249

print(f" {query_scores[j]:.4f}: {documents[j]}")

250

```

251

252

### LLM Reranker Usage

253

254

```python

255

from FlagEmbedding import FlagLLMReranker

256

257

# Initialize LLM reranker for nuanced scoring

258

reranker = FlagLLMReranker(

259

'bge-reranker-v2-gemma',

260

use_fp16=True,

261

batch_size=32, # Smaller batch for LLM

262

max_length=1024 # Longer context for LLM

263

)

264

265

# Complex query requiring reasoning

266

query = "How can renewable energy help reduce climate change impacts?"

267

268

documents = [

269

"Solar panels convert sunlight to electricity with zero emissions",

270

"Climate change causes rising sea levels and extreme weather",

271

"Wind turbines generate clean energy without carbon footprint",

272

"Fossil fuels are the primary cause of greenhouse gas emissions",

273

"Electric vehicles reduce transportation emissions significantly"

274

]

275

276

pairs = [(query, doc) for doc in documents]

277

scores = reranker.compute_score(pairs)

278

279

# LLM rerankers often provide more nuanced scoring

280

for doc, score in zip(documents, scores):

281

print(f"{score:.4f}: {doc}")

282

```

283

284

### Multi-GPU Reranking

285

286

```python

287

from FlagEmbedding import FlagReranker

288

289

# Use multiple GPUs for large-scale reranking

290

reranker = FlagReranker(

291

'bge-reranker-large',

292

devices=['cuda:0', 'cuda:1', 'cuda:2'],

293

batch_size=256,

294

use_fp16=True

295

)

296

297

# Large-scale reranking scenario

298

query = "artificial intelligence applications"

299

large_document_set = [f"Document {i} about AI applications" for i in range(10000)]

300

301

# Create pairs (this could be memory intensive)

302

pairs = [(query, doc) for doc in large_document_set]

303

304

# Efficient batch processing across GPUs

305

scores = reranker.compute_score(pairs)

306

307

# Get top-k results

308

k = 100

309

top_indices = np.argsort(scores)[-k:][::-1]

310

top_documents = [large_document_set[i] for i in top_indices]

311

top_scores = [scores[i] for i in top_indices]

312

```

313

314

### Lightweight Reranker for Resource Constraints

315

316

```python

317

from FlagEmbedding import LightWeightFlagLLMReranker

318

319

# Use lightweight reranker for efficiency

320

reranker = LightWeightFlagLLMReranker(

321

'bge-reranker-v2.5-gemma2-lightweight',

322

use_fp16=True,

323

batch_size=128,

324

normalize=True # Normalize scores for consistency

325

)

326

327

# Efficient processing with good performance

328

query = "best practices for software development"

329

candidates = [

330

"Code review processes improve software quality",

331

"Unit testing prevents bugs in production",

332

"Agile methodology enhances team collaboration",

333

"Version control systems track code changes"

334

]

335

336

pairs = [(query, candidate) for candidate in candidates]

337

scores = reranker.compute_score(pairs)

338

339

# Normalized scores for easy interpretation

340

for candidate, score in zip(candidates, scores):

341

print(f"Relevance: {score:.3f} - {candidate}")

342

```

343

344

### Layer-wise Processing

345

346

```python

347

from FlagEmbedding import LayerWiseFlagLLMReranker

348

349

# Layer-wise reranker for balanced performance-efficiency

350

reranker = LayerWiseFlagLLMReranker(

351

'bge-reranker-v2-minicpm-layerwise',

352

use_fp16=True,

353

batch_size=64

354

)

355

356

# Particularly effective for medium-scale tasks

357

query = "quantum computing applications"

358

documents = [

359

"Quantum computers solve complex optimization problems",

360

"Classical computers use binary logic gates",

361

"Quantum algorithms leverage superposition and entanglement",

362

"Cryptography applications of quantum computing",

363

"Machine learning acceleration with quantum processors"

364

]

365

366

pairs = [(query, doc) for doc in documents]

367

scores = reranker.compute_score(pairs)

368

369

# Layer-wise processing often provides good relevance ranking

370

sorted_results = sorted(zip(documents, scores), key=lambda x: x[1], reverse=True)

371

for doc, score in sorted_results:

372

print(f"{score:.4f}: {doc}")

373

```

374

375

## Supported Models

376

377

### Encoder-Only Rerankers

378

- bge-reranker-base (standard cross-encoder)

379

- bge-reranker-large (larger cross-encoder)

380

381

### LLM Rerankers

382

- bge-reranker-v2-m3 (multi-vector reranker)

383

- bge-reranker-v2-gemma (Gemma-based reranker)

384

385

### Specialized LLM Rerankers

386

- bge-reranker-v2-minicpm-layerwise (layer-wise processing)

387

- bge-reranker-v2.5-gemma2-lightweight (lightweight variant)

388

389

## Model Selection Guidelines

390

391

### FlagReranker (Encoder-Only)

392

- **Best for**: Fast, efficient reranking

393

- **Use when**: Need high throughput, have shorter documents

394

- **Pros**: Fast inference, lower memory usage

395

- **Cons**: Limited context understanding

396

397

### FlagLLMReranker (LLM-Based)

398

- **Best for**: Complex reasoning, nuanced relevance

399

- **Use when**: Need sophisticated understanding, longer contexts

400

- **Pros**: Better understanding, contextual reasoning

401

- **Cons**: Slower inference, higher memory usage

402

403

### LayerWiseFlagLLMReranker

404

- **Best for**: Balanced performance-efficiency

405

- **Use when**: Medium-scale tasks, need LLM benefits with efficiency

406

- **Pros**: Good balance of speed and understanding

407

- **Cons**: Model-specific implementation

408

409

### LightWeightFlagLLMReranker

410

- **Best for**: Resource-constrained environments

411

- **Use when**: Limited compute, need reasonable LLM performance

412

- **Pros**: Lower resource usage, still provides LLM benefits

413

- **Cons**: May sacrifice some accuracy for efficiency

414

415

## Types

416

417

```python { .api }

418

from typing import List, Tuple, Optional, Union

419

import numpy as np

420

421

# Core reranking types

422

QueryDocumentPair = Tuple[str, str]

423

RelevanceScore = float

424

BatchPairs = List[QueryDocumentPair]

425

BatchScores = np.ndarray

426

427

# Instruction formatting

428

InstructionFormat = str # Format string with {} placeholders

429

```