or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

auto-models.mdbase-classes.mddecoder-embedders.mdencoder-embedders.mdindex.mdmodel-types.mdrerankers.md

encoder-embedders.mddocs/

0

# Encoder-Only Embedders

1

2

Embedders designed for encoder-only transformer models (BERT-like architectures). These models excel at understanding bidirectional context and are particularly effective for semantic similarity tasks and dense retrieval.

3

4

## Capabilities

5

6

### FlagModel (Base Encoder Embedder)

7

8

Standard embedder for encoder-only models using CLS token pooling by default. Supports all standard BERT-like architectures and provides a solid foundation for most embedding tasks.

9

10

```python { .api }

11

from typing import Union

12

13

class FlagModel(AbsEmbedder):

14

def __init__(

15

self,

16

model_name_or_path: str,

17

pooling_method: str = "cls",

18

normalize_embeddings: bool = True,

19

use_fp16: bool = True,

20

query_instruction_for_retrieval: Optional[str] = None,

21

query_instruction_format: str = "{}{}",

22

devices: Optional[Union[str, List[str]]] = None,

23

batch_size: int = 256,

24

query_max_length: int = 512,

25

passage_max_length: int = 512,

26

convert_to_numpy: bool = True,

27

trust_remote_code: bool = False,

28

cache_dir: Optional[str] = None,

29

**kwargs

30

):

31

"""

32

Initialize encoder-only embedder.

33

34

Args:

35

model_name_or_path: Path to model or HuggingFace model name

36

pooling_method: Pooling strategy ("cls", "mean")

37

normalize_embeddings: Whether to normalize output embeddings

38

use_fp16: Use half precision for inference

39

query_instruction_for_retrieval: Instruction prepended to queries

40

query_instruction_format: Format string for instructions

41

devices: List of devices for multi-GPU inference

42

batch_size: Default batch size for encoding

43

query_max_length: Maximum query token length

44

passage_max_length: Maximum passage token length

45

convert_to_numpy: Convert outputs to numpy arrays

46

trust_remote_code: Allow custom model code execution

47

cache_dir: Directory for model cache

48

**kwargs: Additional model parameters

49

"""

50

```

51

52

### BGEM3FlagModel (Specialized M3 Embedder)

53

54

Advanced embedder specifically designed for BGE-M3 models with support for dense, sparse, and ColBERT representations. Provides unified multi-vector embeddings for comprehensive retrieval scenarios.

55

56

```python { .api }

57

class BGEM3FlagModel(AbsEmbedder):

58

def __init__(

59

self,

60

model_name_or_path: str,

61

pooling_method: str = "cls",

62

normalize_embeddings: bool = True,

63

use_fp16: bool = True,

64

query_instruction_for_retrieval: Optional[str] = None,

65

query_instruction_format: str = "{}{}",

66

devices: Optional[Union[str, List[str]]] = None,

67

batch_size: int = 256,

68

query_max_length: int = 512,

69

passage_max_length: int = 512,

70

convert_to_numpy: bool = True,

71

colbert_dim: int = -1,

72

return_dense: bool = True,

73

return_sparse: bool = False,

74

return_colbert_vecs: bool = False,

75

**kwargs

76

):

77

"""

78

Initialize BGE-M3 specialized embedder.

79

80

Args:

81

model_name_or_path: Path to BGE-M3 model

82

pooling_method: Pooling strategy ("cls", "mean")

83

normalize_embeddings: Whether to normalize output embeddings

84

use_fp16: Use half precision for inference

85

query_instruction_for_retrieval: Instruction prepended to queries

86

query_instruction_format: Format string for instructions

87

devices: List of devices for multi-GPU inference

88

batch_size: Default batch size for encoding

89

query_max_length: Maximum query token length

90

passage_max_length: Maximum passage token length

91

convert_to_numpy: Convert outputs to numpy arrays

92

colbert_dim: ColBERT dimension (-1 for auto)

93

return_dense: Include dense embeddings in output

94

return_sparse: Include sparse embeddings in output

95

return_colbert_vecs: Include ColBERT vectors in output

96

**kwargs: Additional model parameters

97

"""

98

99

def compute_score(

100

self,

101

q_reps: Dict[str, Any],

102

p_reps: Dict[str, Any],

103

weights: Optional[List[float]] = None

104

) -> float:

105

"""

106

Compute similarity score between query and passage representations.

107

108

Args:

109

q_reps: Query representations (dense, sparse, colbert)

110

p_reps: Passage representations (dense, sparse, colbert)

111

weights: Weights for combining different representation types

112

113

Returns:

114

Combined similarity score

115

"""

116

117

def compute_lexical_matching_score(

118

self,

119

lexical_weights_1: Dict[int, float],

120

lexical_weights_2: Dict[int, float]

121

) -> float:

122

"""

123

Compute lexical matching score between sparse representations.

124

125

Args:

126

lexical_weights_1: First sparse representation weights

127

lexical_weights_2: Second sparse representation weights

128

129

Returns:

130

Lexical matching score

131

"""

132

133

def colbert_score(

134

self,

135

q_reps: torch.Tensor,

136

p_reps: torch.Tensor

137

) -> float:

138

"""

139

Compute ColBERT similarity score.

140

141

Args:

142

q_reps: Query ColBERT vectors

143

p_reps: Passage ColBERT vectors

144

145

Returns:

146

ColBERT similarity score

147

"""

148

149

def convert_id_to_token(

150

self,

151

lexical_weights: Dict[int, float]

152

) -> List[Dict[str, Any]]:

153

"""

154

Convert token IDs in sparse weights to actual tokens.

155

156

Args:

157

lexical_weights: Sparse weights with token IDs

158

159

Returns:

160

List of token-weight mappings

161

"""

162

```

163

164

## Usage Examples

165

166

### Basic Encoder Embedder

167

168

```python

169

from FlagEmbedding import FlagModel

170

171

# Initialize with CLS pooling

172

embedder = FlagModel(

173

'bge-large-en-v1.5',

174

pooling_method="cls",

175

use_fp16=True

176

)

177

178

# Encode queries and documents

179

queries = ["What is deep learning?", "How do transformers work?"]

180

documents = ["Deep learning is a subset of ML", "Transformers use attention mechanisms"]

181

182

query_embeddings = embedder.encode_queries(queries)

183

doc_embeddings = embedder.encode_corpus(documents)

184

185

print(f"Query embeddings shape: {query_embeddings.shape}")

186

print(f"Document embeddings shape: {doc_embeddings.shape}")

187

```

188

189

### Mean Pooling Strategy

190

191

```python

192

from FlagEmbedding import FlagModel

193

194

# Use mean pooling instead of CLS

195

embedder = FlagModel(

196

'bge-base-en-v1.5',

197

pooling_method="mean",

198

normalize_embeddings=True

199

)

200

201

texts = ["Example text for embedding"]

202

embeddings = embedder.encode(texts)

203

```

204

205

### BGE-M3 Multi-Vector Embeddings

206

207

```python

208

from FlagEmbedding import BGEM3FlagModel

209

210

# Initialize BGE-M3 with all representation types

211

embedder = BGEM3FlagModel(

212

'bge-m3',

213

return_dense=True,

214

return_sparse=True,

215

return_colbert_vecs=True,

216

use_fp16=True

217

)

218

219

# Encode with multiple representation types

220

query = ["machine learning applications"]

221

passage = ["ML is used in healthcare, finance, and technology"]

222

223

query_output = embedder.encode_queries(query)

224

passage_output = embedder.encode_corpus(passage)

225

226

# Access different representation types

227

if isinstance(query_output, dict):

228

dense_query = query_output.get('dense_vecs')

229

sparse_query = query_output.get('lexical_weights')

230

colbert_query = query_output.get('colbert_vecs')

231

```

232

233

### M3 Similarity Scoring

234

235

```python

236

from FlagEmbedding import BGEM3FlagModel

237

238

embedder = BGEM3FlagModel(

239

'bge-m3',

240

return_dense=True,

241

return_sparse=True,

242

return_colbert_vecs=True

243

)

244

245

# Get representations for scoring

246

query_reps = embedder.encode_queries(["machine learning"])

247

passage_reps = embedder.encode_corpus(["ML algorithms"])

248

249

# Compute combined similarity score

250

score = embedder.compute_score(query_reps[0], passage_reps[0])

251

print(f"Combined similarity: {score}")

252

253

# Compute individual scores if needed

254

if 'lexical_weights' in query_reps[0]:

255

lexical_score = embedder.compute_lexical_matching_score(

256

query_reps[0]['lexical_weights'][0],

257

passage_reps[0]['lexical_weights'][0]

258

)

259

print(f"Lexical similarity: {lexical_score}")

260

261

if 'colbert_vecs' in query_reps[0]:

262

colbert_score = embedder.colbert_score(

263

query_reps[0]['colbert_vecs'][0],

264

passage_reps[0]['colbert_vecs'][0]

265

)

266

print(f"ColBERT similarity: {colbert_score}")

267

```

268

269

### Custom Instructions for Retrieval

270

271

```python

272

from FlagEmbedding import FlagModel

273

274

# Add custom instruction for retrieval tasks

275

embedder = FlagModel(

276

'bge-large-en-v1.5',

277

query_instruction_for_retrieval="Represent this query for retrieving relevant documents: ",

278

query_instruction_format="{}{}"

279

)

280

281

# Queries will be prepended with instruction

282

queries = ["best practices for machine learning"]

283

embeddings = embedder.encode_queries(queries)

284

```

285

286

### Multi-GPU Processing

287

288

```python

289

from FlagEmbedding import FlagModel

290

291

# Use multiple GPUs for large-scale processing

292

embedder = FlagModel(

293

'bge-large-en-v1.5',

294

devices=['cuda:0', 'cuda:1', 'cuda:2'],

295

batch_size=128

296

)

297

298

# Process large corpus efficiently

299

large_corpus = [f"Document {i}" for i in range(50000)]

300

embeddings = embedder.encode_corpus(large_corpus)

301

```

302

303

## Supported Models

304

305

### BGE Models

306

- bge-large-en-v1.5, bge-base-en-v1.5, bge-small-en-v1.5

307

- bge-large-zh-v1.5, bge-base-zh-v1.5, bge-small-zh-v1.5

308

- bge-large-en, bge-base-en, bge-small-en

309

- bge-large-zh, bge-base-zh, bge-small-zh

310

- bge-m3 (requires BGEM3FlagModel)

311

312

### E5 Models

313

- e5-large-v2, e5-base-v2, e5-small-v2

314

- multilingual-e5-large, multilingual-e5-base, multilingual-e5-small

315

- e5-large, e5-base, e5-small

316

317

### GTE Models

318

- gte-multilingual-base, gte-large-en-v1.5, gte-base-en-v1.5

319

- gte-large, gte-base, gte-small

320

- gte-large-zh, gte-base-zh, gte-small-zh

321

322

## Types

323

324

```python { .api }

325

from typing import Dict, List, Optional, Union, Any

326

import torch

327

import numpy as np

328

329

# BGE-M3 specific types

330

M3Output = Dict[str, Union[torch.Tensor, np.ndarray, List[Dict[int, float]]]]

331

SparseWeights = Dict[int, float]

332

ColBERTVectors = torch.Tensor

333

DenseEmbedding = Union[torch.Tensor, np.ndarray]

334

335

# Pooling method types

336

PoolingMethod = Literal["cls", "mean"]

337

```