FlagEmbedding - BGE: One-Stop Retrieval Toolkit For Search and RAG
npx @tessl/cli install tessl/pypi-flagembedding@1.3.00
# FlagEmbedding
1
2
FlagEmbedding (BGE - BAAI General Embedding) is a comprehensive Python library focused on retrieval-augmented language models and embedding technologies. The package provides state-of-the-art text embedding models, rerankers, and multimodal embedding capabilities for search and RAG applications. It includes tools for both inference and fine-tuning of embedding models, evaluation frameworks, and supports various embedding tasks including text-to-text, text-to-image, and image-to-text retrieval.
3
4
## Package Information
5
6
- **Package Name**: FlagEmbedding
7
- **Language**: Python
8
- **Installation**: `pip install FlagEmbedding`
9
10
## Core Imports
11
12
```python
13
from FlagEmbedding import FlagAutoModel, FlagAutoReranker
14
```
15
16
For direct model access:
17
18
```python
19
from FlagEmbedding import FlagModel, BGEM3FlagModel, FlagReranker
20
from FlagEmbedding import FlagLLMModel, FlagICLModel, FlagLLMReranker
21
```
22
23
For model class enumeration:
24
25
```python
26
from FlagEmbedding import EmbedderModelClass, RerankerModelClass
27
```
28
29
## Basic Usage
30
31
```python
32
from FlagEmbedding import FlagAutoModel, FlagAutoReranker
33
34
# Initialize embedder with automatic model selection
35
embedder = FlagAutoModel.from_finetuned('bge-large-en-v1.5', use_fp16=True)
36
37
# Encode queries and documents
38
queries = ["What is machine learning?", "How does neural networks work?"]
39
documents = [
40
"Machine learning is a subset of artificial intelligence...",
41
"Neural networks are computing systems inspired by biological neural networks..."
42
]
43
44
query_embeddings = embedder.encode_queries(queries)
45
doc_embeddings = embedder.encode_corpus(documents)
46
47
# Initialize reranker for scoring
48
reranker = FlagAutoReranker.from_finetuned('bge-reranker-base')
49
50
# Score query-document pairs
51
pairs = [("What is machine learning?", "Machine learning is a subset of artificial intelligence...")]
52
scores = reranker.compute_score(pairs)
53
54
print(f"Similarity score: {scores[0]}")
55
```
56
57
## Architecture
58
59
FlagEmbedding is built around a hierarchical architecture that supports multiple model types and architectures:
60
61
- **Abstract Base Classes**: `AbsEmbedder` and `AbsReranker` define the interface contracts for all embedding/reranking models
62
- **Auto Models**: Factory classes (`FlagAutoModel`, `FlagAutoReranker`) that automatically select appropriate model implementations
63
- **Concrete Implementations**: Specialized classes for encoder-only models (BERT-like), decoder-only models (LLM-like), and hybrid approaches
64
- **Multi-device Support**: Built-in parallelization across multiple GPUs/devices for scalable inference
65
66
## Capabilities
67
68
### Auto Model Factory
69
70
Automatically selects and initializes the appropriate embedder or reranker class based on the model name, providing the simplest way to use FlagEmbedding with any supported model.
71
72
```python { .api }
73
class FlagAutoModel:
74
@classmethod
75
def from_finetuned(
76
cls,
77
model_name_or_path: str,
78
model_class: Optional[str] = None,
79
normalize_embeddings: bool = True,
80
use_fp16: bool = True,
81
query_instruction_for_retrieval: Optional[str] = None,
82
devices: Optional[List[str]] = None,
83
pooling_method: Optional[str] = None,
84
trust_remote_code: Optional[bool] = None,
85
**kwargs
86
) -> AbsEmbedder: ...
87
88
class FlagAutoReranker:
89
@classmethod
90
def from_finetuned(
91
cls,
92
model_name_or_path: str,
93
model_class: Optional[str] = None,
94
use_fp16: bool = False,
95
trust_remote_code: Optional[bool] = None,
96
**kwargs
97
) -> AbsReranker: ...
98
```
99
100
[Auto Models](./auto-models.md)
101
102
### Encoder-Only Embedders
103
104
Embedders designed for encoder-only transformer models (BERT-like architectures), including specialized implementations for BGE-M3 models with dense, sparse, and ColBERT support.
105
106
```python { .api }
107
class FlagModel(AbsEmbedder):
108
def __init__(
109
self,
110
model_name_or_path: str,
111
pooling_method: str = "cls",
112
normalize_embeddings: bool = True,
113
use_fp16: bool = True,
114
trust_remote_code: bool = False,
115
**kwargs
116
): ...
117
118
class BGEM3FlagModel(AbsEmbedder):
119
def __init__(
120
self,
121
model_name_or_path: str,
122
pooling_method: str = "cls",
123
normalize_embeddings: bool = True,
124
use_fp16: bool = True,
125
colbert_dim: int = -1,
126
return_dense: bool = True,
127
return_sparse: bool = False,
128
return_colbert_vecs: bool = False,
129
**kwargs
130
): ...
131
```
132
133
[Encoder-Only Embedders](./encoder-embedders.md)
134
135
### Decoder-Only Embedders
136
137
Embedders for decoder-only transformer models (LLM-like architectures), including support for in-context learning approaches.
138
139
```python { .api }
140
class FlagLLMModel(AbsEmbedder):
141
def __init__(
142
self,
143
model_name_or_path: str,
144
pooling_method: str = "last_token",
145
normalize_embeddings: bool = True,
146
use_fp16: bool = True,
147
query_instruction_format: str = "Instruct: {}\\nQuery: {}",
148
**kwargs
149
): ...
150
151
class FlagICLModel(AbsEmbedder):
152
def __init__(
153
self,
154
model_name_or_path: str,
155
pooling_method: str = "last_token",
156
normalize_embeddings: bool = True,
157
use_fp16: bool = True,
158
**kwargs
159
): ...
160
```
161
162
[Decoder-Only Embedders](./decoder-embedders.md)
163
164
### Reranking Models
165
166
Reranking models for scoring query-document pairs, available in both encoder-only and decoder-only variants with specialized implementations for different use cases.
167
168
```python { .api }
169
class FlagReranker(AbsReranker):
170
def __init__(
171
self,
172
model_name_or_path: str,
173
use_fp16: bool = False,
174
trust_remote_code: bool = False,
175
**kwargs
176
): ...
177
178
class FlagLLMReranker(AbsReranker):
179
def __init__(
180
self,
181
model_name_or_path: str,
182
use_fp16: bool = False,
183
**kwargs
184
): ...
185
```
186
187
[Rerankers](./rerankers.md)
188
189
### Base Classes and Interfaces
190
191
Abstract base classes that define the core interface contracts for embedders and rerankers, providing multi-device support and consistent API patterns.
192
193
```python { .api }
194
class AbsEmbedder:
195
def encode_queries(
196
self,
197
queries: List[str],
198
batch_size: Optional[int] = None,
199
max_length: Optional[int] = None,
200
convert_to_numpy: Optional[bool] = None,
201
**kwargs
202
) -> Union[torch.Tensor, np.ndarray]: ...
203
204
class AbsReranker:
205
def compute_score(
206
self,
207
sentence_pairs: List[Tuple[str, str]],
208
**kwargs
209
) -> np.ndarray: ...
210
```
211
212
[Base Classes](./base-classes.md)
213
214
### Model Enumerations and Utilities
215
216
Enumerations for supported model classes and utility functions for discovering available models and their capabilities.
217
218
```python { .api }
219
class EmbedderModelClass(Enum):
220
ENCODER_ONLY_BASE = "encoder-only-base"
221
ENCODER_ONLY_M3 = "encoder-only-m3"
222
DECODER_ONLY_BASE = "decoder-only-base"
223
DECODER_ONLY_ICL = "decoder-only-icl"
224
225
class RerankerModelClass(Enum):
226
ENCODER_ONLY_BASE = "encoder-only-base"
227
DECODER_ONLY_BASE = "decoder-only-base"
228
DECODER_ONLY_LAYERWISE = "decoder-only-layerwise"
229
DECODER_ONLY_LIGHTWEIGHT = "decoder-only-lightweight"
230
231
def support_model_list() -> List[str]: ...
232
def support_native_bge_model_list() -> List[str]: ...
233
```
234
235
[Model Types and Utilities](./model-types.md)
236
237
## Supported Models
238
239
FlagEmbedding supports a comprehensive range of pre-trained models:
240
241
- **BGE Models**: bge-m3, bge-large-en-v1.5, bge-base-en-v1.5, bge-small-en-v1.5, bge-large-zh-v1.5, bge-multilingual-gemma2, bge-en-icl
242
- **E5 Models**: e5-mistral-7b-instruct, e5-large-v2, e5-base-v2, multilingual-e5-large-instruct, multilingual-e5-large
243
- **GTE Models**: gte-Qwen2-7B-instruct, gte-Qwen2-1.5B-instruct, gte-large-en-v1.5, gte-base-en-v1.5, gte-multilingual-base
244
- **Reranker Models**: bge-reranker-base, bge-reranker-large, bge-reranker-v2-m3, bge-reranker-v2-gemma, bge-reranker-v2-minicpm-layerwise
245
246
## Common Types
247
248
```python { .api }
249
from typing import List, Union, Optional, Dict, Any, Tuple
250
import torch
251
import numpy as np
252
253
# Core types used across the API
254
QueryType = Union[str, List[str]]
255
CorpusType = Union[str, List[str]]
256
EmbeddingOutput = Union[torch.Tensor, np.ndarray]
257
SentencePair = Tuple[str, str]
258
DeviceSpec = Union[str, List[str]]
259
```