0
# Vector Store
1
2
The FeastVectorStore class provides vector store functionality for RAG (Retrieval-Augmented Generation) applications and semantic search using Feast's feature store infrastructure. It enables efficient vector similarity search and document retrieval for AI applications.
3
4
## Capabilities
5
6
### Vector Store Initialization
7
8
Initialize a vector store instance with a Feast repository and RAG-enabled feature view.
9
10
```python { .api }
11
class FeastVectorStore:
12
def __init__(self, repo_path: str, rag_view: FeatureView, features: List[str]):
13
"""
14
Initialize the Feast vector store.
15
16
Parameters:
17
- repo_path: Path to the Feast repository
18
- rag_view: Feature view configured for RAG operations
19
- features: List of feature names to retrieve in queries
20
"""
21
```
22
23
### Vector Similarity Search
24
25
Query the vector store using vector embeddings or text queries for semantic similarity search.
26
27
```python { .api }
28
def query(
29
self,
30
query_vector: Optional[np.ndarray] = None,
31
query_string: Optional[str] = None,
32
top_k: int = 10
33
) -> OnlineResponse:
34
"""
35
Query the Feast vector store for similar documents.
36
37
Parameters:
38
- query_vector: Vector embedding for similarity search
39
- query_string: Text query for semantic search
40
- top_k: Number of most similar results to return
41
42
Returns:
43
OnlineResponse containing the retrieved documents and features
44
45
Note: Either query_vector or query_string must be provided
46
"""
47
```
48
49
### Vector Store Properties
50
51
Access the underlying Feast store and configuration.
52
53
```python { .api }
54
@property
55
def store(self) -> FeatureStore:
56
"""Access the underlying FeatureStore instance."""
57
```
58
59
## Usage Examples
60
61
### Basic Vector Search Setup
62
63
```python
64
import numpy as np
65
from feast import FeatureStore, FeatureView, Field, FileSource, ValueType, FeastVectorStore
66
from datetime import timedelta
67
68
# Define a RAG-enabled feature view with vector fields
69
documents_source = FileSource(
70
path="data/document_embeddings.parquet",
71
timestamp_field="created_timestamp"
72
)
73
74
# Create feature view for document embeddings
75
document_embeddings_fv = FeatureView(
76
name="document_embeddings",
77
entities=["document_id"],
78
ttl=timedelta(days=365),
79
schema=[
80
Field(name="title", dtype=ValueType.STRING),
81
Field(name="content", dtype=ValueType.STRING),
82
Field(name="embedding", dtype=ValueType.FLOAT_LIST), # Vector field
83
Field(name="category", dtype=ValueType.STRING)
84
],
85
source=documents_source
86
)
87
88
# Initialize vector store
89
vector_store = FeastVectorStore(
90
repo_path="./feast_repo",
91
rag_view=document_embeddings_fv,
92
features=[
93
"document_embeddings:title",
94
"document_embeddings:content",
95
"document_embeddings:embedding",
96
"document_embeddings:category"
97
]
98
)
99
```
100
101
### Vector Similarity Search
102
103
```python
104
# Create query vector (e.g., from text embedding model)
105
query_embedding = np.array([0.1, 0.2, 0.3, 0.4, 0.5]) # Example 5-dimensional vector
106
107
# Perform vector similarity search
108
results = vector_store.query(
109
query_vector=query_embedding,
110
top_k=5
111
)
112
113
# Access results
114
result_dict = results.to_dict()
115
print("Top 5 similar documents:")
116
for i in range(len(result_dict["document_id"])):
117
print(f"Document: {result_dict['title'][i]}")
118
print(f"Category: {result_dict['category'][i]}")
119
print(f"Content: {result_dict['content'][i][:100]}...")
120
print("---")
121
```
122
123
### Text-Based Semantic Search
124
125
```python
126
# Perform text-based semantic search (if supported by the vector store backend)
127
results = vector_store.query(
128
query_string="machine learning algorithms",
129
top_k=10
130
)
131
132
# Convert to DataFrame for analysis
133
df = results.to_df()
134
print(df[["title", "category", "content"]])
135
```
136
137
### RAG Pipeline Integration
138
139
```python
140
def rag_query(question: str, vector_store: FeastVectorStore, embedding_model, llm_model):
141
"""
142
Complete RAG pipeline using FeastVectorStore.
143
144
Args:
145
question: User question
146
vector_store: Configured FeastVectorStore instance
147
embedding_model: Model to create embeddings
148
llm_model: Language model for generation
149
"""
150
# Generate embedding for the question
151
question_embedding = embedding_model.encode(question)
152
153
# Retrieve relevant documents
154
context_results = vector_store.query(
155
query_vector=question_embedding,
156
top_k=5
157
)
158
159
# Format context from retrieved documents
160
context_dict = context_results.to_dict()
161
context_text = "\n\n".join([
162
f"Title: {title}\nContent: {content}"
163
for title, content in zip(context_dict["title"], context_dict["content"])
164
])
165
166
# Generate answer using retrieved context
167
prompt = f"""
168
Context:
169
{context_text}
170
171
Question: {question}
172
173
Answer based on the provided context:
174
"""
175
176
answer = llm_model.generate(prompt)
177
return answer, context_results
178
179
# Usage
180
question = "What are the benefits of feature stores?"
181
answer, sources = rag_query(question, vector_store, embedding_model, llm_model)
182
print(f"Answer: {answer}")
183
print(f"Sources: {len(sources.to_dict()['document_id'])} documents")
184
```
185
186
### Advanced Vector Store Configuration
187
188
```python
189
from feast import Entity
190
191
# Define document entity
192
document_entity = Entity(
193
name="document_id",
194
value_type=ValueType.STRING,
195
description="Unique document identifier"
196
)
197
198
# Create vector store with comprehensive configuration
199
vector_store = FeastVectorStore(
200
repo_path="./feast_repo",
201
rag_view=document_embeddings_fv,
202
features=[
203
"document_embeddings:title",
204
"document_embeddings:content",
205
"document_embeddings:embedding",
206
"document_embeddings:category",
207
"document_embeddings:author",
208
"document_embeddings:published_date",
209
"document_embeddings:tags"
210
]
211
)
212
213
# Batch vector search for multiple queries
214
query_vectors = [
215
np.random.rand(384), # Example embedding dimension
216
np.random.rand(384),
217
np.random.rand(384)
218
]
219
220
batch_results = []
221
for i, query_vec in enumerate(query_vectors):
222
result = vector_store.query(
223
query_vector=query_vec,
224
top_k=3
225
)
226
batch_results.append(result)
227
print(f"Query {i+1}: Found {len(result.to_dict()['document_id'])} results")
228
```
229
230
## Vector Store Backends
231
232
The FeastVectorStore works with various vector database backends supported by Feast's online stores:
233
234
- **PostgreSQL with pgvector**: Vector similarity search in PostgreSQL
235
- **Elasticsearch**: Text and vector search capabilities
236
- **Milvus**: Specialized vector database for high-performance similarity search
237
- **Other vector-enabled online stores**: As supported by Feast infrastructure
238
239
The specific vector search capabilities depend on the configured online store backend and its vector index configuration.