0
# Vector Store Testing
1
2
Specialized testing suite for vector store implementations with comprehensive CRUD operations, similarity search, async support, and bulk operations testing. The vector store tests cover all aspects of vector database functionality including document storage, retrieval, deletion, and metadata handling.
3
4
## Capabilities
5
6
### Vector Store Integration Tests
7
8
Comprehensive integration testing for vector stores with 30+ test methods covering all vector store operations.
9
10
```python { .api }
11
from langchain_tests.integration_tests import VectorStoreIntegrationTests
12
13
class VectorStoreIntegrationTests(BaseStandardTests):
14
"""Integration tests for vector stores with comprehensive CRUD operations."""
15
16
# Configuration properties
17
@property
18
def has_sync(self) -> bool:
19
"""Whether the vector store supports synchronous operations. Default: True."""
20
21
@property
22
def has_async(self) -> bool:
23
"""Whether the vector store supports asynchronous operations. Default: False."""
24
25
# Utility methods
26
def get_embeddings(self):
27
"""Returns deterministic fake embeddings for consistent testing."""
28
29
# Basic state tests
30
def test_vectorstore_is_empty(self) -> None:
31
"""Verify that the vector store starts empty."""
32
33
def test_vectorstore_still_empty(self) -> None:
34
"""Verify that the vector store is properly cleaned up after tests."""
35
36
# Document addition tests
37
def test_add_documents(self) -> None:
38
"""Test adding documents to the vector store."""
39
40
def test_add_documents_with_ids_is_idempotent(self) -> None:
41
"""Test that adding documents with same IDs is idempotent."""
42
43
def test_add_documents_by_id_with_mutation(self) -> None:
44
"""Test adding documents with ID-based mutations."""
45
46
# Document deletion tests
47
def test_deleting_documents(self) -> None:
48
"""Test deleting individual documents from the vector store."""
49
50
def test_deleting_bulk_documents(self) -> None:
51
"""Test bulk deletion of multiple documents."""
52
53
def test_delete_missing_content(self) -> None:
54
"""Test deletion behavior when content doesn't exist."""
55
56
# Document retrieval tests
57
def test_get_by_ids(self) -> None:
58
"""Test retrieving documents by their IDs."""
59
60
def test_get_by_ids_missing(self) -> None:
61
"""Test behavior when retrieving non-existent document IDs."""
62
63
# Similarity search tests
64
def test_similarity_search(self) -> None:
65
"""Test similarity search functionality."""
66
67
def test_similarity_search_with_score(self) -> None:
68
"""Test similarity search with relevance scores."""
69
70
def test_similarity_search_with_score_threshold(self) -> None:
71
"""Test similarity search with score threshold filtering."""
72
73
def test_similarity_search_by_vector(self) -> None:
74
"""Test similarity search using vector embeddings directly."""
75
76
def test_similarity_search_by_vector_with_score(self) -> None:
77
"""Test vector-based similarity search with scores."""
78
79
# Metadata filtering tests
80
def test_similarity_search_with_filter(self) -> None:
81
"""Test similarity search with metadata filtering."""
82
83
def test_similarity_search_with_complex_filter(self) -> None:
84
"""Test similarity search with complex metadata filters."""
85
86
# Async operation tests (if has_async=True)
87
def test_aadd_documents(self) -> None:
88
"""Test asynchronous document addition."""
89
90
def test_adelete_documents(self) -> None:
91
"""Test asynchronous document deletion."""
92
93
def test_aget_by_ids(self) -> None:
94
"""Test asynchronous document retrieval by IDs."""
95
96
def test_asimilarity_search(self) -> None:
97
"""Test asynchronous similarity search."""
98
99
def test_asimilarity_search_with_score(self) -> None:
100
"""Test asynchronous similarity search with scores."""
101
102
# Max marginal relevance tests
103
def test_max_marginal_relevance_search(self) -> None:
104
"""Test max marginal relevance search for diverse results."""
105
106
def test_max_marginal_relevance_search_by_vector(self) -> None:
107
"""Test MMR search using vector embeddings directly."""
108
109
# Async MMR tests (if has_async=True)
110
def test_amax_marginal_relevance_search(self) -> None:
111
"""Test asynchronous max marginal relevance search."""
112
113
def test_amax_marginal_relevance_search_by_vector(self) -> None:
114
"""Test async MMR search using vector embeddings."""
115
```
116
117
#### Usage Example
118
119
```python
120
import pytest
121
from langchain_tests.integration_tests import VectorStoreIntegrationTests
122
from my_integration import MyVectorStore
123
124
class TestMyVectorStore(VectorStoreIntegrationTests):
125
@pytest.fixture
126
def vectorstore(self):
127
# Create a fresh vector store instance for each test
128
store = MyVectorStore(
129
connection_url="postgresql://user:pass@localhost/testdb",
130
collection_name="test_collection"
131
)
132
yield store
133
# Cleanup after test
134
store.delete_collection()
135
136
@property
137
def has_sync(self):
138
return True # Your vector store supports sync operations
139
140
@property
141
def has_async(self):
142
return True # Your vector store also supports async operations
143
```
144
145
## Test Embeddings Utility
146
147
The framework provides a deterministic embeddings utility for consistent testing:
148
149
```python { .api }
150
def get_embeddings(self):
151
"""
152
Returns deterministic fake embeddings for consistent testing.
153
154
Returns:
155
FakeEmbeddings: Embeddings instance that generates consistent
156
vectors for the same input text
157
"""
158
```
159
160
### FakeEmbeddings Implementation
161
162
```python { .api }
163
class FakeEmbeddings:
164
"""Deterministic embeddings for testing purposes."""
165
166
def embed_documents(self, texts: List[str]) -> List[List[float]]:
167
"""Generate embeddings for a list of documents."""
168
169
def embed_query(self, text: str) -> List[float]:
170
"""Generate embedding for a single query."""
171
172
async def aembed_documents(self, texts: List[str]) -> List[List[float]]:
173
"""Async version of embed_documents."""
174
175
async def aembed_query(self, text: str) -> List[float]:
176
"""Async version of embed_query."""
177
```
178
179
## Test Constants
180
181
```python { .api }
182
EMBEDDING_SIZE = 6 # Standard embedding dimension for vector store tests
183
```
184
185
## Document Fixtures
186
187
The framework provides standard document fixtures for consistent testing:
188
189
```python { .api }
190
def get_test_documents():
191
"""
192
Returns a list of test documents with metadata.
193
194
Returns:
195
List[Document]: Standard test documents with varied content and metadata
196
"""
197
```
198
199
### Document Structure
200
201
```python { .api }
202
from langchain_core.documents import Document
203
204
# Example test documents
205
documents = [
206
Document(
207
page_content="This is a test document about machine learning.",
208
metadata={"category": "AI", "difficulty": "beginner"}
209
),
210
Document(
211
page_content="Advanced neural network architectures and training.",
212
metadata={"category": "AI", "difficulty": "advanced"}
213
),
214
Document(
215
page_content="Introduction to vector databases and similarity search.",
216
metadata={"category": "databases", "difficulty": "intermediate"}
217
)
218
]
219
```
220
221
## Async Testing Patterns
222
223
For vector stores that support async operations, the framework provides comprehensive async testing:
224
225
### Async Test Example
226
227
```python
228
class TestAsyncVectorStore(VectorStoreIntegrationTests):
229
@property
230
def has_async(self):
231
return True
232
233
@pytest.fixture
234
async def vectorstore(self):
235
store = await MyAsyncVectorStore.create(
236
connection_string="async://localhost/testdb"
237
)
238
yield store
239
await store.close()
240
```
241
242
## Error Handling Tests
243
244
The framework validates proper error handling for common vector store failures:
245
246
```python
247
def test_add_documents_invalid_embedding_dimension(self):
248
"""Test handling of invalid embedding dimensions."""
249
250
def test_similarity_search_invalid_query(self):
251
"""Test handling of invalid query parameters."""
252
253
def test_delete_nonexistent_documents(self):
254
"""Test deletion of documents that don't exist."""
255
```
256
257
## Metadata Filtering
258
259
Comprehensive testing for metadata-based filtering:
260
261
### Filter Types
262
263
```python { .api }
264
# Simple equality filter
265
filter_dict = {"category": "AI"}
266
267
# Complex filter with multiple conditions
268
complex_filter = {
269
"category": {"$in": ["AI", "databases"]},
270
"difficulty": {"$ne": "beginner"}
271
}
272
273
# Range filter for numeric metadata
274
range_filter = {
275
"score": {"$gte": 0.8, "$lte": 1.0}
276
}
277
```
278
279
## Performance Considerations
280
281
Vector store tests include performance considerations:
282
283
- **Bulk Operations**: Test performance with large document batches
284
- **Search Performance**: Benchmark similarity search with various k values
285
- **Memory Usage**: Monitor memory consumption during large operations
286
- **Connection Management**: Test connection pooling and cleanup
287
288
## Cleanup and Isolation
289
290
The framework ensures proper test isolation:
291
292
```python
293
@pytest.fixture
294
def vectorstore(self):
295
"""Vector store fixture with proper cleanup."""
296
store = MyVectorStore(collection_name=f"test_{uuid.uuid4()}")
297
yield store
298
# Ensure complete cleanup
299
store.delete_collection()
300
store.close()
301
```
302
303
## Collection Management
304
305
For vector stores that support multiple collections:
306
307
```python { .api }
308
def test_create_collection(self) -> None:
309
"""Test collection creation."""
310
311
def test_delete_collection(self) -> None:
312
"""Test collection deletion."""
313
314
def test_list_collections(self) -> None:
315
"""Test listing available collections."""
316
```
317
318
## Indexing and Optimization
319
320
Tests for vector store optimization features:
321
322
```python { .api }
323
def test_create_index(self) -> None:
324
"""Test index creation for performance optimization."""
325
326
def test_optimize_collection(self) -> None:
327
"""Test collection optimization operations."""
328
```
329
330
The vector store testing framework provides comprehensive coverage of all vector database operations, ensuring that implementations correctly handle document storage, similarity search, metadata filtering, and async operations while maintaining data consistency and performance standards.