Python SDK for Milvus vector database with comprehensive functionality for connecting to servers, managing collections, and performing vector operations.
npx @tessl/cli install tessl/pypi-pymilvus@2.6.00
# PyMilvus - Python SDK for Milvus Vector Database
1
2
PyMilvus is the official Python SDK for Milvus, a cloud-native vector database designed for scalable similarity search and AI applications. It provides comprehensive capabilities for vector and scalar data storage, similarity search, collection management, indexing, and user authentication.
3
4
## Package Information
5
6
**Installation:**
7
```bash
8
pip install pymilvus
9
```
10
11
**Import:**
12
```python
13
import pymilvus
14
from pymilvus import MilvusClient, Collection, DataType
15
```
16
17
**Version:** Available via `pymilvus.__version__`
18
19
## Core Imports
20
21
### Primary Client Interface
22
```python
23
from pymilvus import MilvusClient, AsyncMilvusClient
24
25
# Synchronous client for common operations
26
client = MilvusClient(uri="http://localhost:19530")
27
28
# Asynchronous client for high-concurrency applications
29
async_client = AsyncMilvusClient(uri="http://localhost:19530")
30
```
31
32
### ORM Classes for Advanced Usage
33
```python
34
from pymilvus import Collection, CollectionSchema, FieldSchema, DataType
35
from pymilvus import Index, Partition, Role
36
from pymilvus import Connections, connections
37
38
# Schema definition
39
schema = CollectionSchema([
40
FieldSchema("id", DataType.INT64, is_primary=True),
41
FieldSchema("vector", DataType.FLOAT_VECTOR, dim=128),
42
FieldSchema("metadata", DataType.JSON)
43
])
44
45
# Collection with ORM interface
46
collection = Collection("my_collection", schema)
47
```
48
49
### Search and Results
50
```python
51
from pymilvus import SearchResult, Hit, Hits
52
from pymilvus import AnnSearchRequest, RRFRanker, WeightedRanker
53
54
# Hybrid search with reranking
55
requests = [AnnSearchRequest(data=vectors1, anns_field="vector1", param={"metric_type": "L2"}, limit=100)]
56
results = client.hybrid_search("collection", requests, RRFRanker(), limit=10)
57
```
58
59
### Utility Functions
60
```python
61
from pymilvus import utility
62
from pymilvus import create_user, delete_user, list_collections
63
from pymilvus import mkts_from_datetime, hybridts_to_datetime
64
65
# Direct utility access
66
utility.has_collection("my_collection")
67
mkts_from_datetime(datetime.now())
68
```
69
70
## Basic Usage
71
72
### Simple Collection Creation and Search
73
```python
74
from pymilvus import MilvusClient
75
76
# Initialize client
77
client = MilvusClient(uri="http://localhost:19530")
78
79
# Create collection with simple parameters
80
client.create_collection(
81
collection_name="quick_setup",
82
dimension=128,
83
metric_type="COSINE"
84
)
85
86
# Insert data
87
data = [
88
{"id": i, "vector": [0.1] * 128, "text": f"Document {i}"}
89
for i in range(1000)
90
]
91
client.insert("quick_setup", data)
92
93
# Search
94
results = client.search(
95
collection_name="quick_setup",
96
data=[[0.1] * 128], # Query vector
97
limit=5,
98
output_fields=["text"]
99
)
100
```
101
102
### Advanced Schema with Functions
103
```python
104
from pymilvus import Collection, CollectionSchema, FieldSchema, DataType, Function, FunctionType
105
106
# Define schema with BM25 function
107
fields = [
108
FieldSchema("id", DataType.INT64, is_primary=True),
109
FieldSchema("text", DataType.VARCHAR, max_length=1000),
110
FieldSchema("dense_vector", DataType.FLOAT_VECTOR, dim=128),
111
FieldSchema("sparse_vector", DataType.SPARSE_FLOAT_VECTOR), # BM25 output
112
]
113
114
functions = [
115
Function("bm25_function", FunctionType.BM25,
116
input_field_names=["text"],
117
output_field_names=["sparse_vector"])
118
]
119
120
schema = CollectionSchema(fields, functions=functions, description="Hybrid search collection")
121
collection = Collection("hybrid_collection", schema)
122
```
123
124
## Architecture
125
126
PyMilvus provides two complementary API approaches:
127
128
### 1. MilvusClient - Simplified Interface
129
- **Purpose**: Streamlined operations for common use cases
130
- **Best for**: Quick prototyping, simple applications, beginners
131
- **Key features**: Auto-generated schemas, simplified method signatures, built-in defaults
132
133
```python
134
# Automatic schema creation
135
client.create_collection("simple", dimension=128)
136
137
# Direct operations
138
client.insert("simple", [{"id": 1, "vector": [0.1] * 128}])
139
results = client.search("simple", [[0.1] * 128], limit=5)
140
```
141
142
### 2. ORM Classes - Advanced Interface
143
- **Purpose**: Full control over collection lifecycle and configuration
144
- **Best for**: Production applications, complex schemas, fine-tuned operations
145
- **Key features**: Explicit schema definition, advanced indexing, partition management
146
147
```python
148
# Explicit schema control
149
schema = CollectionSchema([
150
FieldSchema("id", DataType.INT64, is_primary=True, auto_id=False),
151
FieldSchema("vector", DataType.FLOAT_VECTOR, dim=128),
152
], enable_dynamic_field=True)
153
154
collection = Collection("advanced", schema)
155
collection.create_index("vector", {"index_type": "IVF_FLAT", "nlist": 1024})
156
```
157
158
Both interfaces can be used together and share the same underlying connection management.
159
160
## Capabilities
161
162
### Vector Operations
163
Comprehensive vector database operations with multiple data types and search capabilities.
164
165
```python { .api }
166
# Multi-vector hybrid search
167
from pymilvus import MilvusClient, AnnSearchRequest, RRFRanker
168
169
client = MilvusClient()
170
171
# Define multiple search requests
172
req1 = AnnSearchRequest(data=dense_vectors, anns_field="dense_vec",
173
param={"metric_type": "L2"}, limit=100)
174
req2 = AnnSearchRequest(data=sparse_vectors, anns_field="sparse_vec",
175
param={"metric_type": "IP"}, limit=100)
176
177
# Hybrid search with RRF reranking
178
results = client.hybrid_search(
179
collection_name="multi_vector_collection",
180
reqs=[req1, req2],
181
ranker=RRFRanker(k=60),
182
limit=10,
183
output_fields=["title", "content"]
184
)
185
```
186
**→ See [Search Operations](./search-operations.md) for complete search capabilities**
187
188
### Data Management
189
Efficient data insertion, updates, and deletion with batch operations and iterators.
190
191
```python { .api }
192
# Batch operations with upsert
193
from pymilvus import MilvusClient
194
195
client = MilvusClient()
196
197
# Upsert data (insert or update)
198
data = [
199
{"id": 1, "vector": [0.1] * 128, "metadata": {"category": "A"}},
200
{"id": 2, "vector": [0.2] * 128, "metadata": {"category": "B"}},
201
]
202
result = client.upsert("my_collection", data)
203
204
# Paginated query with iterator
205
iterator = client.query_iterator(
206
collection_name="my_collection",
207
expr="metadata['category'] == 'A'",
208
output_fields=["id", "metadata"],
209
batch_size=1000
210
)
211
212
for batch in iterator:
213
process_batch(batch)
214
```
215
**→ See [Data Management](./data-management.md) for complete CRUD operations**
216
217
### Schema and Collections
218
Flexible schema definition with support for dynamic fields, functions, and partitioning.
219
220
```python { .api }
221
# Advanced schema with clustering and partitioning
222
from pymilvus import CollectionSchema, FieldSchema, DataType, Function, FunctionType
223
224
schema = CollectionSchema([
225
FieldSchema("id", DataType.INT64, is_primary=True),
226
FieldSchema("category", DataType.VARCHAR, max_length=100, is_partition_key=True),
227
FieldSchema("timestamp", DataType.INT64, is_clustering_key=True),
228
FieldSchema("content", DataType.VARCHAR, max_length=2000),
229
FieldSchema("embedding", DataType.FLOAT_VECTOR, dim=768),
230
FieldSchema("sparse_embedding", DataType.SPARSE_FLOAT_VECTOR),
231
], enable_dynamic_field=True, description="Production collection with advanced features")
232
233
# Add text embedding function
234
functions = [
235
Function("text_embed", FunctionType.TEXTEMBEDDING,
236
input_field_names=["content"],
237
output_field_names=["embedding"],
238
params={"model_name": "sentence-transformers/all-MiniLM-L6-v2"})
239
]
240
241
schema.functions = functions
242
```
243
**→ See [ORM Collection](./orm-collection.md) for complete schema management**
244
245
### Index Management
246
Advanced indexing strategies for optimal search performance across different vector types.
247
248
```python { .api }
249
# Multi-index creation with performance tuning
250
from pymilvus import Collection
251
252
collection = Collection("optimized_collection")
253
254
# Vector index with custom parameters
255
collection.create_index(
256
field_name="dense_vector",
257
index_params={
258
"index_type": "IVF_PQ",
259
"metric_type": "L2",
260
"params": {
261
"nlist": 2048,
262
"m": 16,
263
"nbits": 8
264
}
265
}
266
)
267
268
# Scalar index for filtering
269
collection.create_index(
270
field_name="category",
271
index_params={"index_type": "TRIE"}
272
)
273
274
# Load collection with custom replica and resource group
275
collection.load(replica_number=2, _resource_groups=["rg1", "rg2"])
276
```
277
**→ See [Index Management](./index-management.md) for complete indexing strategies**
278
279
### User Management
280
Comprehensive authentication, authorization, and resource management.
281
282
```python { .api }
283
# Role-based access control
284
from pymilvus import MilvusClient
285
286
client = MilvusClient()
287
288
# Create role with specific privileges
289
client.create_role("data_analyst")
290
client.grant_privilege(
291
role_name="data_analyst",
292
object_type="Collection",
293
privilege="Search",
294
object_name="public_data"
295
)
296
297
# Create user and assign role
298
client.create_user("analyst1", "secure_password")
299
client.grant_role("analyst1", "data_analyst")
300
301
# Privilege group management
302
client.create_privilege_group("read_only_group")
303
client.add_privileges_to_group("read_only_group", ["Query", "Search"])
304
```
305
**→ See [User Management](./user-management.md) for complete access control**
306
307
### Utility Functions
308
Helper functions for timestamps, progress monitoring, and maintenance operations.
309
310
```python { .api }
311
# Timestamp utilities and progress monitoring
312
from pymilvus import utility, mkts_from_datetime, hybridts_to_datetime
313
from datetime import datetime
314
315
# Create travel timestamp for point-in-time queries
316
travel_time = mkts_from_datetime(datetime(2024, 1, 1, 12, 0, 0))
317
318
# Monitor operations
319
progress = utility.loading_progress("my_collection")
320
print(f"Loading progress: {progress['progress']}%")
321
322
# Wait for operations to complete
323
utility.wait_for_loading_complete("my_collection", timeout=300)
324
325
# Resource group management
326
utility.create_resource_group("gpu_group", config={"requests": {"node_num": 2}})
327
utility.transfer_node("cpu_group", "gpu_group", 1)
328
```
329
**→ See [Utility Functions](./utility-functions.md) for complete utility reference**
330
331
### Async Operations
332
Non-blocking operations for high-concurrency applications with full async/await support.
333
334
```python { .api }
335
# Concurrent operations with AsyncMilvusClient
336
from pymilvus import AsyncMilvusClient
337
import asyncio
338
339
async def concurrent_searches():
340
client = AsyncMilvusClient()
341
342
# Concurrent search operations
343
tasks = []
344
for i in range(10):
345
task = client.search(
346
collection_name="large_collection",
347
data=[[0.1] * 128],
348
limit=100,
349
output_fields=["metadata"]
350
)
351
tasks.append(task)
352
353
# Wait for all searches to complete
354
results = await asyncio.gather(*tasks)
355
await client.close()
356
return results
357
358
# Run concurrent operations
359
results = asyncio.run(concurrent_searches())
360
```
361
**→ See [MilvusClient](./milvus-client.md) for complete async capabilities**
362
363
### Types and Enums
364
Comprehensive type system with enums for data types, index types, and configuration options.
365
366
```python { .api }
367
# Type system and enums
368
from pymilvus import DataType, IndexType, FunctionType, ConsistencyLevel
369
370
# Vector data types
371
vector_types = [
372
DataType.FLOAT_VECTOR, # Standard dense vectors
373
DataType.BINARY_VECTOR, # Binary vectors for efficiency
374
DataType.FLOAT16_VECTOR, # Half-precision vectors
375
DataType.BFLOAT16_VECTOR, # BFloat16 vectors
376
DataType.SPARSE_FLOAT_VECTOR # Sparse vectors for text search
377
]
378
379
# Index algorithms
380
index_types = [
381
IndexType.FLAT, # Exact search
382
IndexType.IVF_FLAT, # Inverted file
383
IndexType.HNSW, # Hierarchical navigable small world
384
IndexType.IVF_PQ # Product quantization
385
]
386
387
# Consistency levels
388
levels = [
389
ConsistencyLevel.Strong, # Strong consistency
390
ConsistencyLevel.Eventually, # Eventual consistency
391
ConsistencyLevel.Bounded, # Bounded staleness
392
ConsistencyLevel.Session # Session consistency
393
]
394
```
395
**→ See [Types and Enums](./types-enums.md) for complete type reference**
396
397
## Sub-Documentation
398
399
- **[MilvusClient](./milvus-client.md)** - MilvusClient and AsyncMilvusClient APIs for simplified operations
400
- **[ORM Collection](./orm-collection.md)** - Collection, Schema, and Field classes for advanced control
401
- **[Search Operations](./search-operations.md)** - Search, query, and result handling with hybrid search
402
- **[Data Management](./data-management.md)** - Insert, upsert, delete operations and data iteration
403
- **[Index Management](./index-management.md)** - Index creation, optimization, and performance tuning
404
- **[User Management](./user-management.md)** - Authentication, roles, privileges, and resource groups
405
- **[Utility Functions](./utility-functions.md)** - Helper functions, timestamps, and maintenance operations
406
- **[Types and Enums](./types-enums.md)** - Data types, enums, constants, and type definitions
407
408
---
409
410
*This documentation covers all 136+ public API components in PyMilvus, enabling comprehensive vector database operations without accessing source code.*