Tessl Tile for pypi/pymilvus@2.6.0

npx @tessl/cli init
Version
Tile
Files
index.mddocs/
0
# PyMilvus - Python SDK for Milvus Vector Database
1

2
PyMilvus is the official Python SDK for Milvus, a cloud-native vector database designed for scalable similarity search and AI applications. It provides comprehensive capabilities for vector and scalar data storage, similarity search, collection management, indexing, and user authentication.
3

4
## Package Information
5

6
**Installation:**
7
```bash
8
pip install pymilvus
9
```
10

11
**Import:**
12
```python
13
import pymilvus
14
from pymilvus import MilvusClient, Collection, DataType
15
```
16

17
**Version:** Available via `pymilvus.__version__`
18

19
## Core Imports
20

21
### Primary Client Interface
22
```python
23
from pymilvus import MilvusClient, AsyncMilvusClient
24

25
# Synchronous client for common operations
26
client = MilvusClient(uri="http://localhost:19530")
27

28
# Asynchronous client for high-concurrency applications  
29
async_client = AsyncMilvusClient(uri="http://localhost:19530")
30
```
31

32
### ORM Classes for Advanced Usage
33
```python
34
from pymilvus import Collection, CollectionSchema, FieldSchema, DataType
35
from pymilvus import Index, Partition, Role
36
from pymilvus import Connections, connections
37

38
# Schema definition
39
schema = CollectionSchema([
40
    FieldSchema("id", DataType.INT64, is_primary=True),
41
    FieldSchema("vector", DataType.FLOAT_VECTOR, dim=128),
42
    FieldSchema("metadata", DataType.JSON)
43
])
44

45
# Collection with ORM interface
46
collection = Collection("my_collection", schema)
47
```
48

49
### Search and Results
50
```python
51
from pymilvus import SearchResult, Hit, Hits
52
from pymilvus import AnnSearchRequest, RRFRanker, WeightedRanker
53

54
# Hybrid search with reranking
55
requests = [AnnSearchRequest(data=vectors1, anns_field="vector1", param={"metric_type": "L2"}, limit=100)]
56
results = client.hybrid_search("collection", requests, RRFRanker(), limit=10)
57
```
58

59
### Utility Functions  
60
```python
61
from pymilvus import utility
62
from pymilvus import create_user, delete_user, list_collections
63
from pymilvus import mkts_from_datetime, hybridts_to_datetime
64

65
# Direct utility access
66
utility.has_collection("my_collection")
67
mkts_from_datetime(datetime.now())
68
```
69

70
## Basic Usage
71

72
### Simple Collection Creation and Search
73
```python
74
from pymilvus import MilvusClient
75

76
# Initialize client
77
client = MilvusClient(uri="http://localhost:19530")
78

79
# Create collection with simple parameters
80
client.create_collection(
81
    collection_name="quick_setup", 
82
    dimension=128,
83
    metric_type="COSINE"
84
)
85

86
# Insert data
87
data = [
88
    {"id": i, "vector": [0.1] * 128, "text": f"Document {i}"} 
89
    for i in range(1000)
90
]
91
client.insert("quick_setup", data)
92

93
# Search
94
results = client.search(
95
    collection_name="quick_setup",
96
    data=[[0.1] * 128],  # Query vector
97
    limit=5,
98
    output_fields=["text"]
99
)
100
```
101

102
### Advanced Schema with Functions
103
```python
104
from pymilvus import Collection, CollectionSchema, FieldSchema, DataType, Function, FunctionType
105

106
# Define schema with BM25 function
107
fields = [
108
    FieldSchema("id", DataType.INT64, is_primary=True),
109
    FieldSchema("text", DataType.VARCHAR, max_length=1000),
110
    FieldSchema("dense_vector", DataType.FLOAT_VECTOR, dim=128),
111
    FieldSchema("sparse_vector", DataType.SPARSE_FLOAT_VECTOR),  # BM25 output
112
]
113

114
functions = [
115
    Function("bm25_function", FunctionType.BM25, 
116
            input_field_names=["text"], 
117
            output_field_names=["sparse_vector"])
118
]
119

120
schema = CollectionSchema(fields, functions=functions, description="Hybrid search collection")
121
collection = Collection("hybrid_collection", schema)
122
```
123

124
## Architecture
125

126
PyMilvus provides two complementary API approaches:
127

128
### 1. MilvusClient - Simplified Interface
129
- **Purpose**: Streamlined operations for common use cases
130
- **Best for**: Quick prototyping, simple applications, beginners
131
- **Key features**: Auto-generated schemas, simplified method signatures, built-in defaults
132

133
```python
134
# Automatic schema creation
135
client.create_collection("simple", dimension=128)
136

137
# Direct operations
138
client.insert("simple", [{"id": 1, "vector": [0.1] * 128}])
139
results = client.search("simple", [[0.1] * 128], limit=5)
140
```
141

142
### 2. ORM Classes - Advanced Interface  
143
- **Purpose**: Full control over collection lifecycle and configuration
144
- **Best for**: Production applications, complex schemas, fine-tuned operations
145
- **Key features**: Explicit schema definition, advanced indexing, partition management
146

147
```python
148
# Explicit schema control
149
schema = CollectionSchema([
150
    FieldSchema("id", DataType.INT64, is_primary=True, auto_id=False),
151
    FieldSchema("vector", DataType.FLOAT_VECTOR, dim=128),
152
], enable_dynamic_field=True)
153

154
collection = Collection("advanced", schema)
155
collection.create_index("vector", {"index_type": "IVF_FLAT", "nlist": 1024})
156
```
157

158
Both interfaces can be used together and share the same underlying connection management.
159

160
## Capabilities
161

162
### Vector Operations
163
Comprehensive vector database operations with multiple data types and search capabilities.
164

165
```python { .api }
166
# Multi-vector hybrid search
167
from pymilvus import MilvusClient, AnnSearchRequest, RRFRanker
168

169
client = MilvusClient()
170

171
# Define multiple search requests
172
req1 = AnnSearchRequest(data=dense_vectors, anns_field="dense_vec", 
173
                       param={"metric_type": "L2"}, limit=100)
174
req2 = AnnSearchRequest(data=sparse_vectors, anns_field="sparse_vec",
175
                       param={"metric_type": "IP"}, limit=100)
176

177
# Hybrid search with RRF reranking
178
results = client.hybrid_search(
179
    collection_name="multi_vector_collection",
180
    reqs=[req1, req2],
181
    ranker=RRFRanker(k=60),
182
    limit=10,
183
    output_fields=["title", "content"]
184
)
185
```
186
**→ See [Search Operations](./search-operations.md) for complete search capabilities**
187

188
### Data Management
189
Efficient data insertion, updates, and deletion with batch operations and iterators.
190

191
```python { .api }
192
# Batch operations with upsert
193
from pymilvus import MilvusClient
194

195
client = MilvusClient()
196

197
# Upsert data (insert or update)
198
data = [
199
    {"id": 1, "vector": [0.1] * 128, "metadata": {"category": "A"}},
200
    {"id": 2, "vector": [0.2] * 128, "metadata": {"category": "B"}},
201
]
202
result = client.upsert("my_collection", data)
203

204
# Paginated query with iterator
205
iterator = client.query_iterator(
206
    collection_name="my_collection",
207
    expr="metadata['category'] == 'A'",
208
    output_fields=["id", "metadata"],
209
    batch_size=1000
210
)
211

212
for batch in iterator:
213
    process_batch(batch)
214
```
215
**→ See [Data Management](./data-management.md) for complete CRUD operations**
216

217
### Schema and Collections  
218
Flexible schema definition with support for dynamic fields, functions, and partitioning.
219

220
```python { .api }
221
# Advanced schema with clustering and partitioning
222
from pymilvus import CollectionSchema, FieldSchema, DataType, Function, FunctionType
223

224
schema = CollectionSchema([
225
    FieldSchema("id", DataType.INT64, is_primary=True),
226
    FieldSchema("category", DataType.VARCHAR, max_length=100, is_partition_key=True),
227
    FieldSchema("timestamp", DataType.INT64, is_clustering_key=True),
228
    FieldSchema("content", DataType.VARCHAR, max_length=2000),
229
    FieldSchema("embedding", DataType.FLOAT_VECTOR, dim=768),
230
    FieldSchema("sparse_embedding", DataType.SPARSE_FLOAT_VECTOR),
231
], enable_dynamic_field=True, description="Production collection with advanced features")
232

233
# Add text embedding function
234
functions = [
235
    Function("text_embed", FunctionType.TEXTEMBEDDING,
236
            input_field_names=["content"],
237
            output_field_names=["embedding"],
238
            params={"model_name": "sentence-transformers/all-MiniLM-L6-v2"})
239
]
240

241
schema.functions = functions
242
```
243
**→ See [ORM Collection](./orm-collection.md) for complete schema management**
244

245
### Index Management
246
Advanced indexing strategies for optimal search performance across different vector types.
247

248
```python { .api }
249
# Multi-index creation with performance tuning
250
from pymilvus import Collection
251

252
collection = Collection("optimized_collection")
253

254
# Vector index with custom parameters
255
collection.create_index(
256
    field_name="dense_vector",
257
    index_params={
258
        "index_type": "IVF_PQ",
259
        "metric_type": "L2", 
260
        "params": {
261
            "nlist": 2048,
262
            "m": 16,
263
            "nbits": 8
264
        }
265
    }
266
)
267

268
# Scalar index for filtering
269
collection.create_index(
270
    field_name="category",
271
    index_params={"index_type": "TRIE"}
272
)
273

274
# Load collection with custom replica and resource group
275
collection.load(replica_number=2, _resource_groups=["rg1", "rg2"])
276
```
277
**→ See [Index Management](./index-management.md) for complete indexing strategies**
278

279
### User Management
280
Comprehensive authentication, authorization, and resource management.
281

282
```python { .api }
283
# Role-based access control
284
from pymilvus import MilvusClient
285

286
client = MilvusClient()
287

288
# Create role with specific privileges
289
client.create_role("data_analyst")
290
client.grant_privilege(
291
    role_name="data_analyst",
292
    object_type="Collection",
293
    privilege="Search",
294
    object_name="public_data"
295
)
296

297
# Create user and assign role  
298
client.create_user("analyst1", "secure_password")
299
client.grant_role("analyst1", "data_analyst")
300

301
# Privilege group management
302
client.create_privilege_group("read_only_group")
303
client.add_privileges_to_group("read_only_group", ["Query", "Search"])
304
```
305
**→ See [User Management](./user-management.md) for complete access control**
306

307
### Utility Functions
308
Helper functions for timestamps, progress monitoring, and maintenance operations.
309

310
```python { .api }
311
# Timestamp utilities and progress monitoring
312
from pymilvus import utility, mkts_from_datetime, hybridts_to_datetime
313
from datetime import datetime
314

315
# Create travel timestamp for point-in-time queries
316
travel_time = mkts_from_datetime(datetime(2024, 1, 1, 12, 0, 0))
317

318
# Monitor operations
319
progress = utility.loading_progress("my_collection")
320
print(f"Loading progress: {progress['progress']}%")
321

322
# Wait for operations to complete
323
utility.wait_for_loading_complete("my_collection", timeout=300)
324

325
# Resource group management
326
utility.create_resource_group("gpu_group", config={"requests": {"node_num": 2}})
327
utility.transfer_node("cpu_group", "gpu_group", 1)
328
```
329
**→ See [Utility Functions](./utility-functions.md) for complete utility reference**
330

331
### Async Operations
332
Non-blocking operations for high-concurrency applications with full async/await support.
333

334
```python { .api }
335
# Concurrent operations with AsyncMilvusClient
336
from pymilvus import AsyncMilvusClient
337
import asyncio
338

339
async def concurrent_searches():
340
    client = AsyncMilvusClient()
341
    
342
    # Concurrent search operations
343
    tasks = []
344
    for i in range(10):
345
        task = client.search(
346
            collection_name="large_collection",
347
            data=[[0.1] * 128],
348
            limit=100,
349
            output_fields=["metadata"]
350
        )
351
        tasks.append(task)
352
    
353
    # Wait for all searches to complete
354
    results = await asyncio.gather(*tasks)
355
    await client.close()
356
    return results
357

358
# Run concurrent operations
359
results = asyncio.run(concurrent_searches())
360
```
361
**→ See [MilvusClient](./milvus-client.md) for complete async capabilities**
362

363
### Types and Enums
364
Comprehensive type system with enums for data types, index types, and configuration options.
365

366
```python { .api }
367
# Type system and enums
368
from pymilvus import DataType, IndexType, FunctionType, ConsistencyLevel
369

370
# Vector data types
371
vector_types = [
372
    DataType.FLOAT_VECTOR,      # Standard dense vectors
373
    DataType.BINARY_VECTOR,     # Binary vectors for efficiency  
374
    DataType.FLOAT16_VECTOR,    # Half-precision vectors
375
    DataType.BFLOAT16_VECTOR,   # BFloat16 vectors
376
    DataType.SPARSE_FLOAT_VECTOR # Sparse vectors for text search
377
]
378

379
# Index algorithms
380
index_types = [
381
    IndexType.FLAT,       # Exact search
382
    IndexType.IVF_FLAT,   # Inverted file
383
    IndexType.HNSW,       # Hierarchical navigable small world
384
    IndexType.IVF_PQ      # Product quantization
385
]
386

387
# Consistency levels
388
levels = [
389
    ConsistencyLevel.Strong,      # Strong consistency
390
    ConsistencyLevel.Eventually,  # Eventual consistency  
391
    ConsistencyLevel.Bounded,     # Bounded staleness
392
    ConsistencyLevel.Session      # Session consistency
393
]
394
```
395
**→ See [Types and Enums](./types-enums.md) for complete type reference**
396

397
## Sub-Documentation
398

399
- **[MilvusClient](./milvus-client.md)** - MilvusClient and AsyncMilvusClient APIs for simplified operations
400
- **[ORM Collection](./orm-collection.md)** - Collection, Schema, and Field classes for advanced control  
401
- **[Search Operations](./search-operations.md)** - Search, query, and result handling with hybrid search
402
- **[Data Management](./data-management.md)** - Insert, upsert, delete operations and data iteration
403
- **[Index Management](./index-management.md)** - Index creation, optimization, and performance tuning
404
- **[User Management](./user-management.md)** - Authentication, roles, privileges, and resource groups
405
- **[Utility Functions](./utility-functions.md)** - Helper functions, timestamps, and maintenance operations
406
- **[Types and Enums](./types-enums.md)** - Data types, enums, constants, and type definitions
407

408
---
409

410
*This documentation covers all 136+ public API components in PyMilvus, enabling comprehensive vector database operations without accessing source code.*
Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/