or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

data-management.mdindex-management.mdindex.mdmilvus-client.mdorm-collection.mdsearch-operations.mdtypes-enums.mduser-management.mdutility-functions.md

index.mddocs/

0

# PyMilvus - Python SDK for Milvus Vector Database

1

2

PyMilvus is the official Python SDK for Milvus, a cloud-native vector database designed for scalable similarity search and AI applications. It provides comprehensive capabilities for vector and scalar data storage, similarity search, collection management, indexing, and user authentication.

3

4

## Package Information

5

6

**Installation:**

7

```bash

8

pip install pymilvus

9

```

10

11

**Import:**

12

```python

13

import pymilvus

14

from pymilvus import MilvusClient, Collection, DataType

15

```

16

17

**Version:** Available via `pymilvus.__version__`

18

19

## Core Imports

20

21

### Primary Client Interface

22

```python

23

from pymilvus import MilvusClient, AsyncMilvusClient

24

25

# Synchronous client for common operations

26

client = MilvusClient(uri="http://localhost:19530")

27

28

# Asynchronous client for high-concurrency applications

29

async_client = AsyncMilvusClient(uri="http://localhost:19530")

30

```

31

32

### ORM Classes for Advanced Usage

33

```python

34

from pymilvus import Collection, CollectionSchema, FieldSchema, DataType

35

from pymilvus import Index, Partition, Role

36

from pymilvus import Connections, connections

37

38

# Schema definition

39

schema = CollectionSchema([

40

FieldSchema("id", DataType.INT64, is_primary=True),

41

FieldSchema("vector", DataType.FLOAT_VECTOR, dim=128),

42

FieldSchema("metadata", DataType.JSON)

43

])

44

45

# Collection with ORM interface

46

collection = Collection("my_collection", schema)

47

```

48

49

### Search and Results

50

```python

51

from pymilvus import SearchResult, Hit, Hits

52

from pymilvus import AnnSearchRequest, RRFRanker, WeightedRanker

53

54

# Hybrid search with reranking

55

requests = [AnnSearchRequest(data=vectors1, anns_field="vector1", param={"metric_type": "L2"}, limit=100)]

56

results = client.hybrid_search("collection", requests, RRFRanker(), limit=10)

57

```

58

59

### Utility Functions

60

```python

61

from pymilvus import utility

62

from pymilvus import create_user, delete_user, list_collections

63

from pymilvus import mkts_from_datetime, hybridts_to_datetime

64

65

# Direct utility access

66

utility.has_collection("my_collection")

67

mkts_from_datetime(datetime.now())

68

```

69

70

## Basic Usage

71

72

### Simple Collection Creation and Search

73

```python

74

from pymilvus import MilvusClient

75

76

# Initialize client

77

client = MilvusClient(uri="http://localhost:19530")

78

79

# Create collection with simple parameters

80

client.create_collection(

81

collection_name="quick_setup",

82

dimension=128,

83

metric_type="COSINE"

84

)

85

86

# Insert data

87

data = [

88

{"id": i, "vector": [0.1] * 128, "text": f"Document {i}"}

89

for i in range(1000)

90

]

91

client.insert("quick_setup", data)

92

93

# Search

94

results = client.search(

95

collection_name="quick_setup",

96

data=[[0.1] * 128], # Query vector

97

limit=5,

98

output_fields=["text"]

99

)

100

```

101

102

### Advanced Schema with Functions

103

```python

104

from pymilvus import Collection, CollectionSchema, FieldSchema, DataType, Function, FunctionType

105

106

# Define schema with BM25 function

107

fields = [

108

FieldSchema("id", DataType.INT64, is_primary=True),

109

FieldSchema("text", DataType.VARCHAR, max_length=1000),

110

FieldSchema("dense_vector", DataType.FLOAT_VECTOR, dim=128),

111

FieldSchema("sparse_vector", DataType.SPARSE_FLOAT_VECTOR), # BM25 output

112

]

113

114

functions = [

115

Function("bm25_function", FunctionType.BM25,

116

input_field_names=["text"],

117

output_field_names=["sparse_vector"])

118

]

119

120

schema = CollectionSchema(fields, functions=functions, description="Hybrid search collection")

121

collection = Collection("hybrid_collection", schema)

122

```

123

124

## Architecture

125

126

PyMilvus provides two complementary API approaches:

127

128

### 1. MilvusClient - Simplified Interface

129

- **Purpose**: Streamlined operations for common use cases

130

- **Best for**: Quick prototyping, simple applications, beginners

131

- **Key features**: Auto-generated schemas, simplified method signatures, built-in defaults

132

133

```python

134

# Automatic schema creation

135

client.create_collection("simple", dimension=128)

136

137

# Direct operations

138

client.insert("simple", [{"id": 1, "vector": [0.1] * 128}])

139

results = client.search("simple", [[0.1] * 128], limit=5)

140

```

141

142

### 2. ORM Classes - Advanced Interface

143

- **Purpose**: Full control over collection lifecycle and configuration

144

- **Best for**: Production applications, complex schemas, fine-tuned operations

145

- **Key features**: Explicit schema definition, advanced indexing, partition management

146

147

```python

148

# Explicit schema control

149

schema = CollectionSchema([

150

FieldSchema("id", DataType.INT64, is_primary=True, auto_id=False),

151

FieldSchema("vector", DataType.FLOAT_VECTOR, dim=128),

152

], enable_dynamic_field=True)

153

154

collection = Collection("advanced", schema)

155

collection.create_index("vector", {"index_type": "IVF_FLAT", "nlist": 1024})

156

```

157

158

Both interfaces can be used together and share the same underlying connection management.

159

160

## Capabilities

161

162

### Vector Operations

163

Comprehensive vector database operations with multiple data types and search capabilities.

164

165

```python { .api }

166

# Multi-vector hybrid search

167

from pymilvus import MilvusClient, AnnSearchRequest, RRFRanker

168

169

client = MilvusClient()

170

171

# Define multiple search requests

172

req1 = AnnSearchRequest(data=dense_vectors, anns_field="dense_vec",

173

param={"metric_type": "L2"}, limit=100)

174

req2 = AnnSearchRequest(data=sparse_vectors, anns_field="sparse_vec",

175

param={"metric_type": "IP"}, limit=100)

176

177

# Hybrid search with RRF reranking

178

results = client.hybrid_search(

179

collection_name="multi_vector_collection",

180

reqs=[req1, req2],

181

ranker=RRFRanker(k=60),

182

limit=10,

183

output_fields=["title", "content"]

184

)

185

```

186

**→ See [Search Operations](./search-operations.md) for complete search capabilities**

187

188

### Data Management

189

Efficient data insertion, updates, and deletion with batch operations and iterators.

190

191

```python { .api }

192

# Batch operations with upsert

193

from pymilvus import MilvusClient

194

195

client = MilvusClient()

196

197

# Upsert data (insert or update)

198

data = [

199

{"id": 1, "vector": [0.1] * 128, "metadata": {"category": "A"}},

200

{"id": 2, "vector": [0.2] * 128, "metadata": {"category": "B"}},

201

]

202

result = client.upsert("my_collection", data)

203

204

# Paginated query with iterator

205

iterator = client.query_iterator(

206

collection_name="my_collection",

207

expr="metadata['category'] == 'A'",

208

output_fields=["id", "metadata"],

209

batch_size=1000

210

)

211

212

for batch in iterator:

213

process_batch(batch)

214

```

215

**→ See [Data Management](./data-management.md) for complete CRUD operations**

216

217

### Schema and Collections

218

Flexible schema definition with support for dynamic fields, functions, and partitioning.

219

220

```python { .api }

221

# Advanced schema with clustering and partitioning

222

from pymilvus import CollectionSchema, FieldSchema, DataType, Function, FunctionType

223

224

schema = CollectionSchema([

225

FieldSchema("id", DataType.INT64, is_primary=True),

226

FieldSchema("category", DataType.VARCHAR, max_length=100, is_partition_key=True),

227

FieldSchema("timestamp", DataType.INT64, is_clustering_key=True),

228

FieldSchema("content", DataType.VARCHAR, max_length=2000),

229

FieldSchema("embedding", DataType.FLOAT_VECTOR, dim=768),

230

FieldSchema("sparse_embedding", DataType.SPARSE_FLOAT_VECTOR),

231

], enable_dynamic_field=True, description="Production collection with advanced features")

232

233

# Add text embedding function

234

functions = [

235

Function("text_embed", FunctionType.TEXTEMBEDDING,

236

input_field_names=["content"],

237

output_field_names=["embedding"],

238

params={"model_name": "sentence-transformers/all-MiniLM-L6-v2"})

239

]

240

241

schema.functions = functions

242

```

243

**→ See [ORM Collection](./orm-collection.md) for complete schema management**

244

245

### Index Management

246

Advanced indexing strategies for optimal search performance across different vector types.

247

248

```python { .api }

249

# Multi-index creation with performance tuning

250

from pymilvus import Collection

251

252

collection = Collection("optimized_collection")

253

254

# Vector index with custom parameters

255

collection.create_index(

256

field_name="dense_vector",

257

index_params={

258

"index_type": "IVF_PQ",

259

"metric_type": "L2",

260

"params": {

261

"nlist": 2048,

262

"m": 16,

263

"nbits": 8

264

}

265

}

266

)

267

268

# Scalar index for filtering

269

collection.create_index(

270

field_name="category",

271

index_params={"index_type": "TRIE"}

272

)

273

274

# Load collection with custom replica and resource group

275

collection.load(replica_number=2, _resource_groups=["rg1", "rg2"])

276

```

277

**→ See [Index Management](./index-management.md) for complete indexing strategies**

278

279

### User Management

280

Comprehensive authentication, authorization, and resource management.

281

282

```python { .api }

283

# Role-based access control

284

from pymilvus import MilvusClient

285

286

client = MilvusClient()

287

288

# Create role with specific privileges

289

client.create_role("data_analyst")

290

client.grant_privilege(

291

role_name="data_analyst",

292

object_type="Collection",

293

privilege="Search",

294

object_name="public_data"

295

)

296

297

# Create user and assign role

298

client.create_user("analyst1", "secure_password")

299

client.grant_role("analyst1", "data_analyst")

300

301

# Privilege group management

302

client.create_privilege_group("read_only_group")

303

client.add_privileges_to_group("read_only_group", ["Query", "Search"])

304

```

305

**→ See [User Management](./user-management.md) for complete access control**

306

307

### Utility Functions

308

Helper functions for timestamps, progress monitoring, and maintenance operations.

309

310

```python { .api }

311

# Timestamp utilities and progress monitoring

312

from pymilvus import utility, mkts_from_datetime, hybridts_to_datetime

313

from datetime import datetime

314

315

# Create travel timestamp for point-in-time queries

316

travel_time = mkts_from_datetime(datetime(2024, 1, 1, 12, 0, 0))

317

318

# Monitor operations

319

progress = utility.loading_progress("my_collection")

320

print(f"Loading progress: {progress['progress']}%")

321

322

# Wait for operations to complete

323

utility.wait_for_loading_complete("my_collection", timeout=300)

324

325

# Resource group management

326

utility.create_resource_group("gpu_group", config={"requests": {"node_num": 2}})

327

utility.transfer_node("cpu_group", "gpu_group", 1)

328

```

329

**→ See [Utility Functions](./utility-functions.md) for complete utility reference**

330

331

### Async Operations

332

Non-blocking operations for high-concurrency applications with full async/await support.

333

334

```python { .api }

335

# Concurrent operations with AsyncMilvusClient

336

from pymilvus import AsyncMilvusClient

337

import asyncio

338

339

async def concurrent_searches():

340

client = AsyncMilvusClient()

341

342

# Concurrent search operations

343

tasks = []

344

for i in range(10):

345

task = client.search(

346

collection_name="large_collection",

347

data=[[0.1] * 128],

348

limit=100,

349

output_fields=["metadata"]

350

)

351

tasks.append(task)

352

353

# Wait for all searches to complete

354

results = await asyncio.gather(*tasks)

355

await client.close()

356

return results

357

358

# Run concurrent operations

359

results = asyncio.run(concurrent_searches())

360

```

361

**→ See [MilvusClient](./milvus-client.md) for complete async capabilities**

362

363

### Types and Enums

364

Comprehensive type system with enums for data types, index types, and configuration options.

365

366

```python { .api }

367

# Type system and enums

368

from pymilvus import DataType, IndexType, FunctionType, ConsistencyLevel

369

370

# Vector data types

371

vector_types = [

372

DataType.FLOAT_VECTOR, # Standard dense vectors

373

DataType.BINARY_VECTOR, # Binary vectors for efficiency

374

DataType.FLOAT16_VECTOR, # Half-precision vectors

375

DataType.BFLOAT16_VECTOR, # BFloat16 vectors

376

DataType.SPARSE_FLOAT_VECTOR # Sparse vectors for text search

377

]

378

379

# Index algorithms

380

index_types = [

381

IndexType.FLAT, # Exact search

382

IndexType.IVF_FLAT, # Inverted file

383

IndexType.HNSW, # Hierarchical navigable small world

384

IndexType.IVF_PQ # Product quantization

385

]

386

387

# Consistency levels

388

levels = [

389

ConsistencyLevel.Strong, # Strong consistency

390

ConsistencyLevel.Eventually, # Eventual consistency

391

ConsistencyLevel.Bounded, # Bounded staleness

392

ConsistencyLevel.Session # Session consistency

393

]

394

```

395

**→ See [Types and Enums](./types-enums.md) for complete type reference**

396

397

## Sub-Documentation

398

399

- **[MilvusClient](./milvus-client.md)** - MilvusClient and AsyncMilvusClient APIs for simplified operations

400

- **[ORM Collection](./orm-collection.md)** - Collection, Schema, and Field classes for advanced control

401

- **[Search Operations](./search-operations.md)** - Search, query, and result handling with hybrid search

402

- **[Data Management](./data-management.md)** - Insert, upsert, delete operations and data iteration

403

- **[Index Management](./index-management.md)** - Index creation, optimization, and performance tuning

404

- **[User Management](./user-management.md)** - Authentication, roles, privileges, and resource groups

405

- **[Utility Functions](./utility-functions.md)** - Helper functions, timestamps, and maintenance operations

406

- **[Types and Enums](./types-enums.md)** - Data types, enums, constants, and type definitions

407

408

---

409

410

*This documentation covers all 136+ public API components in PyMilvus, enabling comprehensive vector database operations without accessing source code.*