or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

cli-tools.mdindex.mdmulti-instance.mdpymilvus-integration.mdserver-management.md

pymilvus-integration.mddocs/

0

# PyMilvus Integration

1

2

The primary and recommended way to use milvus-lite is through the pymilvus client, which automatically activates milvus-lite when using local file URIs. This approach provides access to the complete Milvus API surface including collections, vector operations, indexing, and querying.

3

4

## Capabilities

5

6

### Client Initialization

7

8

Create a MilvusClient instance that automatically uses milvus-lite for local database files.

9

10

```python { .api }

11

from pymilvus import MilvusClient

12

13

# Local file URI activates milvus-lite automatically

14

client = MilvusClient(uri="./database.db")

15

16

# Alternative: specify full path

17

client = MilvusClient(uri="/path/to/database.db")

18

```

19

20

**Usage Example:**

21

22

```python

23

from pymilvus import MilvusClient

24

25

# Initialize client - this starts milvus-lite internally

26

client = MilvusClient("./my_vector_db.db")

27

28

# Client is ready for all standard Milvus operations

29

collection_exists = client.has_collection("test_collection")

30

```

31

32

### Collection Management

33

34

Full collection lifecycle management including creation, deletion, listing, and metadata operations.

35

36

```python { .api }

37

# Collection creation with schema

38

client.create_collection(

39

collection_name: str,

40

dimension: int,

41

primary_field_name: str = "id",

42

id_type: str = "int",

43

vector_field_name: str = "vector",

44

metric_type: str = "COSINE",

45

auto_id: bool = False,

46

timeout: Optional[float] = None,

47

**kwargs

48

) -> None

49

50

# Collection existence check

51

client.has_collection(collection_name: str, timeout: Optional[float] = None) -> bool

52

53

# Collection deletion

54

client.drop_collection(collection_name: str, timeout: Optional[float] = None) -> None

55

56

# List all collections

57

client.list_collections(timeout: Optional[float] = None) -> List[str]

58

59

# Get collection statistics

60

client.describe_collection(collection_name: str, timeout: Optional[float] = None) -> Dict[str, Any]

61

```

62

63

**Usage Example:**

64

65

```python

66

# Create collection with 384-dimensional vectors

67

client.create_collection(

68

collection_name="embeddings",

69

dimension=384,

70

metric_type="COSINE",

71

auto_id=True

72

)

73

74

# Check if collection exists

75

if client.has_collection("embeddings"):

76

stats = client.describe_collection("embeddings")

77

print(f"Collection has {stats['num_entities']} entities")

78

```

79

80

### Data Operations

81

82

Insert, upsert, delete, and query operations for vector data with support for batch operations and metadata filtering.

83

84

```python { .api }

85

# Insert data

86

client.insert(

87

collection_name: str,

88

data: List[Dict[str, Any]],

89

partition_name: Optional[str] = None,

90

timeout: Optional[float] = None

91

) -> Dict[str, Any]

92

93

# Upsert data (insert or update if exists)

94

client.upsert(

95

collection_name: str,

96

data: List[Dict[str, Any]],

97

partition_name: Optional[str] = None,

98

timeout: Optional[float] = None

99

) -> Dict[str, Any]

100

101

# Delete data by filter expression

102

client.delete(

103

collection_name: str,

104

filter: str,

105

partition_name: Optional[str] = None,

106

timeout: Optional[float] = None

107

) -> Dict[str, Any]

108

109

# Query data by filter

110

client.query(

111

collection_name: str,

112

filter: str,

113

output_fields: Optional[List[str]] = None,

114

partition_names: Optional[List[str]] = None,

115

timeout: Optional[float] = None

116

) -> List[Dict[str, Any]]

117

```

118

119

**Usage Example:**

120

121

```python

122

# Insert vector data with metadata

123

data = [

124

{"id": 1, "vector": [0.1, 0.2, 0.3], "category": "document", "title": "Sample Doc"},

125

{"id": 2, "vector": [0.4, 0.5, 0.6], "category": "image", "title": "Sample Image"}

126

]

127

128

result = client.insert(collection_name="embeddings", data=data)

129

print(f"Inserted {result['insert_count']} entities")

130

131

# Query with filter

132

results = client.query(

133

collection_name="embeddings",

134

filter='category == "document"',

135

output_fields=["id", "title", "category"]

136

)

137

```

138

139

### Vector Search

140

141

High-performance vector similarity search with support for various distance metrics, filtering, and result limiting.

142

143

```python { .api }

144

# Vector similarity search

145

client.search(

146

collection_name: str,

147

data: List[List[float]],

148

filter: Optional[str] = None,

149

limit: int = 10,

150

output_fields: Optional[List[str]] = None,

151

search_params: Optional[Dict[str, Any]] = None,

152

partition_names: Optional[List[str]] = None,

153

timeout: Optional[float] = None

154

) -> List[List[Dict[str, Any]]]

155

156

# Hybrid search (multiple vector fields)

157

client.hybrid_search(

158

collection_name: str,

159

reqs: List[Dict[str, Any]],

160

ranker: Dict[str, Any],

161

limit: int = 10,

162

partition_names: Optional[List[str]] = None,

163

output_fields: Optional[List[str]] = None,

164

timeout: Optional[float] = None

165

) -> List[List[Dict[str, Any]]]

166

```

167

168

**Usage Example:**

169

170

```python

171

# Single vector search

172

query_vector = [0.15, 0.25, 0.35] # Query embedding

173

results = client.search(

174

collection_name="embeddings",

175

data=[query_vector],

176

filter='category == "document"',

177

limit=5,

178

output_fields=["id", "title", "category"]

179

)

180

181

# Process results

182

for hits in results:

183

for hit in hits:

184

print(f"ID: {hit['id']}, Score: {hit['distance']}, Title: {hit['entity']['title']}")

185

```

186

187

### Index Management

188

189

Create and manage vector indexes for improved search performance, with support for different index types and parameters.

190

191

```python { .api }

192

# Create index on vector field

193

client.create_index(

194

collection_name: str,

195

field_name: str,

196

index_params: Dict[str, Any],

197

timeout: Optional[float] = None

198

) -> None

199

200

# Drop index

201

client.drop_index(

202

collection_name: str,

203

field_name: str,

204

timeout: Optional[float] = None

205

) -> None

206

207

# List indexes

208

client.list_indexes(

209

collection_name: str,

210

timeout: Optional[float] = None

211

) -> List[str]

212

213

# Describe index

214

client.describe_index(

215

collection_name: str,

216

field_name: str,

217

timeout: Optional[float] = None

218

) -> Dict[str, Any]

219

```

220

221

**Usage Example:**

222

223

```python

224

# Create IVF_FLAT index for better performance on larger datasets

225

index_params = {

226

"index_type": "IVF_FLAT",

227

"metric_type": "COSINE",

228

"params": {"nlist": 128}

229

}

230

231

client.create_index(

232

collection_name="embeddings",

233

field_name="vector",

234

index_params=index_params

235

)

236

237

# Check index information

238

index_info = client.describe_index(

239

collection_name="embeddings",

240

field_name="vector"

241

)

242

print(f"Index type: {index_info['index_type']}")

243

```

244

245

## Connection Management

246

247

```python { .api }

248

# Client automatically manages connection lifecycle

249

# No explicit connect/disconnect needed for milvus-lite

250

251

# Client will use file-based connection for local URIs

252

# Connection is established on first operation

253

```

254

255

## Supported Features

256

257

- **Vector Types**: Dense vectors (float32), sparse vectors, binary vectors, bfloat16 vectors

258

- **Metadata**: JSON, integers, floats, strings, arrays

259

- **Filtering**: Rich expression language for metadata filtering

260

- **Indexing**: FLAT and IVF_FLAT index types (version dependent)

261

- **Batch Operations**: Efficient bulk insert, upsert, and delete operations

262

- **Multi-vector**: Multiple vector fields per collection

263

264

## Limitations

265

266

- No partition support (milvus-lite limitation)

267

- No user authentication/RBAC

268

- No collection aliases

269

- Limited to ~1 million vectors for optimal performance

270

- Fewer index types compared to full Milvus deployment