Tessl Tile for pypi/pymilvus@2.6.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

data-management.md index-management.md index.md milvus-client.md orm-collection.md search-operations.md types-enums.md user-management.md utility-functions.md

orm-collection.mddocs/

0
# ORM Collection, Schema, and Field Classes
1

2
The ORM (Object-Relational Mapping) classes provide advanced control over collection lifecycle, schema definition, and field configuration. These classes are ideal for production applications requiring fine-grained control over collection properties, indexing strategies, and data validation.
3

4
## Collection
5

6
The Collection class is the primary interface for advanced collection operations with explicit schema control.
7

8
### Constructor
9

10
```python { .api }
11
from pymilvus import Collection
12

13
def __init__(
14
    self,
15
    name: str,
16
    schema: Optional[CollectionSchema] = None,
17
    using: str = "default",
18
    shards_num: int = 1,
19
    consistency_level: str = "Bounded",
20
    properties: Optional[Dict[str, str]] = None,
21
    **kwargs
22
) -> None
23
```
24

25
**Parameters:**
26
- `name`: Collection name
27
- `schema`: CollectionSchema object defining structure
28
- `using`: Connection alias (default: "default") 
29
- `shards_num`: Number of shards for data distribution
30
- `consistency_level`: "Strong", "Bounded", "Eventually", or "Session"
31
- `properties`: Custom collection properties
32
- `**kwargs`: Additional configuration options
33

34
**Examples:**
35
```python
36
# Create collection with existing schema
37
from pymilvus import Collection, CollectionSchema, FieldSchema, DataType
38

39
schema = CollectionSchema([
40
    FieldSchema("id", DataType.INT64, is_primary=True),
41
    FieldSchema("embedding", DataType.FLOAT_VECTOR, dim=768),
42
    FieldSchema("metadata", DataType.JSON)
43
])
44

45
collection = Collection(
46
    name="documents",
47
    schema=schema,
48
    shards_num=2,
49
    consistency_level="Strong"
50
)
51

52
# Load existing collection
53
existing = Collection("existing_collection")
54
```
55

56
### Class Method: construct_from_dataframe
57

58
```python { .api }
59
@classmethod
60
def construct_from_dataframe(
61
    cls,
62
    name: str,
63
    dataframe: pd.DataFrame,
64
    primary_field: str = "id",
65
    auto_id: bool = False,
66
    **kwargs
67
) -> Collection
68
```
69

70
**Parameters:**
71
- `dataframe`: pandas DataFrame with data
72
- `primary_field`: Column name for primary key
73
- `auto_id`: Enable auto-generated IDs
74

75
**Example:**
76
```python
77
import pandas as pd
78

79
# Create collection from DataFrame
80
df = pd.DataFrame({
81
    "id": [1, 2, 3],
82
    "vector": [[0.1]*128, [0.2]*128, [0.3]*128], 
83
    "text": ["doc1", "doc2", "doc3"]
84
})
85

86
collection = Collection.construct_from_dataframe(
87
    "dataframe_collection",
88
    df,
89
    primary_field="id"
90
)
91
```
92

93
### Properties
94

95
```python { .api }
96
# Schema information
97
collection.schema: CollectionSchema          # Collection schema
98
collection.name: str                         # Collection name  
99
collection.description: str                  # Collection description
100

101
# Data statistics
102
collection.is_empty: bool                    # True if collection has no data
103
collection.num_entities: int                 # Total entity count
104
collection.num_shards: int                   # Number of shards
105

106
# Field access
107
collection.primary_field: FieldSchema        # Primary key field schema
108
collection.aliases: List[str]                # List of collection aliases
109

110
# Related objects
111
collection.partitions: List[Partition]       # List of partition objects
112
collection.indexes: List[Index]              # List of index objects
113
```
114

115
### Memory Management
116

117
```python { .api }
118
def load(
119
    self,
120
    partition_names: Optional[List[str]] = None,
121
    replica_number: int = 1,
122
    timeout: Optional[float] = None,
123
    **kwargs
124
) -> None
125
```
126

127
**Parameters:**
128
- `partition_names`: Specific partitions to load (default: all)
129
- `replica_number`: Number of replicas for high availability
130
- `**kwargs`: Additional loading options like `_resource_groups`
131

132
```python { .api }
133
def release(
134
    self,
135
    timeout: Optional[float] = None
136
) -> None
137
```
138

139
**Examples:**
140
```python
141
# Load entire collection with multiple replicas
142
collection.load(replica_number=2)
143

144
# Load specific partitions
145
collection.load(partition_names=["2024", "2023"])
146

147
# Load with resource group assignment
148
collection.load(replica_number=2, _resource_groups=["gpu_group"])
149

150
# Release from memory
151
collection.release()
152
```
153

154
### Data Operations
155

156
```python { .api }
157
def insert(
158
    self,
159
    data: Union[List[List], List[Dict], pd.DataFrame],
160
    partition_name: Optional[str] = None,
161
    timeout: Optional[float] = None,
162
    **kwargs
163
) -> MutationResult
164
```
165

166
```python { .api }
167
def upsert(
168
    self,
169
    data: Union[List[List], List[Dict], pd.DataFrame], 
170
    partition_name: Optional[str] = None,
171
    timeout: Optional[float] = None,
172
    **kwargs
173
) -> MutationResult
174
```
175

176
```python { .api }
177
def delete(
178
    self,
179
    expr: str,
180
    partition_name: Optional[str] = None,
181
    timeout: Optional[float] = None,
182
    **kwargs
183
) -> MutationResult
184
```
185

186
**Examples:**
187
```python
188
# Insert data as list of dictionaries
189
data = [
190
    {"id": 1, "embedding": [0.1] * 768, "category": "A"},
191
    {"id": 2, "embedding": [0.2] * 768, "category": "B"}
192
]
193
result = collection.insert(data)
194

195
# Insert into specific partition
196
collection.insert(data, partition_name="recent")
197

198
# Delete by expression
199
collection.delete("category == 'obsolete'")
200

201
# Upsert (insert or update)
202
collection.upsert(updated_data)
203
```
204

205
### Query and Search Operations
206

207
```python { .api }
208
def query(
209
    self,
210
    expr: str,
211
    output_fields: Optional[List[str]] = None,
212
    partition_names: Optional[List[str]] = None,
213
    limit: int = 16384,
214
    offset: int = 0,
215
    timeout: Optional[float] = None,
216
    **kwargs
217
) -> List[Dict[str, Any]]
218
```
219

220
```python { .api }
221
def search(
222
    self,
223
    data: Union[List[List[float]], List[Dict]],
224
    anns_field: str,
225
    param: Dict[str, Any],
226
    limit: int = 10,
227
    expr: Optional[str] = None,
228
    partition_names: Optional[List[str]] = None,
229
    output_fields: Optional[List[str]] = None,
230
    timeout: Optional[float] = None,
231
    round_decimal: int = -1,
232
    **kwargs
233
) -> SearchResult
234
```
235

236
```python { .api }
237
def hybrid_search(
238
    self,
239
    reqs: List[AnnSearchRequest],
240
    rerank: Union[RRFRanker, WeightedRanker],
241
    limit: int = 10,
242
    partition_names: Optional[List[str]] = None,
243
    output_fields: Optional[List[str]] = None, 
244
    timeout: Optional[float] = None,
245
    **kwargs
246
) -> SearchResult
247
```
248

249
**Examples:**
250
```python
251
# Query with filtering
252
results = collection.query(
253
    expr="category in ['A', 'B'] and score > 0.5",
254
    output_fields=["id", "category", "metadata"],
255
    limit=100
256
)
257

258
# Vector search
259
search_results = collection.search(
260
    data=[[0.1] * 768],
261
    anns_field="embedding", 
262
    param={"metric_type": "L2", "params": {"nprobe": 16}},
263
    limit=10,
264
    expr="category == 'active'",
265
    output_fields=["id", "title"]
266
)
267

268
# Hybrid search with multiple vector fields
269
from pymilvus import AnnSearchRequest, RRFRanker
270

271
req1 = AnnSearchRequest(
272
    data=dense_vectors,
273
    anns_field="dense_embedding", 
274
    param={"metric_type": "L2"},
275
    limit=100
276
)
277

278
req2 = AnnSearchRequest(
279
    data=sparse_vectors,
280
    anns_field="sparse_embedding",
281
    param={"metric_type": "IP"},
282
    limit=100
283
)
284

285
hybrid_results = collection.hybrid_search(
286
    reqs=[req1, req2],
287
    rerank=RRFRanker(k=60),
288
    limit=10
289
)
290
```
291

292
### Iterator Operations
293

294
```python { .api }
295
def query_iterator(
296
    self,
297
    batch_size: int = 1000,
298
    limit: Optional[int] = None,
299
    expr: Optional[str] = None,
300
    output_fields: Optional[List[str]] = None,
301
    partition_names: Optional[List[str]] = None,
302
    timeout: Optional[float] = None,
303
    **kwargs
304
) -> QueryIterator
305
```
306

307
```python { .api }
308
def search_iterator(
309
    self,
310
    data: Union[List[List[float]], List[Dict]],
311
    anns_field: str,
312
    param: Dict[str, Any], 
313
    batch_size: int = 1000,
314
    limit: Optional[int] = None,
315
    expr: Optional[str] = None,
316
    partition_names: Optional[List[str]] = None,
317
    output_fields: Optional[List[str]] = None,
318
    **kwargs
319
) -> SearchIterator
320
```
321

322
### Partition Management
323

324
```python { .api }
325
def create_partition(
326
    self,
327
    partition_name: str,
328
    description: str = "",
329
    timeout: Optional[float] = None
330
) -> Partition
331
```
332

333
```python { .api }
334
def drop_partition(
335
    self,
336
    partition_name: str,
337
    timeout: Optional[float] = None
338
) -> None
339
```
340

341
```python { .api }
342
def has_partition(
343
    self,
344
    partition_name: str,
345
    timeout: Optional[float] = None
346
) -> bool
347
```
348

349
```python { .api }
350
def partition(
351
    self,
352
    partition_name: str
353
) -> Partition
354
```
355

356
**Examples:**
357
```python
358
# Create partition
359
partition = collection.create_partition("2024_q1", "Q1 2024 data")
360

361
# Access existing partition
362
existing_partition = collection.partition("2024_q1")
363

364
# Check partition existence
365
if collection.has_partition("old_data"):
366
    collection.drop_partition("old_data")
367

368
# List all partitions
369
for partition in collection.partitions:
370
    print(f"Partition: {partition.name}, Entities: {partition.num_entities}")
371
```
372

373
### Index Management
374

375
```python { .api }
376
def create_index(
377
    self,
378
    field_name: str,
379
    index_params: Dict[str, Any],
380
    timeout: Optional[float] = None,
381
    **kwargs
382
) -> None
383
```
384

385
```python { .api }
386
def drop_index(
387
    self,
388
    field_name: Optional[str] = None,
389
    index_name: Optional[str] = None,
390
    timeout: Optional[float] = None
391
) -> None
392
```
393

394
```python { .api }
395
def has_index(
396
    self,
397
    field_name: Optional[str] = None,
398
    index_name: Optional[str] = None,
399
    timeout: Optional[float] = None
400
) -> bool
401
```
402

403
```python { .api }
404
def index(
405
    self,
406
    field_name: Optional[str] = None,
407
    index_name: Optional[str] = None
408
) -> Index
409
```
410

411
**Examples:**
412
```python
413
# Create vector index
414
collection.create_index(
415
    "embedding",
416
    {
417
        "index_type": "IVF_PQ",
418
        "metric_type": "L2",
419
        "params": {
420
            "nlist": 2048,
421
            "m": 16, 
422
            "nbits": 8
423
        }
424
    }
425
)
426

427
# Create scalar index
428
collection.create_index("category", {"index_type": "TRIE"})
429

430
# Access index information
431
if collection.has_index("embedding"):
432
    idx = collection.index("embedding")
433
    print(f"Index type: {idx.index_type}")
434
```
435

436
### Collection Management
437

438
```python { .api }
439
def flush(
440
    self,
441
    timeout: Optional[float] = None,
442
    **kwargs
443
) -> None
444
```
445

446
```python { .api }
447
def drop(
448
    self,
449
    timeout: Optional[float] = None
450
) -> None
451
```
452

453
```python { .api }
454
def compact(
455
    self,
456
    timeout: Optional[float] = None,
457
    **kwargs
458
) -> int
459
```
460

461
```python { .api }
462
def describe(
463
    self,
464
    timeout: Optional[float] = None
465
) -> Dict[str, Any]
466
```
467

468
## CollectionSchema
469

470
Defines the structure and configuration of a collection including fields, functions, and properties.
471

472
### Constructor
473

474
```python { .api }
475
from pymilvus import CollectionSchema
476

477
def __init__(
478
    self,
479
    fields: List[FieldSchema],
480
    description: str = "",
481
    functions: Optional[List[Function]] = None,
482
    **kwargs
483
) -> None
484
```
485

486
**Parameters:**
487
- `fields`: List of FieldSchema objects defining collection structure
488
- `description`: Human-readable description
489
- `functions`: List of Function objects for computed fields
490
- `**kwargs`: Schema configuration options
491

492
**Key Kwargs:**
493
- `auto_id`: Enable auto-generated primary keys (bool)
494
- `enable_dynamic_field`: Allow dynamic fields not in schema (bool)
495
- `primary_field`: Primary key field name (str)
496
- `partition_key_field`: Partition key field name (str) 
497
- `clustering_key_field_name`: Clustering key field name (str)
498

499
**Examples:**
500
```python
501
from pymilvus import CollectionSchema, FieldSchema, DataType, Function, FunctionType
502

503
# Basic schema
504
basic_schema = CollectionSchema([
505
    FieldSchema("id", DataType.INT64, is_primary=True),
506
    FieldSchema("vector", DataType.FLOAT_VECTOR, dim=768),
507
    FieldSchema("text", DataType.VARCHAR, max_length=1000)
508
], description="Simple document collection")
509

510
# Advanced schema with all features
511
advanced_fields = [
512
    FieldSchema("doc_id", DataType.VARCHAR, max_length=100, is_primary=True),
513
    FieldSchema("category", DataType.VARCHAR, max_length=50, is_partition_key=True),
514
    FieldSchema("timestamp", DataType.INT64, is_clustering_key=True),
515
    FieldSchema("content", DataType.VARCHAR, max_length=5000),
516
    FieldSchema("dense_vector", DataType.FLOAT_VECTOR, dim=768),
517
    FieldSchema("sparse_vector", DataType.SPARSE_FLOAT_VECTOR),
518
    FieldSchema("metadata", DataType.JSON)
519
]
520

521
# BM25 function for sparse vectors
522
bm25_function = Function(
523
    name="bm25_sparse",
524
    function_type=FunctionType.BM25,
525
    input_field_names=["content"],
526
    output_field_names=["sparse_vector"],
527
    params={"language": "en"}
528
)
529

530
advanced_schema = CollectionSchema(
531
    fields=advanced_fields,
532
    functions=[bm25_function],
533
    description="Production document collection",
534
    enable_dynamic_field=True,
535
    partition_key_field="category",
536
    clustering_key_field_name="timestamp"
537
)
538
```
539

540
### Properties
541

542
```python { .api }
543
schema.fields: List[FieldSchema]              # All field definitions
544
schema.description: str                       # Schema description
545
schema.functions: Optional[List[Function]]    # Computed field functions
546

547
# Special field access
548
schema.primary_field: Optional[FieldSchema]   # Primary key field
549
schema.partition_key_field: Optional[FieldSchema]  # Partition key field  
550
schema.clustering_key_field: Optional[FieldSchema] # Clustering key field
551

552
# Configuration flags
553
schema.enable_dynamic_field: bool             # Dynamic fields enabled
554
schema.auto_id: bool                         # Auto ID generation enabled
555
```
556

557
### Methods
558

559
```python { .api }
560
def add_field(
561
    self,
562
    field_name: str,
563
    datatype: DataType,
564
    **kwargs
565
) -> None
566
```
567

568
```python { .api }
569
def to_dict(self) -> Dict[str, Any]
570
```
571

572
**Example:**
573
```python
574
# Add field to existing schema
575
schema.add_field("score", DataType.DOUBLE, default_value=0.0)
576

577
# Convert to dictionary for inspection
578
schema_dict = schema.to_dict()
579
print(f"Fields: {len(schema_dict['fields'])}")
580
```
581

582
## FieldSchema
583

584
Defines individual field properties including data type, constraints, and metadata.
585

586
### Constructor
587

588
```python { .api }
589
from pymilvus import FieldSchema, DataType
590

591
def __init__(
592
    self,
593
    name: str,
594
    dtype: DataType,
595
    description: str = "",
596
    **kwargs
597
) -> None
598
```
599

600
**Parameters:**
601
- `name`: Field name (must be unique within schema)
602
- `dtype`: DataType enum value
603
- `description`: Human-readable field description
604
- `**kwargs`: Field-specific configuration options
605

606
**Key Kwargs:**
607
- `is_primary`: Mark as primary key field (bool)
608
- `auto_id`: Enable auto-generated values for primary key (bool)
609
- `max_length`: Maximum length for VARCHAR fields (int)
610
- `dim`: Dimension for vector fields (int)
611
- `max_capacity`: Maximum capacity for ARRAY fields (int)
612
- `element_type`: Element data type for ARRAY fields (DataType)
613
- `is_partition_key`: Mark as partition key (bool)
614
- `is_clustering_key`: Mark as clustering key (bool)
615
- `nullable`: Allow null values (bool)
616
- `default_value`: Default field value
617
- `mmap_enabled`: Enable memory mapping for large fields (bool)
618
- `is_function_output`: Mark as function output field (bool)
619

620
### Field Type Examples
621

622
```python { .api }
623
# Primary key fields
624
id_field = FieldSchema("id", DataType.INT64, is_primary=True, auto_id=True)
625
uuid_field = FieldSchema("uuid", DataType.VARCHAR, max_length=36, is_primary=True)
626

627
# Vector fields
628
dense_vector = FieldSchema("embedding", DataType.FLOAT_VECTOR, dim=768)
629
binary_vector = FieldSchema("hash", DataType.BINARY_VECTOR, dim=128)
630
sparse_vector = FieldSchema("sparse", DataType.SPARSE_FLOAT_VECTOR)
631

632
# Half-precision vectors for memory efficiency
633
fp16_vector = FieldSchema("fp16_embed", DataType.FLOAT16_VECTOR, dim=512)
634
bf16_vector = FieldSchema("bf16_embed", DataType.BFLOAT16_VECTOR, dim=512)
635

636
# Scalar fields
637
text_field = FieldSchema("title", DataType.VARCHAR, max_length=200)
638
json_field = FieldSchema("metadata", DataType.JSON)
639
bool_field = FieldSchema("active", DataType.BOOL, default_value=True)
640
int_field = FieldSchema("count", DataType.INT32, default_value=0)
641
float_field = FieldSchema("score", DataType.DOUBLE, nullable=True)
642

643
# Array fields
644
tag_array = FieldSchema(
645
    "tags", 
646
    DataType.ARRAY,
647
    max_capacity=10,
648
    element_type=DataType.VARCHAR
649
)
650

651
# Special purpose fields
652
partition_key = FieldSchema(
653
    "category", 
654
    DataType.VARCHAR, 
655
    max_length=50, 
656
    is_partition_key=True
657
)
658

659
clustering_key = FieldSchema(
660
    "timestamp", 
661
    DataType.INT64, 
662
    is_clustering_key=True
663
)
664

665
# Memory-mapped field for large data
666
large_field = FieldSchema(
667
    "large_data", 
668
    DataType.VARCHAR, 
669
    max_length=10000,
670
    mmap_enabled=True
671
)
672
```
673

674
### Properties
675

676
```python { .api }
677
field.name: str                    # Field name
678
field.dtype: DataType             # Data type
679
field.description: str            # Field description
680

681
# Special properties
682
field.is_primary: bool            # Primary key flag
683
field.is_dynamic: bool            # Dynamic field flag  
684
field.auto_id: bool              # Auto ID generation flag
685
field.nullable: bool             # Nullable flag
686
field.is_partition_key: bool     # Partition key flag
687
field.is_clustering_key: bool    # Clustering key flag
688
field.is_function_output: bool   # Function output flag
689

690
# Type-specific properties
691
field.max_length: Optional[int]   # VARCHAR max length
692
field.dim: Optional[int]         # Vector dimension
693
field.max_capacity: Optional[int] # ARRAY max capacity
694
field.element_type: Optional[DataType] # ARRAY element type
695
field.default_value: Any        # Default field value
696
field.mmap_enabled: Optional[bool] # Memory mapping enabled
697
```
698

699
## Function
700

701
Defines computed fields that are automatically generated from input fields using built-in functions.
702

703
### Constructor
704

705
```python { .api }
706
from pymilvus import Function, FunctionType
707

708
def __init__(
709
    self,
710
    name: str,
711
    function_type: FunctionType,
712
    input_field_names: Union[str, List[str]],
713
    output_field_names: Optional[Union[str, List[str]]] = None,
714
    description: str = "",
715
    params: Optional[Dict] = None
716
)
717
```
718

719
**Parameters:**
720
- `name`: Function name (must be unique)
721
- `function_type`: FunctionType enum (BM25, TEXTEMBEDDING, RERANK)
722
- `input_field_names`: Source field name(s)
723
- `output_field_names`: Target field name(s) 
724
- `description`: Function description
725
- `params`: Function-specific parameters
726

727
### Function Types
728

729
```python { .api }
730
# BM25 sparse vector generation
731
bm25_func = Function(
732
    name="content_bm25",
733
    function_type=FunctionType.BM25,
734
    input_field_names=["content"],
735
    output_field_names=["bm25_sparse"],
736
    params={"language": "en"}
737
)
738

739
# Text embedding generation
740
embed_func = Function(
741
    name="title_embedding", 
742
    function_type=FunctionType.TEXTEMBEDDING,
743
    input_field_names=["title", "description"],
744
    output_field_names=["text_embedding"],
745
    params={
746
        "model_name": "sentence-transformers/all-MiniLM-L6-v2",
747
        "model_config": {"device": "gpu"}
748
    }
749
)
750

751
# Reranking function
752
rerank_func = Function(
753
    name="relevance_rerank",
754
    function_type=FunctionType.RERANK,
755
    input_field_names=["query", "document"],
756
    output_field_names=["relevance_score"],
757
    params={"model_name": "cross-encoder/ms-marco-MiniLM-L-6-v2"}
758
)
759
```
760

761
### Properties
762

763
```python { .api }
764
func.name: str                           # Function name
765
func.function_type: FunctionType        # Function type enum
766
func.input_field_names: List[str]       # Input field names
767
func.output_field_names: List[str]      # Output field names  
768
func.description: str                   # Function description
769
func.params: Optional[Dict]             # Function parameters
770
```
771

772
## Advanced Schema Patterns
773

774
### Multi-Vector Collection with Functions
775

776
```python { .api }
777
from pymilvus import CollectionSchema, FieldSchema, DataType, Function, FunctionType
778

779
# Define fields including function outputs
780
fields = [
781
    # Primary key
782
    FieldSchema("doc_id", DataType.VARCHAR, max_length=100, is_primary=True),
783
    
784
    # Partitioning and clustering
785
    FieldSchema("category", DataType.VARCHAR, max_length=50, is_partition_key=True),
786
    FieldSchema("created_at", DataType.INT64, is_clustering_key=True),
787
    
788
    # Input text fields
789
    FieldSchema("title", DataType.VARCHAR, max_length=500),
790
    FieldSchema("content", DataType.VARCHAR, max_length=10000),
791
    
792
    # Vector fields (function outputs)
793
    FieldSchema("title_embedding", DataType.FLOAT_VECTOR, dim=384, is_function_output=True),
794
    FieldSchema("content_sparse", DataType.SPARSE_FLOAT_VECTOR, is_function_output=True),
795
    
796
    # Metadata
797
    FieldSchema("metadata", DataType.JSON),
798
    FieldSchema("tags", DataType.ARRAY, max_capacity=20, element_type=DataType.VARCHAR)
799
]
800

801
# Define functions
802
functions = [
803
    Function(
804
        "title_embed",
805
        FunctionType.TEXTEMBEDDING,
806
        input_field_names=["title"],
807
        output_field_names=["title_embedding"],
808
        params={"model_name": "sentence-transformers/all-MiniLM-L6-v2"}
809
    ),
810
    Function(
811
        "content_bm25",
812
        FunctionType.BM25,
813
        input_field_names=["content"], 
814
        output_field_names=["content_sparse"],
815
        params={"language": "en"}
816
    )
817
]
818

819
# Create comprehensive schema
820
schema = CollectionSchema(
821
    fields=fields,
822
    functions=functions,
823
    description="Multi-vector document collection with automatic embeddings",
824
    enable_dynamic_field=True,
825
    partition_key_field="category",
826
    clustering_key_field_name="created_at"
827
)
828

829
# Create collection
830
collection = Collection("documents", schema)
831
```
832

833
### Schema Validation and Best Practices
834

835
```python { .api }
836
# Validate schema before collection creation
837
def validate_schema(schema: CollectionSchema) -> bool:
838
    """Validate schema configuration"""
839
    
840
    # Check for primary key
841
    if not schema.primary_field:
842
        raise ValueError("Schema must have a primary key field")
843
    
844
    # Validate vector dimensions
845
    for field in schema.fields:
846
        if field.dtype in [DataType.FLOAT_VECTOR, DataType.BINARY_VECTOR]:
847
            if not hasattr(field, 'dim') or field.dim <= 0:
848
                raise ValueError(f"Vector field {field.name} must have valid dimension")
849
    
850
    # Validate function input/output fields exist
851
    if schema.functions:
852
        field_names = {f.name for f in schema.fields}
853
        for func in schema.functions:
854
            for input_name in func.input_field_names:
855
                if input_name not in field_names:
856
                    raise ValueError(f"Function {func.name} input field {input_name} not found")
857
            for output_name in func.output_field_names:
858
                if output_name not in field_names:
859
                    raise ValueError(f"Function {func.name} output field {output_name} not found")
860
    
861
    return True
862

863
# Use validation
864
try:
865
    validate_schema(schema)
866
    collection = Collection("validated_collection", schema)
867
except ValueError as e:
868
    print(f"Schema validation failed: {e}")
869
```
870

871
The ORM classes provide comprehensive control over collection structure and behavior, enabling sophisticated data modeling patterns while maintaining type safety and validation.

Version

Tile

Files

orm-collection.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

orm-collection.mddocs/