A lightweight version of Milvus wrapped with Python for vector similarity search in AI applications
—
Data export and migration utilities for moving data between milvus-lite and other Milvus deployments. The command line interface provides tools for collection dumping, data format conversion, and bulk data operations.
The CLI tools are included with milvus-lite but require additional dependencies for data export functionality.
# Install milvus-lite with bulk writer dependencies
pip install -U "pymilvus[bulk_writer]"
# Verify CLI is available
milvus-lite --helpExport collection data from milvus-lite database to JSON format files for migration to other Milvus deployments.
milvus-lite dump -d DB_FILE -c COLLECTION -p PATH
# Required arguments:
# -d, --db-file DB_FILE milvus lite database file path
# -c, --collection COLLECTION collection name to dump
# -p, --path PATH output directory for dump files
# Optional arguments:
# -h, --help show help message and exitUsage Examples:
# Basic collection dump
milvus-lite dump -d ./my_vectors.db -c embeddings -p ./export_data
# Dump with full paths
milvus-lite dump --db-file /home/user/data/vectors.db \
--collection user_profiles \
--path /tmp/migration_data
# Export multiple collections (run command for each)
milvus-lite dump -d ./app.db -c collection1 -p ./exports/collection1
milvus-lite dump -d ./app.db -c collection2 -p ./exports/collection2The dump command performs comprehensive data export with support for various vector types and metadata formats.
Export Features:
Export Process:
Usage Example:
# Programmatic access to dump functionality
from milvus_lite.cmdline import dump_collection
try:
dump_collection(
db_file="./my_database.db",
collection_name="embeddings",
path="./export_directory"
)
print("Export completed successfully")
except RuntimeError as e:
print(f"Export failed: {e}")Automatic conversion of specialized vector types during export for compatibility with import tools.
def bfloat16_to_float32(byte_data: bytes) -> np.ndarray:
"""
Convert bfloat16 byte data to float32 numpy array.
Parameters:
- byte_data (bytes): Raw bfloat16 vector data
Returns:
- np.ndarray: Converted float32 array
"""
def binary_to_int_list(packed_bytes: bytes) -> np.ndarray:
"""
Convert packed binary vectors to integer list representation.
Parameters:
- packed_bytes (bytes): Packed binary vector data
Returns:
- np.ndarray: Unpacked binary vector as integer array
"""These conversion functions are automatically applied during the dump process to ensure exported data is compatible with bulk import tools.
Custom JSON encoder for handling Milvus-specific data types during export.
class MilvusEncoder(json.JSONEncoder):
"""
JSON encoder for Milvus data types.
Handles numpy arrays, float types, and other Milvus-specific
data structures for proper JSON serialization.
"""
def default(self, obj):
"""
Convert Milvus objects to JSON-serializable format.
Supports:
- numpy.ndarray -> list
- numpy.float32/float16 -> float
- Other standard JSON types
"""Complete workflow for migrating data from milvus-lite to other Milvus deployments.
Step 1: Export from Milvus Lite
# Export collection data
milvus-lite dump -d ./source.db -c my_collection -p ./migration_data
# This creates JSON files in ./migration_data/ directoryStep 2: Import to Target Milvus
For Zilliz Cloud (managed Milvus):
For Self-hosted Milvus:
Step 3: Verify Migration
# Verify data after migration
from pymilvus import MilvusClient
# Connect to target Milvus instance
target_client = MilvusClient(uri="http://target-milvus:19530")
# Check collection exists and has expected data
if target_client.has_collection("my_collection"):
stats = target_client.describe_collection("my_collection")
print(f"Migrated collection has {stats['num_entities']} entities")
# Sample some data to verify
results = target_client.query(
collection_name="my_collection",
filter="", # No filter, get any records
limit=5,
output_fields=["*"]
)
print(f"Sample migrated data: {results}")The CLI tools provide comprehensive error handling and validation.
# Common errors and exceptions:
# - RuntimeError: Database file not found, collection doesn't exist
# - FileNotFoundError: Invalid export path or permissions
# - PermissionError: Insufficient file system permissions
# - ValueError: Invalid arguments or collection schema issuesError Examples:
# Database file doesn't exist
$ milvus-lite dump -d ./missing.db -c test -p ./out
# RuntimeError: db_file: ./missing.db not exists
# Collection doesn't exist
$ milvus-lite dump -d ./valid.db -c missing_collection -p ./out
# RuntimeError: Collection: missing_collection not exists
# Invalid export path
$ milvus-lite dump -d ./valid.db -c test -p /invalid/path
# RuntimeError: dump path(/invalid/path)'s parent dir not existsThe dump command is optimized for large collections with configurable batch sizes and memory management.
Performance Features:
Large Collection Handling:
# The dump process automatically handles large collections
# by using query iterators and batch processing
# Default configuration optimized for performance:
# - Segment size: 512MB
# - File type: JSON
# - Batch processing with progress tracking
# - Memory-efficient streamingThe exported JSON files are designed for compatibility with various Milvus import tools.
Supported Import Destinations:
Export Format:
Install with Tessl CLI
npx tessl i tessl/pypi-milvus-lite