Milvus embedding store integration for LangChain4j
—
Advanced configuration and optimization for MilvusEmbeddingStore.
Optimal for: Fast approximate search, 100K-10M vectors
import io.milvus.param.IndexType;
import io.milvus.param.MetricType;
import java.util.Map;
MilvusEmbeddingStore.builder()
.indexType(IndexType.HNSW)
.metricType(MetricType.COSINE)
.extraParameters(Map.of(
"efConstruction", 200, // Build-time: 64-512, higher=better accuracy/slower build
"m", 16 // Connections: 4-64, higher=better accuracy/more memory
))
.build();Parameter Guidelines:
efConstruction: Start with 200, increase for better accuracym: Start with 16, increase for better recallUse Cases:
Optimal for: Balanced performance, 100K-5M vectors
MilvusEmbeddingStore.builder()
.indexType(IndexType.IVF_FLAT)
.extraParameters(Map.of(
"nlist", 1024 // Clusters: 1-65536, rule: sqrt(N) to 4*sqrt(N)
))
.build();Parameter Guidelines:
nlist: ~sqrt(num_vectors) to 4×sqrt(num_vectors)Use Cases:
Optimal for: Large datasets, memory efficiency
MilvusEmbeddingStore.builder()
.indexType(IndexType.IVF_PQ)
.extraParameters(Map.of(
"m", 8, // Subquantizers: divides dimension, must divide evenly
"nlist", 2048 // Clusters
))
.build();Parameter Guidelines:
m: dimension must be divisible by m (e.g., 384/8=48)m = more compression = less accuracyUse Cases:
5M vectors
Optimal for: Very large datasets (> 10M vectors)
MilvusEmbeddingStore.builder()
.indexType(IndexType.DISKANN)
.build();Use Cases:
import io.milvus.common.clientenum.ConsistencyLevelEnum;
.consistencyLevel(ConsistencyLevelEnum.EVENTUALLY)Characteristics:
Visibility Delay:
.consistencyLevel(ConsistencyLevelEnum.BOUNDED)Characteristics:
Use When:
.consistencyLevel(ConsistencyLevelEnum.SESSION)Characteristics:
Use When:
.consistencyLevel(ConsistencyLevelEnum.STRONG)Characteristics:
Use When:
Performance Impact:
import io.milvus.client.MilvusServiceClient;
import io.milvus.param.ConnectParam;
import io.milvus.param.clientconfiguration.ClientConfiguration;
// Create shared client
ConnectParam connectParam = ConnectParam.newBuilder()
.withHost("localhost")
.withPort(19530)
.withAuthorization("user", "password")
.build();
MilvusServiceClient sharedClient = new MilvusServiceClient(connectParam);
// Reuse across multiple stores
MilvusEmbeddingStore store1 = MilvusEmbeddingStore.builder()
.milvusClient(sharedClient)
.collectionName("collection1")
.dimension(384)
.build();
MilvusEmbeddingStore store2 = MilvusEmbeddingStore.builder()
.milvusClient(sharedClient)
.collectionName("collection2")
.dimension(768)
.build();Benefits:
Customize schema for compatibility or preferences:
MilvusEmbeddingStore.builder()
.host("localhost")
.collectionName("custom_schema")
.dimension(384)
.idFieldName("document_id")
.textFieldName("content")
.metadataFieldName("properties")
.vectorFieldName("embedding_vector")
.build();Use Cases:
Milvus supports multiple databases for tenant isolation:
// Tenant 1
MilvusEmbeddingStore tenant1Store = MilvusEmbeddingStore.builder()
.host("localhost")
.databaseName("tenant_1")
.collectionName("embeddings")
.dimension(384)
.build();
// Tenant 2
MilvusEmbeddingStore tenant2Store = MilvusEmbeddingStore.builder()
.host("localhost")
.databaseName("tenant_2")
.collectionName("embeddings")
.dimension(384)
.build();Benefits:
MilvusEmbeddingStore.builder()
.host("localhost")
.collectionName("bulk_insert")
.dimension(384)
.autoFlushOnInsert(false) // Critical for bulk
.consistencyLevel(ConsistencyLevelEnum.EVENTUALLY)
.indexType(IndexType.IVF_FLAT) // Build after bulk insert
.build();Pattern:
addAll()MilvusEmbeddingStore.builder()
.host("localhost")
.collectionName("high_qps")
.dimension(384)
.indexType(IndexType.HNSW)
.metricType(MetricType.COSINE)
.consistencyLevel(ConsistencyLevelEnum.EVENTUALLY)
.retrieveEmbeddingsOnSearch(false) // Critical for QPS
.extraParameters(Map.of("efConstruction", 512, "m", 32))
.build();Key Settings:
MilvusEmbeddingStore.builder()
.host("localhost")
.collectionName("realtime")
.dimension(384)
.indexType(IndexType.IVF_FLAT)
.consistencyLevel(ConsistencyLevelEnum.STRONG) // Critical
.autoFlushOnInsert(true) // For durability
.build();Trade-offs:
MilvusEmbeddingStore.builder()
.host("localhost")
.collectionName("memory_efficient")
.dimension(384)
.indexType(IndexType.IVF_PQ)
.extraParameters(Map.of("m", 8, "nlist", 2048))
.retrieveEmbeddingsOnSearch(false)
.build();.metricType(MetricType.COSINE)Best For:
Score Range: 0 to 1 (1 = most similar)
.metricType(MetricType.L2)Best For:
Score Range: 0 to ∞ (0 = most similar, lower = better)
.metricType(MetricType.IP)Best For:
Score Range: -1 to 1 (1 = most similar, higher = better)
Performance: IP is ~10-20% faster than COSINE for normalized vectors
import static dev.langchain4j.store.embedding.filter.MetadataFilterBuilder.metadataKey;
// Efficient: Single equality
Filter efficientFilter = metadataKey("category").isEqualTo("tech");
// Less efficient: Complex nested conditions
Filter complexFilter = and(
or(
metadataKey("cat1").isEqualTo("a"),
metadataKey("cat2").isEqualTo("b")
),
metadataKey("year").isGreaterThan(2020)
);Best Practices:
// Request only what you need
EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(10) // Not 100 if you only need 10
.build();Impact: Each result requires deserialization and memory
// Use threshold to reduce result processing
EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(100)
.minScore(0.75) // Filter at query time
.build();// Use Milvus SDK directly for advanced operations
import io.milvus.client.MilvusServiceClient;
import io.milvus.param.collection.GetCollectionStatisticsParam;
import io.milvus.grpc.GetCollectionStatisticsResponse;
// Access through custom client pattern
MilvusServiceClient client = /* from store or create */;
GetCollectionStatisticsParam param = GetCollectionStatisticsParam.newBuilder()
.withCollectionName("my_collection")
.build();
GetCollectionStatisticsResponse stats = client.getCollectionStatistics(param);Configuration:
Monitoring:
Resilience:
Security:
// Vector similarity with metadata filtering
Filter metadataFilter = metadataKey("department").isEqualTo("engineering");
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(20)
.filter(metadataFilter) // Combines vector + scalar filtering
.build();// Search multiple embeddings, aggregate results
List<Embedding> queryEmbeddings = List.of(embedding1, embedding2, embedding3);
Map<String, Double> aggregatedScores = new HashMap<>();
for (Embedding queryEmb : queryEmbeddings) {
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmb)
.maxResults(50)
.build();
for (EmbeddingMatch<TextSegment> match : store.search(request).matches()) {
aggregatedScores.merge(
match.embeddingId(),
match.score(),
(existing, newScore) -> Math.max(existing, newScore)
);
}
}
// Get top results from aggregated scores
List<String> topIds = aggregatedScores.entrySet().stream()
.sorted(Map.Entry.<String, Double>comparingByValue().reversed())
.limit(10)
.map(Map.Entry::getKey)
.collect(Collectors.toList());// Use Milvus SDK for advanced collection management
import io.milvus.param.collection.CreateAliasParam;
// Point alias to active collection version
CreateAliasParam param = CreateAliasParam.newBuilder()
.withCollectionName("embeddings_v2")
.withAlias("embeddings_active")
.build();
client.createAlias(param);
// Use alias in store
MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
.host("localhost")
.collectionName("embeddings_active") // Uses alias
.dimension(384)
.build();Install with Tessl CLI
npx tessl i tessl/maven-dev-langchain4j--langchain4j-milvus