Milvus embedding store integration for LangChain4j
Integration between LangChain4j and Milvus vector database for storing and searching embedding vectors with text content and metadata.
Maven:
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-milvus</artifactId>
<version>1.11.0-beta19</version>
</dependency>Gradle:
implementation 'dev.langchain4j:langchain4j-milvus:1.11.0-beta19'Requirements: Milvus 2.4.20 or newer
Local Milvus:
import dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore;
import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
.host("localhost")
.port(19530)
.collectionName("my_embeddings")
.dimension(384)
.build();
// Add embedding
String id = store.add(embedding, textSegment);
// Search
EmbeddingSearchResult<TextSegment> results = store.search(searchRequest);Zilliz Cloud:
MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
.uri("https://xxx.api.gcp-us-west1.zillizcloud.com")
.token("your-api-key")
.collectionName("my_embeddings")
.dimension(384)
.build();import dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore;
import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.store.embedding.EmbeddingSearchRequest;
import dev.langchain4j.store.embedding.EmbeddingSearchResult;
import dev.langchain4j.store.embedding.EmbeddingMatch;
import io.milvus.param.IndexType;
import io.milvus.param.MetricType;
import io.milvus.common.clientenum.ConsistencyLevelEnum;class MilvusEmbeddingStore implements EmbeddingStore<TextSegment> {
static Builder builder();
// Adding
String add(Embedding embedding);
void add(String id, Embedding embedding);
String add(Embedding embedding, TextSegment textSegment);
List<String> addAll(List<Embedding> embeddings);
void addAll(List<String> ids, List<Embedding> embeddings, List<TextSegment> textSegments);
// Searching
EmbeddingSearchResult<TextSegment> search(EmbeddingSearchRequest request);
// Removing
void removeAll(Collection<String> ids);
void removeAll(Filter filter);
void removeAll();
// Management
void dropCollection(String collectionName);
}class MilvusEmbeddingStore.Builder {
// Connection - Local Milvus
Builder host(String host);
Builder port(Integer port);
// Connection - Zilliz Cloud
Builder uri(String uri);
Builder token(String token);
// Authentication
Builder username(String username);
Builder password(String password);
// Database
Builder databaseName(String databaseName);
Builder milvusClient(MilvusServiceClient milvusClient);
// Collection Configuration
Builder collectionName(String collectionName);
Builder dimension(Integer dimension);
Builder indexType(IndexType indexType);
Builder metricType(MetricType metricType);
Builder consistencyLevel(ConsistencyLevelEnum consistencyLevel);
Builder extraParameters(Map<String, Object> extraParameters);
// Field Names
Builder idFieldName(String idFieldName);
Builder textFieldName(String textFieldName);
Builder metadataFieldName(String metadataFieldName);
Builder vectorFieldName(String vectorFieldName);
// Behavior
Builder retrieveEmbeddingsOnSearch(Boolean retrieveEmbeddingsOnSearch);
Builder autoFlushOnInsert(Boolean autoFlushOnInsert);
MilvusEmbeddingStore build();
}// Single with auto-generated ID
String id = store.add(embedding);
// Single with custom ID
store.add("custom-id", embedding);
// With text and metadata
TextSegment segment = TextSegment.from("text", metadata);
String id = store.add(embedding, segment);
// Batch with auto-generated IDs
List<String> ids = store.addAll(embeddings);
// Batch with custom IDs and segments
store.addAll(ids, embeddings, segments);import static dev.langchain4j.store.embedding.filter.MetadataFilterBuilder.metadataKey;
// Basic search
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(10)
.minScore(0.7)
.build();
EmbeddingSearchResult<TextSegment> results = store.search(request);
// With metadata filter
Filter filter = metadataKey("category").isEqualTo("docs");
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(10)
.filter(filter)
.build();
// Process results
for (EmbeddingMatch<TextSegment> match : results.matches()) {
double score = match.score();
String id = match.embeddingId();
TextSegment segment = match.embedded();
}Details: operations/searching.md
// By IDs
store.removeAll(Arrays.asList("id1", "id2", "id3"));
// By filter
Filter filter = metadataKey("status").isEqualTo("archived");
store.removeAll(filter);
// All
store.removeAll();Details: operations/removing.md
// Drop collection (permanent deletion)
store.dropCollection("collection_name");Details: operations/collection-management.md
Local Milvus:
MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
.host("localhost") // default: "localhost"
.port(19530) // default: 19530
.collectionName("my_coll")
.dimension(384)
.build();Zilliz Cloud:
MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
.uri("https://xxx.api.gcp-us-west1.zillizcloud.com")
.token("your-api-key")
.collectionName("my_coll")
.dimension(384)
.build();With Authentication:
MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
.host("localhost")
.port(19530)
.username("admin")
.password("password")
.collectionName("my_coll")
.dimension(384)
.build();import io.milvus.param.IndexType;
.indexType(IndexType.FLAT) // Exact search (default)
.indexType(IndexType.IVF_FLAT) // Balanced performance
.indexType(IndexType.IVF_PQ) // Memory efficient
.indexType(IndexType.HNSW) // High performance
.indexType(IndexType.DISKANN) // Very large datasetsimport io.milvus.param.MetricType;
.metricType(MetricType.COSINE) // Cosine similarity (default)
.metricType(MetricType.L2) // Euclidean distance
.metricType(MetricType.IP) // Inner productimport io.milvus.common.clientenum.ConsistencyLevelEnum;
.consistencyLevel(ConsistencyLevelEnum.EVENTUALLY) // default, best performance
.consistencyLevel(ConsistencyLevelEnum.BOUNDED) // bounded staleness
.consistencyLevel(ConsistencyLevelEnum.SESSION) // per-session consistency
.consistencyLevel(ConsistencyLevelEnum.STRONG) // immediate consistencyFull Configuration Reference: configuration.md
Collections are created automatically if they don't exist when building the store. Requires dimension parameter.
All embeddings must match the collection's configured dimension. Set dimension based on your embedding model.
add(embedding) → returns UUIDadd(id, embedding) → specify your own IDStore metadata with TextSegment for filtering during search:
Metadata metadata = Metadata.from(Map.of(
"category", "technical",
"year", 2024
));
TextSegment segment = TextSegment.from("text", metadata);
store.add(embedding, segment);
// Later, filter in search
Filter filter = metadataKey("category").isEqualTo("technical");.autoFlushOnInsert(false) // default, better batch performance
.retrieveEmbeddingsOnSearch(false) // default, faster searchSee: patterns.md for:
See: advanced.md for:
Install with Tessl CLI
npx tessl i tessl/maven-dev-langchain4j--langchain4j-milvus@1.11.0