Milvus embedding store integration for LangChain4j
—
Methods for deleting embeddings from Milvus collections.
Requirements: Milvus 2.3.x or newer
void removeAll(Collection<String> ids);Removes embeddings with specified IDs.
Throws: Exception if ids is null or empty
import java.util.Arrays;
List<String> idsToRemove = Arrays.asList("id1", "id2", "id3");
store.removeAll(idsToRemove);void removeAll(Filter filter);Removes all embeddings matching metadata filter.
Throws: Exception if filter is null
Requires: Consistency level BOUNDED for complex filters
import static dev.langchain4j.store.embedding.filter.MetadataFilterBuilder.metadataKey;
Filter filter = metadataKey("status").isEqualTo("archived");
store.removeAll(filter);void removeAll();Removes all embeddings from collection.
store.removeAll();Filter filter = metadataKey("category").isEqualTo("temporary");
store.removeAll(filter);Filter oldFilter = metadataKey("year").isLessThan(2020);
store.removeAll(oldFilter);import static dev.langchain4j.store.embedding.filter.Filter.and;
Filter old = metadataKey("year").isLessThan(2021);
Filter draft = metadataKey("status").isEqualTo("draft");
store.removeAll(and(old, draft));import static dev.langchain4j.store.embedding.filter.Filter.or;
Filter year2019 = metadataKey("year").isEqualTo(2019);
Filter year2020 = metadataKey("year").isEqualTo(2020);
store.removeAll(or(year2019, year2020));List<String> processedIds = new ArrayList<>();
for (EmbeddingMatch<TextSegment> match : results.matches()) {
processItem(match);
processedIds.add(match.embeddingId());
}
store.removeAll(processedIds);Set<String> idsToDelete = new HashSet<>();
idsToDelete.add("doc-123");
idsToDelete.add("doc-456");
store.removeAll(idsToDelete);// Find and remove duplicates
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(referenceEmbedding)
.maxResults(100)
.minScore(0.99) // Very high similarity
.build();
List<String> duplicateIds = store.search(request).matches().stream()
.skip(1) // Keep first
.map(EmbeddingMatch::embeddingId)
.collect(Collectors.toList());
if (!duplicateIds.isEmpty()) {
store.removeAll(duplicateIds);
}import static dev.langchain4j.store.embedding.filter.MetadataFilterBuilder.metadataKey;
import java.time.LocalDate;
int cutoffYear = LocalDate.now().getYear() - 2;
Filter oldFilter = metadataKey("year").isLessThan(cutoffYear);
store.removeAll(oldFilter);Filter sourceFilter = metadataKey("source").isEqualTo("deprecated_api_v1");
store.removeAll(sourceFilter);import static dev.langchain4j.store.embedding.filter.Filter.and;
Filter lowConfidence = metadataKey("confidence").isLessThan(0.5);
Filter temporary = metadataKey("type").isEqualTo("temporary");
store.removeAll(and(lowConfidence, temporary));List<String> expiredIds = externalSystem.getExpiredIds();
if (!expiredIds.isEmpty()) {
store.removeAll(expiredIds);
}store.removeAll(); // Clear all
// Rebuild
List<Embedding> newEmbeddings = generateNewEmbeddings();
store.addAll(newEmbeddings);Problem: Deleted items still visible after deletion
Cause: Consistency level below STRONG
Solution:
import io.milvus.common.clientenum.ConsistencyLevelEnum;
MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
.host("localhost")
.collectionName("my_collection")
.dimension(384)
.consistencyLevel(ConsistencyLevelEnum.STRONG) // Required
.build();
store.removeAll(idsToDelete);
// Now immediately invisibleProblem: Filter-based deletion fails
Cause: Requires BOUNDED consistency for complex filters
Solution:
MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
.consistencyLevel(ConsistencyLevelEnum.BOUNDED) // Required
.build();
Filter filter = metadataKey("status").isEqualTo("archived");
store.removeAll(filter);Warning: Filter-based deletions are NOT atomic. If operation fails partway, some data may still be deleted.
try {
Filter complexFilter = buildComplexFilter();
store.removeAll(complexFilter);
} catch (Exception e) {
// Some entities may have been deleted
logPartialDeletionError(e);
}Warning: Frequent deletions impact system performance.
Best Practices:
Warning: Entities deleted beyond Time Travel retention cannot be retrieved.
// Preferred: Direct ID removal
List<String> ids = Arrays.asList("id1", "id2", "id3");
store.removeAll(ids);
// Slower: Filter-based
Filter filter = metadataKey("status").isEqualTo("delete");
store.removeAll(filter);// Preview what will be deleted
Filter filter = metadataKey("status").isEqualTo("archived");
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(sampleEmbedding)
.maxResults(1000)
.filter(filter)
.build();
int count = store.search(request).matches().size();
System.out.println("Will delete " + count + " embeddings");
// Confirm then delete
if (confirmed) {
store.removeAll(filter);
}// For immediate visibility (slower)
.consistencyLevel(ConsistencyLevelEnum.STRONG)
// For eventual consistency (faster)
.consistencyLevel(ConsistencyLevelEnum.EVENTUALLY)
// For filter deletions (required)
.consistencyLevel(ConsistencyLevelEnum.BOUNDED)List<String> idsToDelete = new ArrayList<>();
for (Document doc : documentsToDelete) {
idsToDelete.add(doc.getId());
}
// Single batch deletion
if (!idsToDelete.isEmpty()) {
store.removeAll(idsToDelete);
}// Safe to call even if no matches
Filter filter = metadataKey("type").isEqualTo("nonexistent");
store.removeAll(filter); // No error if no matchestry {
List<String> ids = Arrays.asList("id1", "id2", "id3");
store.removeAll(ids);
System.out.println("Successfully removed");
} catch (IllegalArgumentException e) {
System.err.println("Invalid input: " + e.getMessage());
} catch (Exception e) {
System.err.println("Failed to remove: " + e.getMessage());
}ID-based removal:
Filter-based removal:
removeAll():
Consistency levels:
Install with Tessl CLI
npx tessl i tessl/maven-dev-langchain4j--langchain4j-milvus