CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-milvus

Milvus embedding store integration for LangChain4j

Overview
Eval results
Files

troubleshooting.mddocs/

Troubleshooting

Common issues and solutions for MilvusEmbeddingStore.

Deletion Visibility Issues

Problem: Deleted embeddings still appear in search

Deleted embeddings by ID but they still appear in search results

Cause: Consistency level below STRONG

Solution:

import io.milvus.common.clientenum.ConsistencyLevelEnum;

MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
    .host("localhost")
    .collectionName("my_collection")
    .dimension(384)
    .consistencyLevel(ConsistencyLevelEnum.STRONG)  // Required for immediate visibility
    .build();

store.removeAll(idsToDelete);
// Now immediately invisible in search

Alternative: Wait for eventual consistency (~1 second with EVENTUALLY)

Problem: Filter-based deletion fails

Exception: Complex boolean expressions not supported

Cause: Consistency level not set to BOUNDED

Solution:

MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
    .host("localhost")
    .collectionName("my_collection")
    .dimension(384)
    .consistencyLevel(ConsistencyLevelEnum.BOUNDED)  // Required
    .build();

Filter filter = metadataKey("status").isEqualTo("archived");
store.removeAll(filter);

Problem: Partial deletion after error

Deletion failed, but some data was deleted

Cause: Filter-based deletions are non-atomic

Solution:

// Use ID-based deletion for atomicity
List<String> idsToDelete = /* collect IDs first */;

try {
    store.removeAll(idsToDelete);  // More atomic than filter-based
} catch (Exception e) {
    // Handle error, log for reconciliation
    logDeletionFailure(idsToDelete, e);
}

Dimension Mismatch Errors

Problem: Dimension mismatch when adding

Exception: Embedding dimension 768 doesn't match collection dimension 384

Cause: Embedding dimension differs from collection

Solution:

// Verify embedding dimension matches collection
int collectionDimension = 384;  // Must match embedding model

Embedding embedding = model.embed(text).content();
if (embedding.dimension() != collectionDimension) {
    throw new IllegalArgumentException(
        "Embedding dimension " + embedding.dimension() + 
        " doesn't match collection dimension " + collectionDimension
    );
}

store.add(embedding, segment);

Problem: Wrong dimension at collection creation

Exception: Cannot change dimension of existing collection

Cause: Collection exists with different dimension

Solutions:

Option 1: Drop and recreate

store.dropCollection("my_collection");

MilvusEmbeddingStore newStore = MilvusEmbeddingStore.builder()
    .host("localhost")
    .collectionName("my_collection")
    .dimension(768)  // New dimension
    .build();

Option 2: Use different collection name

MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
    .host("localhost")
    .collectionName("my_collection_v2")  // Different name
    .dimension(768)
    .build();

Connection Problems

Problem: Connection refused

Exception: Failed to connect to Milvus at localhost:19530

Checks:

  1. Verify Milvus is running
  2. Check host/port
  3. Verify network access

Solution:

// Verify connection parameters
MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
    .host("localhost")  // Correct host?
    .port(19530)        // Correct port? Default is 19530
    .collectionName("my_collection")
    .dimension(384)
    .build();

Docker check:

# Verify Milvus is running
docker ps | grep milvus

# Check Milvus logs
docker logs milvus-standalone

Problem: Authentication failed

Exception: Authentication failed

Solution:

MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
    .host("localhost")
    .port(19530)
    .username("admin")      // Correct username
    .password("password")   // Correct password
    .collectionName("my_collection")
    .dimension(384)
    .build();

Problem: Zilliz Cloud connection fails

Exception: Failed to connect to Zilliz Cloud

Solution:

MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
    .uri("https://xxx.api.gcp-us-west1.zillizcloud.com")  // Full URI with https://
    .token("your-actual-api-key")  // Check token validity
    .collectionName("my_collection")
    .dimension(384)
    .build();

Checks:

  • Verify URI includes https://
  • Verify API token is valid
  • Check Zilliz Cloud dashboard for cluster status

Performance Issues

Problem: Slow insertion

Adding embeddings is very slow

Solutions:

1. Disable auto-flush:

MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
    .host("localhost")
    .collectionName("my_collection")
    .dimension(384)
    .autoFlushOnInsert(false)  // Critical for performance
    .build();

2. Use batch operations:

// Slow: Individual adds
for (Embedding emb : embeddings) {
    store.add(emb);
}

// Fast: Batch add
store.addAll(embeddings);

3. Use eventual consistency:

.consistencyLevel(ConsistencyLevelEnum.EVENTUALLY)

Problem: Slow search

Search queries are taking too long

Solutions:

1. Optimize index:

import io.milvus.param.IndexType;

MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
    .host("localhost")
    .collectionName("my_collection")
    .dimension(384)
    .indexType(IndexType.HNSW)  // Faster than FLAT
    .extraParameters(Map.of("efConstruction", 200, "m", 16))
    .build();

2. Disable embedding retrieval:

.retrieveEmbeddingsOnSearch(false)  // Don't retrieve vectors

3. Reduce result count:

EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
    .queryEmbedding(queryEmbedding)
    .maxResults(10)  // Only what you need
    .minScore(0.75)  // Filter early
    .build();

4. Use simpler filters:

// Fast: Simple equality
Filter fast = metadataKey("category").isEqualTo("tech");

// Slower: Complex nested filters
Filter slow = and(or(filter1, filter2), or(filter3, filter4));

Problem: High memory usage

Milvus consuming too much memory

Solutions:

1. Use memory-efficient index:

import io.milvus.param.IndexType;

.indexType(IndexType.IVF_PQ)
.extraParameters(Map.of("m", 8, "nlist", 2048))

2. Disable embedding retrieval:

.retrieveEmbeddingsOnSearch(false)

3. Consider disk-based index:

.indexType(IndexType.DISKANN)

Configuration Errors

Problem: Missing dimension parameter

Exception: dimension must be set for new collection

Solution:

MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
    .host("localhost")
    .collectionName("my_collection")
    .dimension(384)  // REQUIRED for new collections
    .build();

Problem: Invalid index parameters

Exception: Invalid index parameter 'm'

Cause: Parameter doesn't match index type

Solution:

// HNSW parameters
.indexType(IndexType.HNSW)
.extraParameters(Map.of(
    "efConstruction", 200,  // Valid for HNSW
    "m", 16                 // Valid for HNSW
))

// IVF parameters
.indexType(IndexType.IVF_FLAT)
.extraParameters(Map.of(
    "nlist", 1024  // Valid for IVF
))

// Don't mix parameters from different index types

Problem: Dimension not divisible by m for IVF_PQ

Exception: dimension must be divisible by m

Solution:

// If dimension is 384, m must be divisor: 2, 3, 4, 6, 8, 12, etc.
.dimension(384)
.indexType(IndexType.IVF_PQ)
.extraParameters(Map.of(
    "m", 8,  // 384 / 8 = 48 ✓
    "nlist", 2048
))

// Invalid: m=7 because 384/7 is not an integer

Search Result Issues

Problem: Empty search results

Search returns no results but collection has data

Checks:

1. Verify data was added:

// Check if add succeeded
String id = store.add(embedding, segment);
System.out.println("Added with ID: " + id);

2. Check consistency level:

// May need STRONG for immediate visibility
.consistencyLevel(ConsistencyLevelEnum.STRONG)

3. Verify query embedding dimension:

Embedding queryEmbedding = model.embed(query).content();
System.out.println("Query dimension: " + queryEmbedding.dimension());
// Must match collection dimension

4. Check score threshold:

// Try without threshold first
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
    .queryEmbedding(queryEmbedding)
    .maxResults(10)
    // .minScore(0.7)  // Remove to test
    .build();

5. Verify filter doesn't exclude all:

// Test without filter
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
    .queryEmbedding(queryEmbedding)
    .maxResults(10)
    // .filter(filter)  // Remove to test
    .build();

Problem: Incorrect similarity scores

Scores don't match expected similarity

Cause: Metric type interpretation

Understanding scores:

COSINE:

  • Range: 0 to 1
  • 1 = identical
  • Higher = more similar

L2:

  • Range: 0 to ∞
  • 0 = identical
  • Lower = more similar

IP:

  • Range: -1 to 1
  • 1 = identical (for normalized)
  • Higher = more similar

Solution: Interpret based on metric type:

import io.milvus.param.MetricType;

// Check configured metric
MilvusEmbeddingStore store = MilvusEmbeddingStore.builder()
    .metricType(MetricType.COSINE)  // Default
    .build();

// Interpret scores accordingly
for (EmbeddingMatch<TextSegment> match : results.matches()) {
    double score = match.score();
    // For COSINE: score close to 1 is similar
    // For L2: score close to 0 is similar
}

Data Integrity Issues

Problem: Cannot find previously added embedding

Added embedding but can't find it by ID

Solutions:

1. Save returned ID:

String id = store.add(embedding, segment);
System.out.println("Saved with ID: " + id);  // Log for debugging

// Later, use exact ID
store.removeAll(Arrays.asList(id));

2. Use custom IDs:

String customId = "doc-123";
store.add(customId, embedding);

// Later, retrieve with exact ID
store.removeAll(Arrays.asList("doc-123"));

Problem: Duplicate entries

Same embedding added multiple times

Solution: Track IDs or check before adding:

Set<String> addedIds = new HashSet<>();

for (Document doc : documents) {
    if (!addedIds.contains(doc.getId())) {
        store.add(doc.getId(), embedding);
        addedIds.add(doc.getId());
    }
}

Error Handling Patterns

Robust Operation Pattern

int maxRetries = 3;
for (int attempt = 1; attempt <= maxRetries; attempt++) {
    try {
        String id = store.add(embedding, segment);
        System.out.println("Success: " + id);
        break;
    } catch (Exception e) {
        if (attempt == maxRetries) {
            System.err.println("Failed after " + maxRetries + " attempts");
            throw e;
        }
        System.err.println("Attempt " + attempt + " failed, retrying...");
        Thread.sleep(1000 * attempt);  // Exponential backoff
    }
}

Logging for Debugging

import java.util.logging.Logger;

Logger logger = Logger.getLogger("MilvusStore");

try {
    String id = store.add(embedding, segment);
    logger.info("Added embedding: " + id);
} catch (Exception e) {
    logger.severe("Failed to add embedding: " + e.getMessage());
    logger.severe("Embedding dimension: " + embedding.dimension());
    logger.severe("Collection: " + collectionName);
    throw e;
}

Common Warning Messages

"Collection not loaded"

Solution: Collection is loaded automatically by integration, but if using Milvus SDK directly:

import io.milvus.param.collection.LoadCollectionParam;

LoadCollectionParam param = LoadCollectionParam.newBuilder()
    .withCollectionName("my_collection")
    .build();

client.loadCollection(param);

"Time travel retention period exceeded"

Meaning: Trying to query historical data beyond retention

Solution: Adjust Milvus retention settings or query more recent data

Getting Help

When reporting issues, include:

  1. Milvus version (required: 2.4.20+)
  2. Java version
  3. langchain4j-milvus version
  4. Configuration used (sanitized)
  5. Complete error message
  6. Minimal reproduction code

Useful diagnostics:

System.out.println("Embedding dimension: " + embedding.dimension());
System.out.println("Collection: " + collectionName);
System.out.println("Java version: " + System.getProperty("java.version"));

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-milvus@1.11.0

docs

advanced.md

api-reference.md

configuration.md

index.md

patterns.md

quickstart.md

troubleshooting.md

tile.json