CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j

Build LLM-powered applications in Java with support for chatbots, agents, RAG, tools, and much more

Overview
Eval results
Files

embedding-store.mddocs/

Embedding Store

In-memory implementation of embedding store for vector similarity search. Supports adding embeddings with associated objects, finding similar embeddings, and serialization to/from files.

Capabilities

InMemoryEmbeddingStore

In-memory implementation storing embeddings without persistence.

Thread Safety: InMemoryEmbeddingStore is thread-safe for concurrent access. All public methods are synchronized to prevent race conditions when multiple threads add, remove, or query embeddings simultaneously.

Common Pitfalls:

  • All data stored in RAM - embedding store is cleared when JVM terminates unless explicitly serialized
  • Setting minScore too high (e.g., 0.95) may return no results even for relevant content
  • Embeddings are not normalized - ensure your embedding model produces normalized vectors if using cosine similarity
  • Large batch operations load entire dataset into memory at once
  • JSON serialization does not preserve generic type information - cast required when deserializing

Edge Cases:

  • Empty store returns empty list for findRelevant() queries
  • Dimension mismatch between stored embeddings and query embedding causes undefined behavior
  • Adding duplicate IDs overwrites previous embedding silently
  • Removing non-existent IDs completes without error
  • Searching with maxResults=0 returns empty list

Performance Notes:

  • Search complexity: O(n) where n is total number of embeddings
  • Memory usage: ~4KB per embedding (for 1536-dimensional float vectors) + associated object size
  • Add operation: O(1) for single add, O(m) for batch add where m is batch size
  • Remove operation: O(k) where k is number of IDs to remove
  • Serialization: O(n) time and space complexity

Memory Considerations:

  • 10K embeddings (1536-dim): ~40MB + object overhead
  • 100K embeddings (1536-dim): ~400MB + object overhead
  • 1M embeddings (1536-dim): ~4GB + object overhead
  • Each associated TextSegment adds ~1-10KB depending on text length
  • JSON serialization creates temporary copy, doubling memory usage during save/load

Exception Handling:

  • IOException during serialization/deserialization - file system errors, permissions, disk space
  • JsonProcessingException - malformed JSON during fromFile()
  • IllegalArgumentException - null embeddings, negative dimensions
  • OutOfMemoryError - store exceeds available heap space
package dev.langchain4j.store.embedding.inmemory;

/**
 * In-memory implementation of EmbeddingStore
 * Stores embeddings in memory without automatic persistence
 * Supports serialization to/from JSON files
 * Thread-safe for concurrent access
 */
public class InMemoryEmbeddingStore<Embedded> implements EmbeddingStore<Embedded> {
    /**
     * Default constructor
     * Creates empty embedding store
     */
    public InMemoryEmbeddingStore();

    /**
     * Load embedding store from file
     * @param file Path to JSON file
     * @return InMemoryEmbeddingStore instance
     */
    public static <Embedded> InMemoryEmbeddingStore<Embedded> fromFile(Path file);

    /**
     * Load embedding store from file path
     * @param filePath String path to JSON file
     * @return InMemoryEmbeddingStore instance
     */
    public static <Embedded> InMemoryEmbeddingStore<Embedded> fromFile(String filePath);

    /**
     * Serialize embedding store to file
     * @param file Path to JSON file
     */
    public void serializeToFile(Path file);

    /**
     * Serialize embedding store to file path
     * @param filePath String path to JSON file
     */
    public void serializeToFile(String filePath);

    /**
     * Add embedding without associated object
     * @param embedding Embedding to add
     * @return Generated unique ID
     */
    public String add(Embedding embedding);

    /**
     * Add embedding with specific ID
     * @param id ID to use
     * @param embedding Embedding to add
     */
    public void add(String id, Embedding embedding);

    /**
     * Add embedding with associated object
     * @param embedding Embedding to add
     * @param embedded Object to associate with embedding
     * @return Generated unique ID
     */
    public String add(Embedding embedding, Embedded embedded);

    /**
     * Add multiple embeddings
     * @param embeddings List of embeddings to add
     * @return List of generated IDs
     */
    public List<String> addAll(List<Embedding> embeddings);

    /**
     * Add multiple embeddings with IDs and objects
     * @param ids List of IDs
     * @param embeddings List of embeddings
     * @param embedded List of associated objects
     */
    public void addAll(List<String> ids, List<Embedding> embeddings, List<Embedded> embedded);

    /**
     * Find relevant embeddings by similarity
     * @param referenceEmbedding Reference embedding for similarity search
     * @param maxResults Maximum number of results to return
     * @param minScore Minimum similarity score (0.0 to 1.0)
     * @return List of embedding matches sorted by score (highest first)
     */
    public List<EmbeddingMatch<Embedded>> findRelevant(
        Embedding referenceEmbedding,
        int maxResults,
        double minScore
    );

    /**
     * Remove embeddings by IDs
     * @param ids Collection of IDs to remove
     */
    public void removeAll(Collection<String> ids);

    /**
     * Remove embeddings matching filter
     * @param filter Filter criteria
     */
    public void removeAll(Filter filter);

    /**
     * Remove all embeddings
     */
    public void removeAll();
}

JSON Codec

Interface for JSON serialization/deserialization of embedding stores.

Thread Safety: JacksonInMemoryEmbeddingStoreJsonCodec is thread-safe. ObjectMapper instances are thread-safe for reading operations.

Common Pitfalls:

  • Generic type information lost during JSON serialization - manual casting required
  • Large stores create large JSON files - multi-GB files for 100K+ embeddings
  • No schema versioning - incompatible changes break deserialization
  • Custom Embedded types must be Jackson-serializable

Edge Cases:

  • Empty store serializes to minimal JSON structure
  • Null embedded objects serialize as JSON null
  • Special characters in string IDs require proper JSON escaping
  • Very large floating-point numbers may lose precision

Performance Notes:

  • Serialization speed: ~10MB/s typical (varies by CPU and disk)
  • Deserialization speed: ~5MB/s typical (slower due to object creation)
  • Memory spike during serialization: 2x store size
  • File size: ~4-6KB per embedding (1536-dim) in JSON format

Memory Considerations:

  • Serialization creates temporary string representation in memory
  • Peak memory usage: 2x store size during save, 3x during load
  • Large files may require increasing heap size: -Xmx4g or higher

Exception Handling:

  • IOException - file access errors, disk full, permissions
  • JsonParseException - corrupted or invalid JSON
  • JsonMappingException - incompatible JSON structure or types
  • OutOfMemoryError - insufficient heap for deserialization
package dev.langchain4j.store.embedding.inmemory;

/**
 * Interface for JSON codec for serializing/deserializing InMemoryEmbeddingStore
 */
public interface InMemoryEmbeddingStoreJsonCodec {
    /**
     * Serialize embedding store to JSON
     * @param store Embedding store to serialize
     * @return JSON string
     */
    String toJson(InMemoryEmbeddingStore<?> store);

    /**
     * Deserialize embedding store from JSON
     * @param json JSON string
     * @return InMemoryEmbeddingStore instance
     */
    InMemoryEmbeddingStore<?> fromJson(String json);
}

/**
 * Jackson-based implementation of InMemoryEmbeddingStoreJsonCodec
 */
public class JacksonInMemoryEmbeddingStoreJsonCodec implements InMemoryEmbeddingStoreJsonCodec {
    /**
     * Serialize to JSON using Jackson
     * @param store Embedding store
     * @return JSON string
     */
    public String toJson(InMemoryEmbeddingStore<?> store);

    /**
     * Deserialize from JSON using Jackson
     * @param json JSON string
     * @return InMemoryEmbeddingStore instance
     */
    public InMemoryEmbeddingStore<?> fromJson(String json);
}

SPI Factory

Thread Safety: Factory implementations should be thread-safe as they may be called concurrently by ServiceLoader.

Common Pitfalls:

  • Must provide META-INF/services file for SPI discovery
  • Multiple implementations on classpath cause undefined behavior
  • Factory must have public no-arg constructor

Edge Cases:

  • No SPI implementation found falls back to default Jackson codec
  • Multiple SPI implementations load first discovered implementation
  • ServiceLoader caching may prevent runtime implementation changes

Performance Notes:

  • SPI discovery happens once during class loading
  • Factory create() called once per application lifecycle typically
  • Negligible performance overhead

Memory Considerations:

  • Factory instances retained in ServiceLoader cache
  • Minimal memory footprint (< 1KB per factory)

Exception Handling:

  • ServiceConfigurationError - malformed META-INF/services file
  • NoSuchElementException - no implementations available
  • ClassNotFoundException - referenced class not on classpath
package dev.langchain4j.spi.store.embedding.inmemory;

/**
 * SPI factory interface for creating JSON codecs for in-memory embedding store
 * Allows custom serialization implementations
 */
public interface InMemoryEmbeddingStoreJsonCodecFactory {
    /**
     * Create JSON codec instance
     * @return InMemoryEmbeddingStoreJsonCodec instance
     */
    InMemoryEmbeddingStoreJsonCodec create();
}

Usage Examples

Basic Usage

import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.store.embedding.EmbeddingMatch;
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
import java.util.List;

// Create embedding store
InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();

// Create some text segments
TextSegment segment1 = TextSegment.from("The quick brown fox jumps over the lazy dog");
TextSegment segment2 = TextSegment.from("Python is a programming language");
TextSegment segment3 = TextSegment.from("Java is also a programming language");

// Generate embeddings
Embedding embedding1 = embeddingModel.embed(segment1).content();
Embedding embedding2 = embeddingModel.embed(segment2).content();
Embedding embedding3 = embeddingModel.embed(segment3).content();

// Add to store
embeddingStore.add(embedding1, segment1);
embeddingStore.add(embedding2, segment2);
embeddingStore.add(embedding3, segment3);

// Search for similar embeddings
String query = "Tell me about programming languages";
Embedding queryEmbedding = embeddingModel.embed(query).content();

List<EmbeddingMatch<TextSegment>> matches = embeddingStore.findRelevant(
    queryEmbedding,
    2,      // max results
    0.7     // min score
);

for (EmbeddingMatch<TextSegment> match : matches) {
    System.out.println("Score: " + match.score());
    System.out.println("Text: " + match.embedded().text());
}

Batch Adding Embeddings

import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
import java.util.ArrayList;
import java.util.List;

InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();

List<TextSegment> segments = List.of(
    TextSegment.from("First document"),
    TextSegment.from("Second document"),
    TextSegment.from("Third document")
);

// Embed all segments at once (more efficient)
List<Embedding> embeddings = embeddingModel.embedAll(segments).content();

// Add all at once
List<String> ids = new ArrayList<>();
for (int i = 0; i < embeddings.size(); i++) {
    ids.add("doc-" + i);
}

embeddingStore.addAll(ids, embeddings, segments);

Persistence with Serialization

import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
import java.nio.file.Path;

// Create and populate store
InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();
// ... add embeddings ...

// Save to file
embeddingStore.serializeToFile(Path.of("embeddings.json"));

// Later, load from file
InMemoryEmbeddingStore<TextSegment> loadedStore =
    InMemoryEmbeddingStore.fromFile(Path.of("embeddings.json"));

// Use loaded store
List<EmbeddingMatch<TextSegment>> matches = loadedStore.findRelevant(
    queryEmbedding,
    5,
    0.7
);

Building a Document Search System

import dev.langchain4j.data.document.Document;
import dev.langchain4j.data.document.loader.FileSystemDocumentLoader;
import dev.langchain4j.data.document.splitter.DocumentSplitters;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.store.embedding.EmbeddingMatch;
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
import java.nio.file.Path;
import java.util.List;

// 1. Load documents
List<Document> documents = FileSystemDocumentLoader.loadDocumentsRecursively(
    Path.of("/docs")
);

// 2. Split into segments
List<TextSegment> segments = new ArrayList<>();
DocumentSplitter splitter = DocumentSplitters.recursive(500, 50, tokenizer);
for (Document doc : documents) {
    segments.addAll(splitter.split(doc));
}

// 3. Create and populate embedding store
InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();
for (TextSegment segment : segments) {
    Embedding embedding = embeddingModel.embed(segment).content();
    embeddingStore.add(embedding, segment);
}

// 4. Save for later use
embeddingStore.serializeToFile("document-embeddings.json");

// 5. Search
String userQuery = "How do I configure the database?";
Embedding queryEmbedding = embeddingModel.embed(userQuery).content();
List<EmbeddingMatch<TextSegment>> results = embeddingStore.findRelevant(
    queryEmbedding,
    5,
    0.7
);

for (EmbeddingMatch<TextSegment> result : results) {
    System.out.println("Relevance: " + result.score());
    System.out.println("Content: " + result.embedded().text());
    System.out.println("Source: " + result.embedded().metadata("file_name"));
    System.out.println("---");
}

Integration with RAG

import dev.langchain4j.rag.content.retriever.EmbeddingStoreContentRetriever;
import dev.langchain4j.service.AiServices;
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;

// Create and populate embedding store
InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();
// ... add embeddings ...

// Create content retriever
EmbeddingStoreContentRetriever contentRetriever =
    EmbeddingStoreContentRetriever.builder()
        .embeddingStore(embeddingStore)
        .embeddingModel(embeddingModel)
        .maxResults(3)
        .minScore(0.7)
        .build();

// Create AI service with RAG
interface Assistant {
    String chat(String message);
}

Assistant assistant = AiServices.builder(Assistant.class)
    .chatModel(chatModel)
    .contentRetriever(contentRetriever)
    .build();

// Queries will automatically retrieve relevant context
String answer = assistant.chat("How do I configure the database?");

Custom Object Storage

import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;

// Store custom objects with embeddings
record Product(String id, String name, String description, double price) {}

InMemoryEmbeddingStore<Product> embeddingStore = new InMemoryEmbeddingStore<>();

Product product = new Product("p1", "Laptop", "High-performance laptop", 999.99);
String embeddingText = product.name() + " " + product.description();
Embedding embedding = embeddingModel.embed(embeddingText).content();

embeddingStore.add(embedding, product);

// Search for products
String query = "affordable computer";
Embedding queryEmbedding = embeddingModel.embed(query).content();
List<EmbeddingMatch<Product>> matches = embeddingStore.findRelevant(
    queryEmbedding,
    5,
    0.6
);

for (EmbeddingMatch<Product> match : matches) {
    Product p = match.embedded();
    System.out.println(p.name() + " - $" + p.price() + " (score: " + match.score() + ")");
}

Managing Embeddings

import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
import java.util.List;

InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();

// Add with custom IDs
String id1 = embeddingStore.add(embedding1, segment1);
String id2 = embeddingStore.add(embedding2, segment2);
embeddingStore.add("custom-id-3", embedding3);

// Remove specific embeddings
embeddingStore.removeAll(List.of(id1, id2));

// Remove all embeddings
embeddingStore.removeAll();

Using Metadata Filters

import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.store.embedding.filter.Filter;
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;

InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();

// Add segments with metadata
TextSegment segment1 = TextSegment.from(
    "Content 1",
    Metadata.from("category", "technical", "language", "java")
);
TextSegment segment2 = TextSegment.from(
    "Content 2",
    Metadata.from("category", "marketing", "language", "english")
);

embeddingStore.add(embeddingModel.embed(segment1).content(), segment1);
embeddingStore.add(embeddingModel.embed(segment2).content(), segment2);

// Remove by filter
Filter filter = Filter.metadataKey("category").isEqualTo("marketing");
embeddingStore.removeAll(filter);

Batch Operations with Progress Tracking

import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.atomic.AtomicInteger;

InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();

List<TextSegment> segments = loadLargeDataset(); // e.g., 50K segments
int batchSize = 100;
AtomicInteger processed = new AtomicInteger(0);

for (int i = 0; i < segments.size(); i += batchSize) {
    int end = Math.min(i + batchSize, segments.size());
    List<TextSegment> batch = segments.subList(i, end);

    // Embed batch
    List<Embedding> embeddings = embeddingModel.embedAll(batch).content();

    // Generate IDs
    List<String> ids = new ArrayList<>();
    for (int j = 0; j < batch.size(); j++) {
        ids.add("doc-" + (i + j));
    }

    // Add batch
    embeddingStore.addAll(ids, embeddings, batch);

    // Track progress
    int count = processed.addAndGet(batch.size());
    System.out.printf("Processed %d/%d (%.1f%%)%n",
        count, segments.size(), 100.0 * count / segments.size());
}

// Save after all batches
embeddingStore.serializeToFile("large-embeddings.json");

Incremental Updates Pattern

import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.HashSet;
import java.util.Set;

// Load existing store or create new one
Path storePath = Path.of("embeddings.json");
InMemoryEmbeddingStore<TextSegment> embeddingStore;

if (Files.exists(storePath)) {
    embeddingStore = InMemoryEmbeddingStore.fromFile(storePath);
    System.out.println("Loaded existing store");
} else {
    embeddingStore = new InMemoryEmbeddingStore<>();
    System.out.println("Created new store");
}

// Track existing document IDs (in production, persist this separately)
Set<String> existingIds = loadExistingIds();

// Process new documents
List<Document> newDocuments = fetchNewDocuments();
for (Document doc : newDocuments) {
    String docId = doc.metadata("id");

    // Skip if already processed
    if (existingIds.contains(docId)) {
        continue;
    }

    // Add new document
    TextSegment segment = TextSegment.from(doc.text(), doc.metadata());
    Embedding embedding = embeddingModel.embed(segment).content();
    embeddingStore.add(docId, embedding);

    existingIds.add(docId);
}

// Save updated store
embeddingStore.serializeToFile(storePath);
saveExistingIds(existingIds);

Concurrent Read Pattern

import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
import java.util.List;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

// Load store once (thread-safe for reads)
InMemoryEmbeddingStore<TextSegment> embeddingStore =
    InMemoryEmbeddingStore.fromFile("embeddings.json");

// Create thread pool for concurrent queries
ExecutorService executor = Executors.newFixedThreadPool(10);

// Process multiple queries concurrently
List<String> queries = List.of(
    "database configuration",
    "authentication setup",
    "logging configuration"
);

List<CompletableFuture<List<EmbeddingMatch<TextSegment>>>> futures =
    queries.stream()
        .map(query -> CompletableFuture.supplyAsync(() -> {
            Embedding queryEmbedding = embeddingModel.embed(query).content();
            return embeddingStore.findRelevant(queryEmbedding, 5, 0.7);
        }, executor))
        .toList();

// Wait for all queries to complete
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();

// Process results
for (int i = 0; i < queries.size(); i++) {
    System.out.println("Query: " + queries.get(i));
    List<EmbeddingMatch<TextSegment>> matches = futures.get(i).join();
    for (EmbeddingMatch<TextSegment> match : matches) {
        System.out.println("  - " + match.embedded().text() + " (" + match.score() + ")");
    }
}

executor.shutdown();

Testing Patterns

Unit Testing with Mock Embeddings

import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.store.embedding.EmbeddingMatch;
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
import org.junit.jupiter.api.Test;
import static org.assertj.core.api.Assertions.assertThat;

class EmbeddingStoreTest {

    @Test
    void shouldAddAndRetrieveEmbedding() {
        // Arrange
        InMemoryEmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();
        TextSegment segment = TextSegment.from("test content");
        Embedding embedding = createMockEmbedding(1.0f, 0.0f, 0.0f);

        // Act
        String id = store.add(embedding, segment);

        // Assert
        assertThat(id).isNotNull();
        List<EmbeddingMatch<TextSegment>> matches =
            store.findRelevant(embedding, 1, 0.0);
        assertThat(matches).hasSize(1);
        assertThat(matches.get(0).embedded()).isEqualTo(segment);
        assertThat(matches.get(0).score()).isCloseTo(1.0, within(0.01));
    }

    @Test
    void shouldReturnEmptyListWhenNoMatches() {
        // Arrange
        InMemoryEmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();
        store.add(createMockEmbedding(1.0f, 0.0f, 0.0f), TextSegment.from("test"));

        // Act - query with very different embedding
        Embedding query = createMockEmbedding(0.0f, 1.0f, 0.0f);
        List<EmbeddingMatch<TextSegment>> matches =
            store.findRelevant(query, 10, 0.9);

        // Assert
        assertThat(matches).isEmpty();
    }

    @Test
    void shouldHandleEmptyStore() {
        // Arrange
        InMemoryEmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();

        // Act
        List<EmbeddingMatch<TextSegment>> matches =
            store.findRelevant(createMockEmbedding(1.0f, 0.0f, 0.0f), 10, 0.0);

        // Assert
        assertThat(matches).isEmpty();
    }

    @Test
    void shouldRemoveEmbeddingsById() {
        // Arrange
        InMemoryEmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();
        String id1 = store.add(createMockEmbedding(1.0f, 0.0f, 0.0f),
            TextSegment.from("first"));
        String id2 = store.add(createMockEmbedding(0.0f, 1.0f, 0.0f),
            TextSegment.from("second"));

        // Act
        store.removeAll(List.of(id1));

        // Assert
        List<EmbeddingMatch<TextSegment>> matches =
            store.findRelevant(createMockEmbedding(1.0f, 0.0f, 0.0f), 10, 0.0);
        assertThat(matches).hasSize(1);
        assertThat(matches.get(0).embeddingId()).isEqualTo(id2);
    }

    private Embedding createMockEmbedding(float... values) {
        return Embedding.from(values);
    }
}

Testing Serialization

import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.io.TempDir;
import java.nio.file.Path;
import static org.assertj.core.api.Assertions.assertThat;

class SerializationTest {

    @Test
    void shouldSerializeAndDeserialize(@TempDir Path tempDir) throws Exception {
        // Arrange
        Path storePath = tempDir.resolve("test-store.json");
        InMemoryEmbeddingStore<TextSegment> originalStore = new InMemoryEmbeddingStore<>();

        TextSegment segment1 = TextSegment.from("first document");
        TextSegment segment2 = TextSegment.from("second document");

        originalStore.add("id1", createMockEmbedding(1.0f, 0.0f), segment1);
        originalStore.add("id2", createMockEmbedding(0.0f, 1.0f), segment2);

        // Act - serialize
        originalStore.serializeToFile(storePath);

        // Act - deserialize
        InMemoryEmbeddingStore<TextSegment> loadedStore =
            InMemoryEmbeddingStore.fromFile(storePath);

        // Assert
        List<EmbeddingMatch<TextSegment>> matches =
            loadedStore.findRelevant(createMockEmbedding(1.0f, 0.0f), 1, 0.0);

        assertThat(matches).hasSize(1);
        assertThat(matches.get(0).embeddingId()).isEqualTo("id1");
        assertThat(matches.get(0).embedded().text()).isEqualTo("first document");
    }

    @Test
    void shouldHandleEmptyStoreSerialization(@TempDir Path tempDir) throws Exception {
        // Arrange
        Path storePath = tempDir.resolve("empty-store.json");
        InMemoryEmbeddingStore<TextSegment> emptyStore = new InMemoryEmbeddingStore<>();

        // Act
        emptyStore.serializeToFile(storePath);
        InMemoryEmbeddingStore<TextSegment> loadedStore =
            InMemoryEmbeddingStore.fromFile(storePath);

        // Assert
        List<EmbeddingMatch<TextSegment>> matches =
            loadedStore.findRelevant(createMockEmbedding(1.0f, 0.0f), 10, 0.0);
        assertThat(matches).isEmpty();
    }
}

Integration Testing with Real Embeddings

import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.model.embedding.AllMiniLmL6V2EmbeddingModel;
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import static org.assertj.core.api.Assertions.assertThat;

class EmbeddingStoreIntegrationTest {

    private EmbeddingModel embeddingModel;
    private InMemoryEmbeddingStore<TextSegment> store;

    @BeforeEach
    void setUp() {
        embeddingModel = new AllMiniLmL6V2EmbeddingModel();
        store = new InMemoryEmbeddingStore<>();
    }

    @Test
    void shouldFindSemanticallySimilarDocuments() {
        // Arrange - add related documents
        List<String> documents = List.of(
            "Java is a programming language",
            "Python is used for data science",
            "The weather is sunny today"
        );

        for (String doc : documents) {
            TextSegment segment = TextSegment.from(doc);
            Embedding embedding = embeddingModel.embed(segment).content();
            store.add(embedding, segment);
        }

        // Act - search for programming-related content
        String query = "Tell me about programming languages";
        Embedding queryEmbedding = embeddingModel.embed(query).content();
        List<EmbeddingMatch<TextSegment>> matches =
            store.findRelevant(queryEmbedding, 2, 0.5);

        // Assert - should find programming-related documents first
        assertThat(matches).hasSizeGreaterThanOrEqualTo(2);
        assertThat(matches.get(0).embedded().text())
            .containsAnyOf("Java", "Python", "programming");
    }
}

Performance Testing

import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.Timeout;
import java.util.concurrent.TimeUnit;
import static org.assertj.core.api.Assertions.assertThat;

class PerformanceTest {

    @Test
    @Timeout(value = 5, unit = TimeUnit.SECONDS)
    void shouldHandleLargeDatasetEfficiently() {
        // Arrange
        InMemoryEmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();
        int documentCount = 10_000;

        // Act - add 10K embeddings
        for (int i = 0; i < documentCount; i++) {
            Embedding embedding = createMockEmbedding(
                (float) Math.random(),
                (float) Math.random()
            );
            TextSegment segment = TextSegment.from("Document " + i);
            store.add(embedding, segment);
        }

        // Act - search
        long startTime = System.nanoTime();
        List<EmbeddingMatch<TextSegment>> matches =
            store.findRelevant(createMockEmbedding(0.5f, 0.5f), 10, 0.0);
        long duration = System.nanoTime() - startTime;

        // Assert
        assertThat(matches).hasSize(10);
        assertThat(duration).isLessThan(TimeUnit.SECONDS.toNanos(1));
    }

    @Test
    void shouldMeasureMemoryUsage() {
        // Arrange
        Runtime runtime = Runtime.getRuntime();
        runtime.gc();
        long memoryBefore = runtime.totalMemory() - runtime.freeMemory();

        // Act - add 1000 embeddings (1536 dimensions)
        InMemoryEmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();
        for (int i = 0; i < 1000; i++) {
            float[] vector = new float[1536];
            for (int j = 0; j < vector.length; j++) {
                vector[j] = (float) Math.random();
            }
            Embedding embedding = Embedding.from(vector);
            TextSegment segment = TextSegment.from("Document " + i);
            store.add(embedding, segment);
        }

        runtime.gc();
        long memoryAfter = runtime.totalMemory() - runtime.freeMemory();
        long memoryUsed = memoryAfter - memoryBefore;

        // Assert - rough memory check (1000 embeddings * 4KB = ~4MB)
        System.out.println("Memory used for 1000 embeddings: " +
            (memoryUsed / 1024 / 1024) + " MB");
        assertThat(memoryUsed).isLessThan(10 * 1024 * 1024); // < 10MB
    }
}

Migration to Persistent Stores

When to Migrate

Consider migrating from InMemoryEmbeddingStore to a persistent store when:

  • Scale: Dataset exceeds 100K embeddings or 1GB RAM
  • Persistence: Need automatic persistence without manual serialization
  • Distributed: Multiple application instances need to share embeddings
  • Production: Application requires high availability and durability
  • Performance: Need sub-linear search time (e.g., HNSW, IVF indexes)
  • Features: Require advanced filtering, metadata queries, or hybrid search

Migration to PostgreSQL with pgvector

// Before: InMemoryEmbeddingStore
InMemoryEmbeddingStore<TextSegment> oldStore = new InMemoryEmbeddingStore<>();

// After: PostgreSQL with pgvector
import dev.langchain4j.store.embedding.pgvector.PgVectorEmbeddingStore;

PgVectorEmbeddingStore newStore = PgVectorEmbeddingStore.builder()
    .host("localhost")
    .port(5432)
    .database("vectordb")
    .user("postgres")
    .password("password")
    .table("embeddings")
    .dimension(1536)
    .createTable(true)
    .build();

// Migration script
InMemoryEmbeddingStore<TextSegment> sourceStore =
    InMemoryEmbeddingStore.fromFile("embeddings.json");

// Extract all embeddings (requires reflection or custom tracking)
// Workaround: Re-embed all documents
List<TextSegment> segments = loadAllSegments();
List<Embedding> embeddings = embeddingModel.embedAll(segments).content();

// Batch insert
List<String> ids = new ArrayList<>();
for (int i = 0; i < segments.size(); i++) {
    ids.add("doc-" + i);
}
newStore.addAll(ids, embeddings, segments);

Migration to Pinecone

import dev.langchain4j.store.embedding.pinecone.PineconeEmbeddingStore;

PineconeEmbeddingStore pineconeStore = PineconeEmbeddingStore.builder()
    .apiKey(System.getenv("PINECONE_API_KEY"))
    .environment("us-west1-gcp")
    .index("my-index")
    .namespace("production")
    .build();

// Migrate data
InMemoryEmbeddingStore<TextSegment> sourceStore =
    InMemoryEmbeddingStore.fromFile("embeddings.json");

// Re-embed and upload
List<TextSegment> segments = loadAllSegments();
int batchSize = 100;

for (int i = 0; i < segments.size(); i += batchSize) {
    int end = Math.min(i + batchSize, segments.size());
    List<TextSegment> batch = segments.subList(i, end);

    List<Embedding> embeddings = embeddingModel.embedAll(batch).content();
    List<String> ids = new ArrayList<>();
    for (int j = 0; j < batch.size(); j++) {
        ids.add("doc-" + (i + j));
    }

    pineconeStore.addAll(ids, embeddings, batch);
    System.out.printf("Migrated %d/%d%n", end, segments.size());
}

Migration to Weaviate

import dev.langchain4j.store.embedding.weaviate.WeaviateEmbeddingStore;

WeaviateEmbeddingStore weaviateStore = WeaviateEmbeddingStore.builder()
    .scheme("https")
    .host("my-cluster.weaviate.network")
    .apiKey(System.getenv("WEAVIATE_API_KEY"))
    .objectClass("Document")
    .consistencyLevel("QUORUM")
    .build();

// Migration with metadata preservation
List<TextSegment> segments = loadAllSegments();

for (TextSegment segment : segments) {
    Embedding embedding = embeddingModel.embed(segment).content();

    // Weaviate preserves metadata automatically
    weaviateStore.add(embedding, segment);
}

Hybrid Approach: InMemory Cache + Persistent Store

import dev.langchain4j.store.embedding.EmbeddingStore;
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

class CachedEmbeddingStore<T> implements EmbeddingStore<T> {

    private final EmbeddingStore<T> persistentStore;
    private final InMemoryEmbeddingStore<T> cache;
    private final Map<String, T> idToObject;
    private final int cacheSize;

    public CachedEmbeddingStore(EmbeddingStore<T> persistentStore, int cacheSize) {
        this.persistentStore = persistentStore;
        this.cache = new InMemoryEmbeddingStore<>();
        this.idToObject = new ConcurrentHashMap<>();
        this.cacheSize = cacheSize;
    }

    @Override
    public String add(Embedding embedding) {
        String id = persistentStore.add(embedding);

        // Add to cache
        if (idToObject.size() < cacheSize) {
            cache.add(id, embedding);
        }

        return id;
    }

    @Override
    public String add(Embedding embedding, T embedded) {
        String id = persistentStore.add(embedding, embedded);

        // Add to cache
        if (idToObject.size() < cacheSize) {
            cache.add(id, embedding);
            idToObject.put(id, embedded);
        }

        return id;
    }

    @Override
    public List<EmbeddingMatch<T>> findRelevant(
        Embedding referenceEmbedding,
        int maxResults,
        double minScore
    ) {
        // Try cache first for hot queries
        List<EmbeddingMatch<T>> cachedResults =
            cache.findRelevant(referenceEmbedding, maxResults, minScore);

        if (!cachedResults.isEmpty()) {
            return cachedResults;
        }

        // Fall back to persistent store
        return persistentStore.findRelevant(referenceEmbedding, maxResults, minScore);
    }

    // Implement other EmbeddingStore methods...
}

// Usage
EmbeddingStore<TextSegment> postgresStore = createPostgresStore();
CachedEmbeddingStore<TextSegment> cachedStore =
    new CachedEmbeddingStore<>(postgresStore, 10_000);

// Searches hit cache, writes go to persistent store
cachedStore.add(embedding, segment);
List<EmbeddingMatch<TextSegment>> results =
    cachedStore.findRelevant(queryEmbedding, 5, 0.7);

Comparison Table: InMemory vs Persistent Stores

FeatureInMemoryPostgreSQL+pgvectorPineconeWeaviate
Setup ComplexityNoneMedium (install PG)Low (cloud)Low (cloud/docker)
PersistenceManualAutomaticAutomaticAutomatic
Max Scale100K vectors10M+ vectorsBillionsBillions
Search AlgorithmBrute-force O(n)HNSW O(log n)ProprietaryHNSW O(log n)
DistributedNoYes (with replication)YesYes
Metadata FilteringLimitedSQL queriesYesGraphQL queries
CostFreeInfrastructure$$$ (usage-based)$ (self-host) or $$ (cloud)
Latency< 1ms5-20ms10-50ms10-50ms
Best ForDev/testingSmall-medium prodLarge-scale prodFeature-rich apps

Related APIs

Embedding Models

  • dev.langchain4j.model.embedding.EmbeddingModel - Generate embeddings from text
  • dev.langchain4j.model.embedding.AllMiniLmL6V2EmbeddingModel - Local embedding model
  • dev.langchain4j.model.openai.OpenAiEmbeddingModel - OpenAI embeddings (text-embedding-3-small)
  • dev.langchain4j.model.azure.AzureOpenAiEmbeddingModel - Azure OpenAI embeddings

Data Models

  • dev.langchain4j.data.embedding.Embedding - Vector representation of text
  • dev.langchain4j.data.segment.TextSegment - Text chunk with metadata
  • dev.langchain4j.store.embedding.EmbeddingMatch - Search result with score

Content Retrievers

  • dev.langchain4j.rag.content.retriever.EmbeddingStoreContentRetriever - RAG integration
  • dev.langchain4j.rag.content.retriever.ContentRetriever - Base interface

Alternative Embedding Stores

  • dev.langchain4j.store.embedding.pgvector.PgVectorEmbeddingStore - PostgreSQL persistence
  • dev.langchain4j.store.embedding.pinecone.PineconeEmbeddingStore - Pinecone cloud
  • dev.langchain4j.store.embedding.weaviate.WeaviateEmbeddingStore - Weaviate vector DB
  • dev.langchain4j.store.embedding.qdrant.QdrantEmbeddingStore - Qdrant vector DB
  • dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore - Milvus vector DB

Document Processing

  • dev.langchain4j.data.document.Document - Document representation
  • dev.langchain4j.data.document.loader.FileSystemDocumentLoader - Load files
  • dev.langchain4j.data.document.splitter.DocumentSplitter - Split documents

Filters

  • dev.langchain4j.store.embedding.filter.Filter - Metadata filtering
  • dev.langchain4j.store.embedding.filter.comparison.IsEqualTo - Equality filter
  • dev.langchain4j.store.embedding.filter.logical.And - Logical AND
  • dev.langchain4j.store.embedding.filter.logical.Or - Logical OR

Performance Considerations

The InMemoryEmbeddingStore is suitable for:

  • Small to medium datasets (up to ~100K embeddings)
  • Development and testing
  • Applications where fast startup is important
  • Scenarios where persistence can be handled via serialization

For production systems with large datasets or distributed architectures, consider using persistent embedding stores like:

  • PostgreSQL with pgvector
  • Pinecone
  • Weaviate
  • Qdrant
  • Milvus
  • Other vector databases

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j@1.11.0

docs

ai-services.md

chains.md

classification.md

data-types.md

document-processing.md

embedding-store.md

guardrails.md

index.md

memory.md

messages.md

models.md

output-parsing.md

prompts.md

rag.md

request-response.md

spi.md

tools.md

README.md

tile.json