CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-easy-rag

Zero-configuration RAG package that bundles document parsing, embedding, and splitting for easy Retrieval-Augmented Generation in Java applications

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

api-types-storage.mddocs/

Storage Types API

Storage interfaces and implementations for embedding stores.

EmbeddingStore

Interface for storing and searching embeddings.

package dev.langchain4j.store.embedding;

public interface EmbeddingStore<Embedded> {
    // Add embeddings
    String add(Embedding embedding)
    void add(String id, Embedding embedding)
    String add(Embedding embedding, Embedded embedded)
    List<String> addAll(List<Embedding> embeddings)

    // Search
    EmbeddingSearchResult<Embedded> search(EmbeddingSearchRequest request)
}

Type Parameter:

  • Embedded - Type of object embedded (typically TextSegment)

Add Methods:

  • add(embedding) - Add embedding, returns generated ID
  • add(id, embedding) - Add with specific ID (no return value)
  • add(embedding, embedded) - Add embedding with associated object, returns ID
  • addAll(embeddings) - Add multiple embeddings, returns list of IDs

Search:

  • search(request) - Search for similar embeddings

Example:

EmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();

// Add embedding with object
TextSegment segment = TextSegment.from("Some text");
Embedding embedding = embeddingModel.embed(segment).content();
String id = store.add(embedding, segment);

// Search
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
    .queryEmbedding(queryEmbedding)
    .maxResults(5)
    .build();

EmbeddingSearchResult<TextSegment> result = store.search(request);

InMemoryEmbeddingStore

In-memory implementation of EmbeddingStore. Useful for development, testing, and small datasets.

package dev.langchain4j.store.embedding.inmemory;

public class InMemoryEmbeddingStore<Embedded> implements EmbeddingStore<Embedded> {
    // Constructors
    public InMemoryEmbeddingStore()
    public InMemoryEmbeddingStore(Collection<Entry<Embedded>> entries)

    // Add methods
    public String add(Embedding embedding)
    public void add(String id, Embedding embedding)
    public String add(Embedding embedding, Embedded embedded)
    public void add(String id, Embedding embedding, Embedded embedded)
    public List<String> addAll(List<Embedding> embeddings)
    public void addAll(List<String> ids, List<Embedding> embeddings, List<Embedded> embedded)

    // Remove methods
    public void removeAll(Collection<String> ids)
    public void removeAll(Filter filter)
    public void removeAll()

    // Search
    public EmbeddingSearchResult<Embedded> search(EmbeddingSearchRequest request)

    // Persistence
    public String serializeToJson()
    public void serializeToFile(Path filePath)
    public void serializeToFile(String filePath)
    public static InMemoryEmbeddingStore<TextSegment> fromJson(String json)
    public static InMemoryEmbeddingStore<TextSegment> fromFile(Path filePath)
    public static InMemoryEmbeddingStore<TextSegment> fromFile(String filePath)

    // Merge
    public static <Embedded> InMemoryEmbeddingStore<Embedded> merge(
        Collection<InMemoryEmbeddingStore<Embedded>> stores
    )
    public static <Embedded> InMemoryEmbeddingStore<Embedded> merge(
        InMemoryEmbeddingStore<Embedded> first,
        InMemoryEmbeddingStore<Embedded> second
    )

    // Utility
    public int size()
    public boolean isEmpty()
}

Constructors:

  • InMemoryEmbeddingStore() - Create empty store
  • InMemoryEmbeddingStore(entries) - Create from existing entries

Add Methods:

  • add(embedding) - Add embedding, returns generated ID
  • add(id, embedding) - Add with specific ID
  • add(embedding, embedded) - Add with associated object, returns ID
  • add(id, embedding, embedded) - Add with specific ID and object
  • addAll(embeddings) - Add multiple, returns IDs
  • addAll(ids, embeddings, embedded) - Add multiple with IDs and objects

Remove Methods:

  • removeAll(ids) - Remove by IDs
  • removeAll(filter) - Remove by metadata filter
  • removeAll() - Clear all embeddings

Search:

  • search(request) - Find similar embeddings

Persistence:

  • serializeToJson() - Export to JSON string
  • serializeToFile(path) - Save to file
  • fromJson(json) - Load from JSON string
  • fromFile(path) - Load from file

Merge:

  • merge(stores) - Merge multiple stores
  • merge(first, second) - Merge two stores

Utility:

  • size() - Number of embeddings
  • isEmpty() - Check if empty

Example:

// Create and populate
InMemoryEmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();

TextSegment segment = TextSegment.from("Example text");
Embedding embedding = model.embed(segment).content();
store.add(embedding, segment);

// Persist to file
store.serializeToFile("embeddings.json");

// Load from file
InMemoryEmbeddingStore<TextSegment> loadedStore =
    InMemoryEmbeddingStore.fromFile("embeddings.json");

// Check size
System.out.println("Store contains " + loadedStore.size() + " embeddings");

// Clear
store.removeAll();

EmbeddingSearchRequest

Request parameters for embedding search.

package dev.langchain4j.store.embedding;

public class EmbeddingSearchRequest {
    // Constructor
    public EmbeddingSearchRequest(
        Embedding queryEmbedding,
        Integer maxResults,
        Double minScore,
        Filter filter
    )

    // Builder
    public static EmbeddingSearchRequestBuilder builder()

    // Getters
    public Embedding queryEmbedding()
    public int maxResults()
    public double minScore()
    public Filter filter()
}

Constructor:

  • All parameters: queryEmbedding, maxResults, minScore, filter

Builder:

  • builder() - Create builder for fluent configuration

Getters:

  • queryEmbedding() - The query embedding vector
  • maxResults() - Maximum results to return
  • minScore() - Minimum similarity score threshold
  • filter() - Metadata filter

Example:

import dev.langchain4j.store.embedding.EmbeddingSearchRequest;
import static dev.langchain4j.store.embedding.filter.MetadataFilterBuilder.metadataKey;

Embedding queryEmbedding = model.embed("search query").content();

EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
    .queryEmbedding(queryEmbedding)
    .maxResults(10)
    .minScore(0.7)
    .filter(metadataKey("category").isEqualTo("technical"))
    .build();

EmbeddingSearchResult<TextSegment> result = store.search(request);

EmbeddingSearchRequestBuilder

Builder for EmbeddingSearchRequest.

package dev.langchain4j.store.embedding;

public interface EmbeddingSearchRequestBuilder {
    EmbeddingSearchRequestBuilder queryEmbedding(Embedding queryEmbedding)
    EmbeddingSearchRequestBuilder maxResults(Integer maxResults)
    EmbeddingSearchRequestBuilder minScore(Double minScore)
    EmbeddingSearchRequestBuilder filter(Filter filter)
    EmbeddingSearchRequest build()
}

Methods:

  • queryEmbedding(embedding) - Set query vector (required)
  • maxResults(max) - Set max results
  • minScore(min) - Set minimum score threshold
  • filter(filter) - Set metadata filter
  • build() - Build the request

EmbeddingSearchResult

Result of embedding search containing matches.

package dev.langchain4j.store.embedding;

public class EmbeddingSearchResult<Embedded> {
    // Constructor
    public EmbeddingSearchResult(List<EmbeddingMatch<Embedded>> matches)

    // Methods
    public List<EmbeddingMatch<Embedded>> matches()
}

Constructor:

  • EmbeddingSearchResult(matches) - Create with match list

Methods:

  • matches() - Get list of matches (sorted by score, highest first)

Example:

EmbeddingSearchResult<TextSegment> result = store.search(request);

for (EmbeddingMatch<TextSegment> match : result.matches()) {
    System.out.println("Score: " + match.score());
    System.out.println("Text: " + match.embedded().text());
}

EmbeddingMatch

Single match from embedding search.

package dev.langchain4j.store.embedding;

public class EmbeddingMatch<Embedded> {
    // Methods
    public double score()
    public String embeddingId()
    public Embedding embedding()
    public Embedded embedded()
}

Methods:

  • score() - Similarity score (0.0-1.0, higher is more similar)
  • embeddingId() - ID of the embedding in the store
  • embedding() - The embedding vector
  • embedded() - The associated object (e.g., TextSegment)

Example:

EmbeddingSearchResult<TextSegment> result = store.search(request);

for (EmbeddingMatch<TextSegment> match : result.matches()) {
    double score = match.score();
    String id = match.embeddingId();
    TextSegment segment = match.embedded();

    if (score > 0.8) {
        System.out.println("High confidence match:");
        System.out.println("  ID: " + id);
        System.out.println("  Score: " + score);
        System.out.println("  Text: " + segment.text());
    }
}

Entry

Entry type for InMemoryEmbeddingStore initialization.

package dev.langchain4j.store.embedding.inmemory;

public class Entry<Embedded> {
    // Constructor
    public Entry(String id, Embedding embedding, Embedded embedded)

    // Methods
    public String id()
    public Embedding embedding()
    public Embedded embedded()
}

Constructor:

  • Entry(id, embedding, embedded) - Create entry with all fields

Methods:

  • id() - Get entry ID
  • embedding() - Get embedding vector
  • embedded() - Get associated object

Example:

import dev.langchain4j.store.embedding.inmemory.Entry;

// Create entries
List<Entry<TextSegment>> entries = new ArrayList<>();
entries.add(new Entry<>("id1", embedding1, segment1));
entries.add(new Entry<>("id2", embedding2, segment2));

// Initialize store with entries
InMemoryEmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>(entries);

Filter

Interface for filtering embeddings by metadata.

package dev.langchain4j.store.embedding.filter;

public interface Filter {
    // Test if object matches filter
    boolean test(Object object)

    // Combine filters
    default Filter and(Filter filter)
    static Filter and(Filter left, Filter right)
    default Filter or(Filter filter)
    static Filter or(Filter left, Filter right)
    static Filter not(Filter expression)
}

Test:

  • test(object) - Check if object matches filter

Combinators:

  • and(filter) - Combine with AND logic
  • or(filter) - Combine with OR logic
  • not(filter) - Negate filter

Example:

import static dev.langchain4j.store.embedding.filter.MetadataFilterBuilder.metadataKey;

// Single condition
Filter categoryFilter = metadataKey("category").isEqualTo("technical");

// Combined conditions
Filter complexFilter = metadataKey("category").isEqualTo("technical")
    .and(metadataKey("language").isEqualTo("en"))
    .and(metadataKey("version").isGreaterThan(2.0));

// Use in search
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
    .queryEmbedding(queryEmbedding)
    .filter(complexFilter)
    .build();

Usage Patterns

Basic In-Memory Store

import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;

EmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();

// Store is ready to use with EmbeddingStoreIngestor
EmbeddingStoreIngestor.ingest(documents, store);

Persistent Store

// Save after ingestion
store.serializeToFile("knowledge-base.json");

// Load later
InMemoryEmbeddingStore<TextSegment> store =
    InMemoryEmbeddingStore.fromFile("knowledge-base.json");

// Use immediately
ContentRetriever retriever = EmbeddingStoreContentRetriever.from(store);

Merging Stores

// Ingest different document sets
InMemoryEmbeddingStore<TextSegment> techDocs = new InMemoryEmbeddingStore<>();
EmbeddingStoreIngestor.ingest(technicalDocuments, techDocs);

InMemoryEmbeddingStore<TextSegment> userGuides = new InMemoryEmbeddingStore<>();
EmbeddingStoreIngestor.ingest(guideDocuments, userGuides);

// Merge into single store
InMemoryEmbeddingStore<TextSegment> allDocs =
    InMemoryEmbeddingStore.merge(techDocs, userGuides);

Manual Search

import dev.langchain4j.store.embedding.EmbeddingSearchRequest;
import dev.langchain4j.store.embedding.EmbeddingSearchResult;

// Embed query
Embedding queryEmbedding = embeddingModel.embed("How to configure?").content();

// Create search request
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
    .queryEmbedding(queryEmbedding)
    .maxResults(5)
    .minScore(0.7)
    .build();

// Execute search
EmbeddingSearchResult<TextSegment> result = store.search(request);

// Process results
for (EmbeddingMatch<TextSegment> match : result.matches()) {
    System.out.println(match.score() + ": " + match.embedded().text());
}

Filtered Search

import static dev.langchain4j.store.embedding.filter.MetadataFilterBuilder.metadataKey;

// Search only in specific category
Filter filter = metadataKey("category").isEqualTo("api-docs");

EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
    .queryEmbedding(queryEmbedding)
    .filter(filter)
    .maxResults(5)
    .build();

EmbeddingSearchResult<TextSegment> result = store.search(request);

Removing Entries

// Remove by IDs
List<String> idsToRemove = Arrays.asList("id1", "id2", "id3");
store.removeAll(idsToRemove);

// Remove by filter
Filter oldDocsFilter = metadataKey("version").isLessThan(2.0);
store.removeAll(oldDocsFilter);

// Clear everything
store.removeAll();

Production Considerations

InMemoryEmbeddingStore Limitations:

  • All embeddings stored in memory
  • Lost on application restart (unless persisted to file)
  • Memory consumption grows with document count
  • Single-machine only (no distribution)

For production, consider:

  • Vector databases: Pinecone, Weaviate, Qdrant, Milvus
  • LangChain4j provides integrations: langchain4j-<database-name>
  • Persistent storage with indexing
  • Distributed/cloud-native solutions
  • Better performance for large datasets

When InMemoryEmbeddingStore is sufficient:

  • Small to medium datasets (< 100k embeddings)
  • Single-machine deployment
  • Can afford restart time for reloading
  • Simplicity over scalability

Related APIs

  • Document Ingestion API - Populating embedding stores
  • Content Retrieval API - Searching embedding stores
  • Core Types - TextSegment, Embedding types
  • Quick Start - Usage examples

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-easy-rag@1.11.0

docs

api-document-loading.md

api-ingestion.md

api-retrieval.md

api-types-chat.md

api-types-core.md

api-types-storage.md

architecture.md

configuration.md

examples.md

index.md

quickstart.md

reference.md

troubleshooting.md

tile.json