CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-chroma

LangChain4j integration for Chroma embedding store enabling storage, retrieval, and similarity search of vector embeddings with metadata filtering support for both API V1 and V2.

Overview
Eval results
Files

search-types.mddocs/api/

Search Types API

Types for embedding search requests and results.

EmbeddingSearchRequest

Request object for embedding search operations.

package dev.langchain4j.store.embedding;

public class EmbeddingSearchRequest

Builder Factory

public static EmbeddingSearchRequestBuilder builder();

Returns a builder for creating search requests.

Builder

public static class EmbeddingSearchRequestBuilder

query

public EmbeddingSearchRequestBuilder query(String query);

Sets optional query string for hybrid search.

Note: ChromaEmbeddingStore does not use this field.

Parameters:

  • query - query string

Returns: builder for chaining


queryEmbedding

public EmbeddingSearchRequestBuilder queryEmbedding(Embedding queryEmbedding);

Sets the embedding to search for (required).

Parameters:

  • queryEmbedding - embedding vector for similarity search

Required: Yes

Returns: builder for chaining


maxResults

public EmbeddingSearchRequestBuilder maxResults(Integer maxResults);

Sets maximum number of results to return.

Parameters:

  • maxResults - result limit

Default: 3

Returns: builder for chaining


minScore

public EmbeddingSearchRequestBuilder minScore(Double minScore);

Sets minimum similarity score threshold.

Parameters:

  • minScore - threshold (0.0 to 1.0)

Default: 0.0

Returns: builder for chaining


filter

public EmbeddingSearchRequestBuilder filter(Filter filter);

Sets metadata filter.

Parameters:

  • filter - metadata filter condition

Optional: Can be null

Returns: builder for chaining


build

public EmbeddingSearchRequest build();

Builds the search request.

Returns: configured search request

Throws: IllegalArgumentException if queryEmbedding is not set

Instance Methods

public String query();
public Embedding queryEmbedding();
public int maxResults();
public double minScore();
public Filter filter();

Getters for request parameters.

Usage Examples

Basic Search

EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
    .queryEmbedding(embedding)
    .maxResults(10)
    .build();

With Score Threshold

EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
    .queryEmbedding(embedding)
    .maxResults(20)
    .minScore(0.75)
    .build();

With Filter

Filter filter = metadataKey("category").isEqualTo("tech");

EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
    .queryEmbedding(embedding)
    .maxResults(15)
    .minScore(0.7)
    .filter(filter)
    .build();

Complete Configuration

EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
    .queryEmbedding(queryEmbedding)
    .maxResults(10)
    .minScore(0.8)
    .filter(
        metadataKey("status").isEqualTo("published")
        .and(metadataKey("year").isGreaterThanOrEqualTo(2023))
    )
    .build();

EmbeddingSearchResult

Result container for embedding search.

package dev.langchain4j.store.embedding;

public class EmbeddingSearchResult<Embedded>

Methods

public List<EmbeddingMatch<Embedded>> matches();

Returns list of matching embeddings with scores.

Returns: list of matches (may be empty, never null)

Usage Examples

Basic Result Processing

EmbeddingSearchResult<TextSegment> result = store.search(request);

for (EmbeddingMatch<TextSegment> match : result.matches()) {
    double score = match.score();
    String id = match.embeddingId();
    TextSegment segment = match.embedded();
}

Check for Results

EmbeddingSearchResult<TextSegment> result = store.search(request);

if (result.matches().isEmpty()) {
    System.out.println("No matches found");
} else {
    System.out.println("Found " + result.matches().size() + " matches");
}

Extract Text Content

List<String> texts = result.matches().stream()
    .map(match -> match.embedded())
    .filter(Objects::nonNull)
    .map(TextSegment::text)
    .collect(Collectors.toList());

Filter by Score

List<EmbeddingMatch<TextSegment>> highScoreMatches =
    result.matches().stream()
        .filter(match -> match.score() > 0.85)
        .collect(Collectors.toList());

Complete Search Example

// 1. Create query embedding
Embedding queryEmbedding = embeddingModel.embed("search query").content();

// 2. Build search request
Filter filter = metadataKey("category").isEqualTo("documentation");

EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
    .queryEmbedding(queryEmbedding)
    .maxResults(10)
    .minScore(0.7)
    .filter(filter)
    .build();

// 3. Execute search
EmbeddingSearchResult<TextSegment> result = store.search(request);

// 4. Process results
for (EmbeddingMatch<TextSegment> match : result.matches()) {
    System.out.println("Score: " + match.score());
    System.out.println("ID: " + match.embeddingId());

    TextSegment segment = match.embedded();
    if (segment != null) {
        System.out.println("Text: " + segment.text());

        Metadata meta = segment.metadata();
        if (meta != null && meta.containsKey("author")) {
            System.out.println("Author: " + meta.getString("author"));
        }
    }
}

Result Ordering

Results are ordered by similarity score in descending order:

result.matches().get(0).score() >= result.matches().get(1).score()

First match has highest similarity score.

Result Limits

Actual number of results may be less than maxResults due to:

  1. minScore threshold - matches below threshold are excluded
  2. Available data - collection has fewer than maxResults items
  3. Filter restrictions - filter excludes most documents
// Request 100 results
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
    .queryEmbedding(queryEmbedding)
    .maxResults(100)
    .minScore(0.95)  // Very high threshold
    .build();

EmbeddingSearchResult<TextSegment> result = store.search(request);
// May return < 100 results if few matches score >= 0.95

Score Ranges

ChromaEmbeddingStore returns scores in range [0, 1]:

  • 1.0 - perfect match
  • 0.9-1.0 - very high similarity
  • 0.8-0.9 - high similarity
  • 0.7-0.8 - moderate similarity
  • 0.5-0.7 - low similarity
  • < 0.5 - minimal similarity

Performance Considerations

maxResults Impact

// Faster - retrieve fewer results
.maxResults(5)

// Slower - retrieve many results
.maxResults(100)

minScore Impact

// May improve performance by early termination
.minScore(0.8)

// Retrieves all available matches
.minScore(0.0)

Filter Impact

More selective filters improve performance:

// Selective - fewer documents to scan
.filter(metadataKey("id").isEqualTo("specific-id"))

// Broad - more documents to scan
.filter(metadataKey("year").isGreaterThan(2000))

Related APIs

  • Filter API - Metadata filtering
  • Core Types - Embedding, TextSegment, Metadata
  • ChromaEmbeddingStore - Main store class

Examples

See: Search Operations for detailed search examples.

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-chroma@1.11.0

docs

api

builder.md

filters.md

search-types.md

store.md

types.md

version.md

index.md

tile.json