LangChain4j integration for Chroma embedding store enabling storage, retrieval, and similarity search of vector embeddings with metadata filtering support for both API V1 and V2.
—
Types for embedding search requests and results.
Request object for embedding search operations.
package dev.langchain4j.store.embedding;
public class EmbeddingSearchRequestpublic static EmbeddingSearchRequestBuilder builder();Returns a builder for creating search requests.
public static class EmbeddingSearchRequestBuilderpublic EmbeddingSearchRequestBuilder query(String query);Sets optional query string for hybrid search.
Note: ChromaEmbeddingStore does not use this field.
Parameters:
query - query stringReturns: builder for chaining
public EmbeddingSearchRequestBuilder queryEmbedding(Embedding queryEmbedding);Sets the embedding to search for (required).
Parameters:
queryEmbedding - embedding vector for similarity searchRequired: Yes
Returns: builder for chaining
public EmbeddingSearchRequestBuilder maxResults(Integer maxResults);Sets maximum number of results to return.
Parameters:
maxResults - result limitDefault: 3
Returns: builder for chaining
public EmbeddingSearchRequestBuilder minScore(Double minScore);Sets minimum similarity score threshold.
Parameters:
minScore - threshold (0.0 to 1.0)Default: 0.0
Returns: builder for chaining
public EmbeddingSearchRequestBuilder filter(Filter filter);Sets metadata filter.
Parameters:
filter - metadata filter conditionOptional: Can be null
Returns: builder for chaining
public EmbeddingSearchRequest build();Builds the search request.
Returns: configured search request
Throws: IllegalArgumentException if queryEmbedding is not set
public String query();
public Embedding queryEmbedding();
public int maxResults();
public double minScore();
public Filter filter();Getters for request parameters.
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(embedding)
.maxResults(10)
.build();EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(embedding)
.maxResults(20)
.minScore(0.75)
.build();Filter filter = metadataKey("category").isEqualTo("tech");
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(embedding)
.maxResults(15)
.minScore(0.7)
.filter(filter)
.build();EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(10)
.minScore(0.8)
.filter(
metadataKey("status").isEqualTo("published")
.and(metadataKey("year").isGreaterThanOrEqualTo(2023))
)
.build();Result container for embedding search.
package dev.langchain4j.store.embedding;
public class EmbeddingSearchResult<Embedded>public List<EmbeddingMatch<Embedded>> matches();Returns list of matching embeddings with scores.
Returns: list of matches (may be empty, never null)
EmbeddingSearchResult<TextSegment> result = store.search(request);
for (EmbeddingMatch<TextSegment> match : result.matches()) {
double score = match.score();
String id = match.embeddingId();
TextSegment segment = match.embedded();
}EmbeddingSearchResult<TextSegment> result = store.search(request);
if (result.matches().isEmpty()) {
System.out.println("No matches found");
} else {
System.out.println("Found " + result.matches().size() + " matches");
}List<String> texts = result.matches().stream()
.map(match -> match.embedded())
.filter(Objects::nonNull)
.map(TextSegment::text)
.collect(Collectors.toList());List<EmbeddingMatch<TextSegment>> highScoreMatches =
result.matches().stream()
.filter(match -> match.score() > 0.85)
.collect(Collectors.toList());// 1. Create query embedding
Embedding queryEmbedding = embeddingModel.embed("search query").content();
// 2. Build search request
Filter filter = metadataKey("category").isEqualTo("documentation");
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(10)
.minScore(0.7)
.filter(filter)
.build();
// 3. Execute search
EmbeddingSearchResult<TextSegment> result = store.search(request);
// 4. Process results
for (EmbeddingMatch<TextSegment> match : result.matches()) {
System.out.println("Score: " + match.score());
System.out.println("ID: " + match.embeddingId());
TextSegment segment = match.embedded();
if (segment != null) {
System.out.println("Text: " + segment.text());
Metadata meta = segment.metadata();
if (meta != null && meta.containsKey("author")) {
System.out.println("Author: " + meta.getString("author"));
}
}
}Results are ordered by similarity score in descending order:
result.matches().get(0).score() >= result.matches().get(1).score()First match has highest similarity score.
Actual number of results may be less than maxResults due to:
// Request 100 results
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(100)
.minScore(0.95) // Very high threshold
.build();
EmbeddingSearchResult<TextSegment> result = store.search(request);
// May return < 100 results if few matches score >= 0.95ChromaEmbeddingStore returns scores in range [0, 1]:
// Faster - retrieve fewer results
.maxResults(5)
// Slower - retrieve many results
.maxResults(100)// May improve performance by early termination
.minScore(0.8)
// Retrieves all available matches
.minScore(0.0)More selective filters improve performance:
// Selective - fewer documents to scan
.filter(metadataKey("id").isEqualTo("specific-id"))
// Broad - more documents to scan
.filter(metadataKey("year").isGreaterThan(2000))See: Search Operations for detailed search examples.
Install with Tessl CLI
npx tessl i tessl/maven-dev-langchain4j--langchain4j-chroma