LangChain4j integration for Chroma embedding store enabling storage, retrieval, and similarity search of vector embeddings with metadata filtering support for both API V1 and V2.
Types for embedding search requests and results.
Request object for embedding search operations.
package dev.langchain4j.store.embedding;
public class EmbeddingSearchRequestpublic static EmbeddingSearchRequestBuilder builder();Returns a builder for creating search requests.
public static class EmbeddingSearchRequestBuilderpublic EmbeddingSearchRequestBuilder query(String query);Sets optional query string for hybrid search.
Note: ChromaEmbeddingStore does not use this field.
Parameters:
query - query stringReturns: builder for chaining
public EmbeddingSearchRequestBuilder queryEmbedding(Embedding queryEmbedding);Sets the embedding to search for (required).
Parameters:
queryEmbedding - embedding vector for similarity searchRequired: Yes
Returns: builder for chaining
public EmbeddingSearchRequestBuilder maxResults(Integer maxResults);Sets maximum number of results to return.
Parameters:
maxResults - result limitDefault: 3
Returns: builder for chaining
public EmbeddingSearchRequestBuilder minScore(Double minScore);Sets minimum similarity score threshold.
Parameters:
minScore - threshold (0.0 to 1.0)Default: 0.0
Returns: builder for chaining
public EmbeddingSearchRequestBuilder filter(Filter filter);Sets metadata filter.
Parameters:
filter - metadata filter conditionOptional: Can be null
Returns: builder for chaining
public EmbeddingSearchRequest build();Builds the search request.
Returns: configured search request
Throws: IllegalArgumentException if queryEmbedding is not set
public String query();
public Embedding queryEmbedding();
public int maxResults();
public double minScore();
public Filter filter();Getters for request parameters.
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(embedding)
.maxResults(10)
.build();EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(embedding)
.maxResults(20)
.minScore(0.75)
.build();Filter filter = metadataKey("category").isEqualTo("tech");
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(embedding)
.maxResults(15)
.minScore(0.7)
.filter(filter)
.build();EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(10)
.minScore(0.8)
.filter(
metadataKey("status").isEqualTo("published")
.and(metadataKey("year").isGreaterThanOrEqualTo(2023))
)
.build();Result container for embedding search.
package dev.langchain4j.store.embedding;
public class EmbeddingSearchResult<Embedded>public List<EmbeddingMatch<Embedded>> matches();Returns list of matching embeddings with scores.
Returns: list of matches (may be empty, never null)
EmbeddingSearchResult<TextSegment> result = store.search(request);
for (EmbeddingMatch<TextSegment> match : result.matches()) {
double score = match.score();
String id = match.embeddingId();
TextSegment segment = match.embedded();
}EmbeddingSearchResult<TextSegment> result = store.search(request);
if (result.matches().isEmpty()) {
System.out.println("No matches found");
} else {
System.out.println("Found " + result.matches().size() + " matches");
}List<String> texts = result.matches().stream()
.map(match -> match.embedded())
.filter(Objects::nonNull)
.map(TextSegment::text)
.collect(Collectors.toList());List<EmbeddingMatch<TextSegment>> highScoreMatches =
result.matches().stream()
.filter(match -> match.score() > 0.85)
.collect(Collectors.toList());// 1. Create query embedding
Embedding queryEmbedding = embeddingModel.embed("search query").content();
// 2. Build search request
Filter filter = metadataKey("category").isEqualTo("documentation");
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(10)
.minScore(0.7)
.filter(filter)
.build();
// 3. Execute search
EmbeddingSearchResult<TextSegment> result = store.search(request);
// 4. Process results
for (EmbeddingMatch<TextSegment> match : result.matches()) {
System.out.println("Score: " + match.score());
System.out.println("ID: " + match.embeddingId());
TextSegment segment = match.embedded();
if (segment != null) {
System.out.println("Text: " + segment.text());
Metadata meta = segment.metadata();
if (meta != null && meta.containsKey("author")) {
System.out.println("Author: " + meta.getString("author"));
}
}
}Results are ordered by similarity score in descending order:
result.matches().get(0).score() >= result.matches().get(1).score()First match has highest similarity score.
Actual number of results may be less than maxResults due to:
// Request 100 results
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(100)
.minScore(0.95) // Very high threshold
.build();
EmbeddingSearchResult<TextSegment> result = store.search(request);
// May return < 100 results if few matches score >= 0.95ChromaEmbeddingStore returns scores in range [0, 1]:
// Faster - retrieve fewer results
.maxResults(5)
// Slower - retrieve many results
.maxResults(100)// May improve performance by early termination
.minScore(0.8)
// Retrieves all available matches
.minScore(0.0)More selective filters improve performance:
// Selective - fewer documents to scan
.filter(metadataKey("id").isEqualTo("specific-id"))
// Broad - more documents to scan
.filter(metadataKey("year").isGreaterThan(2000))See: Search Operations for detailed search examples.
Install with Tessl CLI
npx tessl i tessl/maven-dev-langchain4j--langchain4j-chroma@1.11.0