LangChain4j integration for Chroma embedding store enabling storage, retrieval, and similarity search of vector embeddings with metadata filtering support for both API V1 and V2.
—
Java integration for Chroma vector database providing storage, retrieval, and similarity search of embeddings. Implements LangChain4j's EmbeddingStore interface with metadata filtering support for both Chroma API V1 (0.5.16+) and V2 (0.7.0+).
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-chroma</artifactId>
<version>1.11.0</version>
</dependency>Minimal setup:
import dev.langchain4j.store.embedding.chroma.ChromaEmbeddingStore;
import dev.langchain4j.data.embedding.Embedding;
ChromaEmbeddingStore store = ChromaEmbeddingStore.builder()
.baseUrl("http://localhost:8000")
.collectionName("my-documents")
.build();
// Add embedding
Embedding embedding = Embedding.from(new float[]{0.1f, 0.2f, 0.3f});
String id = store.add(embedding);
// Search
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(10)
.build();
EmbeddingSearchResult<TextSegment> results = store.search(request);// Main classes
import dev.langchain4j.store.embedding.chroma.ChromaEmbeddingStore;
import dev.langchain4j.store.embedding.chroma.ChromaApiVersion;
// Core types (langchain4j-core)
import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.data.document.Metadata;
import dev.langchain4j.store.embedding.EmbeddingSearchRequest;
import dev.langchain4j.store.embedding.EmbeddingSearchResult;
import dev.langchain4j.store.embedding.EmbeddingMatch;
import dev.langchain4j.store.embedding.filter.Filter;Distance Metric: Always uses cosine distance (hnsw:space = cosine)
Score Calculation: score = 1 - (distance / 2) where distance ∈ [0, 2]
Auto-creation: Collections are automatically created if they don't exist
API Versions:
// Single with auto-generated ID
String id = store.add(embedding);
// Single with specific ID
store.add("custom-id", embedding);
// With text segment and metadata
TextSegment segment = TextSegment.from(
"document text",
new Metadata().put("author", "John").put("year", 2024)
);
String id = store.add(embedding, segment);
// Batch add (efficient)
List<String> ids = store.addAll(embeddings);See: Add Operations | Metadata Guide
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(10)
.minScore(0.7)
.build();
EmbeddingSearchResult<TextSegment> result = store.search(request);
for (EmbeddingMatch<TextSegment> match : result.matches()) {
double score = match.score();
String id = match.embeddingId();
TextSegment segment = match.embedded();
}See: Search Operations | Filtering Guide
import static dev.langchain4j.store.embedding.filter.MetadataFilterBuilder.*;
Filter filter = metadataKey("author").isEqualTo("John Doe")
.and(metadataKey("year").isGreaterThanOrEqualTo(2020));
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(5)
.filter(filter)
.build();See: Filtering Guide | Filter API
// Single ID
store.remove("id-to-remove");
// Multiple IDs
store.removeAll(Arrays.asList("id1", "id2", "id3"));
// By metadata filter
store.removeAll(metadataKey("status").isEqualTo("outdated"));
// All embeddings
store.removeAll();See: Remove Operations
ChromaEmbeddingStore store = ChromaEmbeddingStore.builder()
.baseUrl("http://localhost:8000")
.collectionName("my-documents")
.timeout(Duration.ofSeconds(10))
.logRequests(true)
.logResponses(true)
.build();ChromaEmbeddingStore store = ChromaEmbeddingStore.builder()
.apiVersion(ChromaApiVersion.V2)
.baseUrl("http://localhost:8000")
.tenantName("my-tenant") // Default: "default"
.databaseName("my-database") // Default: "default"
.collectionName("my-documents")
.timeout(Duration.ofSeconds(10))
.build();See: Configuration Guide | Builder API
Use batch operations for multiple embeddings:
// Good: Single HTTP request
List<String> ids = store.addAll(embeddings);
// Bad: N HTTP requests
for (Embedding e : embeddings) { store.add(e); }Reuse store instances - don't create per operation
Adjust timeouts for large operations:
.timeout(Duration.ofSeconds(30))See: Performance Guide
Migrating from V1 to V2:
// V1 → V2: Add apiVersion and optional tenant/database
ChromaEmbeddingStore store = ChromaEmbeddingStore.builder()
.apiVersion(ChromaApiVersion.V2) // Add this
.baseUrl("http://localhost:8000")
.tenantName("my-tenant") // Optional
.databaseName("my-database") // Optional
.collectionName("my-collection")
.build();See: Migration Guide
Common issues:
See: Error Handling Guide
Install with Tessl CLI
npx tessl i tessl/maven-dev-langchain4j--langchain4j-chroma