LangChain4j integration for Chroma embedding store enabling storage, retrieval, and similarity search of vector embeddings with metadata filtering support for both API V1 and V2.
Java integration for Chroma vector database providing storage, retrieval, and similarity search of embeddings. Implements LangChain4j's EmbeddingStore interface with metadata filtering support for both Chroma API V1 (0.5.16+) and V2 (0.7.0+).
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-chroma</artifactId>
<version>1.11.0</version>
</dependency>Minimal setup:
import dev.langchain4j.store.embedding.chroma.ChromaEmbeddingStore;
import dev.langchain4j.data.embedding.Embedding;
ChromaEmbeddingStore store = ChromaEmbeddingStore.builder()
.baseUrl("http://localhost:8000")
.collectionName("my-documents")
.build();
// Add embedding
Embedding embedding = Embedding.from(new float[]{0.1f, 0.2f, 0.3f});
String id = store.add(embedding);
// Search
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(10)
.build();
EmbeddingSearchResult<TextSegment> results = store.search(request);// Main classes
import dev.langchain4j.store.embedding.chroma.ChromaEmbeddingStore;
import dev.langchain4j.store.embedding.chroma.ChromaApiVersion;
// Core types (langchain4j-core)
import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.data.document.Metadata;
import dev.langchain4j.store.embedding.EmbeddingSearchRequest;
import dev.langchain4j.store.embedding.EmbeddingSearchResult;
import dev.langchain4j.store.embedding.EmbeddingMatch;
import dev.langchain4j.store.embedding.filter.Filter;Distance Metric: Always uses cosine distance (hnsw:space = cosine)
Score Calculation: score = 1 - (distance / 2) where distance ∈ [0, 2]
Auto-creation: Collections are automatically created if they don't exist
API Versions:
// Single with auto-generated ID
String id = store.add(embedding);
// Single with specific ID
store.add("custom-id", embedding);
// With text segment and metadata
TextSegment segment = TextSegment.from(
"document text",
new Metadata().put("author", "John").put("year", 2024)
);
String id = store.add(embedding, segment);
// Batch add (efficient)
List<String> ids = store.addAll(embeddings);See: Add Operations | Metadata Guide
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(10)
.minScore(0.7)
.build();
EmbeddingSearchResult<TextSegment> result = store.search(request);
for (EmbeddingMatch<TextSegment> match : result.matches()) {
double score = match.score();
String id = match.embeddingId();
TextSegment segment = match.embedded();
}See: Search Operations | Filtering Guide
import static dev.langchain4j.store.embedding.filter.MetadataFilterBuilder.*;
Filter filter = metadataKey("author").isEqualTo("John Doe")
.and(metadataKey("year").isGreaterThanOrEqualTo(2020));
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(5)
.filter(filter)
.build();See: Filtering Guide | Filter API
// Single ID
store.remove("id-to-remove");
// Multiple IDs
store.removeAll(Arrays.asList("id1", "id2", "id3"));
// By metadata filter
store.removeAll(metadataKey("status").isEqualTo("outdated"));
// All embeddings
store.removeAll();See: Remove Operations
ChromaEmbeddingStore store = ChromaEmbeddingStore.builder()
.baseUrl("http://localhost:8000")
.collectionName("my-documents")
.timeout(Duration.ofSeconds(10))
.logRequests(true)
.logResponses(true)
.build();ChromaEmbeddingStore store = ChromaEmbeddingStore.builder()
.apiVersion(ChromaApiVersion.V2)
.baseUrl("http://localhost:8000")
.tenantName("my-tenant") // Default: "default"
.databaseName("my-database") // Default: "default"
.collectionName("my-documents")
.timeout(Duration.ofSeconds(10))
.build();See: Configuration Guide | Builder API
Use batch operations for multiple embeddings:
// Good: Single HTTP request
List<String> ids = store.addAll(embeddings);
// Bad: N HTTP requests
for (Embedding e : embeddings) { store.add(e); }Reuse store instances - don't create per operation
Adjust timeouts for large operations:
.timeout(Duration.ofSeconds(30))See: Performance Guide
Migrating from V1 to V2:
// V1 → V2: Add apiVersion and optional tenant/database
ChromaEmbeddingStore store = ChromaEmbeddingStore.builder()
.apiVersion(ChromaApiVersion.V2) // Add this
.baseUrl("http://localhost:8000")
.tenantName("my-tenant") // Optional
.databaseName("my-database") // Optional
.collectionName("my-collection")
.build();See: Migration Guide
Common issues:
See: Error Handling Guide
Install with Tessl CLI
npx tessl i tessl/maven-dev-langchain4j--langchain4j-chroma@1.11.0