CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-chroma

LangChain4j integration for Chroma embedding store enabling storage, retrieval, and similarity search of vector embeddings with metadata filtering support for both API V1 and V2.

Pending
Overview
Eval results
Files

configuration.mddocs/guides/

Configuration Guide

Comprehensive guide to configuring ChromaEmbeddingStore.

Installation

Maven

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-chroma</artifactId>
    <version>1.11.0</version>
</dependency>

Gradle

implementation 'dev.langchain4j:langchain4j-chroma:1.11.0'

Basic Configuration

Minimal Setup (V1)

ChromaEmbeddingStore store = ChromaEmbeddingStore.builder()
    .baseUrl("http://localhost:8000")
    .build();

Defaults:

  • API version: V1
  • Collection: "default"
  • Timeout: 5 seconds
  • Logging: disabled

Minimal Setup (V2)

ChromaEmbeddingStore store = ChromaEmbeddingStore.builder()
    .apiVersion(ChromaApiVersion.V2)
    .baseUrl("http://localhost:8000")
    .build();

Defaults:

  • Tenant: "default"
  • Database: "default"
  • Collection: "default"
  • Timeout: 5 seconds

API Version Selection

V1 API (Default)

For Chroma 0.5.16+ with flat collection structure:

ChromaEmbeddingStore store = ChromaEmbeddingStore.builder()
    .apiVersion(ChromaApiVersion.V1)  // Optional, it's default
    .baseUrl("http://localhost:8000")
    .collectionName("my-documents")
    .build();

V2 API

For Chroma 0.7.0+ with hierarchical structure:

ChromaEmbeddingStore store = ChromaEmbeddingStore.builder()
    .apiVersion(ChromaApiVersion.V2)
    .baseUrl("http://localhost:8000")
    .tenantName("my-tenant")
    .databaseName("my-database")
    .collectionName("my-documents")
    .build();

Configuration Parameters

baseUrl (Required)

// Local development
.baseUrl("http://localhost:8000")

// Remote server
.baseUrl("https://chroma.example.com")

// Custom port
.baseUrl("http://chroma-server:9000")

Required: Yes, no default

collectionName

.collectionName("documents")
.collectionName("user-embeddings")
.collectionName("knowledge-base")

Default: "default"

Naming:

  • Use descriptive names
  • Lowercase recommended
  • Hyphens or underscores for multi-word names

tenantName (V2 Only)

.tenantName("production")
.tenantName("customer-123")
.tenantName("team-alpha")

Default: "default"

Use cases:

  • Multi-tenancy isolation
  • Customer segregation
  • Environment separation

databaseName (V2 Only)

.databaseName("main")
.databaseName("analytics")
.databaseName("staging")

Default: "default"

Use cases:

  • Logical data separation
  • Application modules
  • Data lifecycle stages

timeout

import java.time.Duration;

// Short timeout for fast operations
.timeout(Duration.ofSeconds(5))   // Default

// Medium timeout
.timeout(Duration.ofSeconds(15))

// Long timeout for large batches
.timeout(Duration.ofSeconds(30))
.timeout(Duration.ofMinutes(1))

Default: 5 seconds

Guidelines:

  • Local server: 5-10 seconds
  • Remote server: 10-20 seconds
  • Large batches: 30-60 seconds

logRequests

.logRequests(true)   // Enable request logging
.logRequests(false)  // Disable (default)

Default: false

Use for:

  • Debugging API issues
  • Understanding request format
  • Development and testing

logResponses

.logResponses(true)   // Enable response logging
.logResponses(false)  // Disable (default)

Default: false

Use for:

  • Debugging response issues
  • Understanding response format
  • Development and testing

Environment-Specific Configurations

Development

ChromaEmbeddingStore devStore = ChromaEmbeddingStore.builder()
    .baseUrl("http://localhost:8000")
    .collectionName("dev-test")
    .timeout(Duration.ofSeconds(60))  // Generous timeout
    .logRequests(true)                // Debug info
    .logResponses(true)
    .build();

Staging

ChromaEmbeddingStore stagingStore = ChromaEmbeddingStore.builder()
    .apiVersion(ChromaApiVersion.V2)
    .baseUrl("https://chroma-staging.example.com")
    .tenantName("staging")
    .databaseName("main")
    .collectionName("embeddings")
    .timeout(Duration.ofSeconds(20))
    .logRequests(false)
    .logResponses(false)
    .build();

Production

ChromaEmbeddingStore prodStore = ChromaEmbeddingStore.builder()
    .apiVersion(ChromaApiVersion.V2)
    .baseUrl("https://chroma.example.com")
    .tenantName("production")
    .databaseName("primary")
    .collectionName("embeddings")
    .timeout(Duration.ofSeconds(15))
    .logRequests(false)               // Disable in prod
    .logResponses(false)
    .build();

Configuration from Properties

Using Spring Configuration

@Configuration
public class ChromaConfig {

    @Value("${chroma.base-url}")
    private String baseUrl;

    @Value("${chroma.collection-name}")
    private String collectionName;

    @Value("${chroma.timeout:10}")
    private int timeoutSeconds;

    @Bean
    public ChromaEmbeddingStore chromaEmbeddingStore() {
        return ChromaEmbeddingStore.builder()
            .baseUrl(baseUrl)
            .collectionName(collectionName)
            .timeout(Duration.ofSeconds(timeoutSeconds))
            .build();
    }
}

application.properties:

chroma.base-url=http://localhost:8000
chroma.collection-name=documents
chroma.timeout=15

Using Environment Variables

public class ChromaConfig {
    public static ChromaEmbeddingStore createStore() {
        String baseUrl = System.getenv().getOrDefault(
            "CHROMA_BASE_URL",
            "http://localhost:8000"
        );

        String collectionName = System.getenv().getOrDefault(
            "CHROMA_COLLECTION",
            "default"
        );

        int timeout = Integer.parseInt(
            System.getenv().getOrDefault("CHROMA_TIMEOUT", "10")
        );

        return ChromaEmbeddingStore.builder()
            .baseUrl(baseUrl)
            .collectionName(collectionName)
            .timeout(Duration.ofSeconds(timeout))
            .build();
    }
}

Multi-Store Configuration

Multiple Collections

// User documents
ChromaEmbeddingStore userDocs = ChromaEmbeddingStore.builder()
    .baseUrl("http://localhost:8000")
    .collectionName("user-documents")
    .build();

// System documentation
ChromaEmbeddingStore sysDocs = ChromaEmbeddingStore.builder()
    .baseUrl("http://localhost:8000")
    .collectionName("system-docs")
    .build();

Multi-Tenant Setup (V2)

// Tenant 1
ChromaEmbeddingStore tenant1 = ChromaEmbeddingStore.builder()
    .apiVersion(ChromaApiVersion.V2)
    .baseUrl("http://localhost:8000")
    .tenantName("customer-1")
    .databaseName("production")
    .collectionName("embeddings")
    .build();

// Tenant 2
ChromaEmbeddingStore tenant2 = ChromaEmbeddingStore.builder()
    .apiVersion(ChromaApiVersion.V2)
    .baseUrl("http://localhost:8000")
    .tenantName("customer-2")
    .databaseName("production")
    .collectionName("embeddings")
    .build();

Multiple Databases (V2)

// Production database
ChromaEmbeddingStore prod = ChromaEmbeddingStore.builder()
    .apiVersion(ChromaApiVersion.V2)
    .baseUrl("http://localhost:8000")
    .tenantName("my-tenant")
    .databaseName("production")
    .collectionName("embeddings")
    .build();

// Analytics database
ChromaEmbeddingStore analytics = ChromaEmbeddingStore.builder()
    .apiVersion(ChromaApiVersion.V2)
    .baseUrl("http://localhost:8000")
    .tenantName("my-tenant")
    .databaseName("analytics")
    .collectionName("embeddings")
    .build();

Advanced Configuration

Connection Pooling

ChromaEmbeddingStore reuses HTTP connections automatically. Create one instance and reuse it:

// GOOD - Create once, reuse many times
public class EmbeddingService {
    private final ChromaEmbeddingStore store;

    public EmbeddingService() {
        this.store = ChromaEmbeddingStore.builder()
            .baseUrl("http://localhost:8000")
            .build();
    }

    public void addDocument(Embedding emb, TextSegment seg) {
        store.add(emb, seg);
    }

    public EmbeddingSearchResult<TextSegment> search(Embedding query) {
        return store.search(
            EmbeddingSearchRequest.builder()
                .queryEmbedding(query)
                .maxResults(10)
                .build()
        );
    }
}

Factory Pattern

public class ChromaStoreFactory {

    public static ChromaEmbeddingStore createStore(String environment) {
        return switch (environment.toLowerCase()) {
            case "dev" -> ChromaEmbeddingStore.builder()
                .baseUrl("http://localhost:8000")
                .collectionName("dev")
                .timeout(Duration.ofSeconds(60))
                .logRequests(true)
                .build();

            case "staging" -> ChromaEmbeddingStore.builder()
                .baseUrl("https://chroma-staging.example.com")
                .collectionName("staging")
                .timeout(Duration.ofSeconds(20))
                .build();

            case "prod" -> ChromaEmbeddingStore.builder()
                .baseUrl("https://chroma.example.com")
                .collectionName("production")
                .timeout(Duration.ofSeconds(15))
                .build();

            default -> throw new IllegalArgumentException(
                "Unknown environment: " + environment
            );
        };
    }
}

Configuration Validation

Validation on Build

try {
    ChromaEmbeddingStore store = ChromaEmbeddingStore.builder()
        // .baseUrl(...) // Missing - will throw
        .build();
} catch (IllegalArgumentException e) {
    System.err.println("Configuration error: " + e.getMessage());
}

Testing Configuration

public class ChromaConfigTest {

    @Test
    public void testConnection() {
        ChromaEmbeddingStore store = ChromaEmbeddingStore.builder()
            .baseUrl("http://localhost:8000")
            .collectionName("test")
            .build();

        // Test with simple operation
        try {
            Embedding testEmb = Embedding.from(new float[]{1.0f, 0.0f, 0.0f});
            String id = store.add(testEmb);
            store.remove(id);
            System.out.println("Connection successful");
        } catch (Exception e) {
            System.err.println("Connection failed: " + e.getMessage());
        }
    }
}

Troubleshooting

Connection Refused

// Check baseUrl is correct
.baseUrl("http://localhost:8000")  // Not https, not missing port

// Ensure Chroma server is running:
// docker run -p 8000:8000 chromadb/chroma

Timeout Errors

// Increase timeout for slow operations
.timeout(Duration.ofSeconds(30))

// Or check network connectivity

Collection Not Found

Auto-creation should handle this. If issues persist:

  • Check Chroma server logs
  • Verify API version compatibility
  • Ensure Chroma server has write permissions

Best Practices

  1. Single instance per collection - Create once, reuse
  2. Use builders - Don't use deprecated constructors
  3. Externalize configuration - Use properties or env vars
  4. Set appropriate timeouts - Based on operation size
  5. Disable logging in production - For performance and security
  6. Use V2 for new projects - Better organization and features
  7. Test configuration early - Catch connection issues immediately

Related

  • Builder API - Complete API reference
  • ChromaApiVersion - Version selection
  • Migration Guide - Upgrading to V2

External Resources

  • Chroma Installation Guide
  • Docker Compose Setup

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-chroma

docs

index.md

tile.json