CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-core

Core classes and interfaces of LangChain4j providing foundational abstractions for LLM interaction, RAG, embeddings, agents, and observability

Overview
Eval results
Files

embedding-models.mddocs/models/

Embedding Models

Package: dev.langchain4j.model.embedding Thread-Safety: Implementation-dependent, typically thread-safe Primary Interfaces: EmbeddingModel, DimensionAwareEmbeddingModel

Embedding models convert text into dense vector representations (embeddings) for semantic similarity search, RAG systems, and clustering.

Core Interfaces

EmbeddingModel

package dev.langchain4j.model.embedding;

import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.model.output.Response;

/**
 * Embedding model interface for generating vector embeddings
 * Thread-Safety: Implementation-dependent, typically thread-safe
 * Performance: ALWAYS prefer embedAll() for multiple texts
 */
public interface EmbeddingModel {
    /**
     * Embed single text string
     * Use only for one-off operations
     * @param text Text to embed (non-null)
     * @return Response with embedding
     */
    Response<Embedding> embed(String text);

    /**
     * Embed single text segment with metadata
     * @param textSegment Text segment (non-null)
     * @return Response with embedding
     */
    Response<Embedding> embed(TextSegment textSegment);

    /**
     * Embed multiple text segments (RECOMMENDED)
     * 10-100x faster than individual embed() calls
     * @param textSegments List of text segments (non-null)
     * @return Response with list of embeddings (same order as input)
     */
    Response<List<Embedding>> embedAll(List<TextSegment> textSegments);
}

DimensionAwareEmbeddingModel

package dev.langchain4j.model.embedding;

/**
 * Embedding model that exposes dimensionality
 * Useful for configuring vector stores
 */
public interface DimensionAwareEmbeddingModel extends EmbeddingModel {
    /**
     * Get embedding dimension count
     * @return Number of dimensions in embeddings (e.g., 1536, 768)
     */
    int dimension();
}

Data Types

Embedding

package dev.langchain4j.data.embedding;

/**
 * Vector representation of text
 * Immutability: Immutable, thread-safe
 */
public class Embedding {
    private final float[] vector;

    /**
     * Get embedding vector
     * @return Float array of embedding values
     */
    public float[] vector() { /* ... */ }

    /**
     * Compute cosine similarity with another embedding
     * Both embeddings should be from the same model
     * @param other Other embedding (non-null)
     * @return Similarity score 0.0-1.0 (higher = more similar)
     */
    public double cosineSimilarity(Embedding other) { /* ... */ }

    /**
     * Normalize embedding vector (L2 normalization)
     * Required for accurate cosine similarity
     * @return Normalized embedding
     */
    public Embedding normalize() { /* ... */ }
}

TextSegment

package dev.langchain4j.data.segment;

import dev.langchain4j.data.document.Metadata;

/**
 * Text chunk with optional metadata
 * Immutability: Immutable, thread-safe
 */
public class TextSegment {
    private final String text;
    private final Metadata metadata;

    public static TextSegment from(String text) { /* ... */ }
    public static TextSegment from(String text, Metadata metadata) { /* ... */ }

    public String text() { /* ... */ }
    public Metadata metadata() { /* ... */ }
}

Usage Examples

Basic Embedding

import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.model.output.Response;

// Initialize from provider-specific module
EmbeddingModel model = /* OpenAiEmbeddingModel, etc. */;

// Single text embedding
Response<Embedding> response = model.embed("This is a test sentence");
Embedding embedding = response.content();
float[] vector = embedding.vector();

System.out.println("Dimensions: " + vector.length);  // e.g., 1536

Batch Embedding (RECOMMENDED)

import dev.langchain4j.data.segment.TextSegment;
import java.util.List;
import java.util.ArrayList;

// Create text segments
List<TextSegment> segments = new ArrayList<>();
segments.add(TextSegment.from("First document"));
segments.add(TextSegment.from("Second document"));
segments.add(TextSegment.from("Third document"));

// Batch embedding (10-100x faster than individual calls)
Response<List<Embedding>> response = model.embedAll(segments);
List<Embedding> embeddings = response.content();

System.out.println("Generated " + embeddings.size() + " embeddings");
for (int i = 0; i < embeddings.size(); i++) {
    Embedding emb = embeddings.get(i);
    System.out.println("Document " + i + ": " + emb.vector().length + " dimensions");
}

Computing Similarity

// Embed two texts
Embedding emb1 = model.embed("Machine learning is fascinating").content();
Embedding emb2 = model.embed("AI and ML are interesting topics").content();
Embedding emb3 = model.embed("I like pizza and pasta").content();

// Compute cosine similarity
double similarity12 = emb1.cosineSimilarity(emb2);
double similarity13 = emb1.cosineSimilarity(emb3);

System.out.println("Similarity 1-2: " + similarity12);  // High (~0.8-0.9)
System.out.println("Similarity 1-3: " + similarity13);  // Low (~0.3-0.5)

Checking Dimensions

import dev.langchain4j.model.embedding.DimensionAwareEmbeddingModel;

if (model instanceof DimensionAwareEmbeddingModel) {
    DimensionAwareEmbeddingModel dimModel = (DimensionAwareEmbeddingModel) model;
    int dimensions = dimModel.dimension();
    System.out.println("Embedding dimensions: " + dimensions);

    // Use dimensions to configure vector store
    configureVectorStore(dimensions);
}

Embedding with Metadata

import dev.langchain4j.data.document.Metadata;
import java.util.Map;

// Create segments with metadata
Metadata metadata1 = Metadata.from(Map.of(
    "source", "documentation",
    "category", "technical",
    "page", 1
));

TextSegment segment = TextSegment.from("Technical documentation text", metadata1);
Response<Embedding> response = model.embed(segment);

// Metadata is preserved with segment, not with embedding
// Store both in vector database for filtering

Performance Best Practices

1. ALWAYS Use Batch Operations

// ❌ BAD: Individual calls in loop (VERY SLOW)
List<Embedding> embeddings = new ArrayList<>();
for (String text : texts) {
    Embedding emb = model.embed(text).content();
    embeddings.add(emb);
}

// ✅ GOOD: Single batch call (10-100x FASTER)
List<TextSegment> segments = texts.stream()
    .map(TextSegment::from)
    .collect(Collectors.toList());
Response<List<Embedding>> response = model.embedAll(segments);
List<Embedding> embeddings = response.content();

2. Optimal Batch Sizes

// Most models handle 100-1000 items efficiently
int BATCH_SIZE = 100;

List<List<TextSegment>> batches = partition(allSegments, BATCH_SIZE);
List<Embedding> allEmbeddings = new ArrayList<>();

for (List<TextSegment> batch : batches) {
    Response<List<Embedding>> response = model.embedAll(batch);
    allEmbeddings.addAll(response.content());
}

3. Handle Rate Limits

import dev.langchain4j.exception.RateLimitException;

for (List<TextSegment> batch : batches) {
    try {
        Response<List<Embedding>> response = model.embedAll(batch);
        allEmbeddings.addAll(response.content());
    } catch (RateLimitException e) {
        // Wait and retry
        Thread.sleep(60000);  // 1 minute
        // Retry this batch
    }
}

4. Normalize for Cosine Similarity

// ✅ GOOD: Normalize before computing similarity
Embedding normalized1 = emb1.normalize();
Embedding normalized2 = emb2.normalize();
double similarity = normalized1.cosineSimilarity(normalized2);

// Most models produce already-normalized embeddings,
// but normalizing again is safe and ensures accuracy

Common Use Cases

1. Semantic Search

// Embed documents
List<TextSegment> documents = loadDocuments();
Response<List<Embedding>> docResponse = model.embedAll(documents);
List<Embedding> docEmbeddings = docResponse.content();

// Embed query
String query = "How to configure authentication?";
Embedding queryEmbedding = model.embed(query).content();

// Find most similar documents
List<ScoredDocument> results = new ArrayList<>();
for (int i = 0; i < docEmbeddings.size(); i++) {
    double score = queryEmbedding.cosineSimilarity(docEmbeddings.get(i));
    if (score > 0.7) {  // Threshold
        results.add(new ScoredDocument(documents.get(i), score));
    }
}

// Sort by score descending
results.sort((a, b) -> Double.compare(b.score(), a.score()));

2. Clustering Documents

// Embed all documents
List<Embedding> embeddings = model.embedAll(documents).content();

// Compute similarity matrix
double[][] similarities = new double[embeddings.size()][embeddings.size()];
for (int i = 0; i < embeddings.size(); i++) {
    for (int j = i + 1; j < embeddings.size(); j++) {
        double sim = embeddings.get(i).cosineSimilarity(embeddings.get(j));
        similarities[i][j] = sim;
        similarities[j][i] = sim;
    }
}

// Apply clustering algorithm (e.g., k-means, hierarchical)
List<Cluster> clusters = clusterBySimilarity(similarities);

3. Duplicate Detection

// Embed all texts
List<Embedding> embeddings = model.embedAll(segments).content();

// Find near-duplicates (high similarity)
double DUPLICATE_THRESHOLD = 0.95;
Set<Integer> duplicateIndices = new HashSet<>();

for (int i = 0; i < embeddings.size(); i++) {
    for (int j = i + 1; j < embeddings.size(); j++) {
        double sim = embeddings.get(i).cosineSimilarity(embeddings.get(j));
        if (sim > DUPLICATE_THRESHOLD) {
            duplicateIndices.add(j);  // Mark j as duplicate
        }
    }
}

Common Pitfalls

PitfallSolution
Using individual embed() in loopUse embedAll() - 10-100x faster
Not normalizing vectorsCall normalize() before similarity checks
Mixing embeddings from different modelsEach model has different dimensions/semantics
Ignoring rate limitsImplement retry with exponential backoff
Empty or null text segmentsValidate inputs before embedding
Wrong similarity metricUse cosine similarity, not Euclidean distance
Comparing raw vector arraysUse cosineSimilarity() method

Model-Specific Considerations

OpenAI Embeddings

  • Dimensions: 1536 (text-embedding-ada-002), 3072 (text-embedding-3-large)
  • Max Input: ~8,000 tokens per text
  • Normalized: Yes, already normalized
  • Batch Size: Up to 2,048 texts per request

Vertex AI (Google)

  • Dimensions: 768 (textembedding-gecko)
  • Max Input: ~3,000 tokens
  • Normalized: Yes
  • Batch Size: Check quota limits

Azure OpenAI

  • Dimensions: Same as OpenAI
  • Rate Limits: Based on deployment configuration
  • Normalized: Yes

See Also

  • Embeddings and Vector Search - Embedding stores and search
  • Documents and Segments - Text segmentation
  • RAG System - Using embeddings for retrieval

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-core

docs

guardrails.md

index.md

memory.md

observability.md

tools.md

tile.json