CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-hugging-face

LangChain4j integration library for Hugging Face inference capabilities including chat, language, and embedding models

Overview
Eval results
Files

embedding-model.mddocs/

Embedding Model API

Complete API reference for HuggingFaceEmbeddingModel.

Overview

Generates vector embeddings for text using Hugging Face embedding models. Extends DimensionAwareEmbeddingModel from langchain4j-core.

Package: dev.langchain4j.model.huggingface Status: ✅ Active (not deprecated) Interfaces: EmbeddingModel, DimensionAwareEmbeddingModel

Class Signature

package dev.langchain4j.model.huggingface;

public class HuggingFaceEmbeddingModel
    extends dev.langchain4j.model.embedding.DimensionAwareEmbeddingModel {

    // Construction
    public static HuggingFaceEmbeddingModelBuilder builder();
    public static HuggingFaceEmbeddingModel withAccessToken(String accessToken);

    // Embedding operations
    public dev.langchain4j.model.output.Response<dev.langchain4j.data.embedding.Embedding>
        embed(String text);

    public dev.langchain4j.model.output.Response<dev.langchain4j.data.embedding.Embedding>
        embed(dev.langchain4j.data.segment.TextSegment textSegment);

    public dev.langchain4j.model.output.Response<java.util.List<dev.langchain4j.data.embedding.Embedding>>
        embedAll(java.util.List<dev.langchain4j.data.segment.TextSegment> textSegments);

    // Model information
    public int dimension();
    public String modelName();

    // Listeners
    public dev.langchain4j.model.embedding.EmbeddingModel
        addListener(dev.langchain4j.model.embedding.EmbeddingModelListener listener);

    public dev.langchain4j.model.embedding.EmbeddingModel
        addListeners(java.util.List<dev.langchain4j.model.embedding.EmbeddingModelListener> listeners);
}

Builder API

public static class HuggingFaceEmbeddingModelBuilder {

    public HuggingFaceEmbeddingModelBuilder();

    public HuggingFaceEmbeddingModelBuilder baseUrl(String baseUrl);
    public HuggingFaceEmbeddingModelBuilder accessToken(String accessToken);
    public HuggingFaceEmbeddingModelBuilder modelId(String modelId);
    public HuggingFaceEmbeddingModelBuilder waitForModel(Boolean waitForModel);
    public HuggingFaceEmbeddingModelBuilder timeout(java.time.Duration timeout);

    public HuggingFaceEmbeddingModel build();
    public String toString();
}

Construction

builder()

Creates a new builder for configuring the model.

public static HuggingFaceEmbeddingModelBuilder builder()

Returns: New builder instance

Example:

HuggingFaceEmbeddingModel model = HuggingFaceEmbeddingModel.builder()
    .accessToken(System.getenv("HF_API_KEY"))
    .modelId("sentence-transformers/all-MiniLM-L6-v2")
    .build();

withAccessToken()

Quick construction with only access token.

public static HuggingFaceEmbeddingModel withAccessToken(String accessToken)

Parameters:

  • accessToken - Hugging Face API access token

Returns: Configured model with default settings

Example:

HuggingFaceEmbeddingModel model =
    HuggingFaceEmbeddingModel.withAccessToken(System.getenv("HF_API_KEY"));

Embedding Operations

embed(String)

Generates embedding for a single text string.

public dev.langchain4j.model.output.Response<dev.langchain4j.data.embedding.Embedding>
    embed(String text)

Parameters:

  • text - Text to embed

Returns: Response containing the embedding

Throws: RuntimeException on API errors

Example:

Response<Embedding> response = model.embed("Hello world");
Embedding embedding = response.content();
float[] vector = embedding.vector();

embed(TextSegment)

Generates embedding for a text segment.

public dev.langchain4j.model.output.Response<dev.langchain4j.data.embedding.Embedding>
    embed(dev.langchain4j.data.segment.TextSegment textSegment)

Parameters:

  • textSegment - Text segment to embed

Returns: Response containing the embedding

Throws: RuntimeException on API errors

Example:

TextSegment segment = TextSegment.from("Hello world");
Response<Embedding> response = model.embed(segment);

embedAll()

Generates embeddings for multiple text segments in a single API call.

public dev.langchain4j.model.output.Response<java.util.List<dev.langchain4j.data.embedding.Embedding>>
    embedAll(java.util.List<dev.langchain4j.data.segment.TextSegment> textSegments)

Parameters:

  • textSegments - List of text segments to embed

Returns: Response containing list of embeddings (same order as input)

Throws: RuntimeException on API errors

Performance: More efficient than multiple embed() calls

Example:

List<TextSegment> segments = List.of(
    TextSegment.from("Text 1"),
    TextSegment.from("Text 2")
);
Response<List<Embedding>> response = model.embedAll(segments);
List<Embedding> embeddings = response.content();

Model Information

dimension()

Returns the dimension of embeddings produced by this model.

public int dimension()

Returns: Embedding vector dimension (e.g., 384, 768)

Note: Dimension is determined by first embedding call and cached

Example:

int dim = model.dimension();
System.out.println("Model produces " + dim + "-dimensional embeddings");

modelName()

Returns the name of the model.

public String modelName()

Returns: Model name, defaults to "unknown" if not specified

Example:

String name = model.modelName();

Listeners

addListener()

Adds a listener to monitor embedding operations.

public dev.langchain4j.model.embedding.EmbeddingModel
    addListener(dev.langchain4j.model.embedding.EmbeddingModelListener listener)

Parameters:

  • listener - Listener to monitor operations

Returns: This model instance (for chaining)

Example:

EmbeddingModelListener listener = new EmbeddingModelListener() {
    @Override
    public void onRequest(EmbeddingRequest request) {
        System.out.println("Embedding " + request.texts().size() + " texts");
    }

    @Override
    public void onResponse(EmbeddingResponse response) {
        System.out.println("Got " + response.embeddings().size() + " embeddings");
    }

    @Override
    public void onError(Throwable error) {
        System.err.println("Error: " + error.getMessage());
    }
};

model.addListener(listener);

addListeners()

Adds multiple listeners at once.

public dev.langchain4j.model.embedding.EmbeddingModel
    addListeners(java.util.List<dev.langchain4j.model.embedding.EmbeddingModelListener> listeners)

Parameters:

  • listeners - List of listeners

Returns: This model instance (for chaining)

Constructors

Public constructors are available but builders are recommended.

Constructor (Standard)

public HuggingFaceEmbeddingModel(
    String accessToken,
    String modelId,
    Boolean waitForModel,
    java.time.Duration timeout
)

Constructor (Custom Base URL)

public HuggingFaceEmbeddingModel(
    String baseUrl,
    String accessToken,
    String modelId,
    Boolean waitForModel,
    java.time.Duration timeout
)

Recommendation: Use builder pattern instead of constructors.

Configuration Reference

See Configuration Guide for detailed configuration options.

Key Parameters:

  • accessToken (required) - Hugging Face API key
  • modelId (recommended) - Model identifier
  • baseUrl (optional) - Custom API endpoint
  • timeout (optional) - Request timeout (default: 15s)
  • waitForModel (optional) - Wait if model loading (default: true)

Recommended Models

ModelDimensionSpeedQualityUse Case
sentence-transformers/all-MiniLM-L6-v2384FastGoodGeneral purpose
sentence-transformers/all-mpnet-base-v2768MediumHighHigh quality
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2384FastGoodMultilingual
BAAI/bge-small-en-v1.5384FastGoodEnglish retrieval
BAAI/bge-base-en-v1.5768MediumHighEnglish retrieval

Error Handling

All embedding methods throw RuntimeException for API errors.

Error Format: "status code: <code>; body: <body>"

Common Error Codes:

  • 401 - Invalid or missing access token
  • 404 - Model not found
  • 429 - Rate limiting
  • 503 - Model loading or unavailable
  • Timeout - Request exceeded timeout duration

Example:

try {
    Response<Embedding> response = model.embed("text");
} catch (RuntimeException e) {
    String msg = e.getMessage();
    if (msg.contains("401")) {
        // Invalid token
    } else if (msg.contains("404")) {
        // Model not found
    } else if (msg.contains("429")) {
        // Rate limited
    } else if (msg.contains("503")) {
        // Model loading
    }
}

See Error Handling Guide for detailed error scenarios.

Type Reference

Embedding

package dev.langchain4j.data.embedding;

public class Embedding {
    public float[] vector();
    public int dimension();
    public static Embedding from(float[] vector);
}

TextSegment

package dev.langchain4j.data.segment;

public class TextSegment {
    public String text();
    public static TextSegment from(String text);
}

Response<T>

package dev.langchain4j.model.output;

public class Response<T> {
    public T content();
    public static <T> Response<T> from(T content);
}

Performance Considerations

  1. Batch Processing: Use embedAll() for multiple texts (single API call)
  2. Dimension Caching: First call determines dimension, subsequent calls use cached value
  3. Timeouts: Increase timeout for slow networks or large batches
  4. Model Loading: Keep waitForModel(true) to avoid failures
  5. Rate Limits: Hugging Face API has rate limits based on token tier

Related Documentation

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-hugging-face@1.11.0

docs

chat-model.md

client-api.md

common-tasks.md

configuration.md

embedding-model.md

error-handling.md

index.md

language-model.md

migration-guide.md

model-names.md

quick-start.md

spi-extensions.md

tile.json