CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-hugging-face

LangChain4j integration library for Hugging Face inference capabilities including chat, language, and embedding models

Overview
Eval results
Files

error-handling.mddocs/

Error Handling Guide

Comprehensive guide to errors, troubleshooting, and solutions for LangChain4j Hugging Face integration.

Error Format

All API errors are thrown as RuntimeException with a consistent message format:

status code: <HTTP_CODE>; body: <RESPONSE_BODY>

Example:

status code: 401; body: {"error": "Invalid token"}

Common Error Codes

401 Unauthorized

Cause: Invalid or missing API access token

Message: status code: 401; body: {"error": "Invalid token"}

Solutions:

  1. Verify token is correct: https://huggingface.co/settings/tokens
  2. Check token hasn't expired
  3. Ensure token has correct permissions
  4. Verify environment variable is set: echo $HF_API_KEY

Example Fix:

// ❌ Wrong
.accessToken("invalid-token")

// ✅ Correct
.accessToken(System.getenv("HF_API_KEY"))

404 Not Found

Cause: Model does not exist or is not accessible

Message: status code: 404; body: {"error": "Model not found"}

Solutions:

  1. Check model ID spelling
  2. Verify model exists on Hugging Face Hub
  3. Ensure model is public or you have access
  4. Check model type matches usage (embedding vs. generation)

Example Fix:

// ❌ Wrong
.modelId("sentence-transformers/all-MiniLM-L6-v2-typo")

// ✅ Correct
.modelId("sentence-transformers/all-MiniLM-L6-v2")

429 Too Many Requests

Cause: Rate limiting exceeded

Message: status code: 429; body: {"error": "Rate limit exceeded"}

Solutions:

  1. Reduce request frequency
  2. Implement exponential backoff
  3. Upgrade Hugging Face account tier
  4. Use batch operations (embedAll() instead of multiple embed())
  5. Implement request caching

Example Fix:

// Implement retry with backoff
int maxRetries = 3;
int attempt = 0;
while (attempt < maxRetries) {
    try {
        return model.embed(text).content();
    } catch (RuntimeException e) {
        if (e.getMessage().contains("429") && attempt < maxRetries - 1) {
            Thread.sleep((long) Math.pow(2, attempt) * 1000);
            attempt++;
        } else {
            throw e;
        }
    }
}

503 Service Unavailable

Cause: Model is loading or temporarily unavailable

Message: status code: 503; body: {"error": "Model is loading"}

Solutions:

  1. Set waitForModel(true) in configuration (default)
  2. Increase timeout duration
  3. Retry after a delay
  4. Use a different model that's already loaded

Example Fix:

// ✅ Ensure waiting enabled
HuggingFaceEmbeddingModel model = HuggingFaceEmbeddingModel.builder()
    .accessToken(apiKey)
    .waitForModel(true)  // Wait for model to load
    .timeout(Duration.ofSeconds(60))  // Longer timeout
    .build();

Timeout Exception

Cause: Request exceeded configured timeout

Message: Varies by HTTP client (contains "timeout" or "timed out")

Solutions:

  1. Increase timeout duration
  2. Check network connectivity
  3. Use smaller models or reduce batch size
  4. Try different API endpoint (baseUrl)

Example Fix:

import java.time.Duration;

HuggingFaceEmbeddingModel model = HuggingFaceEmbeddingModel.builder()
    .accessToken(apiKey)
    .timeout(Duration.ofSeconds(60))  // Increase from default 15s
    .build();

Error Handling Patterns

Basic Try-Catch

try {
    Embedding embedding = model.embed("text").content();
    // Process embedding
} catch (RuntimeException e) {
    System.err.println("Embedding failed: " + e.getMessage());
}

Specific Error Handling

try {
    Embedding embedding = model.embed("text").content();
} catch (RuntimeException e) {
    String msg = e.getMessage();

    if (msg.contains("401")) {
        System.err.println("Authentication failed. Check API token.");
    } else if (msg.contains("404")) {
        System.err.println("Model not found. Check model ID.");
    } else if (msg.contains("429")) {
        System.err.println("Rate limited. Reduce request frequency.");
    } else if (msg.contains("503")) {
        System.err.println("Model loading. Retry after delay.");
    } else if (msg.toLowerCase().contains("timeout")) {
        System.err.println("Request timeout. Increase timeout setting.");
    } else {
        System.err.println("Unknown error: " + msg);
    }
}

Retry with Exponential Backoff

public Embedding embedWithRetry(String text, int maxRetries) throws InterruptedException {
    int attempt = 0;
    long delay = 1000; // Start with 1 second

    while (attempt < maxRetries) {
        try {
            return model.embed(text).content();
        } catch (RuntimeException e) {
            attempt++;

            // Check if retryable error
            String msg = e.getMessage();
            boolean retryable = msg.contains("429") ||
                                msg.contains("503") ||
                                msg.toLowerCase().contains("timeout");

            if (!retryable || attempt >= maxRetries) {
                throw e;
            }

            System.err.println("Attempt " + attempt + " failed, retrying in " + delay + "ms...");
            Thread.sleep(delay);
            delay *= 2; // Exponential backoff
        }
    }

    throw new RuntimeException("Max retries exceeded");
}

Fallback to Alternative Model

public Embedding embedWithFallback(String text) {
    try {
        // Try primary model
        return primaryModel.embed(text).content();
    } catch (RuntimeException e) {
        System.err.println("Primary model failed, using fallback: " + e.getMessage());

        try {
            // Try fallback model
            return fallbackModel.embed(text).content();
        } catch (RuntimeException e2) {
            System.err.println("Fallback also failed: " + e2.getMessage());
            throw e2;
        }
    }
}

Circuit Breaker Pattern

class CircuitBreaker {
    private int failureCount = 0;
    private final int threshold = 5;
    private long openUntil = 0;
    private final long resetTimeout = 60000; // 1 minute

    public <T> T execute(Supplier<T> operation) {
        // Check if circuit is open
        if (openUntil > System.currentTimeMillis()) {
            throw new RuntimeException("Circuit breaker is open");
        }

        try {
            T result = operation.get();
            // Success - reset failure count
            failureCount = 0;
            return result;
        } catch (RuntimeException e) {
            failureCount++;

            // Trip circuit breaker
            if (failureCount >= threshold) {
                openUntil = System.currentTimeMillis() + resetTimeout;
                System.err.println("Circuit breaker opened");
            }

            throw e;
        }
    }
}

// Usage
CircuitBreaker breaker = new CircuitBreaker();
try {
    Embedding emb = breaker.execute(() -> model.embed("text").content());
} catch (RuntimeException e) {
    System.err.println("Operation failed: " + e.getMessage());
}

Graceful Degradation

public Optional<Embedding> embedSafely(String text) {
    try {
        return Optional.of(model.embed(text).content());
    } catch (RuntimeException e) {
        System.err.println("Embedding failed, continuing without: " + e.getMessage());
        return Optional.empty();
    }
}

// Usage
Optional<Embedding> embedding = embedSafely("text");
embedding.ifPresentOrElse(
    emb -> processEmbedding(emb),
    () -> System.out.println("Skipping this item")
);

Configuration Issues

Missing Access Token

Error: IllegalArgumentException: accessToken is required

Solution:

// ❌ Missing
HuggingFaceEmbeddingModel model = HuggingFaceEmbeddingModel.builder()
    .modelId("some-model")
    .build();

// ✅ Correct
HuggingFaceEmbeddingModel model = HuggingFaceEmbeddingModel.builder()
    .accessToken(System.getenv("HF_API_KEY"))
    .modelId("some-model")
    .build();

Invalid Timeout

Error: IllegalArgumentException: timeout must be positive

Solution:

// ❌ Wrong
.timeout(Duration.ofSeconds(0))
.timeout(Duration.ofSeconds(-10))

// ✅ Correct
.timeout(Duration.ofSeconds(30))

Invalid Base URL

Error: IllegalArgumentException: Invalid URL format

Solution:

// ❌ Wrong
.baseUrl("not-a-url")

// ✅ Correct
.baseUrl("https://custom-endpoint.example.com/")

Network Issues

Connection Refused

Cause: Cannot connect to API endpoint

Solutions:

  1. Check internet connectivity
  2. Verify firewall settings
  3. Check proxy configuration
  4. Try different baseUrl

DNS Resolution Failed

Cause: Cannot resolve API hostname

Solutions:

  1. Check DNS settings
  2. Try alternative DNS servers
  3. Check /etc/hosts (Unix) or C:\Windows\System32\drivers\etc\hosts (Windows)

SSL/TLS Errors

Cause: Certificate validation issues

Solutions:

  1. Update Java JDK/JRE
  2. Update CA certificates
  3. Check system time is correct
  4. Use valid baseUrl (https://)

Best Practices

1. Always Use Environment Variables for Secrets

// ✅ Good
.accessToken(System.getenv("HF_API_KEY"))

// ❌ Bad
.accessToken("hf_xxxxxxxxxxxx")

2. Set Appropriate Timeouts

// For fast operations (embeddings)
.timeout(Duration.ofSeconds(15))

// For slow operations (large models)
.timeout(Duration.ofMinutes(2))

3. Enable waitForModel

// ✅ Recommended
.waitForModel(true)

// ❌ May fail with 503
.waitForModel(false)

4. Implement Retry Logic

// For production code
int maxRetries = 3;
Duration retryDelay = Duration.ofSeconds(2);

5. Log Errors Appropriately

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

private static final Logger logger = LoggerFactory.getLogger(MyClass.class);

try {
    Embedding emb = model.embed("text").content();
} catch (RuntimeException e) {
    logger.error("Embedding failed for text: {}", text, e);
    throw e;
}

6. Monitor and Alert

// Track error rates
private final AtomicLong errorCount = new AtomicLong(0);
private final AtomicLong requestCount = new AtomicLong(0);

try {
    requestCount.incrementAndGet();
    return model.embed(text).content();
} catch (RuntimeException e) {
    errorCount.incrementAndGet();
    double errorRate = (double) errorCount.get() / requestCount.get();
    if (errorRate > 0.1) { // 10% error rate
        alerting.sendAlert("High error rate: " + errorRate);
    }
    throw e;
}

Testing Error Scenarios

Mock for Testing

Use SPI to inject mock client for testing error scenarios:

// Test 401 error
@Test
public void testInvalidToken() {
    HuggingFaceEmbeddingModel model = // ... with invalid token
    assertThrows(RuntimeException.class, () -> {
        model.embed("test");
    });
}

// Test timeout
@Test
public void testTimeout() {
    HuggingFaceEmbeddingModel model = HuggingFaceEmbeddingModel.builder()
        .accessToken(apiKey)
        .timeout(Duration.ofMillis(1)) // Very short timeout
        .build();

    assertThrows(RuntimeException.class, () -> {
        model.embed("test");
    });
}

Debugging Tips

1. Enable HTTP Logging

Add to logging configuration:

logging.level.okhttp3=DEBUG
logging.level.retrofit2=DEBUG

2. Verify Configuration

System.out.println("API Key set: " + (System.getenv("HF_API_KEY") != null));
System.out.println("Model ID: " + modelId);
System.out.println("Timeout: " + timeout);

3. Test with curl

curl -X POST https://router.huggingface.co/hf-inference/ \
  -H "Authorization: Bearer $HF_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"inputs": ["Hello world"], "options": {"wait_for_model": true}}'

4. Check API Status

Visit: https://status.huggingface.co/

Related Documentation

  • Quick Start Guide - Getting started
  • Configuration Guide - Configuration options
  • Common Tasks - Usage examples
  • Embedding Model API - API reference

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-hugging-face@1.11.0

docs

chat-model.md

client-api.md

common-tasks.md

configuration.md

embedding-model.md

error-handling.md

index.md

language-model.md

migration-guide.md

model-names.md

quick-start.md

spi-extensions.md

tile.json