tessl/maven-dev-langchain4j--langchain4j-hugging-face

LangChain4j integration library for Hugging Face inference capabilities including chat, language, and embedding models

Overview

Eval results

Files

Error Handling Guide

Name: tessl/maven-dev-langchain4j--langchain4j-hugging-face
Author: tessl

Comprehensive guide to errors, troubleshooting, and solutions for LangChain4j Hugging Face integration.

Error Format

All API errors are thrown as RuntimeException with a consistent message format:

status code: <HTTP_CODE>; body: <RESPONSE_BODY>

Example:

status code: 401; body: {"error": "Invalid token"}

Common Error Codes

401 Unauthorized

Cause: Invalid or missing API access token

Message: status code: 401; body: {"error": "Invalid token"}

Solutions:

Verify token is correct: https://huggingface.co/settings/tokens
Check token hasn't expired
Ensure token has correct permissions
Verify environment variable is set: echo $HF_API_KEY

Example Fix:

// ❌ Wrong
.accessToken("invalid-token")

// ✅ Correct
.accessToken(System.getenv("HF_API_KEY"))

404 Not Found

Cause: Model does not exist or is not accessible

Message: status code: 404; body: {"error": "Model not found"}

Solutions:

Check model ID spelling
Verify model exists on Hugging Face Hub
Ensure model is public or you have access
Check model type matches usage (embedding vs. generation)

Example Fix:

// ❌ Wrong
.modelId("sentence-transformers/all-MiniLM-L6-v2-typo")

// ✅ Correct
.modelId("sentence-transformers/all-MiniLM-L6-v2")

429 Too Many Requests

Cause: Rate limiting exceeded

Message: status code: 429; body: {"error": "Rate limit exceeded"}

Solutions:

Reduce request frequency
Implement exponential backoff
Upgrade Hugging Face account tier
Use batch operations (embedAll() instead of multiple embed())
Implement request caching

Example Fix:

// Implement retry with backoff
int maxRetries = 3;
int attempt = 0;
while (attempt < maxRetries) {
    try {
        return model.embed(text).content();
    } catch (RuntimeException e) {
        if (e.getMessage().contains("429") && attempt < maxRetries - 1) {
            Thread.sleep((long) Math.pow(2, attempt) * 1000);
            attempt++;
        } else {
            throw e;
        }
    }
}

503 Service Unavailable

Cause: Model is loading or temporarily unavailable

Message: status code: 503; body: {"error": "Model is loading"}

Solutions:

Set waitForModel(true) in configuration (default)
Increase timeout duration
Retry after a delay
Use a different model that's already loaded

Example Fix:

// ✅ Ensure waiting enabled
HuggingFaceEmbeddingModel model = HuggingFaceEmbeddingModel.builder()
    .accessToken(apiKey)
    .waitForModel(true)  // Wait for model to load
    .timeout(Duration.ofSeconds(60))  // Longer timeout
    .build();

Timeout Exception

Cause: Request exceeded configured timeout

Message: Varies by HTTP client (contains "timeout" or "timed out")

Solutions:

Increase timeout duration
Check network connectivity
Use smaller models or reduce batch size
Try different API endpoint (baseUrl)

Example Fix:

import java.time.Duration;

HuggingFaceEmbeddingModel model = HuggingFaceEmbeddingModel.builder()
    .accessToken(apiKey)
    .timeout(Duration.ofSeconds(60))  // Increase from default 15s
    .build();

Error Handling Patterns

Basic Try-Catch

try {
    Embedding embedding = model.embed("text").content();
    // Process embedding
} catch (RuntimeException e) {
    System.err.println("Embedding failed: " + e.getMessage());
}

Specific Error Handling

try {
    Embedding embedding = model.embed("text").content();
} catch (RuntimeException e) {
    String msg = e.getMessage();

    if (msg.contains("401")) {
        System.err.println("Authentication failed. Check API token.");
    } else if (msg.contains("404")) {
        System.err.println("Model not found. Check model ID.");
    } else if (msg.contains("429")) {
        System.err.println("Rate limited. Reduce request frequency.");
    } else if (msg.contains("503")) {
        System.err.println("Model loading. Retry after delay.");
    } else if (msg.toLowerCase().contains("timeout")) {
        System.err.println("Request timeout. Increase timeout setting.");
    } else {
        System.err.println("Unknown error: " + msg);
    }
}

Retry with Exponential Backoff

public Embedding embedWithRetry(String text, int maxRetries) throws InterruptedException {
    int attempt = 0;
    long delay = 1000; // Start with 1 second

    while (attempt < maxRetries) {
        try {
            return model.embed(text).content();
        } catch (RuntimeException e) {
            attempt++;

            // Check if retryable error
            String msg = e.getMessage();
            boolean retryable = msg.contains("429") ||
                                msg.contains("503") ||
                                msg.toLowerCase().contains("timeout");

            if (!retryable || attempt >= maxRetries) {
                throw e;
            }

            System.err.println("Attempt " + attempt + " failed, retrying in " + delay + "ms...");
            Thread.sleep(delay);
            delay *= 2; // Exponential backoff
        }
    }

    throw new RuntimeException("Max retries exceeded");
}

Fallback to Alternative Model

public Embedding embedWithFallback(String text) {
    try {
        // Try primary model
        return primaryModel.embed(text).content();
    } catch (RuntimeException e) {
        System.err.println("Primary model failed, using fallback: " + e.getMessage());

        try {
            // Try fallback model
            return fallbackModel.embed(text).content();
        } catch (RuntimeException e2) {
            System.err.println("Fallback also failed: " + e2.getMessage());
            throw e2;
        }
    }
}

Circuit Breaker Pattern

class CircuitBreaker {
    private int failureCount = 0;
    private final int threshold = 5;
    private long openUntil = 0;
    private final long resetTimeout = 60000; // 1 minute

    public <T> T execute(Supplier<T> operation) {
        // Check if circuit is open
        if (openUntil > System.currentTimeMillis()) {
            throw new RuntimeException("Circuit breaker is open");
        }

        try {
            T result = operation.get();
            // Success - reset failure count
            failureCount = 0;
            return result;
        } catch (RuntimeException e) {
            failureCount++;

            // Trip circuit breaker
            if (failureCount >= threshold) {
                openUntil = System.currentTimeMillis() + resetTimeout;
                System.err.println("Circuit breaker opened");
            }

            throw e;
        }
    }
}

// Usage
CircuitBreaker breaker = new CircuitBreaker();
try {
    Embedding emb = breaker.execute(() -> model.embed("text").content());
} catch (RuntimeException e) {
    System.err.println("Operation failed: " + e.getMessage());
}

Graceful Degradation

public Optional<Embedding> embedSafely(String text) {
    try {
        return Optional.of(model.embed(text).content());
    } catch (RuntimeException e) {
        System.err.println("Embedding failed, continuing without: " + e.getMessage());
        return Optional.empty();
    }
}

// Usage
Optional<Embedding> embedding = embedSafely("text");
embedding.ifPresentOrElse(
    emb -> processEmbedding(emb),
    () -> System.out.println("Skipping this item")
);

Configuration Issues

Missing Access Token

Error: IllegalArgumentException: accessToken is required

Solution:

// ❌ Missing
HuggingFaceEmbeddingModel model = HuggingFaceEmbeddingModel.builder()
    .modelId("some-model")
    .build();

// ✅ Correct
HuggingFaceEmbeddingModel model = HuggingFaceEmbeddingModel.builder()
    .accessToken(System.getenv("HF_API_KEY"))
    .modelId("some-model")
    .build();

Invalid Timeout

Error: IllegalArgumentException: timeout must be positive

Solution:

// ❌ Wrong
.timeout(Duration.ofSeconds(0))
.timeout(Duration.ofSeconds(-10))

// ✅ Correct
.timeout(Duration.ofSeconds(30))

Invalid Base URL

Error: IllegalArgumentException: Invalid URL format

Solution:

// ❌ Wrong
.baseUrl("not-a-url")

// ✅ Correct
.baseUrl("https://custom-endpoint.example.com/")

Network Issues

Connection Refused

Cause: Cannot connect to API endpoint

Solutions:

Check internet connectivity
Verify firewall settings
Check proxy configuration
Try different baseUrl

DNS Resolution Failed

Cause: Cannot resolve API hostname

Solutions:

Check DNS settings
Try alternative DNS servers
Check /etc/hosts (Unix) or C:\Windows\System32\drivers\etc\hosts (Windows)

SSL/TLS Errors

Cause: Certificate validation issues

Solutions:

Update Java JDK/JRE
Update CA certificates
Check system time is correct
Use valid baseUrl (https://)

Best Practices

1. Always Use Environment Variables for Secrets

// ✅ Good
.accessToken(System.getenv("HF_API_KEY"))

// ❌ Bad
.accessToken("hf_xxxxxxxxxxxx")

2. Set Appropriate Timeouts

// For fast operations (embeddings)
.timeout(Duration.ofSeconds(15))

// For slow operations (large models)
.timeout(Duration.ofMinutes(2))

3. Enable waitForModel

// ✅ Recommended
.waitForModel(true)

// ❌ May fail with 503
.waitForModel(false)

4. Implement Retry Logic

// For production code
int maxRetries = 3;
Duration retryDelay = Duration.ofSeconds(2);

5. Log Errors Appropriately

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

private static final Logger logger = LoggerFactory.getLogger(MyClass.class);

try {
    Embedding emb = model.embed("text").content();
} catch (RuntimeException e) {
    logger.error("Embedding failed for text: {}", text, e);
    throw e;
}

6. Monitor and Alert

// Track error rates
private final AtomicLong errorCount = new AtomicLong(0);
private final AtomicLong requestCount = new AtomicLong(0);

try {
    requestCount.incrementAndGet();
    return model.embed(text).content();
} catch (RuntimeException e) {
    errorCount.incrementAndGet();
    double errorRate = (double) errorCount.get() / requestCount.get();
    if (errorRate > 0.1) { // 10% error rate
        alerting.sendAlert("High error rate: " + errorRate);
    }
    throw e;
}

Testing Error Scenarios

Mock for Testing

Use SPI to inject mock client for testing error scenarios:

// Test 401 error
@Test
public void testInvalidToken() {
    HuggingFaceEmbeddingModel model = // ... with invalid token
    assertThrows(RuntimeException.class, () -> {
        model.embed("test");
    });
}

// Test timeout
@Test
public void testTimeout() {
    HuggingFaceEmbeddingModel model = HuggingFaceEmbeddingModel.builder()
        .accessToken(apiKey)
        .timeout(Duration.ofMillis(1)) // Very short timeout
        .build();

    assertThrows(RuntimeException.class, () -> {
        model.embed("test");
    });
}

Debugging Tips

1. Enable HTTP Logging

Add to logging configuration:

logging.level.okhttp3=DEBUG
logging.level.retrofit2=DEBUG

2. Verify Configuration

System.out.println("API Key set: " + (System.getenv("HF_API_KEY") != null));
System.out.println("Model ID: " + modelId);
System.out.println("Timeout: " + timeout);

3. Test with curl

curl -X POST https://router.huggingface.co/hf-inference/ \
  -H "Authorization: Bearer $HF_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"inputs": ["Hello world"], "options": {"wait_for_model": true}}'

4. Check API Status

Visit: https://status.huggingface.co/

tessl/maven-dev-langchain4j--langchain4j-hugging-face

error-handling.mddocs/

Error Handling Guide

Error Format

Common Error Codes

401 Unauthorized

404 Not Found

429 Too Many Requests

503 Service Unavailable

Timeout Exception

Error Handling Patterns

Basic Try-Catch

Specific Error Handling

Retry with Exponential Backoff

Fallback to Alternative Model

Circuit Breaker Pattern

Graceful Degradation

Configuration Issues

Missing Access Token

Invalid Timeout

Invalid Base URL

Network Issues

Connection Refused

DNS Resolution Failed

SSL/TLS Errors

Best Practices

1. Always Use Environment Variables for Secrets

2. Set Appropriate Timeouts

3. Enable waitForModel

4. Implement Retry Logic

5. Log Errors Appropriately

6. Monitor and Alert

Testing Error Scenarios

Mock for Testing

Debugging Tips

1. Enable HTTP Logging

2. Verify Configuration

3. Test with curl

4. Check API Status

Related Documentation

tessl/maven-dev-langchain4j--langchain4j-hugging-face

error-handling.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

Error Handling Guide

Error Format

Common Error Codes

401 Unauthorized

404 Not Found

429 Too Many Requests

503 Service Unavailable

Timeout Exception

Error Handling Patterns

Basic Try-Catch

Specific Error Handling

Retry with Exponential Backoff

Fallback to Alternative Model

Circuit Breaker Pattern

Graceful Degradation

Configuration Issues

Missing Access Token

Invalid Timeout

Invalid Base URL

Network Issues

Connection Refused

DNS Resolution Failed

SSL/TLS Errors

Best Practices

1. Always Use Environment Variables for Secrets

2. Set Appropriate Timeouts

3. Enable waitForModel

4. Implement Retry Logic

5. Log Errors Appropriately

6. Monitor and Alert

Testing Error Scenarios

Mock for Testing

Debugging Tips

1. Enable HTTP Logging

2. Verify Configuration

3. Test with curl

4. Check API Status

Related Documentation

error-handling.mddocs/