CtrlK

Community Documentation Log in Get started

tessl/maven-org-springframework-ai--spring-ai-autoconfigure-retry

Spring Boot auto-configuration for AI retry capabilities with exponential backoff and intelligent HTTP error handling

Overview

Eval results

Files

Exception Handling

Name: tessl/maven-org-springframework-ai--spring-ai-autoconfigure-retry
Author: tessl

Exception types for classifying AI errors as transient or non-transient to control retry behavior. These exception types are part of the spring-ai-retry module and are used by the auto-configuration to determine whether errors should be retried.

Capabilities

TransientAiException

Exception for transient AI errors where a retry of the same operation might succeed without any intervention.

// Package: org.springframework.ai.retry
/**
 * Root of the hierarchy of Model access exceptions that are considered transient
 * A previously failed operation might succeed when retried
 *
 * Thrown for:
 * - Server errors (5xx)
 * - Network errors (timeouts, connection failures)
 * - Rate limits (429 when configured)
 * - Temporary service unavailability (503)
 * - Gateway errors (502, 504)
 *
 * Causes RetryTemplate to retry the operation
 * Extends: java.lang.RuntimeException (unchecked exception)
 * Thread-safe: Yes (immutable after construction)
 * Serializable: Yes (extends RuntimeException which is Serializable)
 *
 * @since 0.8.1
 */
public class TransientAiException extends RuntimeException {

    /**
     * Constructs a new TransientAiException with the specified message
     * 
     * Usage:
     * throw new TransientAiException("HTTP 503 - Service temporarily unavailable");
     * 
     * @param message Error message describing the transient failure (can be null)
     */
    public TransientAiException(String message);

    /**
     * Constructs a new TransientAiException with message and cause
     * 
     * Usage:
     * throw new TransientAiException("Connection timeout", networkException);
     * 
     * Preserves full stack trace of underlying cause
     * Use this constructor to maintain exception chain
     * 
     * @param message Error message describing the transient failure (can be null)
     * @param cause The underlying cause of the exception (can be null)
     */
    public TransientAiException(String message, Throwable cause);
}

When to throw TransientAiException:

Server errors (5xx)
- 500 Internal Server Error
- 502 Bad Gateway
- 503 Service Unavailable
- 504 Gateway Timeout
Rate limiting
- 429 Too Many Requests
- Custom rate limit responses
Network errors
- Connection timeout (ConnectException)
- Socket timeout (SocketTimeoutException)
- Connection refused (ConnectException)
- Network unreachable (NoRouteToHostException)
- DNS resolution failures (UnknownHostException)
Temporary unavailability
- Service maintenance windows
- Temporary capacity issues
- Load balancer health check failures
- Circuit breaker open states

Behavior with RetryTemplate:

Triggers retry if attempts remain
Uses configured backoff strategy
Propagated to caller if all retries exhausted
Logged by retry listener on each attempt

NonTransientAiException

Exception for non-transient AI errors where a retry of the same operation will fail unless the cause is corrected.

// Package: org.springframework.ai.retry
/**
 * Root of the hierarchy of Model access exceptions that are considered non-transient
 * A retry of the same operation would fail unless the cause is corrected
 *
 * Thrown for:
 * - Authentication errors (401)
 * - Authorization errors (403)
 * - Bad request errors (400)
 * - Not found errors (404)
 * - Client errors (4xx when configured)
 * - Explicitly configured non-transient codes
 * - Invalid configuration (malformed URLs, missing parameters)
 *
 * Causes RetryTemplate to fail immediately without retry
 * Extends: java.lang.RuntimeException (unchecked exception)
 * Thread-safe: Yes (immutable after construction)
 * Serializable: Yes (extends RuntimeException which is Serializable)
 *
 * @since 0.8.1
 */
public class NonTransientAiException extends RuntimeException {

    /**
     * Constructs a new NonTransientAiException with the specified message
     * 
     * Usage:
     * throw new NonTransientAiException("HTTP 401 - Invalid API key");
     * 
     * @param message Error message describing the non-transient failure (can be null)
     */
    public NonTransientAiException(String message);

    /**
     * Constructs a new NonTransientAiException with message and cause
     * 
     * Usage:
     * throw new NonTransientAiException("Invalid API key", authException);
     * 
     * Preserves full stack trace of underlying cause
     * Use this constructor to maintain exception chain
     * 
     * @param message Error message describing the non-transient failure (can be null)
     * @param cause The underlying cause of the exception (can be null)
     */
    public NonTransientAiException(String message, Throwable cause);
}

When to throw NonTransientAiException:

Authentication errors
- 401 Unauthorized
- Invalid API keys
- Expired tokens
- Missing authentication headers
Authorization errors
- 403 Forbidden
- Insufficient permissions
- Account disabled/suspended
- Resource access denied
Client errors
- 400 Bad Request
- 404 Not Found
- 405 Method Not Allowed
- 406 Not Acceptable
- 415 Unsupported Media Type
- 422 Unprocessable Entity
Configuration errors
- Invalid endpoint URL
- Malformed request body
- Missing required parameters
- Invalid parameter values
- Schema validation failures

Behavior with RetryTemplate:

Causes immediate failure without retry
Propagated to caller immediately
Not logged by retry listener (no retry attempted)
Bypasses backoff strategy

Usage Examples

Throwing Exceptions in Custom Code

Throw these exceptions to control retry behavior in your own code:

import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
import java.io.IOException;
import java.net.SocketTimeoutException;

public class CustomAiClient {

    /**
     * Example of throwing appropriate exceptions based on error type
     */
    public String callApi(String apiKey, String request) {
        // Validate input - configuration error, won't resolve on retry
        if (apiKey == null || apiKey.isEmpty()) {
            throw new NonTransientAiException("API key is required");
        }
        
        if (request == null || request.length() > 4096) {
            throw new NonTransientAiException("Invalid request: must be 1-4096 characters");
        }
        
        try {
            return performApiCall(apiKey, request);
        } catch (SocketTimeoutException e) {
            // Network timeout - transient, retry might succeed
            throw new TransientAiException("Network timeout communicating with AI service", e);
        } catch (IOException e) {
            // General I/O error - could be transient
            throw new TransientAiException("I/O error communicating with AI service", e);
        } catch (InvalidApiKeyException e) {
            // Auth error - non-transient, retry won't help
            throw new NonTransientAiException("Invalid API key: " + e.getMessage(), e);
        } catch (RateLimitException e) {
            // Rate limit - transient, retry after backoff should succeed
            throw new TransientAiException(
                "Rate limit exceeded: " + e.getRetryAfterSeconds() + "s", e);
        } catch (ServiceUnavailableException e) {
            // Service down - transient, might recover
            throw new TransientAiException("AI service temporarily unavailable", e);
        }
    }

    private String performApiCall(String apiKey, String request) throws IOException {
        // Implementation
        return "result";
    }
}

Catching and Handling Exceptions

Handle these exceptions in application code:

import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
import org.springframework.retry.support.RetryTemplate;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class AiService {

    private static final Logger log = LoggerFactory.getLogger(AiService.class);
    private final RetryTemplate retryTemplate;
    private final AiClient aiClient;

    public AiService(RetryTemplate retryTemplate, AiClient aiClient) {
        this.retryTemplate = retryTemplate;
        this.aiClient = aiClient;
    }

    /**
     * Calls AI service with retry and graceful error handling
     * Returns either the completion or a fallback message
     */
    public String getCompletion(String prompt) {
        try {
            // RetryTemplate handles TransientAiException automatically
            return retryTemplate.execute(context -> {
                log.debug("Calling AI service, attempt {}", context.getRetryCount() + 1);
                return aiClient.complete(prompt);
            });
        } catch (TransientAiException e) {
            // All retries exhausted - still transient error
            log.error("AI service temporarily unavailable after {} retries: {}",
                      retryTemplate.getRetryPolicy().getMaxAttempts(),
                      e.getMessage());
            return "Service temporarily unavailable. Please try again later.";
        } catch (NonTransientAiException e) {
            // Permanent failure - immediate failure, no retries
            log.error("AI service error that cannot be resolved by retry: {}", 
                      e.getMessage());
            
            // Check specific error types
            if (e.getMessage().contains("401") || e.getMessage().contains("Invalid API key")) {
                return "Service configuration error. Please contact support.";
            } else if (e.getMessage().contains("400") || e.getMessage().contains("Bad Request")) {
                return "Invalid request format. Please check your input.";
            } else {
                return "Service error. Please check your configuration.";
            }
        }
    }

    /**
     * Calls AI service with context-aware error handling
     * Distinguishes between different failure modes
     */
    public CompletionResult getCompletionWithDetails(String prompt) {
        try {
            String completion = retryTemplate.execute(context -> 
                aiClient.complete(prompt)
            );
            return CompletionResult.success(completion);
        } catch (TransientAiException e) {
            // Transient failure after all retries
            return CompletionResult.failure(
                FailureReason.TEMPORARY_UNAVAILABLE,
                "Service temporarily unavailable: " + e.getMessage(),
                true  // retryable
            );
        } catch (NonTransientAiException e) {
            // Determine specific failure reason
            if (e.getMessage().contains("401") || e.getMessage().contains("403")) {
                return CompletionResult.failure(
                    FailureReason.AUTHENTICATION_ERROR,
                    e.getMessage(),
                    false  // not retryable
                );
            } else if (e.getMessage().contains("400")) {
                return CompletionResult.failure(
                    FailureReason.INVALID_REQUEST,
                    e.getMessage(),
                    false  // not retryable
                );
            } else {
                return CompletionResult.failure(
                    FailureReason.CONFIGURATION_ERROR,
                    e.getMessage(),
                    false  // not retryable
                );
            }
        }
    }
}

Using with ResponseErrorHandler

The auto-configured ResponseErrorHandler automatically throws these exceptions:

import org.springframework.web.client.RestTemplate;
import org.springframework.web.client.ResponseErrorHandler;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.retry.support.RetryTemplate;

@Configuration
public class RestConfig {

    /**
     * Configure RestTemplate with auto-configured error handler
     * Error handler throws TransientAiException or NonTransientAiException
     * based on HTTP status code
     */
    @Bean
    public RestTemplate restTemplate(ResponseErrorHandler errorHandler) {
        RestTemplate template = new RestTemplate();
        // ErrorHandler will be invoked for 4xx and 5xx responses
        // Before response is returned to caller
        template.setErrorHandler(errorHandler);
        return template;
    }

    /**
     * Example service using RestTemplate with retry
     */
    @Bean
    public AiService aiService(RestTemplate restTemplate, RetryTemplate retryTemplate) {
        return new AiService() {
            public String callAi(String prompt) {
                return retryTemplate.execute(context -> {
                    // RestTemplate calls API
                    // If error response (4xx/5xx):
                    //   1. ResponseErrorHandler.hasError() returns true
                    //   2. ResponseErrorHandler.handleError() is invoked
                    //   3. Throws TransientAiException or NonTransientAiException
                    //   4. RetryTemplate catches and handles based on exception type
                    return restTemplate.postForObject(
                        "https://api.example.com/complete",
                        prompt,
                        String.class
                    );
                });
            }
        };
    }
}

Flow when RestTemplate encounters HTTP error:

HTTP 5xx → ResponseErrorHandler throws TransientAiException → RetryTemplate retries
HTTP 4xx → ResponseErrorHandler throws NonTransientAiException → RetryTemplate fails immediately
HTTP 429 (if in onHttpCodes) → ResponseErrorHandler throws TransientAiException → RetryTemplate retries

Error Classification Patterns

Transient Errors (Should Retry)

Errors that are temporary and might resolve on retry:

1. Server Errors (5xx)

// 500 Internal Server Error
throw new TransientAiException("HTTP 500 - Internal server error");

// 502 Bad Gateway
throw new TransientAiException("HTTP 502 - Bad gateway");

// 503 Service Unavailable
throw new TransientAiException("HTTP 503 - Service temporarily unavailable");

// 504 Gateway Timeout
throw new TransientAiException("HTTP 504 - Gateway timeout");

Retry rationale:

Server might be recovering from crash
Load balancer might find healthy instance
Temporary capacity issue might resolve
Database connection might be re-established

2. Rate Limiting

// 429 Too Many Requests
throw new TransientAiException("HTTP 429 - Rate limit exceeded. Retry after 60s");

// With retry-after information
throw new TransientAiException(String.format(
    "HTTP 429 - Rate limit exceeded. Retry after %d seconds", 
    retryAfterSeconds
));

Retry rationale:

Rate limit window will expire
Exponential backoff provides adequate wait time
Request itself is valid, just timing is wrong

3. Network Errors

import java.net.SocketTimeoutException;
import java.net.ConnectException;
import java.net.UnknownHostException;
import java.net.NoRouteToHostException;

// Connection timeout
try {
    makeConnection();
} catch (SocketTimeoutException e) {
    throw new TransientAiException("Connection timeout to AI service", e);
}

// Connection refused
try {
    makeConnection();
} catch (ConnectException e) {
    throw new TransientAiException("Connection refused: " + e.getMessage(), e);
}

// DNS resolution failure
try {
    resolveHost();
} catch (UnknownHostException e) {
    throw new TransientAiException("DNS resolution failed: " + e.getMessage(), e);
}

// Network unreachable
try {
    makeConnection();
} catch (NoRouteToHostException e) {
    throw new TransientAiException("Network unreachable: " + e.getMessage(), e);
}

Retry rationale:

Network might be temporarily congested
DNS might resolve correctly on next attempt
Network route might be re-established
Firewall issue might be transient

4. Temporary Unavailability

// Service maintenance
throw new TransientAiException(
    "Service maintenance in progress. Expected completion: 15:00 UTC"
);

// Capacity issues
throw new TransientAiException(
    "Service at capacity. Request queued."
);

// Circuit breaker open
throw new TransientAiException(
    "Circuit breaker open due to high error rate. Retry after cooldown."
);

// Load balancer health check failure
throw new TransientAiException(
    "No healthy instances available. Retry after health check interval."
);

Retry rationale:

Maintenance window will complete
Capacity might be available after queue processing
Circuit breaker will eventually close
Health checks might pass on next attempt

Non-Transient Errors (Should Not Retry)

Errors that are permanent and won't resolve without intervention:

1. Authentication Errors

// 401 Unauthorized
throw new NonTransientAiException("HTTP 401 - Invalid API key");

// Invalid credentials
throw new NonTransientAiException(
    "Authentication failed: API key not found in system"
);

// Expired token
throw new NonTransientAiException(
    "Authentication failed: Token expired. Please refresh token."
);

// Missing auth header
throw new NonTransientAiException(
    "Authentication required: Authorization header missing"
);

No retry rationale:

API key won't become valid on retry
Token won't unexpire automatically
Missing header indicates code bug, not transient issue

2. Authorization Errors

// 403 Forbidden
throw new NonTransientAiException("HTTP 403 - Insufficient permissions");

// Account disabled
throw new NonTransientAiException(
    "Account disabled. Please contact support."
);

// Resource access denied
throw new NonTransientAiException(
    "Access denied: User does not have permission to access model 'gpt-4'"
);

// Quota exceeded (permanent)
throw new NonTransientAiException(
    "Monthly quota exceeded. Upgrade plan or wait for reset."
);

No retry rationale:

Permissions won't change without admin action
Account status requires manual intervention
Quota limit requires payment or time passage (end of billing period)

3. Client Errors

// 400 Bad Request
throw new NonTransientAiException(
    "HTTP 400 - Invalid request format: JSON parsing error"
);

// 404 Not Found
throw new NonTransientAiException(
    "HTTP 404 - Endpoint not found: /v1/completios (typo in URL)"
);

// 405 Method Not Allowed
throw new NonTransientAiException(
    "HTTP 405 - Method GET not allowed for this endpoint. Use POST."
);

// 422 Unprocessable Entity
throw new NonTransientAiException(
    "HTTP 422 - Validation failed: prompt length exceeds maximum (4096 chars)"
);

// 415 Unsupported Media Type
throw new NonTransientAiException(
    "HTTP 415 - Content-Type must be application/json, received text/plain"
);

No retry rationale:

Request format won't become valid on retry
Endpoint URL won't change without code change
HTTP method won't become allowed without code change
Validation rules won't change on retry
Content-Type won't become supported on retry

4. Configuration Errors

// Invalid URL
try {
    new URL(endpoint);
} catch (MalformedURLException e) {
    throw new NonTransientAiException("Invalid endpoint URL: " + endpoint, e);
}

// Missing required parameter
if (apiKey == null) {
    throw new NonTransientAiException("Configuration error: API key not configured");
}

// Invalid parameter value
if (temperature < 0 || temperature > 2) {
    throw new NonTransientAiException(
        "Invalid temperature value: " + temperature + ". Must be between 0 and 2."
    );
}

// Schema validation failure
throw new NonTransientAiException(
    "Request schema validation failed: field 'model' is required"
);

No retry rationale:

Configuration won't fix itself on retry
Parameter validation rules won't change on retry
Code bug requires code fix, not retry

Integration with RetryTemplate

The auto-configured RetryTemplate is configured to retry on TransientAiException:

// Pseudocode for RetryTemplate configuration
RetryTemplate.builder()
    .maxAttempts(10)
    .retryOn(TransientAiException.class)      // Retry these
    .retryOn(ResourceAccessException.class)   // Spring's network error
    .retryOn(WebClientRequestException.class) // WebFlux network error (if available)
    .exponentialBackoff(initialInterval, multiplier, maxInterval)
    .build();

Exceptions NOT in the retry list (including NonTransientAiException) cause immediate failure.

Retry Behavior Flow

1. Operation throws exception
   ↓
2. Is exception TransientAiException or ResourceAccessException?
   ↓ Yes → Go to step 3
   ↓ No → Fail immediately (propagate to caller)
   ↓
3. Have we reached max attempts?
   ↓ Yes → Propagate TransientAiException to caller
   ↓ No → Go to step 4
   ↓
4. Calculate backoff delay using exponential formula
   ↓
5. Wait for backoff delay
   ↓
6. Increment retry counter
   ↓
7. Log retry attempt (via RetryListener)
   ↓
8. Retry operation (go back to step 1)

Detailed Flow Example

import org.springframework.retry.support.RetryTemplate;
import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;

public class RetryFlowExample {

    private final RetryTemplate retryTemplate;
    private int callCount = 0;

    public String demonstrateRetryFlow() {
        try {
            return retryTemplate.execute(context -> {
                callCount++;
                System.out.println("Attempt " + callCount);
                
                if (callCount == 1) {
                    // First call: transient error (will retry)
                    throw new TransientAiException("HTTP 503 - Service unavailable");
                } else if (callCount == 2) {
                    // Second call: transient error (will retry)
                    throw new TransientAiException("HTTP 429 - Rate limit exceeded");
                } else if (callCount == 3) {
                    // Third call: success
                    return "Success after 2 retries";
                }
                
                return "Should not reach here";
            });
        } catch (TransientAiException e) {
            // Only reached if all retries exhausted
            System.out.println("All retries failed: " + e.getMessage());
            return "Failure";
        }
    }

    public String demonstrateNonTransientFlow() {
        try {
            return retryTemplate.execute(context -> {
                callCount++;
                System.out.println("Attempt " + callCount);
                
                // Non-transient error: fails immediately, no retry
                throw new NonTransientAiException("HTTP 401 - Invalid API key");
            });
        } catch (NonTransientAiException e) {
            // Reached immediately after first attempt
            System.out.println("Immediate failure: " + e.getMessage());
            System.out.println("Call count: " + callCount);  // Will be 1
            return "Configuration error";
        }
    }
}

Output for demonstrateRetryFlow():

Attempt 1
[Wait 2s]
Attempt 2
[Wait 10s]
Attempt 3
Result: "Success after 2 retries"

Output for demonstrateNonTransientFlow():

Attempt 1
Immediate failure: HTTP 401 - Invalid API key
Call count: 1
Result: "Configuration error"

Custom Exception Handling

Creating Domain-Specific Exceptions

Extend the base exception types for domain-specific errors:

import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;

/**
 * Thrown when AI model quota is exceeded (transient - might resolve after time)
 * Extends TransientAiException because quota may reset at billing period
 */
public class QuotaExceededException extends TransientAiException {
    
    private final int quotaLimit;
    private final int quotaUsed;
    private final long resetTimeMillis;
    
    public QuotaExceededException(String message, int quotaLimit, int quotaUsed, long resetTimeMillis) {
        super(message);
        this.quotaLimit = quotaLimit;
        this.quotaUsed = quotaUsed;
        this.resetTimeMillis = resetTimeMillis;
    }
    
    public int getQuotaLimit() { return quotaLimit; }
    public int getQuotaUsed() { return quotaUsed; }
    public long getResetTimeMillis() { return resetTimeMillis; }
    
    @Override
    public String getMessage() {
        return String.format(
            "Quota exceeded: %d/%d used. Resets at %tF %tT",
            quotaUsed, quotaLimit, resetTimeMillis, resetTimeMillis
        );
    }
}

/**
 * Thrown when API key is invalid (non-transient - needs configuration fix)
 * Extends NonTransientAiException because API key won't become valid on retry
 */
public class InvalidApiKeyException extends NonTransientAiException {
    
    private final String apiKeyPrefix;  // For debugging (first 4 chars)
    
    public InvalidApiKeyException(String message, String apiKeyPrefix) {
        super(message);
        this.apiKeyPrefix = apiKeyPrefix;
    }
    
    public String getApiKeyPrefix() { return apiKeyPrefix; }
    
    @Override
    public String getMessage() {
        return super.getMessage() + " (key prefix: " + apiKeyPrefix + "...)";
    }
}

/**
 * Thrown when model is not available (transient - might be deployed later)
 * Extends TransientAiException because model might become available
 */
public class ModelNotAvailableException extends TransientAiException {
    
    private final String modelName;
    private final List<String> availableModels;
    
    public ModelNotAvailableException(String modelName, List<String> availableModels) {
        super(String.format("Model '%s' not available", modelName));
        this.modelName = modelName;
        this.availableModels = availableModels;
    }
    
    public String getModelName() { return modelName; }
    public List<String> getAvailableModels() { return availableModels; }
}

/**
 * Thrown when request format is invalid (non-transient - needs code fix)
 * Extends NonTransientAiException because format won't become valid on retry
 */
public class InvalidRequestFormatException extends NonTransientAiException {
    
    private final String fieldName;
    private final String expectedFormat;
    private final String actualValue;
    
    public InvalidRequestFormatException(
            String fieldName, 
            String expectedFormat, 
            String actualValue) {
        super(String.format(
            "Invalid format for field '%s': expected %s, got %s",
            fieldName, expectedFormat, actualValue
        ));
        this.fieldName = fieldName;
        this.expectedFormat = expectedFormat;
        this.actualValue = actualValue;
    }
    
    public String getFieldName() { return fieldName; }
    public String getExpectedFormat() { return expectedFormat; }
    public String getActualValue() { return actualValue; }
}

Usage of domain-specific exceptions:

public class AiClient {

    public String complete(String prompt, String model) {
        // Check quota
        if (isQuotaExceeded()) {
            throw new QuotaExceededException(
                "Monthly quota exceeded",
                1000,  // quota limit
                1023,  // quota used
                getQuotaResetTime()
            );
        }
        
        // Validate API key
        if (!isValidApiKey(apiKey)) {
            throw new InvalidApiKeyException(
                "Invalid API key",
                apiKey.substring(0, 4)  // First 4 chars for debugging
            );
        }
        
        // Check model availability
        if (!isModelAvailable(model)) {
            throw new ModelNotAvailableException(
                model,
                getAvailableModels()
            );
        }
        
        // Validate prompt format
        if (prompt.length() > 4096) {
            throw new InvalidRequestFormatException(
                "prompt",
                "string (max 4096 chars)",
                "string (" + prompt.length() + " chars)"
            );
        }
        
        return performCompletion(prompt, model);
    }
}

Custom Error Handler

Create a custom error handler that throws these exceptions:

import org.springframework.web.client.ResponseErrorHandler;
import org.springframework.http.client.ClientHttpResponse;
import org.springframework.http.HttpStatusCode;
import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
import java.io.IOException;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.nio.charset.StandardCharsets;
import java.util.stream.Collectors;

/**
 * Custom ResponseErrorHandler with detailed error classification
 */
public class CustomAiErrorHandler implements ResponseErrorHandler {

    @Override
    public boolean hasError(ClientHttpResponse response) throws IOException {
        HttpStatusCode statusCode = response.getStatusCode();
        // Treat 404 as non-error (resource might not exist yet)
        if (statusCode.value() == 404) {
            return false;
        }
        return statusCode.isError();
    }

    @Override
    public void handleError(ClientHttpResponse response) throws IOException {
        int statusCode = response.getStatusCode().value();
        String errorBody = readErrorBody(response);
        String errorMessage = "HTTP " + statusCode + " - " + errorBody;

        // Detailed classification based on status code
        switch (statusCode) {
            // Authentication errors (non-transient)
            case 401:
                if (errorBody.contains("invalid_api_key")) {
                    throw new InvalidApiKeyException(
                        errorMessage,
                        extractApiKeyPrefix(errorBody)
                    );
                }
                throw new NonTransientAiException(errorMessage);

            // Authorization errors (non-transient)
            case 403:
                if (errorBody.contains("quota_exceeded")) {
                    // Quota exceeded - transient if it resets
                    throw new QuotaExceededException(
                        errorMessage,
                        extractQuotaLimit(errorBody),
                        extractQuotaUsed(errorBody),
                        extractResetTime(errorBody)
                    );
                }
                throw new NonTransientAiException(errorMessage);

            // Bad request (non-transient)
            case 400:
                throw new InvalidRequestFormatException(
                    extractFieldName(errorBody),
                    extractExpectedFormat(errorBody),
                    extractActualValue(errorBody)
                );

            // Rate limit (transient)
            case 429:
                int retryAfter = extractRetryAfter(response, errorBody);
                throw new TransientAiException(
                    errorMessage + " - Retry after " + retryAfter + "s"
                );

            // Server errors (transient)
            case 500:
            case 502:
            case 503:
            case 504:
                throw new TransientAiException(errorMessage);

            // Default classification
            default:
                if (statusCode >= 400 && statusCode < 500) {
                    throw new NonTransientAiException(errorMessage);
                } else {
                    throw new TransientAiException(errorMessage);
                }
        }
    }

    /**
     * Read error response body with size limit
     */
    private String readErrorBody(ClientHttpResponse response) throws IOException {
        try (BufferedReader reader = new BufferedReader(
                new InputStreamReader(response.getBody(), StandardCharsets.UTF_8))) {
            
            // Read up to 4KB to prevent memory issues
            char[] buffer = new char[4096];
            int charsRead = reader.read(buffer);
            
            if (charsRead == -1) {
                return "No response body available";
            }
            
            return new String(buffer, 0, charsRead);
        }
    }

    /**
     * Extract retry-after value from response
     */
    private int extractRetryAfter(ClientHttpResponse response, String errorBody) {
        // Check Retry-After header
        String retryAfterHeader = response.getHeaders().getFirst("Retry-After");
        if (retryAfterHeader != null) {
            try {
                return Integer.parseInt(retryAfterHeader);
            } catch (NumberFormatException e) {
                // Ignore invalid header
            }
        }
        
        // Parse from error body (example: {"retry_after": 60})
        // Simplified - use JSON parser in production
        int retryAfter = 60;  // default
        if (errorBody.contains("retry_after")) {
            // Extract value (simplified)
            retryAfter = 60;
        }
        
        return retryAfter;
    }

    // Helper methods for extracting information from error body
    private String extractApiKeyPrefix(String errorBody) {
        // Simplified - use JSON parser in production
        return "sk-...";
    }

    private int extractQuotaLimit(String errorBody) {
        // Simplified - use JSON parser in production
        return 1000;
    }

    private int extractQuotaUsed(String errorBody) {
        // Simplified - use JSON parser in production
        return 1023;
    }

    private long extractResetTime(String errorBody) {
        // Simplified - use JSON parser in production
        return System.currentTimeMillis() + (24 * 60 * 60 * 1000);  // +24 hours
    }

    private String extractFieldName(String errorBody) {
        // Simplified - use JSON parser in production
        return "prompt";
    }

    private String extractExpectedFormat(String errorBody) {
        // Simplified - use JSON parser in production
        return "string (max 4096 chars)";
    }

    private String extractActualValue(String errorBody) {
        // Simplified - use JSON parser in production
        return "string (5000 chars)";
    }
}

Exception Message Format

The auto-configured ResponseErrorHandler creates error messages in this format:

HTTP {status_code} - {response_body}

Examples:

HTTP 429 - Rate limit exceeded. Please retry after 60 seconds.
HTTP 401 - Invalid authentication credentials.
HTTP 503 - Service temporarily unavailable due to maintenance.
HTTP 500 - Internal server error: NullPointerException at ServiceImpl.java:42
HTTP 400 - Invalid request: field 'prompt' is required
HTTP 404 - Model 'gpt-5' not found

Include context in your exception messages to aid debugging:

// Good: Includes context
throw new TransientAiException(
    "HTTP 503 - AI model service unavailable. " +
    "Retry attempt " + retryCount + " of " + maxRetries
);

// Good: Includes original cause
throw new NonTransientAiException(
    "Failed to authenticate with AI service: " + originalException.getMessage(),
    originalException
);

// Good: Includes relevant parameters
throw new NonTransientAiException(
    "Invalid temperature parameter: " + temperature + ". Must be between 0 and 2."
);

// Less helpful: Vague message
throw new TransientAiException("Error occurred");

// Less helpful: Missing context
throw new NonTransientAiException("Invalid parameter");

Best Practices

1. Classify Correctly

Use TransientAiException only for truly temporary errors that might resolve on retry:

// GOOD: Transient - network might recover
throw new TransientAiException("Connection timeout", timeoutException);

// GOOD: Transient - service might come back up
throw new TransientAiException("HTTP 503 - Service unavailable");

// BAD: Non-transient classified as transient - wastes retries
throw new TransientAiException("HTTP 401 - Invalid API key");
// SHOULD BE:
throw new NonTransientAiException("HTTP 401 - Invalid API key");

// BAD: Transient classified as non-transient - misses recovery opportunity
throw new NonTransientAiException("HTTP 503 - Service unavailable");
// SHOULD BE:
throw new TransientAiException("HTTP 503 - Service unavailable");

2. Include Context

Add relevant details to exception messages:

// GOOD: Includes status code, response body, and retry info
throw new TransientAiException(String.format(
    "HTTP %d - %s (attempt %d/%d)",
    statusCode, responseBody, attemptNumber, maxAttempts
));

// GOOD: Includes parameter name and valid range
throw new NonTransientAiException(String.format(
    "Invalid parameter '%s': value %s is outside valid range [%s, %s]",
    paramName, actualValue, minValue, maxValue
));

// BAD: No context
throw new TransientAiException("Error");

3. Preserve Causes

Pass the original exception as the cause for full stack traces:

// GOOD: Preserves full exception chain
try {
    performOperation();
} catch (IOException e) {
    throw new TransientAiException("Network error: " + e.getMessage(), e);
}

// BAD: Loses original stack trace
try {
    performOperation();
} catch (IOException e) {
    throw new TransientAiException("Network error");  // No cause
}

4. Document Exceptions

Document what exceptions your methods throw:

/**
 * Calls AI completion API
 * 
 * @param prompt The prompt text
 * @return Completion result
 * @throws TransientAiException If a temporary error occurs (network, server error, rate limit)
 * @throws NonTransientAiException If a permanent error occurs (auth, validation, configuration)
 */
public String complete(String prompt) {
    // Implementation
}

5. Handle Exhausted Retries

Catch TransientAiException after retry exhaustion to handle gracefully:

try {
    return retryTemplate.execute(context -> callApi());
} catch (TransientAiException e) {
    // All retries exhausted - provide fallback
    log.error("Service unavailable after {} retries", maxAttempts);
    return fallbackResponse();
}

6. Don't Overuse Transient

When in doubt, use NonTransientAiException to avoid wasteful retries:

// If uncertain whether error is transient:
// Better to fail fast than waste time on futile retries
if (unknownErrorCondition) {
    throw new NonTransientAiException("Unknown error condition");
    // User can manually retry if they think it's transient
}

7. Log Appropriately

Log TransientAiException at WARN level, NonTransientAiException at ERROR level:

try {
    return callApi();
} catch (TransientAiException e) {
    // Transient - might recover
    log.warn("Transient error (will retry): {}", e.getMessage());
    throw e;
} catch (NonTransientAiException e) {
    // Non-transient - requires intervention
    log.error("Non-transient error (won't retry): {}", e.getMessage(), e);
    throw e;
}

Testing Exception Behavior

Testing Transient Errors

Use RetryUtils.SHORT_RETRY_TEMPLATE for tests:

import org.springframework.ai.retry.RetryUtils;
import org.springframework.ai.retry.TransientAiException;
import org.junit.jupiter.api.Test;
import java.util.concurrent.atomic.AtomicInteger;
import static org.assertj.core.api.Assertions.assertThat;

class TransientErrorRetryTest {

    @Test
    void testTransientErrorSucceedsAfterRetries() {
        AtomicInteger attempts = new AtomicInteger(0);

        String result = RetryUtils.SHORT_RETRY_TEMPLATE.execute(context -> {
            int attemptNumber = attempts.incrementAndGet();
            
            // Fail first 2 attempts
            if (attemptNumber < 3) {
                throw new TransientAiException("Transient failure " + attemptNumber);
            }
            
            // Succeed on 3rd attempt
            return "success";
        });

        assertThat(result).isEqualTo("success");
        assertThat(attempts.get()).isEqualTo(3);  // Succeeded on 3rd attempt
    }

    @Test
    void testTransientErrorExhaustsRetries() {
        AtomicInteger attempts = new AtomicInteger(0);

        assertThatThrownBy(() -> {
            RetryUtils.SHORT_RETRY_TEMPLATE.execute(context -> {
                attempts.incrementAndGet();
                throw new TransientAiException("Always fails");
            });
        }).isInstanceOf(TransientAiException.class)
          .hasMessageContaining("Always fails");

        // SHORT_RETRY_TEMPLATE has maxAttempts=10
        assertThat(attempts.get()).isEqualTo(10);
    }
}

Testing Non-Transient Errors

import org.springframework.ai.retry.RetryUtils;
import org.springframework.ai.retry.NonTransientAiException;
import org.junit.jupiter.api.Test;
import java.util.concurrent.atomic.AtomicInteger;
import static org.assertj.core.api.Assertions.assertThat;
import static org.assertj.core.api.Assertions.assertThatThrownBy;

class NonTransientErrorTest {

    @Test
    void testNonTransientErrorFailsImmediately() {
        AtomicInteger attempts = new AtomicInteger(0);

        assertThatThrownBy(() -> {
            RetryUtils.SHORT_RETRY_TEMPLATE.execute(context -> {
                attempts.incrementAndGet();
                throw new NonTransientAiException("Permanent failure");
            });
        }).isInstanceOf(NonTransientAiException.class)
          .hasMessageContaining("Permanent failure");

        // Should fail immediately without retry
        assertThat(attempts.get()).isEqualTo(1);
    }

    @Test
    void testMixedExceptions() {
        AtomicInteger attempts = new AtomicInteger(0);

        assertThatThrownBy(() -> {
            RetryUtils.SHORT_RETRY_TEMPLATE.execute(context -> {
                int attemptNumber = attempts.incrementAndGet();
                
                if (attemptNumber == 1) {
                    // First: transient (will retry)
                    throw new TransientAiException("Transient error");
                } else {
                    // Second: non-transient (will fail immediately)
                    throw new NonTransientAiException("Permanent error");
                }
            });
        }).isInstanceOf(NonTransientAiException.class);

        // Transient retry + non-transient failure = 2 attempts
        assertThat(attempts.get()).isEqualTo(2);
    }
}

Mocking Error Responses

import org.springframework.web.client.ResponseErrorHandler;
import org.springframework.http.client.ClientHttpResponse;
import org.springframework.http.HttpStatus;
import org.springframework.http.HttpStatusCode;
import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
import org.junit.jupiter.api.Test;
import org.mockito.Mockito;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import static org.assertj.core.api.Assertions.assertThatThrownBy;

class ErrorHandlerTest {

    @Test
    void testRateLimitThrowsTransientException() throws IOException {
        ResponseErrorHandler handler = createAutoConfiguredErrorHandler();
        ClientHttpResponse response = createMockResponse(
            HttpStatus.TOO_MANY_REQUESTS,
            "Rate limit exceeded"
        );

        assertThatThrownBy(() -> handler.handleError(response))
            .isInstanceOf(TransientAiException.class)
            .hasMessageContaining("429")
            .hasMessageContaining("Rate limit exceeded");
    }

    @Test
    void testUnauthorizedThrowsNonTransientException() throws IOException {
        ResponseErrorHandler handler = createAutoConfiguredErrorHandler();
        ClientHttpResponse response = createMockResponse(
            HttpStatus.UNAUTHORIZED,
            "Invalid API key"
        );

        assertThatThrownBy(() -> handler.handleError(response))
            .isInstanceOf(NonTransientAiException.class)
            .hasMessageContaining("401")
            .hasMessageContaining("Invalid API key");
    }

    @Test
    void testServerErrorThrowsTransientException() throws IOException {
        ResponseErrorHandler handler = createAutoConfiguredErrorHandler();
        ClientHttpResponse response = createMockResponse(
            HttpStatus.INTERNAL_SERVER_ERROR,
            "Internal server error"
        );

        assertThatThrownBy(() -> handler.handleError(response))
            .isInstanceOf(TransientAiException.class)
            .hasMessageContaining("500");
    }

    private ClientHttpResponse createMockResponse(HttpStatus status, String body) 
            throws IOException {
        ClientHttpResponse response = Mockito.mock(ClientHttpResponse.class);
        Mockito.when(response.getStatusCode()).thenReturn(HttpStatusCode.valueOf(status.value()));
        Mockito.when(response.getBody()).thenReturn(
            new ByteArrayInputStream(body.getBytes())
        );
        return response;
    }

    private ResponseErrorHandler createAutoConfiguredErrorHandler() {
        // Return configured error handler from auto-configuration
        // In real tests, inject from Spring context
        return handler;
    }
}

Exception Hierarchy

java.lang.Object
└── java.lang.Throwable
    └── java.lang.Exception
        └── java.lang.RuntimeException
            ├── org.springframework.ai.retry.TransientAiException
            │   ├── QuotaExceededException (custom)
            │   ├── ModelNotAvailableException (custom)
            │   └── (other custom transient exceptions)
            └── org.springframework.ai.retry.NonTransientAiException
                ├── InvalidApiKeyException (custom)
                ├── InvalidRequestFormatException (custom)
                └── (other custom non-transient exceptions)

Both exception types extend RuntimeException, making them unchecked exceptions that don't require explicit throws declarations in method signatures.

Benefits of unchecked exceptions:

Cleaner method signatures
Optional handling (caller can choose to catch or propagate)
Consistent with Spring framework exception patterns
Reduces boilerplate code

Serializable: Both exceptions are serializable (RuntimeException implements Serializable), allowing them to:

Be transmitted across network boundaries (RMI, distributed systems)
Be stored in session (if needed)
Be logged with full stack traces

tessl i tessl/maven-org-springframework-ai--spring-ai-autoconfigure-retry@1.1.1

docs

examples

guides

reference

api-overview.md

auto-configuration.md

configuration-properties.md

exception-handling.md

index.md

tile.json

tessl/maven-org-springframework-ai--spring-ai-autoconfigure-retry