CtrlK
CommunityDocumentationLog inGet started
Tessl Logo

tessl/maven-org-springframework-ai--spring-ai-autoconfigure-retry

Spring Boot auto-configuration for AI retry capabilities with exponential backoff and intelligent HTTP error handling

Overview
Eval results
Files

exception-handling.mddocs/reference/

Exception Handling

Exception types for classifying AI errors as transient or non-transient to control retry behavior. These exception types are part of the spring-ai-retry module and are used by the auto-configuration to determine whether errors should be retried.

Capabilities

TransientAiException

Exception for transient AI errors where a retry of the same operation might succeed without any intervention.

// Package: org.springframework.ai.retry
/**
 * Root of the hierarchy of Model access exceptions that are considered transient
 * A previously failed operation might succeed when retried
 *
 * Thrown for:
 * - Server errors (5xx)
 * - Network errors (timeouts, connection failures)
 * - Rate limits (429 when configured)
 * - Temporary service unavailability (503)
 * - Gateway errors (502, 504)
 *
 * Causes RetryTemplate to retry the operation
 * Extends: java.lang.RuntimeException (unchecked exception)
 * Thread-safe: Yes (immutable after construction)
 * Serializable: Yes (extends RuntimeException which is Serializable)
 *
 * @since 0.8.1
 */
public class TransientAiException extends RuntimeException {

    /**
     * Constructs a new TransientAiException with the specified message
     * 
     * Usage:
     * throw new TransientAiException("HTTP 503 - Service temporarily unavailable");
     * 
     * @param message Error message describing the transient failure (can be null)
     */
    public TransientAiException(String message);

    /**
     * Constructs a new TransientAiException with message and cause
     * 
     * Usage:
     * throw new TransientAiException("Connection timeout", networkException);
     * 
     * Preserves full stack trace of underlying cause
     * Use this constructor to maintain exception chain
     * 
     * @param message Error message describing the transient failure (can be null)
     * @param cause The underlying cause of the exception (can be null)
     */
    public TransientAiException(String message, Throwable cause);
}

When to throw TransientAiException:

  1. Server errors (5xx)

    • 500 Internal Server Error
    • 502 Bad Gateway
    • 503 Service Unavailable
    • 504 Gateway Timeout
  2. Rate limiting

    • 429 Too Many Requests
    • Custom rate limit responses
  3. Network errors

    • Connection timeout (ConnectException)
    • Socket timeout (SocketTimeoutException)
    • Connection refused (ConnectException)
    • Network unreachable (NoRouteToHostException)
    • DNS resolution failures (UnknownHostException)
  4. Temporary unavailability

    • Service maintenance windows
    • Temporary capacity issues
    • Load balancer health check failures
    • Circuit breaker open states

Behavior with RetryTemplate:

  • Triggers retry if attempts remain
  • Uses configured backoff strategy
  • Propagated to caller if all retries exhausted
  • Logged by retry listener on each attempt

NonTransientAiException

Exception for non-transient AI errors where a retry of the same operation will fail unless the cause is corrected.

// Package: org.springframework.ai.retry
/**
 * Root of the hierarchy of Model access exceptions that are considered non-transient
 * A retry of the same operation would fail unless the cause is corrected
 *
 * Thrown for:
 * - Authentication errors (401)
 * - Authorization errors (403)
 * - Bad request errors (400)
 * - Not found errors (404)
 * - Client errors (4xx when configured)
 * - Explicitly configured non-transient codes
 * - Invalid configuration (malformed URLs, missing parameters)
 *
 * Causes RetryTemplate to fail immediately without retry
 * Extends: java.lang.RuntimeException (unchecked exception)
 * Thread-safe: Yes (immutable after construction)
 * Serializable: Yes (extends RuntimeException which is Serializable)
 *
 * @since 0.8.1
 */
public class NonTransientAiException extends RuntimeException {

    /**
     * Constructs a new NonTransientAiException with the specified message
     * 
     * Usage:
     * throw new NonTransientAiException("HTTP 401 - Invalid API key");
     * 
     * @param message Error message describing the non-transient failure (can be null)
     */
    public NonTransientAiException(String message);

    /**
     * Constructs a new NonTransientAiException with message and cause
     * 
     * Usage:
     * throw new NonTransientAiException("Invalid API key", authException);
     * 
     * Preserves full stack trace of underlying cause
     * Use this constructor to maintain exception chain
     * 
     * @param message Error message describing the non-transient failure (can be null)
     * @param cause The underlying cause of the exception (can be null)
     */
    public NonTransientAiException(String message, Throwable cause);
}

When to throw NonTransientAiException:

  1. Authentication errors

    • 401 Unauthorized
    • Invalid API keys
    • Expired tokens
    • Missing authentication headers
  2. Authorization errors

    • 403 Forbidden
    • Insufficient permissions
    • Account disabled/suspended
    • Resource access denied
  3. Client errors

    • 400 Bad Request
    • 404 Not Found
    • 405 Method Not Allowed
    • 406 Not Acceptable
    • 415 Unsupported Media Type
    • 422 Unprocessable Entity
  4. Configuration errors

    • Invalid endpoint URL
    • Malformed request body
    • Missing required parameters
    • Invalid parameter values
    • Schema validation failures

Behavior with RetryTemplate:

  • Causes immediate failure without retry
  • Propagated to caller immediately
  • Not logged by retry listener (no retry attempted)
  • Bypasses backoff strategy

Usage Examples

Throwing Exceptions in Custom Code

Throw these exceptions to control retry behavior in your own code:

import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
import java.io.IOException;
import java.net.SocketTimeoutException;

public class CustomAiClient {

    /**
     * Example of throwing appropriate exceptions based on error type
     */
    public String callApi(String apiKey, String request) {
        // Validate input - configuration error, won't resolve on retry
        if (apiKey == null || apiKey.isEmpty()) {
            throw new NonTransientAiException("API key is required");
        }
        
        if (request == null || request.length() > 4096) {
            throw new NonTransientAiException("Invalid request: must be 1-4096 characters");
        }
        
        try {
            return performApiCall(apiKey, request);
        } catch (SocketTimeoutException e) {
            // Network timeout - transient, retry might succeed
            throw new TransientAiException("Network timeout communicating with AI service", e);
        } catch (IOException e) {
            // General I/O error - could be transient
            throw new TransientAiException("I/O error communicating with AI service", e);
        } catch (InvalidApiKeyException e) {
            // Auth error - non-transient, retry won't help
            throw new NonTransientAiException("Invalid API key: " + e.getMessage(), e);
        } catch (RateLimitException e) {
            // Rate limit - transient, retry after backoff should succeed
            throw new TransientAiException(
                "Rate limit exceeded: " + e.getRetryAfterSeconds() + "s", e);
        } catch (ServiceUnavailableException e) {
            // Service down - transient, might recover
            throw new TransientAiException("AI service temporarily unavailable", e);
        }
    }

    private String performApiCall(String apiKey, String request) throws IOException {
        // Implementation
        return "result";
    }
}

Catching and Handling Exceptions

Handle these exceptions in application code:

import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
import org.springframework.retry.support.RetryTemplate;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class AiService {

    private static final Logger log = LoggerFactory.getLogger(AiService.class);
    private final RetryTemplate retryTemplate;
    private final AiClient aiClient;

    public AiService(RetryTemplate retryTemplate, AiClient aiClient) {
        this.retryTemplate = retryTemplate;
        this.aiClient = aiClient;
    }

    /**
     * Calls AI service with retry and graceful error handling
     * Returns either the completion or a fallback message
     */
    public String getCompletion(String prompt) {
        try {
            // RetryTemplate handles TransientAiException automatically
            return retryTemplate.execute(context -> {
                log.debug("Calling AI service, attempt {}", context.getRetryCount() + 1);
                return aiClient.complete(prompt);
            });
        } catch (TransientAiException e) {
            // All retries exhausted - still transient error
            log.error("AI service temporarily unavailable after {} retries: {}",
                      retryTemplate.getRetryPolicy().getMaxAttempts(),
                      e.getMessage());
            return "Service temporarily unavailable. Please try again later.";
        } catch (NonTransientAiException e) {
            // Permanent failure - immediate failure, no retries
            log.error("AI service error that cannot be resolved by retry: {}", 
                      e.getMessage());
            
            // Check specific error types
            if (e.getMessage().contains("401") || e.getMessage().contains("Invalid API key")) {
                return "Service configuration error. Please contact support.";
            } else if (e.getMessage().contains("400") || e.getMessage().contains("Bad Request")) {
                return "Invalid request format. Please check your input.";
            } else {
                return "Service error. Please check your configuration.";
            }
        }
    }

    /**
     * Calls AI service with context-aware error handling
     * Distinguishes between different failure modes
     */
    public CompletionResult getCompletionWithDetails(String prompt) {
        try {
            String completion = retryTemplate.execute(context -> 
                aiClient.complete(prompt)
            );
            return CompletionResult.success(completion);
        } catch (TransientAiException e) {
            // Transient failure after all retries
            return CompletionResult.failure(
                FailureReason.TEMPORARY_UNAVAILABLE,
                "Service temporarily unavailable: " + e.getMessage(),
                true  // retryable
            );
        } catch (NonTransientAiException e) {
            // Determine specific failure reason
            if (e.getMessage().contains("401") || e.getMessage().contains("403")) {
                return CompletionResult.failure(
                    FailureReason.AUTHENTICATION_ERROR,
                    e.getMessage(),
                    false  // not retryable
                );
            } else if (e.getMessage().contains("400")) {
                return CompletionResult.failure(
                    FailureReason.INVALID_REQUEST,
                    e.getMessage(),
                    false  // not retryable
                );
            } else {
                return CompletionResult.failure(
                    FailureReason.CONFIGURATION_ERROR,
                    e.getMessage(),
                    false  // not retryable
                );
            }
        }
    }
}

Using with ResponseErrorHandler

The auto-configured ResponseErrorHandler automatically throws these exceptions:

import org.springframework.web.client.RestTemplate;
import org.springframework.web.client.ResponseErrorHandler;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.retry.support.RetryTemplate;

@Configuration
public class RestConfig {

    /**
     * Configure RestTemplate with auto-configured error handler
     * Error handler throws TransientAiException or NonTransientAiException
     * based on HTTP status code
     */
    @Bean
    public RestTemplate restTemplate(ResponseErrorHandler errorHandler) {
        RestTemplate template = new RestTemplate();
        // ErrorHandler will be invoked for 4xx and 5xx responses
        // Before response is returned to caller
        template.setErrorHandler(errorHandler);
        return template;
    }

    /**
     * Example service using RestTemplate with retry
     */
    @Bean
    public AiService aiService(RestTemplate restTemplate, RetryTemplate retryTemplate) {
        return new AiService() {
            public String callAi(String prompt) {
                return retryTemplate.execute(context -> {
                    // RestTemplate calls API
                    // If error response (4xx/5xx):
                    //   1. ResponseErrorHandler.hasError() returns true
                    //   2. ResponseErrorHandler.handleError() is invoked
                    //   3. Throws TransientAiException or NonTransientAiException
                    //   4. RetryTemplate catches and handles based on exception type
                    return restTemplate.postForObject(
                        "https://api.example.com/complete",
                        prompt,
                        String.class
                    );
                });
            }
        };
    }
}

Flow when RestTemplate encounters HTTP error:

  1. HTTP 5xx → ResponseErrorHandler throws TransientAiException → RetryTemplate retries
  2. HTTP 4xx → ResponseErrorHandler throws NonTransientAiException → RetryTemplate fails immediately
  3. HTTP 429 (if in onHttpCodes) → ResponseErrorHandler throws TransientAiException → RetryTemplate retries

Error Classification Patterns

Transient Errors (Should Retry)

Errors that are temporary and might resolve on retry:

1. Server Errors (5xx)

// 500 Internal Server Error
throw new TransientAiException("HTTP 500 - Internal server error");

// 502 Bad Gateway
throw new TransientAiException("HTTP 502 - Bad gateway");

// 503 Service Unavailable
throw new TransientAiException("HTTP 503 - Service temporarily unavailable");

// 504 Gateway Timeout
throw new TransientAiException("HTTP 504 - Gateway timeout");

Retry rationale:

  • Server might be recovering from crash
  • Load balancer might find healthy instance
  • Temporary capacity issue might resolve
  • Database connection might be re-established

2. Rate Limiting

// 429 Too Many Requests
throw new TransientAiException("HTTP 429 - Rate limit exceeded. Retry after 60s");

// With retry-after information
throw new TransientAiException(String.format(
    "HTTP 429 - Rate limit exceeded. Retry after %d seconds", 
    retryAfterSeconds
));

Retry rationale:

  • Rate limit window will expire
  • Exponential backoff provides adequate wait time
  • Request itself is valid, just timing is wrong

3. Network Errors

import java.net.SocketTimeoutException;
import java.net.ConnectException;
import java.net.UnknownHostException;
import java.net.NoRouteToHostException;

// Connection timeout
try {
    makeConnection();
} catch (SocketTimeoutException e) {
    throw new TransientAiException("Connection timeout to AI service", e);
}

// Connection refused
try {
    makeConnection();
} catch (ConnectException e) {
    throw new TransientAiException("Connection refused: " + e.getMessage(), e);
}

// DNS resolution failure
try {
    resolveHost();
} catch (UnknownHostException e) {
    throw new TransientAiException("DNS resolution failed: " + e.getMessage(), e);
}

// Network unreachable
try {
    makeConnection();
} catch (NoRouteToHostException e) {
    throw new TransientAiException("Network unreachable: " + e.getMessage(), e);
}

Retry rationale:

  • Network might be temporarily congested
  • DNS might resolve correctly on next attempt
  • Network route might be re-established
  • Firewall issue might be transient

4. Temporary Unavailability

// Service maintenance
throw new TransientAiException(
    "Service maintenance in progress. Expected completion: 15:00 UTC"
);

// Capacity issues
throw new TransientAiException(
    "Service at capacity. Request queued."
);

// Circuit breaker open
throw new TransientAiException(
    "Circuit breaker open due to high error rate. Retry after cooldown."
);

// Load balancer health check failure
throw new TransientAiException(
    "No healthy instances available. Retry after health check interval."
);

Retry rationale:

  • Maintenance window will complete
  • Capacity might be available after queue processing
  • Circuit breaker will eventually close
  • Health checks might pass on next attempt

Non-Transient Errors (Should Not Retry)

Errors that are permanent and won't resolve without intervention:

1. Authentication Errors

// 401 Unauthorized
throw new NonTransientAiException("HTTP 401 - Invalid API key");

// Invalid credentials
throw new NonTransientAiException(
    "Authentication failed: API key not found in system"
);

// Expired token
throw new NonTransientAiException(
    "Authentication failed: Token expired. Please refresh token."
);

// Missing auth header
throw new NonTransientAiException(
    "Authentication required: Authorization header missing"
);

No retry rationale:

  • API key won't become valid on retry
  • Token won't unexpire automatically
  • Missing header indicates code bug, not transient issue

2. Authorization Errors

// 403 Forbidden
throw new NonTransientAiException("HTTP 403 - Insufficient permissions");

// Account disabled
throw new NonTransientAiException(
    "Account disabled. Please contact support."
);

// Resource access denied
throw new NonTransientAiException(
    "Access denied: User does not have permission to access model 'gpt-4'"
);

// Quota exceeded (permanent)
throw new NonTransientAiException(
    "Monthly quota exceeded. Upgrade plan or wait for reset."
);

No retry rationale:

  • Permissions won't change without admin action
  • Account status requires manual intervention
  • Quota limit requires payment or time passage (end of billing period)

3. Client Errors

// 400 Bad Request
throw new NonTransientAiException(
    "HTTP 400 - Invalid request format: JSON parsing error"
);

// 404 Not Found
throw new NonTransientAiException(
    "HTTP 404 - Endpoint not found: /v1/completios (typo in URL)"
);

// 405 Method Not Allowed
throw new NonTransientAiException(
    "HTTP 405 - Method GET not allowed for this endpoint. Use POST."
);

// 422 Unprocessable Entity
throw new NonTransientAiException(
    "HTTP 422 - Validation failed: prompt length exceeds maximum (4096 chars)"
);

// 415 Unsupported Media Type
throw new NonTransientAiException(
    "HTTP 415 - Content-Type must be application/json, received text/plain"
);

No retry rationale:

  • Request format won't become valid on retry
  • Endpoint URL won't change without code change
  • HTTP method won't become allowed without code change
  • Validation rules won't change on retry
  • Content-Type won't become supported on retry

4. Configuration Errors

// Invalid URL
try {
    new URL(endpoint);
} catch (MalformedURLException e) {
    throw new NonTransientAiException("Invalid endpoint URL: " + endpoint, e);
}

// Missing required parameter
if (apiKey == null) {
    throw new NonTransientAiException("Configuration error: API key not configured");
}

// Invalid parameter value
if (temperature < 0 || temperature > 2) {
    throw new NonTransientAiException(
        "Invalid temperature value: " + temperature + ". Must be between 0 and 2."
    );
}

// Schema validation failure
throw new NonTransientAiException(
    "Request schema validation failed: field 'model' is required"
);

No retry rationale:

  • Configuration won't fix itself on retry
  • Parameter validation rules won't change on retry
  • Code bug requires code fix, not retry

Integration with RetryTemplate

The auto-configured RetryTemplate is configured to retry on TransientAiException:

// Pseudocode for RetryTemplate configuration
RetryTemplate.builder()
    .maxAttempts(10)
    .retryOn(TransientAiException.class)      // Retry these
    .retryOn(ResourceAccessException.class)   // Spring's network error
    .retryOn(WebClientRequestException.class) // WebFlux network error (if available)
    .exponentialBackoff(initialInterval, multiplier, maxInterval)
    .build();

Exceptions NOT in the retry list (including NonTransientAiException) cause immediate failure.

Retry Behavior Flow

1. Operation throws exception
   ↓
2. Is exception TransientAiException or ResourceAccessException?
   ↓ Yes → Go to step 3
   ↓ No → Fail immediately (propagate to caller)
   ↓
3. Have we reached max attempts?
   ↓ Yes → Propagate TransientAiException to caller
   ↓ No → Go to step 4
   ↓
4. Calculate backoff delay using exponential formula
   ↓
5. Wait for backoff delay
   ↓
6. Increment retry counter
   ↓
7. Log retry attempt (via RetryListener)
   ↓
8. Retry operation (go back to step 1)

Detailed Flow Example

import org.springframework.retry.support.RetryTemplate;
import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;

public class RetryFlowExample {

    private final RetryTemplate retryTemplate;
    private int callCount = 0;

    public String demonstrateRetryFlow() {
        try {
            return retryTemplate.execute(context -> {
                callCount++;
                System.out.println("Attempt " + callCount);
                
                if (callCount == 1) {
                    // First call: transient error (will retry)
                    throw new TransientAiException("HTTP 503 - Service unavailable");
                } else if (callCount == 2) {
                    // Second call: transient error (will retry)
                    throw new TransientAiException("HTTP 429 - Rate limit exceeded");
                } else if (callCount == 3) {
                    // Third call: success
                    return "Success after 2 retries";
                }
                
                return "Should not reach here";
            });
        } catch (TransientAiException e) {
            // Only reached if all retries exhausted
            System.out.println("All retries failed: " + e.getMessage());
            return "Failure";
        }
    }

    public String demonstrateNonTransientFlow() {
        try {
            return retryTemplate.execute(context -> {
                callCount++;
                System.out.println("Attempt " + callCount);
                
                // Non-transient error: fails immediately, no retry
                throw new NonTransientAiException("HTTP 401 - Invalid API key");
            });
        } catch (NonTransientAiException e) {
            // Reached immediately after first attempt
            System.out.println("Immediate failure: " + e.getMessage());
            System.out.println("Call count: " + callCount);  // Will be 1
            return "Configuration error";
        }
    }
}

Output for demonstrateRetryFlow():

Attempt 1
[Wait 2s]
Attempt 2
[Wait 10s]
Attempt 3
Result: "Success after 2 retries"

Output for demonstrateNonTransientFlow():

Attempt 1
Immediate failure: HTTP 401 - Invalid API key
Call count: 1
Result: "Configuration error"

Custom Exception Handling

Creating Domain-Specific Exceptions

Extend the base exception types for domain-specific errors:

import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;

/**
 * Thrown when AI model quota is exceeded (transient - might resolve after time)
 * Extends TransientAiException because quota may reset at billing period
 */
public class QuotaExceededException extends TransientAiException {
    
    private final int quotaLimit;
    private final int quotaUsed;
    private final long resetTimeMillis;
    
    public QuotaExceededException(String message, int quotaLimit, int quotaUsed, long resetTimeMillis) {
        super(message);
        this.quotaLimit = quotaLimit;
        this.quotaUsed = quotaUsed;
        this.resetTimeMillis = resetTimeMillis;
    }
    
    public int getQuotaLimit() { return quotaLimit; }
    public int getQuotaUsed() { return quotaUsed; }
    public long getResetTimeMillis() { return resetTimeMillis; }
    
    @Override
    public String getMessage() {
        return String.format(
            "Quota exceeded: %d/%d used. Resets at %tF %tT",
            quotaUsed, quotaLimit, resetTimeMillis, resetTimeMillis
        );
    }
}

/**
 * Thrown when API key is invalid (non-transient - needs configuration fix)
 * Extends NonTransientAiException because API key won't become valid on retry
 */
public class InvalidApiKeyException extends NonTransientAiException {
    
    private final String apiKeyPrefix;  // For debugging (first 4 chars)
    
    public InvalidApiKeyException(String message, String apiKeyPrefix) {
        super(message);
        this.apiKeyPrefix = apiKeyPrefix;
    }
    
    public String getApiKeyPrefix() { return apiKeyPrefix; }
    
    @Override
    public String getMessage() {
        return super.getMessage() + " (key prefix: " + apiKeyPrefix + "...)";
    }
}

/**
 * Thrown when model is not available (transient - might be deployed later)
 * Extends TransientAiException because model might become available
 */
public class ModelNotAvailableException extends TransientAiException {
    
    private final String modelName;
    private final List<String> availableModels;
    
    public ModelNotAvailableException(String modelName, List<String> availableModels) {
        super(String.format("Model '%s' not available", modelName));
        this.modelName = modelName;
        this.availableModels = availableModels;
    }
    
    public String getModelName() { return modelName; }
    public List<String> getAvailableModels() { return availableModels; }
}

/**
 * Thrown when request format is invalid (non-transient - needs code fix)
 * Extends NonTransientAiException because format won't become valid on retry
 */
public class InvalidRequestFormatException extends NonTransientAiException {
    
    private final String fieldName;
    private final String expectedFormat;
    private final String actualValue;
    
    public InvalidRequestFormatException(
            String fieldName, 
            String expectedFormat, 
            String actualValue) {
        super(String.format(
            "Invalid format for field '%s': expected %s, got %s",
            fieldName, expectedFormat, actualValue
        ));
        this.fieldName = fieldName;
        this.expectedFormat = expectedFormat;
        this.actualValue = actualValue;
    }
    
    public String getFieldName() { return fieldName; }
    public String getExpectedFormat() { return expectedFormat; }
    public String getActualValue() { return actualValue; }
}

Usage of domain-specific exceptions:

public class AiClient {

    public String complete(String prompt, String model) {
        // Check quota
        if (isQuotaExceeded()) {
            throw new QuotaExceededException(
                "Monthly quota exceeded",
                1000,  // quota limit
                1023,  // quota used
                getQuotaResetTime()
            );
        }
        
        // Validate API key
        if (!isValidApiKey(apiKey)) {
            throw new InvalidApiKeyException(
                "Invalid API key",
                apiKey.substring(0, 4)  // First 4 chars for debugging
            );
        }
        
        // Check model availability
        if (!isModelAvailable(model)) {
            throw new ModelNotAvailableException(
                model,
                getAvailableModels()
            );
        }
        
        // Validate prompt format
        if (prompt.length() > 4096) {
            throw new InvalidRequestFormatException(
                "prompt",
                "string (max 4096 chars)",
                "string (" + prompt.length() + " chars)"
            );
        }
        
        return performCompletion(prompt, model);
    }
}

Custom Error Handler

Create a custom error handler that throws these exceptions:

import org.springframework.web.client.ResponseErrorHandler;
import org.springframework.http.client.ClientHttpResponse;
import org.springframework.http.HttpStatusCode;
import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
import java.io.IOException;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.nio.charset.StandardCharsets;
import java.util.stream.Collectors;

/**
 * Custom ResponseErrorHandler with detailed error classification
 */
public class CustomAiErrorHandler implements ResponseErrorHandler {

    @Override
    public boolean hasError(ClientHttpResponse response) throws IOException {
        HttpStatusCode statusCode = response.getStatusCode();
        // Treat 404 as non-error (resource might not exist yet)
        if (statusCode.value() == 404) {
            return false;
        }
        return statusCode.isError();
    }

    @Override
    public void handleError(ClientHttpResponse response) throws IOException {
        int statusCode = response.getStatusCode().value();
        String errorBody = readErrorBody(response);
        String errorMessage = "HTTP " + statusCode + " - " + errorBody;

        // Detailed classification based on status code
        switch (statusCode) {
            // Authentication errors (non-transient)
            case 401:
                if (errorBody.contains("invalid_api_key")) {
                    throw new InvalidApiKeyException(
                        errorMessage,
                        extractApiKeyPrefix(errorBody)
                    );
                }
                throw new NonTransientAiException(errorMessage);

            // Authorization errors (non-transient)
            case 403:
                if (errorBody.contains("quota_exceeded")) {
                    // Quota exceeded - transient if it resets
                    throw new QuotaExceededException(
                        errorMessage,
                        extractQuotaLimit(errorBody),
                        extractQuotaUsed(errorBody),
                        extractResetTime(errorBody)
                    );
                }
                throw new NonTransientAiException(errorMessage);

            // Bad request (non-transient)
            case 400:
                throw new InvalidRequestFormatException(
                    extractFieldName(errorBody),
                    extractExpectedFormat(errorBody),
                    extractActualValue(errorBody)
                );

            // Rate limit (transient)
            case 429:
                int retryAfter = extractRetryAfter(response, errorBody);
                throw new TransientAiException(
                    errorMessage + " - Retry after " + retryAfter + "s"
                );

            // Server errors (transient)
            case 500:
            case 502:
            case 503:
            case 504:
                throw new TransientAiException(errorMessage);

            // Default classification
            default:
                if (statusCode >= 400 && statusCode < 500) {
                    throw new NonTransientAiException(errorMessage);
                } else {
                    throw new TransientAiException(errorMessage);
                }
        }
    }

    /**
     * Read error response body with size limit
     */
    private String readErrorBody(ClientHttpResponse response) throws IOException {
        try (BufferedReader reader = new BufferedReader(
                new InputStreamReader(response.getBody(), StandardCharsets.UTF_8))) {
            
            // Read up to 4KB to prevent memory issues
            char[] buffer = new char[4096];
            int charsRead = reader.read(buffer);
            
            if (charsRead == -1) {
                return "No response body available";
            }
            
            return new String(buffer, 0, charsRead);
        }
    }

    /**
     * Extract retry-after value from response
     */
    private int extractRetryAfter(ClientHttpResponse response, String errorBody) {
        // Check Retry-After header
        String retryAfterHeader = response.getHeaders().getFirst("Retry-After");
        if (retryAfterHeader != null) {
            try {
                return Integer.parseInt(retryAfterHeader);
            } catch (NumberFormatException e) {
                // Ignore invalid header
            }
        }
        
        // Parse from error body (example: {"retry_after": 60})
        // Simplified - use JSON parser in production
        int retryAfter = 60;  // default
        if (errorBody.contains("retry_after")) {
            // Extract value (simplified)
            retryAfter = 60;
        }
        
        return retryAfter;
    }

    // Helper methods for extracting information from error body
    private String extractApiKeyPrefix(String errorBody) {
        // Simplified - use JSON parser in production
        return "sk-...";
    }

    private int extractQuotaLimit(String errorBody) {
        // Simplified - use JSON parser in production
        return 1000;
    }

    private int extractQuotaUsed(String errorBody) {
        // Simplified - use JSON parser in production
        return 1023;
    }

    private long extractResetTime(String errorBody) {
        // Simplified - use JSON parser in production
        return System.currentTimeMillis() + (24 * 60 * 60 * 1000);  // +24 hours
    }

    private String extractFieldName(String errorBody) {
        // Simplified - use JSON parser in production
        return "prompt";
    }

    private String extractExpectedFormat(String errorBody) {
        // Simplified - use JSON parser in production
        return "string (max 4096 chars)";
    }

    private String extractActualValue(String errorBody) {
        // Simplified - use JSON parser in production
        return "string (5000 chars)";
    }
}

Exception Message Format

The auto-configured ResponseErrorHandler creates error messages in this format:

HTTP {status_code} - {response_body}

Examples:

HTTP 429 - Rate limit exceeded. Please retry after 60 seconds.
HTTP 401 - Invalid authentication credentials.
HTTP 503 - Service temporarily unavailable due to maintenance.
HTTP 500 - Internal server error: NullPointerException at ServiceImpl.java:42
HTTP 400 - Invalid request: field 'prompt' is required
HTTP 404 - Model 'gpt-5' not found

Include context in your exception messages to aid debugging:

// Good: Includes context
throw new TransientAiException(
    "HTTP 503 - AI model service unavailable. " +
    "Retry attempt " + retryCount + " of " + maxRetries
);

// Good: Includes original cause
throw new NonTransientAiException(
    "Failed to authenticate with AI service: " + originalException.getMessage(),
    originalException
);

// Good: Includes relevant parameters
throw new NonTransientAiException(
    "Invalid temperature parameter: " + temperature + ". Must be between 0 and 2."
);

// Less helpful: Vague message
throw new TransientAiException("Error occurred");

// Less helpful: Missing context
throw new NonTransientAiException("Invalid parameter");

Best Practices

1. Classify Correctly

Use TransientAiException only for truly temporary errors that might resolve on retry:

// GOOD: Transient - network might recover
throw new TransientAiException("Connection timeout", timeoutException);

// GOOD: Transient - service might come back up
throw new TransientAiException("HTTP 503 - Service unavailable");

// BAD: Non-transient classified as transient - wastes retries
throw new TransientAiException("HTTP 401 - Invalid API key");
// SHOULD BE:
throw new NonTransientAiException("HTTP 401 - Invalid API key");

// BAD: Transient classified as non-transient - misses recovery opportunity
throw new NonTransientAiException("HTTP 503 - Service unavailable");
// SHOULD BE:
throw new TransientAiException("HTTP 503 - Service unavailable");

2. Include Context

Add relevant details to exception messages:

// GOOD: Includes status code, response body, and retry info
throw new TransientAiException(String.format(
    "HTTP %d - %s (attempt %d/%d)",
    statusCode, responseBody, attemptNumber, maxAttempts
));

// GOOD: Includes parameter name and valid range
throw new NonTransientAiException(String.format(
    "Invalid parameter '%s': value %s is outside valid range [%s, %s]",
    paramName, actualValue, minValue, maxValue
));

// BAD: No context
throw new TransientAiException("Error");

3. Preserve Causes

Pass the original exception as the cause for full stack traces:

// GOOD: Preserves full exception chain
try {
    performOperation();
} catch (IOException e) {
    throw new TransientAiException("Network error: " + e.getMessage(), e);
}

// BAD: Loses original stack trace
try {
    performOperation();
} catch (IOException e) {
    throw new TransientAiException("Network error");  // No cause
}

4. Document Exceptions

Document what exceptions your methods throw:

/**
 * Calls AI completion API
 * 
 * @param prompt The prompt text
 * @return Completion result
 * @throws TransientAiException If a temporary error occurs (network, server error, rate limit)
 * @throws NonTransientAiException If a permanent error occurs (auth, validation, configuration)
 */
public String complete(String prompt) {
    // Implementation
}

5. Handle Exhausted Retries

Catch TransientAiException after retry exhaustion to handle gracefully:

try {
    return retryTemplate.execute(context -> callApi());
} catch (TransientAiException e) {
    // All retries exhausted - provide fallback
    log.error("Service unavailable after {} retries", maxAttempts);
    return fallbackResponse();
}

6. Don't Overuse Transient

When in doubt, use NonTransientAiException to avoid wasteful retries:

// If uncertain whether error is transient:
// Better to fail fast than waste time on futile retries
if (unknownErrorCondition) {
    throw new NonTransientAiException("Unknown error condition");
    // User can manually retry if they think it's transient
}

7. Log Appropriately

Log TransientAiException at WARN level, NonTransientAiException at ERROR level:

try {
    return callApi();
} catch (TransientAiException e) {
    // Transient - might recover
    log.warn("Transient error (will retry): {}", e.getMessage());
    throw e;
} catch (NonTransientAiException e) {
    // Non-transient - requires intervention
    log.error("Non-transient error (won't retry): {}", e.getMessage(), e);
    throw e;
}

Testing Exception Behavior

Testing Transient Errors

Use RetryUtils.SHORT_RETRY_TEMPLATE for tests:

import org.springframework.ai.retry.RetryUtils;
import org.springframework.ai.retry.TransientAiException;
import org.junit.jupiter.api.Test;
import java.util.concurrent.atomic.AtomicInteger;
import static org.assertj.core.api.Assertions.assertThat;

class TransientErrorRetryTest {

    @Test
    void testTransientErrorSucceedsAfterRetries() {
        AtomicInteger attempts = new AtomicInteger(0);

        String result = RetryUtils.SHORT_RETRY_TEMPLATE.execute(context -> {
            int attemptNumber = attempts.incrementAndGet();
            
            // Fail first 2 attempts
            if (attemptNumber < 3) {
                throw new TransientAiException("Transient failure " + attemptNumber);
            }
            
            // Succeed on 3rd attempt
            return "success";
        });

        assertThat(result).isEqualTo("success");
        assertThat(attempts.get()).isEqualTo(3);  // Succeeded on 3rd attempt
    }

    @Test
    void testTransientErrorExhaustsRetries() {
        AtomicInteger attempts = new AtomicInteger(0);

        assertThatThrownBy(() -> {
            RetryUtils.SHORT_RETRY_TEMPLATE.execute(context -> {
                attempts.incrementAndGet();
                throw new TransientAiException("Always fails");
            });
        }).isInstanceOf(TransientAiException.class)
          .hasMessageContaining("Always fails");

        // SHORT_RETRY_TEMPLATE has maxAttempts=10
        assertThat(attempts.get()).isEqualTo(10);
    }
}

Testing Non-Transient Errors

import org.springframework.ai.retry.RetryUtils;
import org.springframework.ai.retry.NonTransientAiException;
import org.junit.jupiter.api.Test;
import java.util.concurrent.atomic.AtomicInteger;
import static org.assertj.core.api.Assertions.assertThat;
import static org.assertj.core.api.Assertions.assertThatThrownBy;

class NonTransientErrorTest {

    @Test
    void testNonTransientErrorFailsImmediately() {
        AtomicInteger attempts = new AtomicInteger(0);

        assertThatThrownBy(() -> {
            RetryUtils.SHORT_RETRY_TEMPLATE.execute(context -> {
                attempts.incrementAndGet();
                throw new NonTransientAiException("Permanent failure");
            });
        }).isInstanceOf(NonTransientAiException.class)
          .hasMessageContaining("Permanent failure");

        // Should fail immediately without retry
        assertThat(attempts.get()).isEqualTo(1);
    }

    @Test
    void testMixedExceptions() {
        AtomicInteger attempts = new AtomicInteger(0);

        assertThatThrownBy(() -> {
            RetryUtils.SHORT_RETRY_TEMPLATE.execute(context -> {
                int attemptNumber = attempts.incrementAndGet();
                
                if (attemptNumber == 1) {
                    // First: transient (will retry)
                    throw new TransientAiException("Transient error");
                } else {
                    // Second: non-transient (will fail immediately)
                    throw new NonTransientAiException("Permanent error");
                }
            });
        }).isInstanceOf(NonTransientAiException.class);

        // Transient retry + non-transient failure = 2 attempts
        assertThat(attempts.get()).isEqualTo(2);
    }
}

Mocking Error Responses

import org.springframework.web.client.ResponseErrorHandler;
import org.springframework.http.client.ClientHttpResponse;
import org.springframework.http.HttpStatus;
import org.springframework.http.HttpStatusCode;
import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
import org.junit.jupiter.api.Test;
import org.mockito.Mockito;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import static org.assertj.core.api.Assertions.assertThatThrownBy;

class ErrorHandlerTest {

    @Test
    void testRateLimitThrowsTransientException() throws IOException {
        ResponseErrorHandler handler = createAutoConfiguredErrorHandler();
        ClientHttpResponse response = createMockResponse(
            HttpStatus.TOO_MANY_REQUESTS,
            "Rate limit exceeded"
        );

        assertThatThrownBy(() -> handler.handleError(response))
            .isInstanceOf(TransientAiException.class)
            .hasMessageContaining("429")
            .hasMessageContaining("Rate limit exceeded");
    }

    @Test
    void testUnauthorizedThrowsNonTransientException() throws IOException {
        ResponseErrorHandler handler = createAutoConfiguredErrorHandler();
        ClientHttpResponse response = createMockResponse(
            HttpStatus.UNAUTHORIZED,
            "Invalid API key"
        );

        assertThatThrownBy(() -> handler.handleError(response))
            .isInstanceOf(NonTransientAiException.class)
            .hasMessageContaining("401")
            .hasMessageContaining("Invalid API key");
    }

    @Test
    void testServerErrorThrowsTransientException() throws IOException {
        ResponseErrorHandler handler = createAutoConfiguredErrorHandler();
        ClientHttpResponse response = createMockResponse(
            HttpStatus.INTERNAL_SERVER_ERROR,
            "Internal server error"
        );

        assertThatThrownBy(() -> handler.handleError(response))
            .isInstanceOf(TransientAiException.class)
            .hasMessageContaining("500");
    }

    private ClientHttpResponse createMockResponse(HttpStatus status, String body) 
            throws IOException {
        ClientHttpResponse response = Mockito.mock(ClientHttpResponse.class);
        Mockito.when(response.getStatusCode()).thenReturn(HttpStatusCode.valueOf(status.value()));
        Mockito.when(response.getBody()).thenReturn(
            new ByteArrayInputStream(body.getBytes())
        );
        return response;
    }

    private ResponseErrorHandler createAutoConfiguredErrorHandler() {
        // Return configured error handler from auto-configuration
        // In real tests, inject from Spring context
        return handler;
    }
}

Exception Hierarchy

java.lang.Object
└── java.lang.Throwable
    └── java.lang.Exception
        └── java.lang.RuntimeException
            ├── org.springframework.ai.retry.TransientAiException
            │   ├── QuotaExceededException (custom)
            │   ├── ModelNotAvailableException (custom)
            │   └── (other custom transient exceptions)
            └── org.springframework.ai.retry.NonTransientAiException
                ├── InvalidApiKeyException (custom)
                ├── InvalidRequestFormatException (custom)
                └── (other custom non-transient exceptions)

Both exception types extend RuntimeException, making them unchecked exceptions that don't require explicit throws declarations in method signatures.

Benefits of unchecked exceptions:

  • Cleaner method signatures
  • Optional handling (caller can choose to catch or propagate)
  • Consistent with Spring framework exception patterns
  • Reduces boilerplate code

Serializable: Both exceptions are serializable (RuntimeException implements Serializable), allowing them to:

  • Be transmitted across network boundaries (RMI, distributed systems)
  • Be stored in session (if needed)
  • Be logged with full stack traces
tessl i tessl/maven-org-springframework-ai--spring-ai-autoconfigure-retry@1.1.1

docs

index.md

tile.json