Spring Boot auto-configuration for AI retry capabilities with exponential backoff and intelligent HTTP error handling
Exception types for classifying AI errors as transient or non-transient to control retry behavior. These exception types are part of the spring-ai-retry module and are used by the auto-configuration to determine whether errors should be retried.
Exception for transient AI errors where a retry of the same operation might succeed without any intervention.
// Package: org.springframework.ai.retry
/**
* Root of the hierarchy of Model access exceptions that are considered transient
* A previously failed operation might succeed when retried
*
* Thrown for:
* - Server errors (5xx)
* - Network errors (timeouts, connection failures)
* - Rate limits (429 when configured)
* - Temporary service unavailability (503)
* - Gateway errors (502, 504)
*
* Causes RetryTemplate to retry the operation
* Extends: java.lang.RuntimeException (unchecked exception)
* Thread-safe: Yes (immutable after construction)
* Serializable: Yes (extends RuntimeException which is Serializable)
*
* @since 0.8.1
*/
public class TransientAiException extends RuntimeException {
/**
* Constructs a new TransientAiException with the specified message
*
* Usage:
* throw new TransientAiException("HTTP 503 - Service temporarily unavailable");
*
* @param message Error message describing the transient failure (can be null)
*/
public TransientAiException(String message);
/**
* Constructs a new TransientAiException with message and cause
*
* Usage:
* throw new TransientAiException("Connection timeout", networkException);
*
* Preserves full stack trace of underlying cause
* Use this constructor to maintain exception chain
*
* @param message Error message describing the transient failure (can be null)
* @param cause The underlying cause of the exception (can be null)
*/
public TransientAiException(String message, Throwable cause);
}When to throw TransientAiException:
Server errors (5xx)
Rate limiting
Network errors
Temporary unavailability
Behavior with RetryTemplate:
Exception for non-transient AI errors where a retry of the same operation will fail unless the cause is corrected.
// Package: org.springframework.ai.retry
/**
* Root of the hierarchy of Model access exceptions that are considered non-transient
* A retry of the same operation would fail unless the cause is corrected
*
* Thrown for:
* - Authentication errors (401)
* - Authorization errors (403)
* - Bad request errors (400)
* - Not found errors (404)
* - Client errors (4xx when configured)
* - Explicitly configured non-transient codes
* - Invalid configuration (malformed URLs, missing parameters)
*
* Causes RetryTemplate to fail immediately without retry
* Extends: java.lang.RuntimeException (unchecked exception)
* Thread-safe: Yes (immutable after construction)
* Serializable: Yes (extends RuntimeException which is Serializable)
*
* @since 0.8.1
*/
public class NonTransientAiException extends RuntimeException {
/**
* Constructs a new NonTransientAiException with the specified message
*
* Usage:
* throw new NonTransientAiException("HTTP 401 - Invalid API key");
*
* @param message Error message describing the non-transient failure (can be null)
*/
public NonTransientAiException(String message);
/**
* Constructs a new NonTransientAiException with message and cause
*
* Usage:
* throw new NonTransientAiException("Invalid API key", authException);
*
* Preserves full stack trace of underlying cause
* Use this constructor to maintain exception chain
*
* @param message Error message describing the non-transient failure (can be null)
* @param cause The underlying cause of the exception (can be null)
*/
public NonTransientAiException(String message, Throwable cause);
}When to throw NonTransientAiException:
Authentication errors
Authorization errors
Client errors
Configuration errors
Behavior with RetryTemplate:
Throw these exceptions to control retry behavior in your own code:
import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
import java.io.IOException;
import java.net.SocketTimeoutException;
public class CustomAiClient {
/**
* Example of throwing appropriate exceptions based on error type
*/
public String callApi(String apiKey, String request) {
// Validate input - configuration error, won't resolve on retry
if (apiKey == null || apiKey.isEmpty()) {
throw new NonTransientAiException("API key is required");
}
if (request == null || request.length() > 4096) {
throw new NonTransientAiException("Invalid request: must be 1-4096 characters");
}
try {
return performApiCall(apiKey, request);
} catch (SocketTimeoutException e) {
// Network timeout - transient, retry might succeed
throw new TransientAiException("Network timeout communicating with AI service", e);
} catch (IOException e) {
// General I/O error - could be transient
throw new TransientAiException("I/O error communicating with AI service", e);
} catch (InvalidApiKeyException e) {
// Auth error - non-transient, retry won't help
throw new NonTransientAiException("Invalid API key: " + e.getMessage(), e);
} catch (RateLimitException e) {
// Rate limit - transient, retry after backoff should succeed
throw new TransientAiException(
"Rate limit exceeded: " + e.getRetryAfterSeconds() + "s", e);
} catch (ServiceUnavailableException e) {
// Service down - transient, might recover
throw new TransientAiException("AI service temporarily unavailable", e);
}
}
private String performApiCall(String apiKey, String request) throws IOException {
// Implementation
return "result";
}
}Handle these exceptions in application code:
import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
import org.springframework.retry.support.RetryTemplate;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class AiService {
private static final Logger log = LoggerFactory.getLogger(AiService.class);
private final RetryTemplate retryTemplate;
private final AiClient aiClient;
public AiService(RetryTemplate retryTemplate, AiClient aiClient) {
this.retryTemplate = retryTemplate;
this.aiClient = aiClient;
}
/**
* Calls AI service with retry and graceful error handling
* Returns either the completion or a fallback message
*/
public String getCompletion(String prompt) {
try {
// RetryTemplate handles TransientAiException automatically
return retryTemplate.execute(context -> {
log.debug("Calling AI service, attempt {}", context.getRetryCount() + 1);
return aiClient.complete(prompt);
});
} catch (TransientAiException e) {
// All retries exhausted - still transient error
log.error("AI service temporarily unavailable after {} retries: {}",
retryTemplate.getRetryPolicy().getMaxAttempts(),
e.getMessage());
return "Service temporarily unavailable. Please try again later.";
} catch (NonTransientAiException e) {
// Permanent failure - immediate failure, no retries
log.error("AI service error that cannot be resolved by retry: {}",
e.getMessage());
// Check specific error types
if (e.getMessage().contains("401") || e.getMessage().contains("Invalid API key")) {
return "Service configuration error. Please contact support.";
} else if (e.getMessage().contains("400") || e.getMessage().contains("Bad Request")) {
return "Invalid request format. Please check your input.";
} else {
return "Service error. Please check your configuration.";
}
}
}
/**
* Calls AI service with context-aware error handling
* Distinguishes between different failure modes
*/
public CompletionResult getCompletionWithDetails(String prompt) {
try {
String completion = retryTemplate.execute(context ->
aiClient.complete(prompt)
);
return CompletionResult.success(completion);
} catch (TransientAiException e) {
// Transient failure after all retries
return CompletionResult.failure(
FailureReason.TEMPORARY_UNAVAILABLE,
"Service temporarily unavailable: " + e.getMessage(),
true // retryable
);
} catch (NonTransientAiException e) {
// Determine specific failure reason
if (e.getMessage().contains("401") || e.getMessage().contains("403")) {
return CompletionResult.failure(
FailureReason.AUTHENTICATION_ERROR,
e.getMessage(),
false // not retryable
);
} else if (e.getMessage().contains("400")) {
return CompletionResult.failure(
FailureReason.INVALID_REQUEST,
e.getMessage(),
false // not retryable
);
} else {
return CompletionResult.failure(
FailureReason.CONFIGURATION_ERROR,
e.getMessage(),
false // not retryable
);
}
}
}
}The auto-configured ResponseErrorHandler automatically throws these exceptions:
import org.springframework.web.client.RestTemplate;
import org.springframework.web.client.ResponseErrorHandler;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.retry.support.RetryTemplate;
@Configuration
public class RestConfig {
/**
* Configure RestTemplate with auto-configured error handler
* Error handler throws TransientAiException or NonTransientAiException
* based on HTTP status code
*/
@Bean
public RestTemplate restTemplate(ResponseErrorHandler errorHandler) {
RestTemplate template = new RestTemplate();
// ErrorHandler will be invoked for 4xx and 5xx responses
// Before response is returned to caller
template.setErrorHandler(errorHandler);
return template;
}
/**
* Example service using RestTemplate with retry
*/
@Bean
public AiService aiService(RestTemplate restTemplate, RetryTemplate retryTemplate) {
return new AiService() {
public String callAi(String prompt) {
return retryTemplate.execute(context -> {
// RestTemplate calls API
// If error response (4xx/5xx):
// 1. ResponseErrorHandler.hasError() returns true
// 2. ResponseErrorHandler.handleError() is invoked
// 3. Throws TransientAiException or NonTransientAiException
// 4. RetryTemplate catches and handles based on exception type
return restTemplate.postForObject(
"https://api.example.com/complete",
prompt,
String.class
);
});
}
};
}
}Flow when RestTemplate encounters HTTP error:
Errors that are temporary and might resolve on retry:
// 500 Internal Server Error
throw new TransientAiException("HTTP 500 - Internal server error");
// 502 Bad Gateway
throw new TransientAiException("HTTP 502 - Bad gateway");
// 503 Service Unavailable
throw new TransientAiException("HTTP 503 - Service temporarily unavailable");
// 504 Gateway Timeout
throw new TransientAiException("HTTP 504 - Gateway timeout");Retry rationale:
// 429 Too Many Requests
throw new TransientAiException("HTTP 429 - Rate limit exceeded. Retry after 60s");
// With retry-after information
throw new TransientAiException(String.format(
"HTTP 429 - Rate limit exceeded. Retry after %d seconds",
retryAfterSeconds
));Retry rationale:
import java.net.SocketTimeoutException;
import java.net.ConnectException;
import java.net.UnknownHostException;
import java.net.NoRouteToHostException;
// Connection timeout
try {
makeConnection();
} catch (SocketTimeoutException e) {
throw new TransientAiException("Connection timeout to AI service", e);
}
// Connection refused
try {
makeConnection();
} catch (ConnectException e) {
throw new TransientAiException("Connection refused: " + e.getMessage(), e);
}
// DNS resolution failure
try {
resolveHost();
} catch (UnknownHostException e) {
throw new TransientAiException("DNS resolution failed: " + e.getMessage(), e);
}
// Network unreachable
try {
makeConnection();
} catch (NoRouteToHostException e) {
throw new TransientAiException("Network unreachable: " + e.getMessage(), e);
}Retry rationale:
// Service maintenance
throw new TransientAiException(
"Service maintenance in progress. Expected completion: 15:00 UTC"
);
// Capacity issues
throw new TransientAiException(
"Service at capacity. Request queued."
);
// Circuit breaker open
throw new TransientAiException(
"Circuit breaker open due to high error rate. Retry after cooldown."
);
// Load balancer health check failure
throw new TransientAiException(
"No healthy instances available. Retry after health check interval."
);Retry rationale:
Errors that are permanent and won't resolve without intervention:
// 401 Unauthorized
throw new NonTransientAiException("HTTP 401 - Invalid API key");
// Invalid credentials
throw new NonTransientAiException(
"Authentication failed: API key not found in system"
);
// Expired token
throw new NonTransientAiException(
"Authentication failed: Token expired. Please refresh token."
);
// Missing auth header
throw new NonTransientAiException(
"Authentication required: Authorization header missing"
);No retry rationale:
// 403 Forbidden
throw new NonTransientAiException("HTTP 403 - Insufficient permissions");
// Account disabled
throw new NonTransientAiException(
"Account disabled. Please contact support."
);
// Resource access denied
throw new NonTransientAiException(
"Access denied: User does not have permission to access model 'gpt-4'"
);
// Quota exceeded (permanent)
throw new NonTransientAiException(
"Monthly quota exceeded. Upgrade plan or wait for reset."
);No retry rationale:
// 400 Bad Request
throw new NonTransientAiException(
"HTTP 400 - Invalid request format: JSON parsing error"
);
// 404 Not Found
throw new NonTransientAiException(
"HTTP 404 - Endpoint not found: /v1/completios (typo in URL)"
);
// 405 Method Not Allowed
throw new NonTransientAiException(
"HTTP 405 - Method GET not allowed for this endpoint. Use POST."
);
// 422 Unprocessable Entity
throw new NonTransientAiException(
"HTTP 422 - Validation failed: prompt length exceeds maximum (4096 chars)"
);
// 415 Unsupported Media Type
throw new NonTransientAiException(
"HTTP 415 - Content-Type must be application/json, received text/plain"
);No retry rationale:
// Invalid URL
try {
new URL(endpoint);
} catch (MalformedURLException e) {
throw new NonTransientAiException("Invalid endpoint URL: " + endpoint, e);
}
// Missing required parameter
if (apiKey == null) {
throw new NonTransientAiException("Configuration error: API key not configured");
}
// Invalid parameter value
if (temperature < 0 || temperature > 2) {
throw new NonTransientAiException(
"Invalid temperature value: " + temperature + ". Must be between 0 and 2."
);
}
// Schema validation failure
throw new NonTransientAiException(
"Request schema validation failed: field 'model' is required"
);No retry rationale:
The auto-configured RetryTemplate is configured to retry on TransientAiException:
// Pseudocode for RetryTemplate configuration
RetryTemplate.builder()
.maxAttempts(10)
.retryOn(TransientAiException.class) // Retry these
.retryOn(ResourceAccessException.class) // Spring's network error
.retryOn(WebClientRequestException.class) // WebFlux network error (if available)
.exponentialBackoff(initialInterval, multiplier, maxInterval)
.build();Exceptions NOT in the retry list (including NonTransientAiException) cause immediate failure.
1. Operation throws exception
↓
2. Is exception TransientAiException or ResourceAccessException?
↓ Yes → Go to step 3
↓ No → Fail immediately (propagate to caller)
↓
3. Have we reached max attempts?
↓ Yes → Propagate TransientAiException to caller
↓ No → Go to step 4
↓
4. Calculate backoff delay using exponential formula
↓
5. Wait for backoff delay
↓
6. Increment retry counter
↓
7. Log retry attempt (via RetryListener)
↓
8. Retry operation (go back to step 1)import org.springframework.retry.support.RetryTemplate;
import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
public class RetryFlowExample {
private final RetryTemplate retryTemplate;
private int callCount = 0;
public String demonstrateRetryFlow() {
try {
return retryTemplate.execute(context -> {
callCount++;
System.out.println("Attempt " + callCount);
if (callCount == 1) {
// First call: transient error (will retry)
throw new TransientAiException("HTTP 503 - Service unavailable");
} else if (callCount == 2) {
// Second call: transient error (will retry)
throw new TransientAiException("HTTP 429 - Rate limit exceeded");
} else if (callCount == 3) {
// Third call: success
return "Success after 2 retries";
}
return "Should not reach here";
});
} catch (TransientAiException e) {
// Only reached if all retries exhausted
System.out.println("All retries failed: " + e.getMessage());
return "Failure";
}
}
public String demonstrateNonTransientFlow() {
try {
return retryTemplate.execute(context -> {
callCount++;
System.out.println("Attempt " + callCount);
// Non-transient error: fails immediately, no retry
throw new NonTransientAiException("HTTP 401 - Invalid API key");
});
} catch (NonTransientAiException e) {
// Reached immediately after first attempt
System.out.println("Immediate failure: " + e.getMessage());
System.out.println("Call count: " + callCount); // Will be 1
return "Configuration error";
}
}
}Output for demonstrateRetryFlow():
Attempt 1
[Wait 2s]
Attempt 2
[Wait 10s]
Attempt 3
Result: "Success after 2 retries"Output for demonstrateNonTransientFlow():
Attempt 1
Immediate failure: HTTP 401 - Invalid API key
Call count: 1
Result: "Configuration error"Extend the base exception types for domain-specific errors:
import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
/**
* Thrown when AI model quota is exceeded (transient - might resolve after time)
* Extends TransientAiException because quota may reset at billing period
*/
public class QuotaExceededException extends TransientAiException {
private final int quotaLimit;
private final int quotaUsed;
private final long resetTimeMillis;
public QuotaExceededException(String message, int quotaLimit, int quotaUsed, long resetTimeMillis) {
super(message);
this.quotaLimit = quotaLimit;
this.quotaUsed = quotaUsed;
this.resetTimeMillis = resetTimeMillis;
}
public int getQuotaLimit() { return quotaLimit; }
public int getQuotaUsed() { return quotaUsed; }
public long getResetTimeMillis() { return resetTimeMillis; }
@Override
public String getMessage() {
return String.format(
"Quota exceeded: %d/%d used. Resets at %tF %tT",
quotaUsed, quotaLimit, resetTimeMillis, resetTimeMillis
);
}
}
/**
* Thrown when API key is invalid (non-transient - needs configuration fix)
* Extends NonTransientAiException because API key won't become valid on retry
*/
public class InvalidApiKeyException extends NonTransientAiException {
private final String apiKeyPrefix; // For debugging (first 4 chars)
public InvalidApiKeyException(String message, String apiKeyPrefix) {
super(message);
this.apiKeyPrefix = apiKeyPrefix;
}
public String getApiKeyPrefix() { return apiKeyPrefix; }
@Override
public String getMessage() {
return super.getMessage() + " (key prefix: " + apiKeyPrefix + "...)";
}
}
/**
* Thrown when model is not available (transient - might be deployed later)
* Extends TransientAiException because model might become available
*/
public class ModelNotAvailableException extends TransientAiException {
private final String modelName;
private final List<String> availableModels;
public ModelNotAvailableException(String modelName, List<String> availableModels) {
super(String.format("Model '%s' not available", modelName));
this.modelName = modelName;
this.availableModels = availableModels;
}
public String getModelName() { return modelName; }
public List<String> getAvailableModels() { return availableModels; }
}
/**
* Thrown when request format is invalid (non-transient - needs code fix)
* Extends NonTransientAiException because format won't become valid on retry
*/
public class InvalidRequestFormatException extends NonTransientAiException {
private final String fieldName;
private final String expectedFormat;
private final String actualValue;
public InvalidRequestFormatException(
String fieldName,
String expectedFormat,
String actualValue) {
super(String.format(
"Invalid format for field '%s': expected %s, got %s",
fieldName, expectedFormat, actualValue
));
this.fieldName = fieldName;
this.expectedFormat = expectedFormat;
this.actualValue = actualValue;
}
public String getFieldName() { return fieldName; }
public String getExpectedFormat() { return expectedFormat; }
public String getActualValue() { return actualValue; }
}Usage of domain-specific exceptions:
public class AiClient {
public String complete(String prompt, String model) {
// Check quota
if (isQuotaExceeded()) {
throw new QuotaExceededException(
"Monthly quota exceeded",
1000, // quota limit
1023, // quota used
getQuotaResetTime()
);
}
// Validate API key
if (!isValidApiKey(apiKey)) {
throw new InvalidApiKeyException(
"Invalid API key",
apiKey.substring(0, 4) // First 4 chars for debugging
);
}
// Check model availability
if (!isModelAvailable(model)) {
throw new ModelNotAvailableException(
model,
getAvailableModels()
);
}
// Validate prompt format
if (prompt.length() > 4096) {
throw new InvalidRequestFormatException(
"prompt",
"string (max 4096 chars)",
"string (" + prompt.length() + " chars)"
);
}
return performCompletion(prompt, model);
}
}Create a custom error handler that throws these exceptions:
import org.springframework.web.client.ResponseErrorHandler;
import org.springframework.http.client.ClientHttpResponse;
import org.springframework.http.HttpStatusCode;
import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
import java.io.IOException;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.nio.charset.StandardCharsets;
import java.util.stream.Collectors;
/**
* Custom ResponseErrorHandler with detailed error classification
*/
public class CustomAiErrorHandler implements ResponseErrorHandler {
@Override
public boolean hasError(ClientHttpResponse response) throws IOException {
HttpStatusCode statusCode = response.getStatusCode();
// Treat 404 as non-error (resource might not exist yet)
if (statusCode.value() == 404) {
return false;
}
return statusCode.isError();
}
@Override
public void handleError(ClientHttpResponse response) throws IOException {
int statusCode = response.getStatusCode().value();
String errorBody = readErrorBody(response);
String errorMessage = "HTTP " + statusCode + " - " + errorBody;
// Detailed classification based on status code
switch (statusCode) {
// Authentication errors (non-transient)
case 401:
if (errorBody.contains("invalid_api_key")) {
throw new InvalidApiKeyException(
errorMessage,
extractApiKeyPrefix(errorBody)
);
}
throw new NonTransientAiException(errorMessage);
// Authorization errors (non-transient)
case 403:
if (errorBody.contains("quota_exceeded")) {
// Quota exceeded - transient if it resets
throw new QuotaExceededException(
errorMessage,
extractQuotaLimit(errorBody),
extractQuotaUsed(errorBody),
extractResetTime(errorBody)
);
}
throw new NonTransientAiException(errorMessage);
// Bad request (non-transient)
case 400:
throw new InvalidRequestFormatException(
extractFieldName(errorBody),
extractExpectedFormat(errorBody),
extractActualValue(errorBody)
);
// Rate limit (transient)
case 429:
int retryAfter = extractRetryAfter(response, errorBody);
throw new TransientAiException(
errorMessage + " - Retry after " + retryAfter + "s"
);
// Server errors (transient)
case 500:
case 502:
case 503:
case 504:
throw new TransientAiException(errorMessage);
// Default classification
default:
if (statusCode >= 400 && statusCode < 500) {
throw new NonTransientAiException(errorMessage);
} else {
throw new TransientAiException(errorMessage);
}
}
}
/**
* Read error response body with size limit
*/
private String readErrorBody(ClientHttpResponse response) throws IOException {
try (BufferedReader reader = new BufferedReader(
new InputStreamReader(response.getBody(), StandardCharsets.UTF_8))) {
// Read up to 4KB to prevent memory issues
char[] buffer = new char[4096];
int charsRead = reader.read(buffer);
if (charsRead == -1) {
return "No response body available";
}
return new String(buffer, 0, charsRead);
}
}
/**
* Extract retry-after value from response
*/
private int extractRetryAfter(ClientHttpResponse response, String errorBody) {
// Check Retry-After header
String retryAfterHeader = response.getHeaders().getFirst("Retry-After");
if (retryAfterHeader != null) {
try {
return Integer.parseInt(retryAfterHeader);
} catch (NumberFormatException e) {
// Ignore invalid header
}
}
// Parse from error body (example: {"retry_after": 60})
// Simplified - use JSON parser in production
int retryAfter = 60; // default
if (errorBody.contains("retry_after")) {
// Extract value (simplified)
retryAfter = 60;
}
return retryAfter;
}
// Helper methods for extracting information from error body
private String extractApiKeyPrefix(String errorBody) {
// Simplified - use JSON parser in production
return "sk-...";
}
private int extractQuotaLimit(String errorBody) {
// Simplified - use JSON parser in production
return 1000;
}
private int extractQuotaUsed(String errorBody) {
// Simplified - use JSON parser in production
return 1023;
}
private long extractResetTime(String errorBody) {
// Simplified - use JSON parser in production
return System.currentTimeMillis() + (24 * 60 * 60 * 1000); // +24 hours
}
private String extractFieldName(String errorBody) {
// Simplified - use JSON parser in production
return "prompt";
}
private String extractExpectedFormat(String errorBody) {
// Simplified - use JSON parser in production
return "string (max 4096 chars)";
}
private String extractActualValue(String errorBody) {
// Simplified - use JSON parser in production
return "string (5000 chars)";
}
}The auto-configured ResponseErrorHandler creates error messages in this format:
HTTP {status_code} - {response_body}Examples:
HTTP 429 - Rate limit exceeded. Please retry after 60 seconds.
HTTP 401 - Invalid authentication credentials.
HTTP 503 - Service temporarily unavailable due to maintenance.
HTTP 500 - Internal server error: NullPointerException at ServiceImpl.java:42
HTTP 400 - Invalid request: field 'prompt' is required
HTTP 404 - Model 'gpt-5' not foundInclude context in your exception messages to aid debugging:
// Good: Includes context
throw new TransientAiException(
"HTTP 503 - AI model service unavailable. " +
"Retry attempt " + retryCount + " of " + maxRetries
);
// Good: Includes original cause
throw new NonTransientAiException(
"Failed to authenticate with AI service: " + originalException.getMessage(),
originalException
);
// Good: Includes relevant parameters
throw new NonTransientAiException(
"Invalid temperature parameter: " + temperature + ". Must be between 0 and 2."
);
// Less helpful: Vague message
throw new TransientAiException("Error occurred");
// Less helpful: Missing context
throw new NonTransientAiException("Invalid parameter");Use TransientAiException only for truly temporary errors that might resolve on retry:
// GOOD: Transient - network might recover
throw new TransientAiException("Connection timeout", timeoutException);
// GOOD: Transient - service might come back up
throw new TransientAiException("HTTP 503 - Service unavailable");
// BAD: Non-transient classified as transient - wastes retries
throw new TransientAiException("HTTP 401 - Invalid API key");
// SHOULD BE:
throw new NonTransientAiException("HTTP 401 - Invalid API key");
// BAD: Transient classified as non-transient - misses recovery opportunity
throw new NonTransientAiException("HTTP 503 - Service unavailable");
// SHOULD BE:
throw new TransientAiException("HTTP 503 - Service unavailable");Add relevant details to exception messages:
// GOOD: Includes status code, response body, and retry info
throw new TransientAiException(String.format(
"HTTP %d - %s (attempt %d/%d)",
statusCode, responseBody, attemptNumber, maxAttempts
));
// GOOD: Includes parameter name and valid range
throw new NonTransientAiException(String.format(
"Invalid parameter '%s': value %s is outside valid range [%s, %s]",
paramName, actualValue, minValue, maxValue
));
// BAD: No context
throw new TransientAiException("Error");Pass the original exception as the cause for full stack traces:
// GOOD: Preserves full exception chain
try {
performOperation();
} catch (IOException e) {
throw new TransientAiException("Network error: " + e.getMessage(), e);
}
// BAD: Loses original stack trace
try {
performOperation();
} catch (IOException e) {
throw new TransientAiException("Network error"); // No cause
}Document what exceptions your methods throw:
/**
* Calls AI completion API
*
* @param prompt The prompt text
* @return Completion result
* @throws TransientAiException If a temporary error occurs (network, server error, rate limit)
* @throws NonTransientAiException If a permanent error occurs (auth, validation, configuration)
*/
public String complete(String prompt) {
// Implementation
}Catch TransientAiException after retry exhaustion to handle gracefully:
try {
return retryTemplate.execute(context -> callApi());
} catch (TransientAiException e) {
// All retries exhausted - provide fallback
log.error("Service unavailable after {} retries", maxAttempts);
return fallbackResponse();
}When in doubt, use NonTransientAiException to avoid wasteful retries:
// If uncertain whether error is transient:
// Better to fail fast than waste time on futile retries
if (unknownErrorCondition) {
throw new NonTransientAiException("Unknown error condition");
// User can manually retry if they think it's transient
}Log TransientAiException at WARN level, NonTransientAiException at ERROR level:
try {
return callApi();
} catch (TransientAiException e) {
// Transient - might recover
log.warn("Transient error (will retry): {}", e.getMessage());
throw e;
} catch (NonTransientAiException e) {
// Non-transient - requires intervention
log.error("Non-transient error (won't retry): {}", e.getMessage(), e);
throw e;
}Use RetryUtils.SHORT_RETRY_TEMPLATE for tests:
import org.springframework.ai.retry.RetryUtils;
import org.springframework.ai.retry.TransientAiException;
import org.junit.jupiter.api.Test;
import java.util.concurrent.atomic.AtomicInteger;
import static org.assertj.core.api.Assertions.assertThat;
class TransientErrorRetryTest {
@Test
void testTransientErrorSucceedsAfterRetries() {
AtomicInteger attempts = new AtomicInteger(0);
String result = RetryUtils.SHORT_RETRY_TEMPLATE.execute(context -> {
int attemptNumber = attempts.incrementAndGet();
// Fail first 2 attempts
if (attemptNumber < 3) {
throw new TransientAiException("Transient failure " + attemptNumber);
}
// Succeed on 3rd attempt
return "success";
});
assertThat(result).isEqualTo("success");
assertThat(attempts.get()).isEqualTo(3); // Succeeded on 3rd attempt
}
@Test
void testTransientErrorExhaustsRetries() {
AtomicInteger attempts = new AtomicInteger(0);
assertThatThrownBy(() -> {
RetryUtils.SHORT_RETRY_TEMPLATE.execute(context -> {
attempts.incrementAndGet();
throw new TransientAiException("Always fails");
});
}).isInstanceOf(TransientAiException.class)
.hasMessageContaining("Always fails");
// SHORT_RETRY_TEMPLATE has maxAttempts=10
assertThat(attempts.get()).isEqualTo(10);
}
}import org.springframework.ai.retry.RetryUtils;
import org.springframework.ai.retry.NonTransientAiException;
import org.junit.jupiter.api.Test;
import java.util.concurrent.atomic.AtomicInteger;
import static org.assertj.core.api.Assertions.assertThat;
import static org.assertj.core.api.Assertions.assertThatThrownBy;
class NonTransientErrorTest {
@Test
void testNonTransientErrorFailsImmediately() {
AtomicInteger attempts = new AtomicInteger(0);
assertThatThrownBy(() -> {
RetryUtils.SHORT_RETRY_TEMPLATE.execute(context -> {
attempts.incrementAndGet();
throw new NonTransientAiException("Permanent failure");
});
}).isInstanceOf(NonTransientAiException.class)
.hasMessageContaining("Permanent failure");
// Should fail immediately without retry
assertThat(attempts.get()).isEqualTo(1);
}
@Test
void testMixedExceptions() {
AtomicInteger attempts = new AtomicInteger(0);
assertThatThrownBy(() -> {
RetryUtils.SHORT_RETRY_TEMPLATE.execute(context -> {
int attemptNumber = attempts.incrementAndGet();
if (attemptNumber == 1) {
// First: transient (will retry)
throw new TransientAiException("Transient error");
} else {
// Second: non-transient (will fail immediately)
throw new NonTransientAiException("Permanent error");
}
});
}).isInstanceOf(NonTransientAiException.class);
// Transient retry + non-transient failure = 2 attempts
assertThat(attempts.get()).isEqualTo(2);
}
}import org.springframework.web.client.ResponseErrorHandler;
import org.springframework.http.client.ClientHttpResponse;
import org.springframework.http.HttpStatus;
import org.springframework.http.HttpStatusCode;
import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
import org.junit.jupiter.api.Test;
import org.mockito.Mockito;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import static org.assertj.core.api.Assertions.assertThatThrownBy;
class ErrorHandlerTest {
@Test
void testRateLimitThrowsTransientException() throws IOException {
ResponseErrorHandler handler = createAutoConfiguredErrorHandler();
ClientHttpResponse response = createMockResponse(
HttpStatus.TOO_MANY_REQUESTS,
"Rate limit exceeded"
);
assertThatThrownBy(() -> handler.handleError(response))
.isInstanceOf(TransientAiException.class)
.hasMessageContaining("429")
.hasMessageContaining("Rate limit exceeded");
}
@Test
void testUnauthorizedThrowsNonTransientException() throws IOException {
ResponseErrorHandler handler = createAutoConfiguredErrorHandler();
ClientHttpResponse response = createMockResponse(
HttpStatus.UNAUTHORIZED,
"Invalid API key"
);
assertThatThrownBy(() -> handler.handleError(response))
.isInstanceOf(NonTransientAiException.class)
.hasMessageContaining("401")
.hasMessageContaining("Invalid API key");
}
@Test
void testServerErrorThrowsTransientException() throws IOException {
ResponseErrorHandler handler = createAutoConfiguredErrorHandler();
ClientHttpResponse response = createMockResponse(
HttpStatus.INTERNAL_SERVER_ERROR,
"Internal server error"
);
assertThatThrownBy(() -> handler.handleError(response))
.isInstanceOf(TransientAiException.class)
.hasMessageContaining("500");
}
private ClientHttpResponse createMockResponse(HttpStatus status, String body)
throws IOException {
ClientHttpResponse response = Mockito.mock(ClientHttpResponse.class);
Mockito.when(response.getStatusCode()).thenReturn(HttpStatusCode.valueOf(status.value()));
Mockito.when(response.getBody()).thenReturn(
new ByteArrayInputStream(body.getBytes())
);
return response;
}
private ResponseErrorHandler createAutoConfiguredErrorHandler() {
// Return configured error handler from auto-configuration
// In real tests, inject from Spring context
return handler;
}
}java.lang.Object
└── java.lang.Throwable
└── java.lang.Exception
└── java.lang.RuntimeException
├── org.springframework.ai.retry.TransientAiException
│ ├── QuotaExceededException (custom)
│ ├── ModelNotAvailableException (custom)
│ └── (other custom transient exceptions)
└── org.springframework.ai.retry.NonTransientAiException
├── InvalidApiKeyException (custom)
├── InvalidRequestFormatException (custom)
└── (other custom non-transient exceptions)Both exception types extend RuntimeException, making them unchecked exceptions that don't require explicit throws declarations in method signatures.
Benefits of unchecked exceptions:
Serializable: Both exceptions are serializable (RuntimeException implements Serializable), allowing them to:
tessl i tessl/maven-org-springframework-ai--spring-ai-autoconfigure-retry@1.1.1