Spring Boot auto-configuration for AI retry capabilities with exponential backoff and intelligent HTTP error handling
Advanced scenarios, edge cases, and corner cases when using Spring AI Retry Auto Configuration.
API returns error status but empty response body.
ResponseErrorHandler uses placeholder text:
HTTP 503 - No response body available@Bean
public ResponseErrorHandler customErrorHandler() {
return new ResponseErrorHandler() {
@Override
public void handleError(ClientHttpResponse response) throws IOException {
int status = response.getStatusCode().value();
String body = readBody(response);
if (body == null || body.trim().isEmpty()) {
body = "No response body available";
}
String message = "HTTP " + status + " - " + body;
if (status >= 500) {
throw new TransientAiException(message);
} else {
throw new NonTransientAiException(message);
}
}
private String readBody(ClientHttpResponse response) throws IOException {
try (InputStream is = response.getBody()) {
return new String(is.readAllBytes(), StandardCharsets.UTF_8);
} catch (IOException e) {
return null;
}
}
};
}Same code in both onHttpCodes and excludeOnHttpCodes.
spring.ai.retry.on-http-codes=429,503
spring.ai.retry.exclude-on-http-codes=503,401onHttpCodes takes precedence (highest priority).
Result:
Avoid conflicts by keeping lists mutually exclusive:
spring.ai.retry.on-http-codes=429
spring.ai.retry.exclude-on-http-codes=401,403
# 503 handled by default (5xx = retry)WebClientRequestException class doesn't exist.
Auto-configuration silently skips WebFlux support:
try {
Class<?> webClientRequestEx = Class.forName(
"org.springframework.web.reactive.function.client.WebClientRequestException"
);
retryTemplateBuilder.retryOn(webClientRequestEx);
} catch (ClassNotFoundException ignore) {
// Silently skip - no error or warning
}Result:
@Test
void testWebFluxDetection() {
boolean webFluxAvailable = false;
try {
Class.forName("org.springframework.web.reactive.function.client.WebClient");
webFluxAvailable = true;
} catch (ClassNotFoundException e) {
// WebFlux not available
}
if (webFluxAvailable) {
// Test WebClient retry behavior
} else {
// Test without WebFlux
}
}Error response body exceeds memory limits.
Reading entire response into memory can cause OOM.
Limit response body size:
@Bean
public ResponseErrorHandler safeSizeErrorHandler() {
return new ResponseErrorHandler() {
private static final int MAX_BODY_SIZE = 4096; // 4KB
@Override
public void handleError(ClientHttpResponse response) throws IOException {
int status = response.getStatusCode().value();
String body = readBodySafely(response, MAX_BODY_SIZE);
String message = "HTTP " + status + " - " + body;
if (status >= 500) {
throw new TransientAiException(message);
} else {
throw new NonTransientAiException(message);
}
}
private String readBodySafely(ClientHttpResponse response, int maxSize)
throws IOException {
try (InputStream is = response.getBody()) {
byte[] buffer = new byte[maxSize];
int bytesRead = is.read(buffer);
if (bytesRead == -1) {
return "No response body";
}
String body = new String(buffer, 0, bytesRead, StandardCharsets.UTF_8);
// Check if there's more data
if (is.read() != -1) {
body += "... (truncated)";
}
return body;
}
}
};
}Need to track retry attempts across multiple calls.
public String trackRetries(String prompt) {
AtomicInteger totalAttempts = new AtomicInteger(0);
return retryTemplate.execute(context -> {
int attemptNumber = totalAttempts.incrementAndGet();
int retryCount = context.getRetryCount();
log.info("Total attempts: {}, Retry count: {}",
attemptNumber, retryCount);
// Store in context for recovery callback
context.setAttribute("totalAttempts", attemptNumber);
return callApi(prompt);
}, context -> {
Integer attempts = (Integer) context.getAttribute("totalAttempts");
log.error("Failed after {} total attempts", attempts);
return "Fallback";
});
}API returns non-standard codes (e.g., 460, 599).
Classification follows standard logic:
spring.ai.retry.on-http-codes=460,599
spring.ai.retry.exclude-on-http-codes=461Results:
Set max-attempts to 0.
spring.ai.retry.max-attempts=0No retries - immediate failure on first error.
Disable retries temporarily for debugging:
@Profile("debug")
@Configuration
class NoRetryConfig {
@Bean
public RetryTemplate retryTemplate() {
return RetryTemplate.builder()
.maxAttempts(1) // Or use 0 depending on implementation
.build();
}
}Set multiplier to 1 for fixed backoff.
spring.ai.retry.backoff.multiplier=1
spring.ai.retry.backoff.initial-interval=2s
spring.ai.retry.backoff.max-interval=2sAll retries wait exactly 2 seconds.
Attempt 1: 2 × 1^0 = 2s
Attempt 2: 2 × 1^1 = 2s
Attempt 3: 2 × 1^2 = 2s
...Predictable timing for integration tests or APIs with fixed rate windows.
Multiple threads using same RetryTemplate.
Each thread gets its own RetryContext - thread-safe.
@Test
void testConcurrentRetries() throws InterruptedException {
RetryTemplate retryTemplate = createRetryTemplate();
ExecutorService executor = Executors.newFixedThreadPool(10);
CountDownLatch latch = new CountDownLatch(10);
for (int i = 0; i < 10; i++) {
final int threadId = i;
executor.submit(() -> {
try {
retryTemplate.execute(context -> {
// Each thread has independent context
log.info("Thread {} retry count: {}",
threadId, context.getRetryCount());
return performOperation();
});
} finally {
latch.countDown();
}
});
}
latch.await(30, TimeUnit.SECONDS);
executor.shutdown();
}RecoveryCallback itself throws exception.
Exception propagated to caller.
public String safeRecovery(String prompt) {
return retryTemplate.execute(
context -> callApi(prompt),
context -> {
try {
return getExpensiveFallback();
} catch (Exception e) {
log.error("Recovery failed", e);
// Return simple fallback instead of throwing
return "Fallback unavailable";
}
}
);
}Multiplier set to invalid value.
spring.ai.retry.backoff.multiplier=-1 # Invalid
spring.ai.retry.backoff.multiplier=1000 # ExtremeSpring Boot validation should catch this, but implement defensive checks:
@ConfigurationProperties("spring.ai.retry")
@Validated
public class SpringAiRetryProperties {
@Min(1)
private int multiplier = 5;
public void setMultiplier(int multiplier) {
if (multiplier < 1) {
throw new IllegalArgumentException("Multiplier must be >= 1");
}
if (multiplier > 100) {
log.warn("Very large multiplier ({}), backoff may be excessive", multiplier);
}
this.multiplier = multiplier;
}
}Duration property set to invalid value.
spring.ai.retry.backoff.initial-interval=invalid
spring.ai.retry.backoff.max-interval=Spring Boot throws PropertyBindingException on startup.
public void setInitialInterval(Duration initialInterval) {
if (initialInterval == null) {
throw new IllegalArgumentException("Initial interval cannot be null");
}
if (initialInterval.isNegative() || initialInterval.isZero()) {
throw new IllegalArgumentException("Initial interval must be positive");
}
this.initialInterval = initialInterval;
}Multiple RetryTemplate beans defined.
Which one gets injected?
Use qualifiers:
@Configuration
class MultipleRetryConfig {
@Bean
@Primary
public RetryTemplate defaultRetryTemplate() {
return RetryTemplate.builder()
.maxAttempts(10)
.exponentialBackoff(2000, 5, 180000)
.build();
}
@Bean
@Qualifier("aggressive")
public RetryTemplate aggressiveRetryTemplate() {
return RetryTemplate.builder()
.maxAttempts(20)
.exponentialBackoff(500, 3, 60000)
.build();
}
}
@Service
class MyService {
private final RetryTemplate defaultRetry;
private final RetryTemplate aggressiveRetry;
public MyService(
RetryTemplate defaultRetry, // Injects @Primary
@Qualifier("aggressive") RetryTemplate aggressiveRetry) {
this.defaultRetry = defaultRetry;
this.aggressiveRetry = aggressiveRetry;
}
}Retry in progress when application shuts down.
May cause incomplete operations or hung threads.
Implement graceful shutdown:
@Component
class GracefulRetryShutdown implements DisposableBean {
private final AtomicBoolean shuttingDown = new AtomicBoolean(false);
private final Set<CompletableFuture<?>> inFlightRetries = ConcurrentHashMap.newKeySet();
public String executeWithShutdownAwareness(Supplier<String> operation) {
if (shuttingDown.get()) {
throw new IllegalStateException("Application is shutting down");
}
CompletableFuture<String> future = CompletableFuture.supplyAsync(operation);
inFlightRetries.add(future);
try {
return future.get();
} catch (Exception e) {
throw new RuntimeException(e);
} finally {
inFlightRetries.remove(future);
}
}
@Override
public void destroy() throws Exception {
log.info("Graceful shutdown initiated. {} retries in flight",
inFlightRetries.size());
shuttingDown.set(true);
// Wait for in-flight retries (with timeout)
CompletableFuture.allOf(inFlightRetries.toArray(new CompletableFuture[0]))
.orTimeout(30, TimeUnit.SECONDS)
.join();
log.info("All retries completed. Shutdown complete.");
}
}Operation has side effects that persist across retries.
Retries may cause duplicate operations.
Use idempotency keys:
public String idempotentRetry(String request) {
String idempotencyKey = generateIdempotencyKey(request);
return retryTemplate.execute(context -> {
// Check if operation already completed
Optional<String> cached = checkIdempotencyCache(idempotencyKey);
if (cached.isPresent()) {
log.info("Operation already completed, returning cached result");
return cached.get();
}
// Perform operation with idempotency key
String result = performOperation(request, idempotencyKey);
// Cache result
cacheIdempotentResult(idempotencyKey, result);
return result;
});
}
private String generateIdempotencyKey(String request) {
return DigestUtils.md5Hex(request + System.currentTimeMillis());
}Total timeout < max retry backoff interval.
spring.ai.retry.backoff.max-interval=60s
# But application timeout is 30sLater retries never execute due to timeout.
Ensure backoff fits within timeout budget:
# If timeout is 30s total:
spring.ai.retry.max-attempts=5
spring.ai.retry.backoff.initial-interval=1s
spring.ai.retry.backoff.multiplier=2
spring.ai.retry.backoff.max-interval=5s
# Total: 1 + 2 + 4 + 5 + 5 = 17s (fits in 30s)Custom exception that should be retried but doesn't extend TransientAiException.
Create custom RetryTemplate:
@Bean
public RetryTemplate customExceptionRetryTemplate() {
return RetryTemplate.builder()
.maxAttempts(10)
.retryOn(TransientAiException.class)
.retryOn(CustomRetryableException.class) // Custom exception
.retryOn(ResourceAccessException.class)
.exponentialBackoff(2000, 5, 180000)
.build();
}
class CustomRetryableException extends RuntimeException {
public CustomRetryableException(String message) {
super(message);
}
}Heavy metrics collection during retries impacts performance.
Use sampling or async metrics:
@Configuration
class PerformantMetricsConfig {
@Bean
public RetryListener sampledMetricsListener(MeterRegistry registry) {
Counter retryCounter = registry.counter("ai.retry.sampled");
ThreadLocalRandom random = ThreadLocalRandom.current();
return new RetryListener() {
@Override
public <T, E extends Throwable> void onError(
RetryContext context,
RetryCallback<T, E> callback,
Throwable throwable) {
// Sample 10% of retries
if (random.nextDouble() < 0.1) {
retryCounter.increment();
// Record detailed metrics
recordDetailedMetrics(context, throwable);
}
}
};
}
private void recordDetailedMetrics(RetryContext context, Throwable throwable) {
// Async metrics recording
CompletableFuture.runAsync(() -> {
// Heavy metrics work here
});
}
}@SpringBootTest
class EdgeCaseTests {
@Autowired
private RetryTemplate retryTemplate;
@Test
void testEmptyResponseBody() {
// Test with mock that returns empty body
}
@Test
void testConflictingCodes() {
// Test precedence rules
}
@Test
void testZeroMaxAttempts() {
// Test immediate failure
}
@Test
void testConcurrentRetries() {
// Test thread safety
}
@Test
void testExceptionInRecovery() {
// Test recovery failure handling
}
}tessl i tessl/maven-org-springframework-ai--spring-ai-autoconfigure-retry@1.1.1