CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-org-springframework-ai--spring-ai-retry

Spring AI utility library providing retry mechanisms for AI API interactions with comprehensive error handling and exception classification

Overview
Eval results
Files

edge-cases.mddocs/examples/

Edge Cases and Common Pitfalls

This document covers common pitfalls and edge cases when using Spring AI Retry.

Common Pitfalls

Pitfall 1: Retrying Non-Idempotent Operations

Problem: Retrying operations that aren't safe to repeat (e.g., charging a credit card, creating resources).

Bad Example:

// BAD: May charge multiple times!
retryTemplate.execute(context -> {
    return paymentService.charge(amount);
});

Solution: Only retry idempotent operations, or implement idempotency keys.

// GOOD: Use idempotency key
retryTemplate.execute(context -> {
    String idempotencyKey = generateKey(request);
    return paymentService.charge(amount, idempotencyKey);
});

// GOOD: Check if already processed
retryTemplate.execute(context -> {
    if (paymentService.isAlreadyProcessed(transactionId)) {
        return paymentService.getResult(transactionId);
    }
    return paymentService.charge(amount, transactionId);
});

Pitfall 2: Not Handling NonTransientAiException

Problem: Catching all exceptions and retrying, including non-transient ones.

Bad Example:

// BAD: Retrying everything manually defeats the purpose
try {
    result = retryTemplate.execute(context -> aiClient.call());
} catch (Exception e) {
    // Retrying again manually - wrong!
    result = aiClient.call();
}

Solution: Let retry template handle retries, distinguish exception types.

// GOOD: Let retry template handle it
try {
    result = retryTemplate.execute(context -> aiClient.call());
} catch (NonTransientAiException e) {
    // Handle permanent failures appropriately
    logger.error("Invalid request: {}", e.getMessage());
    throw new BadRequestException("Invalid request", e);
} catch (TransientAiException e) {
    // All retries exhausted
    logger.error("Service unavailable after retries");
    throw new ServiceUnavailableException("Service down", e);
}

Pitfall 3: Blocking Reactive Threads

Problem: Using blocking retry in reactive applications.

Bad Example:

// BAD: Blocks reactive thread!
Mono<String> result = Mono.fromCallable(() -> {
    return RetryUtils.DEFAULT_RETRY_TEMPLATE.execute(context -> {
        return aiClient.generate(prompt);  // Blocking call!
    });
});

Solution: Use reactive retry mechanisms.

// GOOD: Use reactive retry
Mono<String> result = webClient
    .post()
    .uri("/chat")
    .bodyValue(request)
    .retrieve()
    .bodyToMono(String.class)
    .retryWhen(Retry.backoff(10, Duration.ofSeconds(2))
        .maxBackoff(Duration.ofMinutes(3))
        .filter(e -> e instanceof TransientAiException));

Pitfall 4: Incorrect Error Classification

Problem: Classifying errors incorrectly (e.g., treating auth errors as transient).

Bad Example:

// BAD: Auth error will never succeed on retry!
if (response.getStatusCode() == 401) {
    throw new TransientAiException("Auth failed");
}

Solution: Carefully classify errors based on whether they'll succeed on retry.

// GOOD: Correct classification
if (response.getStatusCode() == 401) {
    throw new NonTransientAiException(
        "Invalid API key - check configuration");
}

if (response.getStatusCode() == 503) {
    throw new TransientAiException(
        "Service temporarily unavailable");
}

Pitfall 5: Not Setting Max Interval

Problem: Exponential backoff grows indefinitely without a cap.

Bad Configuration:

# BAD: No max interval (can grow to hours!)
spring.ai.retry.backoff:
  initial-interval: 1s
  multiplier: 10
  # max-interval not set - defaults to 3 minutes, but could be misconfigured

Solution: Always set a reasonable max-interval.

# GOOD: Reasonable cap
spring.ai.retry.backoff:
  initial-interval: 1s
  multiplier: 10
  max-interval: 60s  # Cap at 1 minute

Edge Cases

Edge Case 1: Rate Limit with Retry-After Header

Problem: Need to respect the Retry-After header from rate limit responses.

Solution: Parse and use Retry-After header.

public class RateLimitAwareRetryListener implements RetryListener {
    
    @Override
    public <T, E extends Throwable> void onError(
            RetryContext context,
            RetryCallback<T, E> callback,
            Throwable throwable) {
        
        if (throwable instanceof TransientAiException) {
            String message = throwable.getMessage();
            
            // Parse Retry-After from error message
            if (message.contains("Retry-After:")) {
                int retryAfterSeconds = parseRetryAfter(message);
                if (retryAfterSeconds > 0) {
                    try {
                        Thread.sleep(retryAfterSeconds * 1000L);
                    } catch (InterruptedException e) {
                        Thread.currentThread().interrupt();
                    }
                }
            }
        }
    }
    
    private int parseRetryAfter(String message) {
        // Parse "Retry-After: 30" from message
        Pattern pattern = Pattern.compile("Retry-After:\\s*(\\d+)");
        Matcher matcher = pattern.matcher(message);
        if (matcher.find()) {
            return Integer.parseInt(matcher.group(1));
        }
        return 0;
    }
}

Edge Case 2: Circuit Breaker Open State

Problem: Circuit breaker is open, should not attempt requests.

Solution: Check circuit breaker state before attempting.

public String generate(String prompt) {
    return retryTemplate.execute(context -> {
        // Check circuit breaker state
        if (circuitBreaker.getState() == CircuitBreaker.State.OPEN) {
            // Don't waste time retrying - fail immediately
            throw new TransientAiException(
                "Circuit breaker open - service recovering");
        }
        
        // Only attempt if circuit is closed or half-open
        return aiClient.generate(prompt);
    });
}

Edge Case 3: Quota Exceeded vs Temporary Rate Limit

Problem: Both may return 429, but one is permanent (quota) and one is temporary (rate limit).

Solution: Parse response body to distinguish.

public String generate(String prompt) {
    return retryTemplate.execute(context -> {
        try {
            return aiClient.generate(prompt);
        } catch (HttpStatusCodeException e) {
            if (e.getStatusCode().value() == 429) {
                String body = e.getResponseBodyAsString();
                
                // Check if it's quota exceeded (non-transient)
                if (body.contains("quota_exceeded") || 
                    body.contains("monthly_limit")) {
                    throw new NonTransientAiException(
                        "API quota exceeded - upgrade plan", e);
                }
                
                // Otherwise it's a temporary rate limit (transient)
                throw new TransientAiException(
                    "Rate limit exceeded - will retry", e);
            }
            throw e;
        }
    });
}

Edge Case 4: Scheduled Maintenance

Problem: Service returns 503 during scheduled maintenance, which may last hours.

Solution: Detect maintenance and fail fast.

public String generate(String prompt) {
    return retryTemplate.execute(context -> {
        try {
            return aiClient.generate(prompt);
        } catch (HttpStatusCodeException e) {
            if (e.getStatusCode().value() == 503) {
                String body = e.getResponseBodyAsString();
                
                // Check if it's scheduled maintenance
                if (body.contains("maintenance") || 
                    body.contains("scheduled_downtime")) {
                    throw new NonTransientAiException(
                        "Service in maintenance - check status page", e);
                }
                
                // Otherwise it's a temporary outage
                throw new TransientAiException(
                    "Service temporarily unavailable", e);
            }
            throw e;
        }
    });
}

Edge Case 5: Response Body Already Consumed

Problem: Response body can only be read once, causing errors in error handler.

Solution: Use buffering or don't read body multiple times.

public class BufferedResponseErrorHandler implements ResponseErrorHandler {
    
    @Override
    public void handleError(ClientHttpResponse response) throws IOException {
        // Buffer response so it can be read multiple times
        byte[] body = response.getBody().readAllBytes();
        String bodyString = new String(body, StandardCharsets.UTF_8);
        
        int statusCode = response.getStatusCode().value();
        
        if (statusCode >= 400 && statusCode < 500) {
            throw new NonTransientAiException(
                "Client error (" + statusCode + "): " + bodyString);
        } else {
            throw new TransientAiException(
                "Server error (" + statusCode + "): " + bodyString);
        }
    }
}

Edge Case 6: Timeout vs Connection Refused

Problem: Both are network errors but may have different retry strategies.

Solution: Classify network errors more specifically.

public String generate(String prompt) {
    return retryTemplate.execute(context -> {
        try {
            return aiClient.generate(prompt);
        } catch (SocketTimeoutException e) {
            // Timeout - service is slow but reachable
            throw new TransientAiException("Request timeout", e);
        } catch (ConnectException e) {
            // Connection refused - service is down
            // Check attempt count
            if (context.getRetryCount() > 5) {
                // After 5 attempts, likely not coming back soon
                throw new NonTransientAiException(
                    "Service unreachable - check status", e);
            }
            throw new TransientAiException("Connection refused", e);
        }
    });
}

Edge Case 7: Concurrent Request Limit

Problem: Service has concurrent request limit, not a rate limit.

Solution: Use semaphore or queue before retry.

public class ThrottledAiService {
    
    private final Semaphore concurrencyLimiter;
    private final RetryTemplate retryTemplate;
    
    public ThrottledAiService(int maxConcurrentRequests) {
        this.concurrencyLimiter = new Semaphore(maxConcurrentRequests);
        this.retryTemplate = RetryUtils.DEFAULT_RETRY_TEMPLATE;
    }
    
    public String generate(String prompt) {
        try {
            // Acquire permit before attempting
            concurrencyLimiter.acquire();
            
            try {
                return retryTemplate.execute(context -> 
                    aiClient.generate(prompt)
                );
            } finally {
                concurrencyLimiter.release();
            }
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            throw new RuntimeException("Interrupted", e);
        }
    }
}

Edge Case 8: Partial Success in Batch Operations

Problem: Batch operation partially succeeds, don't want to retry successful items.

Solution: Track successful items and only retry failures.

public List<String> generateBatch(List<String> prompts) {
    Map<String, String> results = new ConcurrentHashMap<>();
    Set<String> remaining = ConcurrentHashMap.newKeySet();
    remaining.addAll(prompts);
    
    int attempt = 0;
    while (!remaining.isEmpty() && attempt < 5) {
        attempt++;
        
        Set<String> currentBatch = new HashSet<>(remaining);
        for (String prompt : currentBatch) {
            try {
                String result = aiClient.generate(prompt);
                results.put(prompt, result);
                remaining.remove(prompt);
            } catch (TransientAiException e) {
                logger.warn("Attempt {} failed for prompt: {}", attempt, prompt);
                // Will retry in next iteration
            } catch (NonTransientAiException e) {
                logger.error("Prompt failed validation: {}", prompt);
                remaining.remove(prompt);  // Don't retry
            }
        }
        
        if (!remaining.isEmpty()) {
            try {
                Thread.sleep(2000 * attempt);  // Backoff
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                break;
            }
        }
    }
    
    return prompts.stream()
        .map(results::get)
        .collect(Collectors.toList());
}

Edge Case 9: Token Expiration Mid-Request

Problem: OAuth token expires while request is in flight.

Solution: Refresh token and retry.

public String generate(String prompt) {
    return retryTemplate.execute(context -> {
        try {
            return aiClient.generate(prompt);
        } catch (HttpStatusCodeException e) {
            if (e.getStatusCode().value() == 401) {
                String body = e.getResponseBodyAsString();
                
                // Check if it's token expiration (transient)
                if (body.contains("token_expired") || 
                    body.contains("expired_token")) {
                    
                    // Refresh token
                    logger.info("Token expired, refreshing");
                    tokenService.refresh();
                    
                    // Retry with new token
                    throw new TransientAiException(
                        "Token expired - refreshed", e);
                }
                
                // Otherwise it's invalid credentials (non-transient)
                throw new NonTransientAiException(
                    "Invalid credentials", e);
            }
            throw e;
        }
    });
}

Edge Case 10: Different Retry Strategies per Exception

Problem: Want different retry strategies for different error types.

Solution: Use multiple retry templates or custom retry policy.

public class MultiStrategyService {
    
    private final RetryTemplate fastRetry;
    private final RetryTemplate slowRetry;
    
    public MultiStrategyService() {
        // Fast retry for rate limits (fixed delay)
        this.fastRetry = RetryTemplate.builder()
            .maxAttempts(10)
            .fixedBackoff(10000)  // 10 seconds
            .retryOn(RateLimitException.class)
            .build();
        
        // Slow retry for service outages (exponential)
        this.slowRetry = RetryTemplate.builder()
            .maxAttempts(5)
            .exponentialBackoff(
                Duration.ofSeconds(5),
                3.0,
                Duration.ofMinutes(5)
            )
            .retryOn(ServiceUnavailableException.class)
            .build();
    }
    
    public String generate(String prompt) {
        try {
            return slowRetry.execute(context -> {
                try {
                    return fastRetry.execute(innerContext -> {
                        return aiClient.generate(prompt);
                    });
                } catch (RateLimitException e) {
                    // Fast retry exhausted
                    throw new TransientAiException("Rate limited", e);
                }
            });
        } catch (ServiceUnavailableException e) {
            throw new TransientAiException("Service down", e);
        }
    }
}

Testing Edge Cases

Test 1: Verify No Retry on Non-Transient

@Test
void testNonTransientNotRetried() {
    AtomicInteger attempts = new AtomicInteger(0);
    
    assertThrows(NonTransientAiException.class, () -> {
        RetryUtils.SHORT_RETRY_TEMPLATE.execute(context -> {
            attempts.incrementAndGet();
            throw new NonTransientAiException("Invalid API key");
        });
    });
    
    // Should only attempt once
    assertEquals(1, attempts.get());
}

Test 2: Verify Retry Exhaustion

@Test
void testRetryExhaustion() {
    AtomicInteger attempts = new AtomicInteger(0);
    
    assertThrows(TransientAiException.class, () -> {
        RetryUtils.SHORT_RETRY_TEMPLATE.execute(context -> {
            attempts.incrementAndGet();
            throw new TransientAiException("Always fails");
        });
    });
    
    // Should attempt all 10 times
    assertEquals(10, attempts.get());
}

Test 3: Verify Success After Retries

@Test
void testSuccessAfterRetries() {
    AtomicInteger attempts = new AtomicInteger(0);
    
    String result = RetryUtils.SHORT_RETRY_TEMPLATE.execute(context -> {
        int attempt = attempts.incrementAndGet();
        if (attempt < 3) {
            throw new TransientAiException("Temporary failure");
        }
        return "Success";
    });
    
    assertEquals("Success", result);
    assertEquals(3, attempts.get());
}

Next Steps

  • Integration Patterns - Real-world integration examples
  • Error Handling Strategies - Comprehensive error handling
  • API Reference - Complete API documentation

Install with Tessl CLI

npx tessl i tessl/maven-org-springframework-ai--spring-ai-retry@1.1.0

docs

examples

edge-cases.md

error-handling-strategies.md

integration-patterns.md

index.md

tile.json