tessl/maven-org-springframework-ai--spring-ai-spring-boot-autoconfigure

Spring AI Spring Boot Auto Configuration modules providing automatic setup for AI models, vector stores, MCP, and retry capabilities

Overview

Eval results

Files

Common - Retry Module

Name: tessl/maven-org-springframework-ai--spring-ai-spring-boot-autoconfigure
Author: tessl

The Retry module provides autoconfiguration for resilient AI operations with automatic retry capabilities for transient failures. It handles HTTP errors intelligently, distinguishing between transient (retryable) and non-transient (permanent) failures with exponential backoff and comprehensive error classification.

Maven Coordinates

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-autoconfigure-retry</artifactId>
    <version>1.1.2</version>
</dependency>

Capabilities

Spring AI Retry AutoConfiguration

Automatically configures retry mechanisms for AI operations with exponential backoff and intelligent error handling.

/**
 * Autoconfigures retry support for Spring AI operations
 * 
 * Conditional Requirements:
 * - @ConditionalOnClass: org.springframework.ai.retry.RetryUtils
 * 
 * Configuration Properties: spring.ai.retry.*
 * 
 * @AutoConfiguration
 * @ConditionalOnClass(RetryUtils.class)
 * @EnableConfigurationProperties(SpringAiRetryProperties.class)
 */
@AutoConfiguration
@ConditionalOnClass(RetryUtils.class)
@EnableConfigurationProperties(SpringAiRetryProperties.class)
class SpringAiRetryAutoConfiguration {
    // Bean definitions for retry infrastructure
}

Retry Template Bean

Creates a configured RetryTemplate for AI operations with exponential backoff and retry listeners.

/**
 * Provides retry template with exponential backoff
 * 
 * @Bean
 * @ConditionalOnMissingBean
 * @param properties Configuration properties for retry behavior
 * @return RetryTemplate configured for AI operations
 * 
 * Retry Behavior:
 * - Retries on TransientAiException: Rate limits, timeouts, 5xx errors
 * - Retries on ResourceAccessException: Network failures, connection issues
 * - Optionally retries on WebClientRequestException: If WebFlux present
 * - Exponential backoff with configurable intervals
 * - Includes retry listener for logging attempts
 * 
 * Backoff Formula:
 * delay = min(initialInterval * (multiplier ^ attemptNumber), maxInterval)
 * 
 * Example with defaults:
 * - Attempt 1: 2s
 * - Attempt 2: 10s (2 * 5^1)
 * - Attempt 3: 50s (2 * 5^2)
 * - Attempt 4: 180s (capped at maxInterval)
 */
@Bean
@ConditionalOnMissingBean
RetryTemplate retryTemplate(SpringAiRetryProperties properties) {
    RetryTemplate template = new RetryTemplate();
    
    // Configure retry policy
    SimpleRetryPolicy retryPolicy = new SimpleRetryPolicy();
    retryPolicy.setMaxAttempts(properties.getMaxAttempts());
    template.setRetryPolicy(retryPolicy);
    
    // Configure exponential backoff
    ExponentialBackOffPolicy backOffPolicy = new ExponentialBackOffPolicy();
    backOffPolicy.setInitialInterval(properties.getBackoff().getInitialInterval().toMillis());
    backOffPolicy.setMultiplier(properties.getBackoff().getMultiplier());
    backOffPolicy.setMaxInterval(properties.getBackoff().getMaxInterval().toMillis());
    template.setBackOffPolicy(backOffPolicy);
    
    // Add retry listener for logging
    template.registerListener(new RetryListenerSupport() {
        @Override
        public <T, E extends Throwable> void onError(
                RetryContext context, 
                RetryCallback<T, E> callback, 
                Throwable throwable) {
            log.warn("Retry attempt {} failed: {}", 
                    context.getRetryCount(), 
                    throwable.getMessage());
        }
    });
    
    return template;
}

Usage Example:

import org.springframework.ai.retry.TransientAiException;
import org.springframework.retry.support.RetryTemplate;
import org.springframework.stereotype.Service;

@Service
public class ResilientAiService {
    private final RetryTemplate retryTemplate;
    private final ChatModel chatModel;
    
    public ResilientAiService(RetryTemplate retryTemplate, ChatModel chatModel) {
        this.retryTemplate = retryTemplate;
        this.chatModel = chatModel;
    }
    
    public String callAiWithRetry(String prompt) {
        return retryTemplate.execute(context -> {
            // This operation will be retried on transient failures
            return chatModel.call(prompt);
        });
    }
    
    public String callAiWithRetryAndRecovery(String prompt) {
        return retryTemplate.execute(
            context -> {
                // Main operation
                return chatModel.call(prompt);
            },
            context -> {
                // Recovery callback - called after all retries exhausted
                log.error("All retries exhausted for prompt: {}", prompt);
                return "I apologize, but I'm currently unable to process your request. Please try again later.";
            }
        );
    }
}

Response Error Handler Bean

Creates a ResponseErrorHandler that classifies HTTP errors as transient or non-transient based on status codes and configuration.

/**
 * Handles HTTP response errors for AI operations
 * 
 * @Bean
 * @ConditionalOnMissingBean
 * @param properties Configuration for error classification
 * @return ResponseErrorHandler that throws appropriate exceptions
 * 
 * Error Classification Logic:
 * 1. Check excludeOnHttpCodes: If matched -> NonTransientAiException
 * 2. Check onHttpCodes: If matched -> TransientAiException
 * 3. Check onClientErrors: If true, 4xx -> TransientAiException
 * 4. Default 4xx -> NonTransientAiException
 * 5. Default 5xx -> TransientAiException
 * 
 * HTTP Status Code Handling:
 * - 400 Bad Request: NonTransient (invalid request format)
 * - 401 Unauthorized: NonTransient (invalid API key)
 * - 403 Forbidden: NonTransient (insufficient permissions)
 * - 404 Not Found: NonTransient (invalid endpoint)
 * - 408 Request Timeout: Transient (can be retried)
 * - 429 Too Many Requests: Transient (rate limit, retry with backoff)
 * - 500 Internal Server Error: Transient (temporary server issue)
 * - 502 Bad Gateway: Transient (temporary proxy issue)
 * - 503 Service Unavailable: Transient (temporary unavailability)
 * - 504 Gateway Timeout: Transient (timeout, can retry)
 * 
 * Custom Configuration Examples:
 * - Retry on specific 4xx: onHttpCodes=408,429
 * - Never retry auth errors: excludeOnHttpCodes=401,403
 * - Retry all 4xx: onClientErrors=true
 */
@Bean
@ConditionalOnMissingBean
ResponseErrorHandler responseErrorHandler(SpringAiRetryProperties properties) {
    return new ResponseErrorHandler() {
        @Override
        public boolean hasError(ClientHttpResponse response) throws IOException {
            return response.getStatusCode().isError();
        }
        
        @Override
        public void handleError(ClientHttpResponse response) throws IOException {
            int statusCode = response.getStatusCode().value();
            String responseBody = new String(response.getBody().readAllBytes());
            
            // Check exclude list first
            if (properties.getExcludeOnHttpCodes().contains(statusCode)) {
                throw new NonTransientAiException(
                    String.format("HTTP %d (excluded from retry): %s", 
                                statusCode, responseBody)
                );
            }
            
            // Check explicit retry list
            if (properties.getOnHttpCodes().contains(statusCode)) {
                throw new TransientAiException(
                    String.format("HTTP %d (will retry): %s", 
                                statusCode, responseBody)
                );
            }
            
            // Handle 4xx errors
            if (statusCode >= 400 && statusCode < 500) {
                if (properties.isOnClientErrors()) {
                    throw new TransientAiException(
                        String.format("HTTP %d (client error, will retry): %s", 
                                    statusCode, responseBody)
                    );
                } else {
                    throw new NonTransientAiException(
                        String.format("HTTP %d (client error, won't retry): %s", 
                                    statusCode, responseBody)
                    );
                }
            }
            
            // Handle 5xx errors (always transient)
            if (statusCode >= 500) {
                throw new TransientAiException(
                    String.format("HTTP %d (server error, will retry): %s", 
                                statusCode, responseBody)
                );
            }
        }
    };
}

Usage Example:

import org.springframework.web.client.RestTemplate;
import org.springframework.web.client.ResponseErrorHandler;
import org.springframework.stereotype.Component;

@Component
public class AiRestClient {
    private final RestTemplate restTemplate;
    
    public AiRestClient(ResponseErrorHandler errorHandler) {
        this.restTemplate = new RestTemplate();
        this.restTemplate.setErrorHandler(errorHandler);
    }
    
    public String callAiApi(String endpoint, Object request) {
        // Error handler automatically classifies failures
        // Throws TransientAiException or NonTransientAiException
        return restTemplate.postForObject(endpoint, request, String.class);
    }
}

Exception Types

Spring AI provides two exception types for error classification with clear semantics for retry behavior.

/**
 * Exception indicating a transient (retryable) AI operation failure
 * 
 * Thrown for temporary errors that may succeed on retry:
 * - Rate limits (HTTP 429): Service is temporarily throttling requests
 * - Timeouts (HTTP 408, 504): Request took too long, may succeed if retried
 * - Server errors (HTTP 5xx): Temporary server issues
 * - Network errors: Connection failures, DNS issues
 * - Resource exhaustion: Temporary capacity issues
 * 
 * Retry Behavior:
 * - Will be automatically retried according to retry policy
 * - Uses exponential backoff between attempts
 * - Stops after max attempts reached
 * 
 * Best Practices:
 * - Use for errors that are likely to resolve with time
 * - Include original cause for debugging
 * - Log retry attempts for monitoring
 */
class TransientAiException extends RuntimeException {
    /**
     * Create exception with message
     * @param message Description of the transient failure
     */
    public TransientAiException(String message);
    
    /**
     * Create exception with message and cause
     * @param message Description of the transient failure
     * @param cause Original exception that caused the failure
     */
    public TransientAiException(String message, Throwable cause);
}

/**
 * Exception indicating a non-transient (permanent) AI operation failure
 * 
 * Thrown for permanent errors that will not succeed on retry:
 * - Invalid API key (HTTP 401): Credentials are wrong
 * - Forbidden (HTTP 403): Insufficient permissions
 * - Bad request (HTTP 400): Invalid request format or parameters
 * - Not found (HTTP 404): Invalid endpoint or resource
 * - Invalid model: Model name doesn't exist
 * - Quota exceeded: Account limits reached
 * - Content policy violation: Request violates provider policies
 * 
 * Retry Behavior:
 * - Will NOT be retried - fails immediately
 * - Allows fast failure for permanent issues
 * - Prevents wasting retry attempts on unfixable errors
 * 
 * Best Practices:
 * - Use for errors that require user intervention
 * - Provide clear error messages for troubleshooting
 * - Include specific details about what needs to be fixed
 */
class NonTransientAiException extends RuntimeException {
    /**
     * Create exception with message
     * @param message Description of the permanent failure
     */
    public NonTransientAiException(String message);
    
    /**
     * Create exception with message and cause
     * @param message Description of the permanent failure
     * @param cause Original exception that caused the failure
     */
    public NonTransientAiException(String message, Throwable cause);
}

Exception Handling Example:

import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;

@Service
public class ChatService {
    private final ChatModel chatModel;
    private final RetryTemplate retryTemplate;
    
    public String chat(String prompt) {
        try {
            return retryTemplate.execute(context -> {
                return chatModel.call(prompt);
            });
        } catch (TransientAiException e) {
            // All retries exhausted for a transient error
            log.error("Transient failure after {} attempts: {}", 
                     retryTemplate.getRetryPolicy().getMaxAttempts(), 
                     e.getMessage());
            throw new ServiceUnavailableException(
                "AI service is temporarily unavailable. Please try again later."
            );
        } catch (NonTransientAiException e) {
            // Permanent error - no retries attempted
            log.error("Non-transient failure: {}", e.getMessage());
            throw new BadRequestException(
                "Invalid request: " + e.getMessage()
            );
        }
    }
}

Configuration Properties

SpringAiRetryProperties

Configuration prefix: spring.ai.retry

/**
 * Configuration properties for Spring AI retry behavior
 * 
 * @ConfigurationProperties(prefix = "spring.ai.retry")
 */
class SpringAiRetryProperties {
    /**
     * Maximum number of retry attempts
     * 
     * Default: 10
     * Range: 1-100 (recommended)
     * 
     * Considerations:
     * - Higher values: More resilient but longer wait times
     * - Lower values: Faster failure but less resilient
     * - For rate limits: Use higher values (10-20)
     * - For auth errors: Use lower values (1-3)
     */
    private int maxAttempts = 10;
    
    /**
     * Whether to retry on 4xx client errors
     * 
     * Default: false
     * 
     * If false: 4xx errors throw NonTransientAiException (no retry)
     * If true: 4xx errors throw TransientAiException (will retry)
     * 
     * Use Cases:
     * - false: Most cases (4xx usually indicates client error)
     * - true: When 4xx might be transient (e.g., 429 rate limits)
     * 
     * Note: Specific codes can override this via onHttpCodes/excludeOnHttpCodes
     */
    private boolean onClientErrors = false;
    
    /**
     * HTTP status codes that should NOT trigger a retry
     * These codes will throw NonTransientAiException
     * 
     * Default: empty list
     * 
     * Common Values:
     * - 401: Unauthorized (invalid API key)
     * - 403: Forbidden (insufficient permissions)
     * - 400: Bad Request (invalid parameters)
     * - 404: Not Found (invalid endpoint)
     * 
     * Priority: Highest (overrides all other settings)
     */
    private List<Integer> excludeOnHttpCodes = new ArrayList<>();
    
    /**
     * HTTP status codes that SHOULD trigger a retry
     * These codes will throw TransientAiException
     * 
     * Default: empty list
     * 
     * Common Values:
     * - 429: Too Many Requests (rate limit)
     * - 408: Request Timeout
     * - 503: Service Unavailable
     * - 504: Gateway Timeout
     * 
     * Priority: High (overrides onClientErrors but not excludeOnHttpCodes)
     */
    private List<Integer> onHttpCodes = new ArrayList<>();
    
    /**
     * Exponential backoff configuration
     */
    private Backoff backoff = new Backoff();
    
    /**
     * Backoff configuration for retry attempts
     * Implements exponential backoff with configurable parameters
     */
    static class Backoff {
        /**
         * Initial sleep duration before first retry
         * 
         * Default: 2000ms (2 seconds)
         * Range: 100ms - 60000ms (recommended)
         * 
         * Considerations:
         * - Too low: May overwhelm rate-limited services
         * - Too high: Unnecessary delays for quick recoveries
         * - For rate limits: Use 2-5 seconds
         * - For network issues: Use 0.5-1 second
         */
        private Duration initialInterval = Duration.ofMillis(2000);
        
        /**
         * Multiplier for exponential backoff
         * 
         * Default: 5
         * Range: 1.5 - 10 (recommended)
         * 
         * Formula: delay = initialInterval * (multiplier ^ attemptNumber)
         * 
         * Examples with initialInterval=2s:
         * - multiplier=2: 2s, 4s, 8s, 16s, 32s
         * - multiplier=5: 2s, 10s, 50s, 250s (capped at maxInterval)
         * - multiplier=10: 2s, 20s, 200s (capped at maxInterval)
         * 
         * Considerations:
         * - Higher values: Faster backoff growth, fewer retries in short time
         * - Lower values: Slower backoff growth, more retries in short time
         * - For rate limits: Use higher values (3-5)
         * - For network issues: Use lower values (1.5-2)
         */
        private int multiplier = 5;
        
        /**
         * Maximum backoff duration
         * 
         * Default: 180000ms (3 minutes)
         * Range: 10000ms - 600000ms (recommended)
         * 
         * Purpose: Caps exponential growth to prevent extremely long waits
         * 
         * Considerations:
         * - Too low: May not give service enough time to recover
         * - Too high: User may wait too long for response
         * - For user-facing APIs: Use 30-60 seconds
         * - For background jobs: Use 3-10 minutes
         */
        private Duration maxInterval = Duration.ofMillis(180000);
    }
}

Configuration Examples

Basic Retry Configuration:

# application.properties

# Set maximum retry attempts
spring.ai.retry.max-attempts=5

# Configure exponential backoff
spring.ai.retry.backoff.initial-interval=1000ms
spring.ai.retry.backoff.multiplier=2
spring.ai.retry.backoff.max-interval=60000ms

# Result: Retry delays will be 1s, 2s, 4s, 8s, 16s (5 attempts)

Advanced Error Handling:

# Retry on specific 4xx errors (rate limits and timeouts)
spring.ai.retry.on-client-errors=false
spring.ai.retry.on-http-codes=429,408

# Never retry on authentication/authorization errors
spring.ai.retry.exclude-on-http-codes=401,403

# More aggressive retry for transient failures
spring.ai.retry.max-attempts=15
spring.ai.retry.backoff.initial-interval=500ms
spring.ai.retry.backoff.multiplier=3

YAML Configuration:

# application.yml
spring:
  ai:
    retry:
      max-attempts: 8
      on-client-errors: false
      on-http-codes:
        - 429  # Rate limit
        - 503  # Service unavailable
        - 504  # Gateway timeout
      exclude-on-http-codes:
        - 401  # Unauthorized - don't retry
        - 403  # Forbidden - don't retry
        - 400  # Bad Request - don't retry
      backoff:
        initial-interval: 1s
        multiplier: 2
        max-interval: 30s

Production Configuration:

# Production settings - balanced resilience and performance
spring.ai.retry.max-attempts=10
spring.ai.retry.backoff.initial-interval=2s
spring.ai.retry.backoff.multiplier=3
spring.ai.retry.backoff.max-interval=120s

# Retry rate limits and server errors
spring.ai.retry.on-http-codes=429,500,502,503,504

# Never retry auth and validation errors
spring.ai.retry.exclude-on-http-codes=401,403,400,422

# Enable retry logging
logging.level.org.springframework.ai.retry=DEBUG

Development Configuration:

# Development settings - faster feedback
spring.ai.retry.max-attempts=3
spring.ai.retry.backoff.initial-interval=500ms
spring.ai.retry.backoff.multiplier=2
spring.ai.retry.backoff.max-interval=10s

# Retry fewer errors for faster failure
spring.ai.retry.on-http-codes=429

Integration Examples

Using with RestClient

import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
import org.springframework.retry.support.RetryTemplate;
import org.springframework.web.client.RestTemplate;
import org.springframework.web.client.ResponseErrorHandler;

@Service
public class OpenAiClient {
    private final RestTemplate restTemplate;
    private final RetryTemplate retryTemplate;
    
    public OpenAiClient(ResponseErrorHandler errorHandler, 
                        RetryTemplate retryTemplate) {
        this.restTemplate = new RestTemplate();
        this.restTemplate.setErrorHandler(errorHandler);
        this.retryTemplate = retryTemplate;
    }
    
    public String chat(String prompt) {
        return retryTemplate.execute(context -> {
            try {
                ChatRequest request = new ChatRequest(prompt);
                return restTemplate.postForObject(
                    "https://api.openai.com/v1/chat/completions",
                    request,
                    ChatResponse.class
                ).getContent();
            } catch (TransientAiException e) {
                // Will be retried automatically
                log.debug("Transient error on attempt {}: {}", 
                         context.getRetryCount(), e.getMessage());
                throw e;
            } catch (NonTransientAiException e) {
                // Won't be retried - permanent failure
                log.error("Non-transient error: {}", e.getMessage());
                throw e;
            }
        });
    }
}

Custom Retry Logic

import org.springframework.ai.retry.RetryUtils;
import org.springframework.retry.RetryCallback;
import org.springframework.retry.RetryContext;
import org.springframework.retry.support.RetryTemplate;

@Service
public class CustomRetryService {
    private final RetryTemplate retryTemplate;
    
    public CustomRetryService(RetryTemplate retryTemplate) {
        this.retryTemplate = retryTemplate;
    }
    
    public String callWithCustomRetry(String prompt) {
        return retryTemplate.execute(
            new RetryCallback<String, RuntimeException>() {
                @Override
                public String doWithRetry(RetryContext context) {
                    int attempts = context.getRetryCount();
                    log.info("Attempt {} of {}", 
                            attempts + 1, 
                            retryTemplate.getRetryPolicy().getMaxAttempts());
                    
                    // Add custom logic based on attempt number
                    if (attempts > 5) {
                        // Use different model after 5 attempts
                        return callBackupModel(prompt);
                    }
                    
                    // Your AI operation here
                    return performOperation(prompt);
                }
            },
            context -> {
                // Recovery callback - called after all retries exhausted
                log.error("All retries exhausted after {} attempts", 
                         context.getRetryCount());
                return "I apologize, but I'm unable to process your request at this time.";
            }
        );
    }
    
    private String performOperation(String prompt) {
        // AI operation
        return "result";
    }
    
    private String callBackupModel(String prompt) {
        // Fallback to different model
        return "backup result";
    }
}

Programmatic Configuration

import org.springframework.ai.retry.RetryUtils;
import org.springframework.boot.autoconfigure.condition.ConditionalOnMissingBean;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.retry.backoff.ExponentialBackOffPolicy;
import org.springframework.retry.policy.SimpleRetryPolicy;
import org.springframework.retry.support.RetryTemplate;

@Configuration
public class CustomRetryConfig {
    
    @Bean
    @ConditionalOnMissingBean
    public RetryTemplate customRetryTemplate() {
        RetryTemplate template = new RetryTemplate();
        
        // Configure retry policy
        SimpleRetryPolicy retryPolicy = new SimpleRetryPolicy();
        retryPolicy.setMaxAttempts(3);
        template.setRetryPolicy(retryPolicy);
        
        // Configure backoff
        ExponentialBackOffPolicy backOffPolicy = new ExponentialBackOffPolicy();
        backOffPolicy.setInitialInterval(1000);
        backOffPolicy.setMultiplier(2.0);
        backOffPolicy.setMaxInterval(10000);
        template.setBackOffPolicy(backOffPolicy);
        
        // Add custom retry listener
        template.registerListener(new RetryListenerSupport() {
            @Override
            public <T, E extends Throwable> void onError(
                    RetryContext context,
                    RetryCallback<T, E> callback,
                    Throwable throwable) {
                // Custom logging or metrics
                metrics.incrementCounter("ai.retry.attempts");
            }
        });
        
        return template;
    }
}

Retry with Circuit Breaker

import io.github.resilience4j.circuitbreaker.CircuitBreaker;
import io.github.resilience4j.circuitbreaker.CircuitBreakerRegistry;
import org.springframework.retry.support.RetryTemplate;

@Service
public class ResilientAiService {
    private final RetryTemplate retryTemplate;
    private final CircuitBreaker circuitBreaker;
    private final ChatModel chatModel;
    
    public ResilientAiService(RetryTemplate retryTemplate,
                             CircuitBreakerRegistry circuitBreakerRegistry,
                             ChatModel chatModel) {
        this.retryTemplate = retryTemplate;
        this.circuitBreaker = circuitBreakerRegistry.circuitBreaker("chatModel");
        this.chatModel = chatModel;
    }
    
    public String chat(String prompt) {
        // Combine retry with circuit breaker
        return circuitBreaker.executeSupplier(() -> 
            retryTemplate.execute(context -> 
                chatModel.call(prompt)
            )
        );
    }
}

Conditional Requirements

The retry module activates when:

Class Present: org.springframework.ai.retry.RetryUtils is on the classpath
No Conflicting Beans: Can be disabled by providing custom RetryTemplate or ResponseErrorHandler beans

Common Use Cases

Rate Limit Handling

# Retry on 429 Too Many Requests with aggressive backoff
spring.ai.retry.on-http-codes=429
spring.ai.retry.max-attempts=10
spring.ai.retry.backoff.initial-interval=5s
spring.ai.retry.backoff.multiplier=2
spring.ai.retry.backoff.max-interval=120s

# Result: 5s, 10s, 20s, 40s, 80s, 120s, 120s, 120s, 120s, 120s

API Key Validation

# Don't retry on authentication errors - fail fast
spring.ai.retry.exclude-on-http-codes=401,403
spring.ai.retry.max-attempts=3
spring.ai.retry.backoff.initial-interval=1s

# Result: Only 3 attempts for non-auth errors, immediate failure for auth

High-Availability Setup

# Aggressive retry for critical operations
spring.ai.retry.max-attempts=20
spring.ai.retry.on-http-codes=429,500,502,503,504
spring.ai.retry.backoff.initial-interval=1s
spring.ai.retry.backoff.multiplier=1.5
spring.ai.retry.backoff.max-interval=60s

# Result: Many attempts with gradual backoff

Network Resilience

# Quick retries for network issues
spring.ai.retry.max-attempts=5
spring.ai.retry.backoff.initial-interval=500ms
spring.ai.retry.backoff.multiplier=2
spring.ai.retry.backoff.max-interval=10s

# Result: 0.5s, 1s, 2s, 4s, 8s - fast recovery for transient network issues

Best Practices

1. Choose Appropriate Max Attempts

# User-facing APIs: Lower attempts for faster response
spring.ai.retry.max-attempts=5

# Background jobs: Higher attempts for better success rate
spring.ai.retry.max-attempts=20

# Critical operations: Very high attempts
spring.ai.retry.max-attempts=50

2. Configure Backoff Based on Error Type

# Rate limits: Longer initial interval and higher multiplier
spring.ai.retry.backoff.initial-interval=5s
spring.ai.retry.backoff.multiplier=3

# Network issues: Shorter initial interval
spring.ai.retry.backoff.initial-interval=500ms
spring.ai.retry.backoff.multiplier=2

3. Use Exclude List for Permanent Errors

# Never retry these - they require user intervention
spring.ai.retry.exclude-on-http-codes=400,401,403,404,422

4. Monitor Retry Metrics

@Component
public class RetryMetrics {
    private final MeterRegistry meterRegistry;
    
    @EventListener
    public void onRetry(RetryEvent event) {
        meterRegistry.counter("ai.retry.attempts",
            "exception", event.getException().getClass().getSimpleName(),
            "attempt", String.valueOf(event.getRetryCount())
        ).increment();
    }
}

5. Implement Fallback Strategies

public String chatWithFallback(String prompt) {
    try {
        return retryTemplate.execute(context -> 
            primaryModel.call(prompt)
        );
    } catch (Exception e) {
        log.warn("Primary model failed, using fallback");
        return fallbackModel.call(prompt);
    }
}

Troubleshooting

Issue: Too Many Retries

Problem: Operations retry too many times, causing long delays

Solution:

# Reduce max attempts
spring.ai.retry.max-attempts=5

# Reduce max interval
spring.ai.retry.backoff.max-interval=30s

Issue: Not Retrying When Expected

Problem: Operations fail without retrying

Diagnostic:

# Enable debug logging
logging.level.org.springframework.ai.retry=DEBUG
logging.level.org.springframework.retry=DEBUG

Common Causes:

Error is classified as NonTransient
Error code is in excludeOnHttpCodes
Custom RetryTemplate bean overriding autoconfiguration

Issue: Retrying Permanent Errors

Problem: Retrying errors that will never succeed

Solution:

# Add to exclude list
spring.ai.retry.exclude-on-http-codes=400,401,403,404

# Disable client error retry
spring.ai.retry.on-client-errors=false

Issue: Rate Limits Not Respected

Problem: Still hitting rate limits despite retries

Solution:

# Increase backoff for rate limits
spring.ai.retry.on-http-codes=429
spring.ai.retry.backoff.initial-interval=10s
spring.ai.retry.backoff.multiplier=5
spring.ai.retry.backoff.max-interval=300s

# Consider implementing request queuing

Performance Considerations

Memory Usage

RetryTemplate is stateless and thread-safe
Each retry attempt uses minimal memory
Consider memory impact of storing large responses during retries

Thread Blocking

Retry operations block the calling thread
For high-concurrency scenarios, consider:
- Async retry with CompletableFuture
- Reactive retry with Reactor
- Separate thread pool for retry operations

Timeout Configuration

# Set timeouts to prevent hanging
spring.ai.mcp.client.request-timeout=30s
spring.ai.openai.chat.options.timeout=60s

# Ensure timeout < (maxAttempts * maxInterval)

Summary

The Spring AI Retry module provides production-ready retry capabilities with:

Intelligent error classification (transient vs non-transient)
Exponential backoff with configurable parameters
Flexible configuration via properties
Integration with Spring Boot error handling
Support for custom retry logic and recovery strategies

Key benefits:

Resilience: Automatic recovery from transient failures
Efficiency: Exponential backoff prevents overwhelming services
Flexibility: Highly configurable for different scenarios
Observability: Built-in logging and metrics support

Install with Tessl CLI

npx tessl i tessl/maven-org-springframework-ai--spring-ai-spring-boot-autoconfigure

tessl/maven-org-springframework-ai--spring-ai-spring-boot-autoconfigure

retry.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/reference/

Common - Retry Module

Maven Coordinates

Capabilities

Spring AI Retry AutoConfiguration

Retry Template Bean

Response Error Handler Bean

Exception Types

Configuration Properties

SpringAiRetryProperties

Configuration Examples

Integration Examples

Using with RestClient

Custom Retry Logic

Programmatic Configuration

Retry with Circuit Breaker

Conditional Requirements

Common Use Cases

Rate Limit Handling

API Key Validation

High-Availability Setup

Network Resilience

Best Practices

1. Choose Appropriate Max Attempts

2. Configure Backoff Based on Error Type

3. Use Exclude List for Permanent Errors

4. Monitor Retry Metrics

5. Implement Fallback Strategies

Troubleshooting

Issue: Too Many Retries

Issue: Not Retrying When Expected

Issue: Retrying Permanent Errors

Issue: Rate Limits Not Respected

Performance Considerations

Memory Usage

Thread Blocking

Timeout Configuration

Summary

retry.mddocs/reference/