CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-org-springframework-ai--spring-ai-spring-boot-autoconfigure

Spring AI Spring Boot Auto Configuration modules providing automatic setup for AI models, vector stores, MCP, and retry capabilities

Overview
Eval results
Files

retry.mddocs/reference/

Common - Retry Module

The Retry module provides autoconfiguration for resilient AI operations with automatic retry capabilities for transient failures. It handles HTTP errors intelligently, distinguishing between transient (retryable) and non-transient (permanent) failures with exponential backoff and comprehensive error classification.

Maven Coordinates

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-autoconfigure-retry</artifactId>
    <version>1.1.2</version>
</dependency>

Capabilities

Spring AI Retry AutoConfiguration

Automatically configures retry mechanisms for AI operations with exponential backoff and intelligent error handling.

/**
 * Autoconfigures retry support for Spring AI operations
 * 
 * Conditional Requirements:
 * - @ConditionalOnClass: org.springframework.ai.retry.RetryUtils
 * 
 * Configuration Properties: spring.ai.retry.*
 * 
 * @AutoConfiguration
 * @ConditionalOnClass(RetryUtils.class)
 * @EnableConfigurationProperties(SpringAiRetryProperties.class)
 */
@AutoConfiguration
@ConditionalOnClass(RetryUtils.class)
@EnableConfigurationProperties(SpringAiRetryProperties.class)
class SpringAiRetryAutoConfiguration {
    // Bean definitions for retry infrastructure
}

Retry Template Bean

Creates a configured RetryTemplate for AI operations with exponential backoff and retry listeners.

/**
 * Provides retry template with exponential backoff
 * 
 * @Bean
 * @ConditionalOnMissingBean
 * @param properties Configuration properties for retry behavior
 * @return RetryTemplate configured for AI operations
 * 
 * Retry Behavior:
 * - Retries on TransientAiException: Rate limits, timeouts, 5xx errors
 * - Retries on ResourceAccessException: Network failures, connection issues
 * - Optionally retries on WebClientRequestException: If WebFlux present
 * - Exponential backoff with configurable intervals
 * - Includes retry listener for logging attempts
 * 
 * Backoff Formula:
 * delay = min(initialInterval * (multiplier ^ attemptNumber), maxInterval)
 * 
 * Example with defaults:
 * - Attempt 1: 2s
 * - Attempt 2: 10s (2 * 5^1)
 * - Attempt 3: 50s (2 * 5^2)
 * - Attempt 4: 180s (capped at maxInterval)
 */
@Bean
@ConditionalOnMissingBean
RetryTemplate retryTemplate(SpringAiRetryProperties properties) {
    RetryTemplate template = new RetryTemplate();
    
    // Configure retry policy
    SimpleRetryPolicy retryPolicy = new SimpleRetryPolicy();
    retryPolicy.setMaxAttempts(properties.getMaxAttempts());
    template.setRetryPolicy(retryPolicy);
    
    // Configure exponential backoff
    ExponentialBackOffPolicy backOffPolicy = new ExponentialBackOffPolicy();
    backOffPolicy.setInitialInterval(properties.getBackoff().getInitialInterval().toMillis());
    backOffPolicy.setMultiplier(properties.getBackoff().getMultiplier());
    backOffPolicy.setMaxInterval(properties.getBackoff().getMaxInterval().toMillis());
    template.setBackOffPolicy(backOffPolicy);
    
    // Add retry listener for logging
    template.registerListener(new RetryListenerSupport() {
        @Override
        public <T, E extends Throwable> void onError(
                RetryContext context, 
                RetryCallback<T, E> callback, 
                Throwable throwable) {
            log.warn("Retry attempt {} failed: {}", 
                    context.getRetryCount(), 
                    throwable.getMessage());
        }
    });
    
    return template;
}

Usage Example:

import org.springframework.ai.retry.TransientAiException;
import org.springframework.retry.support.RetryTemplate;
import org.springframework.stereotype.Service;

@Service
public class ResilientAiService {
    private final RetryTemplate retryTemplate;
    private final ChatModel chatModel;
    
    public ResilientAiService(RetryTemplate retryTemplate, ChatModel chatModel) {
        this.retryTemplate = retryTemplate;
        this.chatModel = chatModel;
    }
    
    public String callAiWithRetry(String prompt) {
        return retryTemplate.execute(context -> {
            // This operation will be retried on transient failures
            return chatModel.call(prompt);
        });
    }
    
    public String callAiWithRetryAndRecovery(String prompt) {
        return retryTemplate.execute(
            context -> {
                // Main operation
                return chatModel.call(prompt);
            },
            context -> {
                // Recovery callback - called after all retries exhausted
                log.error("All retries exhausted for prompt: {}", prompt);
                return "I apologize, but I'm currently unable to process your request. Please try again later.";
            }
        );
    }
}

Response Error Handler Bean

Creates a ResponseErrorHandler that classifies HTTP errors as transient or non-transient based on status codes and configuration.

/**
 * Handles HTTP response errors for AI operations
 * 
 * @Bean
 * @ConditionalOnMissingBean
 * @param properties Configuration for error classification
 * @return ResponseErrorHandler that throws appropriate exceptions
 * 
 * Error Classification Logic:
 * 1. Check excludeOnHttpCodes: If matched -> NonTransientAiException
 * 2. Check onHttpCodes: If matched -> TransientAiException
 * 3. Check onClientErrors: If true, 4xx -> TransientAiException
 * 4. Default 4xx -> NonTransientAiException
 * 5. Default 5xx -> TransientAiException
 * 
 * HTTP Status Code Handling:
 * - 400 Bad Request: NonTransient (invalid request format)
 * - 401 Unauthorized: NonTransient (invalid API key)
 * - 403 Forbidden: NonTransient (insufficient permissions)
 * - 404 Not Found: NonTransient (invalid endpoint)
 * - 408 Request Timeout: Transient (can be retried)
 * - 429 Too Many Requests: Transient (rate limit, retry with backoff)
 * - 500 Internal Server Error: Transient (temporary server issue)
 * - 502 Bad Gateway: Transient (temporary proxy issue)
 * - 503 Service Unavailable: Transient (temporary unavailability)
 * - 504 Gateway Timeout: Transient (timeout, can retry)
 * 
 * Custom Configuration Examples:
 * - Retry on specific 4xx: onHttpCodes=408,429
 * - Never retry auth errors: excludeOnHttpCodes=401,403
 * - Retry all 4xx: onClientErrors=true
 */
@Bean
@ConditionalOnMissingBean
ResponseErrorHandler responseErrorHandler(SpringAiRetryProperties properties) {
    return new ResponseErrorHandler() {
        @Override
        public boolean hasError(ClientHttpResponse response) throws IOException {
            return response.getStatusCode().isError();
        }
        
        @Override
        public void handleError(ClientHttpResponse response) throws IOException {
            int statusCode = response.getStatusCode().value();
            String responseBody = new String(response.getBody().readAllBytes());
            
            // Check exclude list first
            if (properties.getExcludeOnHttpCodes().contains(statusCode)) {
                throw new NonTransientAiException(
                    String.format("HTTP %d (excluded from retry): %s", 
                                statusCode, responseBody)
                );
            }
            
            // Check explicit retry list
            if (properties.getOnHttpCodes().contains(statusCode)) {
                throw new TransientAiException(
                    String.format("HTTP %d (will retry): %s", 
                                statusCode, responseBody)
                );
            }
            
            // Handle 4xx errors
            if (statusCode >= 400 && statusCode < 500) {
                if (properties.isOnClientErrors()) {
                    throw new TransientAiException(
                        String.format("HTTP %d (client error, will retry): %s", 
                                    statusCode, responseBody)
                    );
                } else {
                    throw new NonTransientAiException(
                        String.format("HTTP %d (client error, won't retry): %s", 
                                    statusCode, responseBody)
                    );
                }
            }
            
            // Handle 5xx errors (always transient)
            if (statusCode >= 500) {
                throw new TransientAiException(
                    String.format("HTTP %d (server error, will retry): %s", 
                                statusCode, responseBody)
                );
            }
        }
    };
}

Usage Example:

import org.springframework.web.client.RestTemplate;
import org.springframework.web.client.ResponseErrorHandler;
import org.springframework.stereotype.Component;

@Component
public class AiRestClient {
    private final RestTemplate restTemplate;
    
    public AiRestClient(ResponseErrorHandler errorHandler) {
        this.restTemplate = new RestTemplate();
        this.restTemplate.setErrorHandler(errorHandler);
    }
    
    public String callAiApi(String endpoint, Object request) {
        // Error handler automatically classifies failures
        // Throws TransientAiException or NonTransientAiException
        return restTemplate.postForObject(endpoint, request, String.class);
    }
}

Exception Types

Spring AI provides two exception types for error classification with clear semantics for retry behavior.

/**
 * Exception indicating a transient (retryable) AI operation failure
 * 
 * Thrown for temporary errors that may succeed on retry:
 * - Rate limits (HTTP 429): Service is temporarily throttling requests
 * - Timeouts (HTTP 408, 504): Request took too long, may succeed if retried
 * - Server errors (HTTP 5xx): Temporary server issues
 * - Network errors: Connection failures, DNS issues
 * - Resource exhaustion: Temporary capacity issues
 * 
 * Retry Behavior:
 * - Will be automatically retried according to retry policy
 * - Uses exponential backoff between attempts
 * - Stops after max attempts reached
 * 
 * Best Practices:
 * - Use for errors that are likely to resolve with time
 * - Include original cause for debugging
 * - Log retry attempts for monitoring
 */
class TransientAiException extends RuntimeException {
    /**
     * Create exception with message
     * @param message Description of the transient failure
     */
    public TransientAiException(String message);
    
    /**
     * Create exception with message and cause
     * @param message Description of the transient failure
     * @param cause Original exception that caused the failure
     */
    public TransientAiException(String message, Throwable cause);
}

/**
 * Exception indicating a non-transient (permanent) AI operation failure
 * 
 * Thrown for permanent errors that will not succeed on retry:
 * - Invalid API key (HTTP 401): Credentials are wrong
 * - Forbidden (HTTP 403): Insufficient permissions
 * - Bad request (HTTP 400): Invalid request format or parameters
 * - Not found (HTTP 404): Invalid endpoint or resource
 * - Invalid model: Model name doesn't exist
 * - Quota exceeded: Account limits reached
 * - Content policy violation: Request violates provider policies
 * 
 * Retry Behavior:
 * - Will NOT be retried - fails immediately
 * - Allows fast failure for permanent issues
 * - Prevents wasting retry attempts on unfixable errors
 * 
 * Best Practices:
 * - Use for errors that require user intervention
 * - Provide clear error messages for troubleshooting
 * - Include specific details about what needs to be fixed
 */
class NonTransientAiException extends RuntimeException {
    /**
     * Create exception with message
     * @param message Description of the permanent failure
     */
    public NonTransientAiException(String message);
    
    /**
     * Create exception with message and cause
     * @param message Description of the permanent failure
     * @param cause Original exception that caused the failure
     */
    public NonTransientAiException(String message, Throwable cause);
}

Exception Handling Example:

import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;

@Service
public class ChatService {
    private final ChatModel chatModel;
    private final RetryTemplate retryTemplate;
    
    public String chat(String prompt) {
        try {
            return retryTemplate.execute(context -> {
                return chatModel.call(prompt);
            });
        } catch (TransientAiException e) {
            // All retries exhausted for a transient error
            log.error("Transient failure after {} attempts: {}", 
                     retryTemplate.getRetryPolicy().getMaxAttempts(), 
                     e.getMessage());
            throw new ServiceUnavailableException(
                "AI service is temporarily unavailable. Please try again later."
            );
        } catch (NonTransientAiException e) {
            // Permanent error - no retries attempted
            log.error("Non-transient failure: {}", e.getMessage());
            throw new BadRequestException(
                "Invalid request: " + e.getMessage()
            );
        }
    }
}

Configuration Properties

SpringAiRetryProperties

Configuration prefix: spring.ai.retry

/**
 * Configuration properties for Spring AI retry behavior
 * 
 * @ConfigurationProperties(prefix = "spring.ai.retry")
 */
class SpringAiRetryProperties {
    /**
     * Maximum number of retry attempts
     * 
     * Default: 10
     * Range: 1-100 (recommended)
     * 
     * Considerations:
     * - Higher values: More resilient but longer wait times
     * - Lower values: Faster failure but less resilient
     * - For rate limits: Use higher values (10-20)
     * - For auth errors: Use lower values (1-3)
     */
    private int maxAttempts = 10;
    
    /**
     * Whether to retry on 4xx client errors
     * 
     * Default: false
     * 
     * If false: 4xx errors throw NonTransientAiException (no retry)
     * If true: 4xx errors throw TransientAiException (will retry)
     * 
     * Use Cases:
     * - false: Most cases (4xx usually indicates client error)
     * - true: When 4xx might be transient (e.g., 429 rate limits)
     * 
     * Note: Specific codes can override this via onHttpCodes/excludeOnHttpCodes
     */
    private boolean onClientErrors = false;
    
    /**
     * HTTP status codes that should NOT trigger a retry
     * These codes will throw NonTransientAiException
     * 
     * Default: empty list
     * 
     * Common Values:
     * - 401: Unauthorized (invalid API key)
     * - 403: Forbidden (insufficient permissions)
     * - 400: Bad Request (invalid parameters)
     * - 404: Not Found (invalid endpoint)
     * 
     * Priority: Highest (overrides all other settings)
     */
    private List<Integer> excludeOnHttpCodes = new ArrayList<>();
    
    /**
     * HTTP status codes that SHOULD trigger a retry
     * These codes will throw TransientAiException
     * 
     * Default: empty list
     * 
     * Common Values:
     * - 429: Too Many Requests (rate limit)
     * - 408: Request Timeout
     * - 503: Service Unavailable
     * - 504: Gateway Timeout
     * 
     * Priority: High (overrides onClientErrors but not excludeOnHttpCodes)
     */
    private List<Integer> onHttpCodes = new ArrayList<>();
    
    /**
     * Exponential backoff configuration
     */
    private Backoff backoff = new Backoff();
    
    /**
     * Backoff configuration for retry attempts
     * Implements exponential backoff with configurable parameters
     */
    static class Backoff {
        /**
         * Initial sleep duration before first retry
         * 
         * Default: 2000ms (2 seconds)
         * Range: 100ms - 60000ms (recommended)
         * 
         * Considerations:
         * - Too low: May overwhelm rate-limited services
         * - Too high: Unnecessary delays for quick recoveries
         * - For rate limits: Use 2-5 seconds
         * - For network issues: Use 0.5-1 second
         */
        private Duration initialInterval = Duration.ofMillis(2000);
        
        /**
         * Multiplier for exponential backoff
         * 
         * Default: 5
         * Range: 1.5 - 10 (recommended)
         * 
         * Formula: delay = initialInterval * (multiplier ^ attemptNumber)
         * 
         * Examples with initialInterval=2s:
         * - multiplier=2: 2s, 4s, 8s, 16s, 32s
         * - multiplier=5: 2s, 10s, 50s, 250s (capped at maxInterval)
         * - multiplier=10: 2s, 20s, 200s (capped at maxInterval)
         * 
         * Considerations:
         * - Higher values: Faster backoff growth, fewer retries in short time
         * - Lower values: Slower backoff growth, more retries in short time
         * - For rate limits: Use higher values (3-5)
         * - For network issues: Use lower values (1.5-2)
         */
        private int multiplier = 5;
        
        /**
         * Maximum backoff duration
         * 
         * Default: 180000ms (3 minutes)
         * Range: 10000ms - 600000ms (recommended)
         * 
         * Purpose: Caps exponential growth to prevent extremely long waits
         * 
         * Considerations:
         * - Too low: May not give service enough time to recover
         * - Too high: User may wait too long for response
         * - For user-facing APIs: Use 30-60 seconds
         * - For background jobs: Use 3-10 minutes
         */
        private Duration maxInterval = Duration.ofMillis(180000);
    }
}

Configuration Examples

Basic Retry Configuration:

# application.properties

# Set maximum retry attempts
spring.ai.retry.max-attempts=5

# Configure exponential backoff
spring.ai.retry.backoff.initial-interval=1000ms
spring.ai.retry.backoff.multiplier=2
spring.ai.retry.backoff.max-interval=60000ms

# Result: Retry delays will be 1s, 2s, 4s, 8s, 16s (5 attempts)

Advanced Error Handling:

# Retry on specific 4xx errors (rate limits and timeouts)
spring.ai.retry.on-client-errors=false
spring.ai.retry.on-http-codes=429,408

# Never retry on authentication/authorization errors
spring.ai.retry.exclude-on-http-codes=401,403

# More aggressive retry for transient failures
spring.ai.retry.max-attempts=15
spring.ai.retry.backoff.initial-interval=500ms
spring.ai.retry.backoff.multiplier=3

YAML Configuration:

# application.yml
spring:
  ai:
    retry:
      max-attempts: 8
      on-client-errors: false
      on-http-codes:
        - 429  # Rate limit
        - 503  # Service unavailable
        - 504  # Gateway timeout
      exclude-on-http-codes:
        - 401  # Unauthorized - don't retry
        - 403  # Forbidden - don't retry
        - 400  # Bad Request - don't retry
      backoff:
        initial-interval: 1s
        multiplier: 2
        max-interval: 30s

Production Configuration:

# Production settings - balanced resilience and performance
spring.ai.retry.max-attempts=10
spring.ai.retry.backoff.initial-interval=2s
spring.ai.retry.backoff.multiplier=3
spring.ai.retry.backoff.max-interval=120s

# Retry rate limits and server errors
spring.ai.retry.on-http-codes=429,500,502,503,504

# Never retry auth and validation errors
spring.ai.retry.exclude-on-http-codes=401,403,400,422

# Enable retry logging
logging.level.org.springframework.ai.retry=DEBUG

Development Configuration:

# Development settings - faster feedback
spring.ai.retry.max-attempts=3
spring.ai.retry.backoff.initial-interval=500ms
spring.ai.retry.backoff.multiplier=2
spring.ai.retry.backoff.max-interval=10s

# Retry fewer errors for faster failure
spring.ai.retry.on-http-codes=429

Integration Examples

Using with RestClient

import org.springframework.ai.retry.TransientAiException;
import org.springframework.ai.retry.NonTransientAiException;
import org.springframework.retry.support.RetryTemplate;
import org.springframework.web.client.RestTemplate;
import org.springframework.web.client.ResponseErrorHandler;

@Service
public class OpenAiClient {
    private final RestTemplate restTemplate;
    private final RetryTemplate retryTemplate;
    
    public OpenAiClient(ResponseErrorHandler errorHandler, 
                        RetryTemplate retryTemplate) {
        this.restTemplate = new RestTemplate();
        this.restTemplate.setErrorHandler(errorHandler);
        this.retryTemplate = retryTemplate;
    }
    
    public String chat(String prompt) {
        return retryTemplate.execute(context -> {
            try {
                ChatRequest request = new ChatRequest(prompt);
                return restTemplate.postForObject(
                    "https://api.openai.com/v1/chat/completions",
                    request,
                    ChatResponse.class
                ).getContent();
            } catch (TransientAiException e) {
                // Will be retried automatically
                log.debug("Transient error on attempt {}: {}", 
                         context.getRetryCount(), e.getMessage());
                throw e;
            } catch (NonTransientAiException e) {
                // Won't be retried - permanent failure
                log.error("Non-transient error: {}", e.getMessage());
                throw e;
            }
        });
    }
}

Custom Retry Logic

import org.springframework.ai.retry.RetryUtils;
import org.springframework.retry.RetryCallback;
import org.springframework.retry.RetryContext;
import org.springframework.retry.support.RetryTemplate;

@Service
public class CustomRetryService {
    private final RetryTemplate retryTemplate;
    
    public CustomRetryService(RetryTemplate retryTemplate) {
        this.retryTemplate = retryTemplate;
    }
    
    public String callWithCustomRetry(String prompt) {
        return retryTemplate.execute(
            new RetryCallback<String, RuntimeException>() {
                @Override
                public String doWithRetry(RetryContext context) {
                    int attempts = context.getRetryCount();
                    log.info("Attempt {} of {}", 
                            attempts + 1, 
                            retryTemplate.getRetryPolicy().getMaxAttempts());
                    
                    // Add custom logic based on attempt number
                    if (attempts > 5) {
                        // Use different model after 5 attempts
                        return callBackupModel(prompt);
                    }
                    
                    // Your AI operation here
                    return performOperation(prompt);
                }
            },
            context -> {
                // Recovery callback - called after all retries exhausted
                log.error("All retries exhausted after {} attempts", 
                         context.getRetryCount());
                return "I apologize, but I'm unable to process your request at this time.";
            }
        );
    }
    
    private String performOperation(String prompt) {
        // AI operation
        return "result";
    }
    
    private String callBackupModel(String prompt) {
        // Fallback to different model
        return "backup result";
    }
}

Programmatic Configuration

import org.springframework.ai.retry.RetryUtils;
import org.springframework.boot.autoconfigure.condition.ConditionalOnMissingBean;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.retry.backoff.ExponentialBackOffPolicy;
import org.springframework.retry.policy.SimpleRetryPolicy;
import org.springframework.retry.support.RetryTemplate;

@Configuration
public class CustomRetryConfig {
    
    @Bean
    @ConditionalOnMissingBean
    public RetryTemplate customRetryTemplate() {
        RetryTemplate template = new RetryTemplate();
        
        // Configure retry policy
        SimpleRetryPolicy retryPolicy = new SimpleRetryPolicy();
        retryPolicy.setMaxAttempts(3);
        template.setRetryPolicy(retryPolicy);
        
        // Configure backoff
        ExponentialBackOffPolicy backOffPolicy = new ExponentialBackOffPolicy();
        backOffPolicy.setInitialInterval(1000);
        backOffPolicy.setMultiplier(2.0);
        backOffPolicy.setMaxInterval(10000);
        template.setBackOffPolicy(backOffPolicy);
        
        // Add custom retry listener
        template.registerListener(new RetryListenerSupport() {
            @Override
            public <T, E extends Throwable> void onError(
                    RetryContext context,
                    RetryCallback<T, E> callback,
                    Throwable throwable) {
                // Custom logging or metrics
                metrics.incrementCounter("ai.retry.attempts");
            }
        });
        
        return template;
    }
}

Retry with Circuit Breaker

import io.github.resilience4j.circuitbreaker.CircuitBreaker;
import io.github.resilience4j.circuitbreaker.CircuitBreakerRegistry;
import org.springframework.retry.support.RetryTemplate;

@Service
public class ResilientAiService {
    private final RetryTemplate retryTemplate;
    private final CircuitBreaker circuitBreaker;
    private final ChatModel chatModel;
    
    public ResilientAiService(RetryTemplate retryTemplate,
                             CircuitBreakerRegistry circuitBreakerRegistry,
                             ChatModel chatModel) {
        this.retryTemplate = retryTemplate;
        this.circuitBreaker = circuitBreakerRegistry.circuitBreaker("chatModel");
        this.chatModel = chatModel;
    }
    
    public String chat(String prompt) {
        // Combine retry with circuit breaker
        return circuitBreaker.executeSupplier(() -> 
            retryTemplate.execute(context -> 
                chatModel.call(prompt)
            )
        );
    }
}

Conditional Requirements

The retry module activates when:

  1. Class Present: org.springframework.ai.retry.RetryUtils is on the classpath
  2. No Conflicting Beans: Can be disabled by providing custom RetryTemplate or ResponseErrorHandler beans

Common Use Cases

Rate Limit Handling

# Retry on 429 Too Many Requests with aggressive backoff
spring.ai.retry.on-http-codes=429
spring.ai.retry.max-attempts=10
spring.ai.retry.backoff.initial-interval=5s
spring.ai.retry.backoff.multiplier=2
spring.ai.retry.backoff.max-interval=120s

# Result: 5s, 10s, 20s, 40s, 80s, 120s, 120s, 120s, 120s, 120s

API Key Validation

# Don't retry on authentication errors - fail fast
spring.ai.retry.exclude-on-http-codes=401,403
spring.ai.retry.max-attempts=3
spring.ai.retry.backoff.initial-interval=1s

# Result: Only 3 attempts for non-auth errors, immediate failure for auth

High-Availability Setup

# Aggressive retry for critical operations
spring.ai.retry.max-attempts=20
spring.ai.retry.on-http-codes=429,500,502,503,504
spring.ai.retry.backoff.initial-interval=1s
spring.ai.retry.backoff.multiplier=1.5
spring.ai.retry.backoff.max-interval=60s

# Result: Many attempts with gradual backoff

Network Resilience

# Quick retries for network issues
spring.ai.retry.max-attempts=5
spring.ai.retry.backoff.initial-interval=500ms
spring.ai.retry.backoff.multiplier=2
spring.ai.retry.backoff.max-interval=10s

# Result: 0.5s, 1s, 2s, 4s, 8s - fast recovery for transient network issues

Best Practices

1. Choose Appropriate Max Attempts

# User-facing APIs: Lower attempts for faster response
spring.ai.retry.max-attempts=5

# Background jobs: Higher attempts for better success rate
spring.ai.retry.max-attempts=20

# Critical operations: Very high attempts
spring.ai.retry.max-attempts=50

2. Configure Backoff Based on Error Type

# Rate limits: Longer initial interval and higher multiplier
spring.ai.retry.backoff.initial-interval=5s
spring.ai.retry.backoff.multiplier=3

# Network issues: Shorter initial interval
spring.ai.retry.backoff.initial-interval=500ms
spring.ai.retry.backoff.multiplier=2

3. Use Exclude List for Permanent Errors

# Never retry these - they require user intervention
spring.ai.retry.exclude-on-http-codes=400,401,403,404,422

4. Monitor Retry Metrics

@Component
public class RetryMetrics {
    private final MeterRegistry meterRegistry;
    
    @EventListener
    public void onRetry(RetryEvent event) {
        meterRegistry.counter("ai.retry.attempts",
            "exception", event.getException().getClass().getSimpleName(),
            "attempt", String.valueOf(event.getRetryCount())
        ).increment();
    }
}

5. Implement Fallback Strategies

public String chatWithFallback(String prompt) {
    try {
        return retryTemplate.execute(context -> 
            primaryModel.call(prompt)
        );
    } catch (Exception e) {
        log.warn("Primary model failed, using fallback");
        return fallbackModel.call(prompt);
    }
}

Troubleshooting

Issue: Too Many Retries

Problem: Operations retry too many times, causing long delays

Solution:

# Reduce max attempts
spring.ai.retry.max-attempts=5

# Reduce max interval
spring.ai.retry.backoff.max-interval=30s

Issue: Not Retrying When Expected

Problem: Operations fail without retrying

Diagnostic:

# Enable debug logging
logging.level.org.springframework.ai.retry=DEBUG
logging.level.org.springframework.retry=DEBUG

Common Causes:

  1. Error is classified as NonTransient
  2. Error code is in excludeOnHttpCodes
  3. Custom RetryTemplate bean overriding autoconfiguration

Issue: Retrying Permanent Errors

Problem: Retrying errors that will never succeed

Solution:

# Add to exclude list
spring.ai.retry.exclude-on-http-codes=400,401,403,404

# Disable client error retry
spring.ai.retry.on-client-errors=false

Issue: Rate Limits Not Respected

Problem: Still hitting rate limits despite retries

Solution:

# Increase backoff for rate limits
spring.ai.retry.on-http-codes=429
spring.ai.retry.backoff.initial-interval=10s
spring.ai.retry.backoff.multiplier=5
spring.ai.retry.backoff.max-interval=300s

# Consider implementing request queuing

Performance Considerations

Memory Usage

  • RetryTemplate is stateless and thread-safe
  • Each retry attempt uses minimal memory
  • Consider memory impact of storing large responses during retries

Thread Blocking

  • Retry operations block the calling thread
  • For high-concurrency scenarios, consider:
    • Async retry with CompletableFuture
    • Reactive retry with Reactor
    • Separate thread pool for retry operations

Timeout Configuration

# Set timeouts to prevent hanging
spring.ai.mcp.client.request-timeout=30s
spring.ai.openai.chat.options.timeout=60s

# Ensure timeout < (maxAttempts * maxInterval)

Summary

The Spring AI Retry module provides production-ready retry capabilities with:

  • Intelligent error classification (transient vs non-transient)
  • Exponential backoff with configurable parameters
  • Flexible configuration via properties
  • Integration with Spring Boot error handling
  • Support for custom retry logic and recovery strategies

Key benefits:

  • Resilience: Automatic recovery from transient failures
  • Efficiency: Exponential backoff prevents overwhelming services
  • Flexibility: Highly configurable for different scenarios
  • Observability: Built-in logging and metrics support

Install with Tessl CLI

npx tessl i tessl/maven-org-springframework-ai--spring-ai-spring-boot-autoconfigure

docs

index.md

tile.json