CtrlK

Community Documentation Log in Get started

tessl/maven-org-springframework-ai--spring-ai-azure-openai

Spring AI integration for Azure OpenAI services providing chat completion, text embeddings, image generation, and audio transcription with GPT, DALL-E, and Whisper models

Overview

Eval results

Files

Error Handling Reference

Name: tessl/maven-org-springframework-ai--spring-ai-azure-openai
Author: tessl

Complete guide to exception handling and error recovery.

Exception Hierarchy

// Azure SDK exceptions
com.azure.core.exception.HttpResponseException  // Base HTTP error
com.azure.core.exception.ResourceNotFoundException  // 404 errors

// Spring AI exceptions
org.springframework.ai.retry.NonTransientAiException  // Permanent failures
org.springframework.ai.retry.TransientAiException  // Temporary failures

// Java exceptions
java.lang.IllegalArgumentException  // Invalid parameters
java.lang.NullPointerException  // Null required parameters

HTTP Status Codes

400 Bad Request

Causes:

Invalid parameters
Malformed request
Incompatible options
Content policy violation
Token limit exceeded
Invalid image size
Prompt too long

Example:

try {
    response = chatModel.call(prompt);
} catch (HttpResponseException e) {
    if (e.getResponse().getStatusCode() == 400) {
        String errorMessage = e.getMessage();
        
        if (errorMessage.contains("content_policy_violation")) {
            // Handle content filter
        } else if (errorMessage.contains("maximum context length")) {
            // Handle token limit
        } else if (errorMessage.contains("invalid_image_size")) {
            // Handle invalid dimensions
        }
    }
}

401 Unauthorized

Causes:

Invalid API key
Expired credentials
Wrong endpoint

Example:

try {
    response = chatModel.call(prompt);
} catch (HttpResponseException e) {
    if (e.getResponse().getStatusCode() == 401) {
        throw new AuthenticationException(
            "Invalid Azure OpenAI credentials. Check API key and endpoint.",
            e
        );
    }
}

403 Forbidden

Causes:

Insufficient permissions
Quota exceeded
Content filter triggered
Region restrictions

Example:

try {
    response = chatModel.call(prompt);
} catch (HttpResponseException e) {
    if (e.getResponse().getStatusCode() == 403) {
        String errorBody = e.getResponse().getBodyAsString().block();
        
        if (errorBody.contains("content_filter")) {
            throw new ContentFilterException("Content blocked by filter", e);
        } else if (errorBody.contains("quota")) {
            throw new QuotaException("Quota exceeded", e);
        }
    }
}

404 Not Found

Causes:

Deployment not found
Invalid deployment name
Wrong region

Example:

try {
    response = chatModel.call(prompt);
} catch (ResourceNotFoundException e) {
    throw new ConfigurationException(
        "Deployment '" + options.getDeploymentName() + "' not found. " +
        "Check Azure portal for valid deployment names.",
        e
    );
}

429 Too Many Requests

Causes:

Rate limit exceeded
Too many concurrent requests
Quota limits reached

Example:

try {
    response = chatModel.call(prompt);
} catch (HttpResponseException e) {
    if (e.getResponse().getStatusCode() == 429) {
        // Extract retry-after header
        String retryAfter = e.getResponse()
            .getHeaders()
            .getValue("Retry-After");
        
        int waitSeconds = retryAfter != null ? 
            Integer.parseInt(retryAfter) : 60;
        
        Thread.sleep(waitSeconds * 1000);
        response = chatModel.call(prompt);  // Retry
    }
}

500 Internal Server Error

Causes:

Azure service error
Temporary outage
Model overload

Example:

try {
    response = chatModel.call(prompt);
} catch (HttpResponseException e) {
    if (e.getResponse().getStatusCode() == 500) {
        // Transient error - retry with backoff
        Thread.sleep(2000);
        response = chatModel.call(prompt);
    }
}

503 Service Unavailable

Causes:

Service temporarily down
Maintenance
Overload

Example:

try {
    response = chatModel.call(prompt);
} catch (HttpResponseException e) {
    if (e.getResponse().getStatusCode() == 503) {
        // Service down - retry after delay
        Thread.sleep(5000);
        response = chatModel.call(prompt);
    }
}

Retry Patterns

Exponential Backoff

public ChatResponse callWithExponentialBackoff(Prompt prompt) {
    int maxRetries = 5;
    int baseDelayMs = 1000;
    
    for (int attempt = 0; attempt < maxRetries; attempt++) {
        try {
            return chatModel.call(prompt);
        } catch (HttpResponseException e) {
            int statusCode = e.getResponse().getStatusCode();
            
            // Only retry on transient errors
            if (statusCode == 429 || statusCode == 500 || statusCode == 503) {
                if (attempt < maxRetries - 1) {
                    int delayMs = baseDelayMs * (1 << attempt);
                    Thread.sleep(delayMs);
                    continue;
                }
            }
            throw e;
        }
    }
    throw new RuntimeException("Max retries exceeded");
}

Exponential Backoff with Jitter

public ChatResponse callWithJitter(Prompt prompt) {
    int maxRetries = 5;
    int baseDelayMs = 1000;
    
    for (int attempt = 0; attempt < maxRetries; attempt++) {
        try {
            return chatModel.call(prompt);
        } catch (HttpResponseException e) {
            if (isRetryable(e) && attempt < maxRetries - 1) {
                int exponentialDelay = baseDelayMs * (1 << attempt);
                int jitter = ThreadLocalRandom.current()
                    .nextInt(0, exponentialDelay / 2);
                int totalDelay = Math.min(
                    exponentialDelay + jitter,
                    60000  // Cap at 60 seconds
                );
                Thread.sleep(totalDelay);
                continue;
            }
            throw e;
        }
    }
    throw new RuntimeException("Max retries exceeded");
}

private boolean isRetryable(HttpResponseException e) {
    int statusCode = e.getResponse().getStatusCode();
    return statusCode == 429 || statusCode == 500 || statusCode == 503;
}

Circuit Breaker Pattern

public class CircuitBreakerService {
    private enum State { CLOSED, OPEN, HALF_OPEN }
    
    private State state = State.CLOSED;
    private int failureCount = 0;
    private final int failureThreshold = 5;
    private long openedAt = 0;
    private final long resetTimeout = 60000; // 1 minute
    
    public ChatResponse callWithCircuitBreaker(Prompt prompt) {
        if (state == State.OPEN) {
            if (System.currentTimeMillis() - openedAt > resetTimeout) {
                state = State.HALF_OPEN;
            } else {
                throw new RuntimeException("Circuit breaker is OPEN");
            }
        }
        
        try {
            ChatResponse response = chatModel.call(prompt);
            
            if (state == State.HALF_OPEN) {
                state = State.CLOSED;
                failureCount = 0;
            }
            
            return response;
            
        } catch (HttpResponseException e) {
            failureCount++;
            
            if (failureCount >= failureThreshold) {
                state = State.OPEN;
                openedAt = System.currentTimeMillis();
            }
            
            throw e;
        }
    }
}

Error Recovery Strategies

Graceful Degradation

public String getResponseWithFallback(String prompt) {
    try {
        // Try primary model
        ChatResponse response = chatModel.call(new Prompt(prompt));
        return response.getResult().getOutput().getText();
        
    } catch (HttpResponseException e) {
        if (e.getResponse().getStatusCode() == 429) {
            // Rate limited - use cached response if available
            String cached = cache.get(prompt);
            if (cached != null) {
                return cached;
            }
            
            // Fall back to simpler model
            return getFallbackResponse(prompt);
        }
        throw e;
    }
}

private String getFallbackResponse(String prompt) {
    AzureOpenAiChatOptions fallbackOptions = AzureOpenAiChatOptions.builder()
        .deploymentName("gpt-35-turbo")  // Cheaper, faster model
        .maxTokens(500)
        .build();
    
    ChatResponse response = chatModel.call(
        new Prompt(prompt, fallbackOptions)
    );
    return response.getResult().getOutput().getText();
}

Partial Results

public String streamWithPartialResults(Prompt prompt) {
    StringBuilder result = new StringBuilder();
    
    try {
        chatModel.stream(prompt).subscribe(
            chunk -> {
                String token = chunk.getResult().getOutput().getText();
                if (token != null) {
                    result.append(token);
                }
            },
            error -> {
                // Stream failed - return partial result
                if (result.length() > 0) {
                    result.append("\n[Incomplete response]");
                }
            }
        );
        
        return result.toString();
        
    } catch (Exception e) {
        // Return partial result if available
        if (result.length() > 0) {
            return result.toString() + "\n[Error occurred]";
        }
        throw e;
    }
}

Validation Errors

Parameter Validation

public void validateChatOptions(AzureOpenAiChatOptions options) {
    if (options.getTemperature() != null) {
        double temp = options.getTemperature();
        if (temp < 0.0 || temp > 2.0) {
            throw new IllegalArgumentException(
                "Temperature must be between 0.0 and 2.0, got: " + temp
            );
        }
    }
    
    if (options.getMaxTokens() != null && 
        options.getMaxCompletionTokens() != null) {
        throw new IllegalArgumentException(
            "Cannot use both maxTokens and maxCompletionTokens"
        );
    }
    
    if (options.getN() != null && options.getN() < 1) {
        throw new IllegalArgumentException(
            "N must be >= 1, got: " + options.getN()
        );
    }
}

Input Validation

public void validatePrompt(String prompt) {
    if (prompt == null || prompt.trim().isEmpty()) {
        throw new IllegalArgumentException("Prompt cannot be null or empty");
    }
    
    int estimatedTokens = estimateTokenCount(prompt);
    if (estimatedTokens > 128000) {
        throw new IllegalArgumentException(
            "Prompt too long: " + estimatedTokens + " tokens (max: 128000)"
        );
    }
}

Logging and Monitoring

Structured Error Logging

public ChatResponse callWithLogging(Prompt prompt) {
    String requestId = UUID.randomUUID().toString();
    
    try {
        logger.info("API call started", Map.of(
            "requestId", requestId,
            "model", options.getDeploymentName(),
            "promptLength", prompt.getContents().length()
        ));
        
        ChatResponse response = chatModel.call(prompt);
        
        logger.info("API call succeeded", Map.of(
            "requestId", requestId,
            "tokensUsed", response.getMetadata().getUsage().getTotalTokens()
        ));
        
        return response;
        
    } catch (HttpResponseException e) {
        logger.error("API call failed", Map.of(
            "requestId", requestId,
            "statusCode", e.getResponse().getStatusCode(),
            "error", e.getMessage()
        ));
        throw e;
    }
}

Metrics Collection

public ChatResponse callWithMetrics(Prompt prompt) {
    long startTime = System.currentTimeMillis();
    
    try {
        ChatResponse response = chatModel.call(prompt);
        
        long duration = System.currentTimeMillis() - startTime;
        metrics.recordSuccess(duration);
        metrics.recordTokens(response.getMetadata().getUsage().getTotalTokens());
        
        return response;
        
    } catch (HttpResponseException e) {
        long duration = System.currentTimeMillis() - startTime;
        metrics.recordFailure(duration, e.getResponse().getStatusCode());
        throw e;
    }
}

Best Practices

Always handle exceptions: Don't let exceptions propagate unhandled
Implement retry logic: Use exponential backoff for transient errors
Validate inputs: Check parameters before making API calls
Log errors: Include request IDs and context for debugging
Monitor metrics: Track success rates, latencies, and error rates
Use circuit breakers: Prevent cascading failures
Provide fallbacks: Gracefully degrade when possible
Cache responses: Reduce load and improve resilience
Set timeouts: Don't wait indefinitely
Test error paths: Ensure error handling works correctly

Common Patterns

Complete Error Handling Template

public ChatResponse robustCall(Prompt prompt) {
    // Validate input
    validatePrompt(prompt);
    
    // Retry with backoff
    int maxRetries = 3;
    int baseDelay = 1000;
    
    for (int attempt = 0; attempt < maxRetries; attempt++) {
        try {
            return chatModel.call(prompt);
            
        } catch (HttpResponseException e) {
            int statusCode = e.getResponse().getStatusCode();
            
            // Handle specific errors
            if (statusCode == 401) {
                throw new AuthenticationException("Invalid credentials", e);
            } else if (statusCode == 404) {
                throw new ConfigurationException("Deployment not found", e);
            } else if (statusCode == 400) {
                if (e.getMessage().contains("content_policy")) {
                    throw new ContentFilterException("Content filtered", e);
                }
                throw new ValidationException("Invalid request", e);
            }
            
            // Retry transient errors
            if ((statusCode == 429 || statusCode >= 500) && 
                attempt < maxRetries - 1) {
                int delay = baseDelay * (1 << attempt);
                Thread.sleep(delay);
                continue;
            }
            
            throw e;
            
        } catch (Exception e) {
            // Unexpected error
            logger.error("Unexpected error", e);
            throw new RuntimeException("API call failed", e);
        }
    }
    
    throw new RuntimeException("Max retries exceeded");
}

tessl/maven-org-springframework-ai--spring-ai-azure-openai

error-handling.mddocs/reference/

Error Handling Reference

Exception Hierarchy

HTTP Status Codes

400 Bad Request

401 Unauthorized

403 Forbidden

404 Not Found

429 Too Many Requests

500 Internal Server Error

503 Service Unavailable

Retry Patterns

Exponential Backoff

Exponential Backoff with Jitter

Circuit Breaker Pattern

Error Recovery Strategies

Graceful Degradation

Partial Results

Validation Errors

Parameter Validation

Input Validation

Logging and Monitoring

Structured Error Logging

Metrics Collection

Best Practices

Common Patterns

Complete Error Handling Template

See Also

tessl/maven-org-springframework-ai--spring-ai-azure-openai

error-handling.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/reference/

Error Handling Reference

Exception Hierarchy

HTTP Status Codes

400 Bad Request

401 Unauthorized

403 Forbidden

404 Not Found

429 Too Many Requests

500 Internal Server Error

503 Service Unavailable

Retry Patterns

Exponential Backoff

Exponential Backoff with Jitter

Circuit Breaker Pattern

Error Recovery Strategies

Graceful Degradation

Partial Results

Validation Errors

Parameter Validation

Input Validation

Logging and Monitoring

Structured Error Logging

Metrics Collection

Best Practices

Common Patterns

Complete Error Handling Template

See Also

error-handling.mddocs/reference/