0
# Exception Handling & Error Management
1
2
The AWS Java SDK Core provides a comprehensive exception hierarchy for handling different types of errors that can occur during AWS service interactions. Understanding this hierarchy is crucial for proper error handling and application resilience.
3
4
## Exception Hierarchy
5
6
### Base Exception Classes
7
8
```java { .api }
9
// Base SDK exception class (extends RuntimeException)
10
class SdkBaseException extends RuntimeException {
11
public SdkBaseException(String message);
12
public SdkBaseException(String message, Throwable cause);
13
public boolean isRetryable();
14
}
15
16
// Client-side exception (network, configuration, etc.)
17
class SdkClientException extends SdkBaseException {
18
public SdkClientException(String message);
19
public SdkClientException(String message, Throwable cause);
20
public boolean isRetryable();
21
}
22
23
// Legacy client exception (for backward compatibility)
24
class AmazonClientException extends SdkBaseException {
25
public AmazonClientException(String message);
26
public AmazonClientException(String message, Throwable cause);
27
public AmazonClientException(Throwable cause);
28
public boolean isRetryable();
29
}
30
```
31
32
### Service Exception Classes
33
34
```java { .api }
35
// Service-side exception (4xx/5xx HTTP responses)
36
class AmazonServiceException extends SdkClientException {
37
// Error details
38
public String getErrorCode();
39
public String getErrorMessage();
40
public String getErrorType();
41
public int getStatusCode();
42
public String getServiceName();
43
public String getRequestId();
44
45
// Request details
46
public String getRawResponseContent();
47
public HttpResponse getRawResponse();
48
public Map<String, String> getHttpHeaders();
49
public String getProxyHost();
50
51
// Error classification
52
public ErrorType getErrorType();
53
public boolean isRetryable();
54
55
// Setters for error details
56
public void setErrorCode(String errorCode);
57
public void setErrorMessage(String errorMessage);
58
public void setErrorType(String errorType);
59
public void setStatusCode(int statusCode);
60
public void setServiceName(String serviceName);
61
public void setRequestId(String requestId);
62
}
63
64
// Error type enumeration
65
enum ErrorType {
66
Client, // 4xx errors - client-side issues
67
Service, // 5xx errors - server-side issues
68
Unknown // Unclassified errors
69
}
70
```
71
72
### Specific Exception Types
73
74
```java { .api }
75
// Operation aborted exception
76
class AbortedException extends SdkClientException {
77
public AbortedException();
78
public AbortedException(String message);
79
public AbortedException(String message, Throwable cause);
80
}
81
82
// Stream reset exception
83
class ResetException extends SdkClientException {
84
public ResetException(String message);
85
public ResetException(String message, Throwable cause);
86
87
public String getExtraInfo();
88
public void setExtraInfo(String extraInfo);
89
}
90
91
// Connection timeout exception
92
class SdkInterruptedException extends SdkClientException {
93
public SdkInterruptedException(Throwable cause);
94
public SdkInterruptedException(String message, Throwable cause);
95
}
96
```
97
98
## Error Handling Patterns
99
100
### Basic Exception Handling
101
102
```java
103
import com.amazonaws.AmazonClientException;
104
import com.amazonaws.AmazonServiceException;
105
import com.amazonaws.SdkClientException;
106
import com.amazonaws.services.s3.AmazonS3;
107
import com.amazonaws.services.s3.model.GetObjectRequest;
108
109
public void handleS3Operations(AmazonS3 s3Client, String bucket, String key) {
110
try {
111
// Perform S3 operation
112
s3Client.getObject(new GetObjectRequest(bucket, key));
113
114
} catch (AmazonServiceException ase) {
115
// Service-side error (4xx/5xx HTTP status codes)
116
System.err.println("Service Error:");
117
System.err.println(" Error Code: " + ase.getErrorCode());
118
System.err.println(" Error Message: " + ase.getErrorMessage());
119
System.err.println(" HTTP Status: " + ase.getStatusCode());
120
System.err.println(" Service: " + ase.getServiceName());
121
System.err.println(" Request ID: " + ase.getRequestId());
122
123
// Handle specific service errors
124
handleServiceError(ase);
125
126
} catch (SdkClientException ace) {
127
// Client-side error (network issues, configuration problems, etc.)
128
System.err.println("Client Error: " + ace.getMessage());
129
130
// Handle client errors
131
handleClientError(ace);
132
133
} catch (Exception e) {
134
// Unexpected errors
135
System.err.println("Unexpected error: " + e.getMessage());
136
throw new RuntimeException("Operation failed", e);
137
}
138
}
139
```
140
141
### Service Error Handling by Type
142
143
```java
144
private void handleServiceError(AmazonServiceException ase) {
145
// Handle errors by HTTP status code
146
switch (ase.getStatusCode()) {
147
case 400: // Bad Request
148
handleBadRequestError(ase);
149
break;
150
case 403: // Forbidden
151
handlePermissionError(ase);
152
break;
153
case 404: // Not Found
154
handleNotFoundError(ase);
155
break;
156
case 429: // Too Many Requests
157
handleThrottlingError(ase);
158
break;
159
case 500: // Internal Server Error
160
case 502: // Bad Gateway
161
case 503: // Service Unavailable
162
handleServerError(ase);
163
break;
164
default:
165
handleUnknownServiceError(ase);
166
}
167
168
// Handle errors by error code (service-specific)
169
String errorCode = ase.getErrorCode();
170
switch (errorCode) {
171
case "NoSuchBucket":
172
System.err.println("S3 bucket does not exist");
173
break;
174
case "NoSuchKey":
175
System.err.println("S3 object does not exist");
176
break;
177
case "AccessDenied":
178
System.err.println("Access denied - check permissions");
179
break;
180
case "InvalidBucketName":
181
System.err.println("Invalid S3 bucket name");
182
break;
183
case "ServiceUnavailable":
184
System.err.println("Service temporarily unavailable");
185
break;
186
default:
187
System.err.println("Service error: " + errorCode);
188
}
189
}
190
```
191
192
### Client Error Handling
193
194
```java
195
private void handleClientError(SdkClientException ace) {
196
// Check for specific client error types
197
if (ace instanceof AbortedException) {
198
System.err.println("Operation was aborted");
199
// Handle abortion scenario
200
201
} else if (ace instanceof ResetException) {
202
ResetException re = (ResetException) ace;
203
System.err.println("Connection reset: " + re.getExtraInfo());
204
// Handle connection reset
205
206
} else if (ace instanceof SdkInterruptedException) {
207
System.err.println("Operation was interrupted");
208
// Handle interruption
209
210
} else {
211
// Generic client error handling
212
String message = ace.getMessage();
213
214
if (message.contains("UnknownHostException")) {
215
System.err.println("Network connectivity issue - check DNS/Internet connection");
216
217
} else if (message.contains("ConnectTimeoutException")) {
218
System.err.println("Connection timeout - service may be unavailable");
219
220
} else if (message.contains("SocketTimeoutException")) {
221
System.err.println("Socket timeout - operation took too long");
222
223
} else if (message.contains("SSLException")) {
224
System.err.println("SSL/TLS error - check certificates and protocol");
225
226
} else {
227
System.err.println("Client configuration or network error: " + message);
228
}
229
}
230
}
231
```
232
233
## Retry Strategy with Exception Handling
234
235
### Intelligent Retry Logic
236
237
```java
238
import com.amazonaws.retry.RetryUtils;
239
import java.util.concurrent.TimeUnit;
240
241
public class ResilientAwsOperations {
242
private static final int MAX_RETRIES = 3;
243
private static final long INITIAL_BACKOFF_MS = 1000;
244
245
public <T> T executeWithRetry(Supplier<T> operation) {
246
int attempt = 0;
247
Exception lastException = null;
248
249
while (attempt < MAX_RETRIES) {
250
try {
251
return operation.get();
252
253
} catch (AmazonServiceException ase) {
254
lastException = ase;
255
256
// Check if error is retryable
257
if (shouldRetryServiceException(ase)) {
258
attempt++;
259
if (attempt < MAX_RETRIES) {
260
waitWithBackoff(attempt);
261
continue;
262
}
263
}
264
// Non-retryable service error
265
throw ase;
266
267
} catch (SdkClientException ace) {
268
lastException = ace;
269
270
// Check if client error is retryable
271
if (shouldRetryClientException(ace)) {
272
attempt++;
273
if (attempt < MAX_RETRIES) {
274
waitWithBackoff(attempt);
275
continue;
276
}
277
}
278
// Non-retryable client error
279
throw ace;
280
}
281
}
282
283
// Max retries exceeded
284
throw new RuntimeException("Max retries exceeded", lastException);
285
}
286
287
private boolean shouldRetryServiceException(AmazonServiceException ase) {
288
// Retry on server errors (5xx)
289
if (ase.getStatusCode() >= 500) {
290
return true;
291
}
292
293
// Retry on throttling (429)
294
if (ase.getStatusCode() == 429) {
295
return true;
296
}
297
298
// Retry on specific error codes
299
String errorCode = ase.getErrorCode();
300
return "ServiceUnavailable".equals(errorCode) ||
301
"Throttling".equals(errorCode) ||
302
"ThrottlingException".equals(errorCode) ||
303
"RequestTimeout".equals(errorCode);
304
}
305
306
private boolean shouldRetryClientException(SdkClientException ace) {
307
String message = ace.getMessage().toLowerCase();
308
309
// Retry on network-related errors
310
return message.contains("connecttimeoutexception") ||
311
message.contains("sockettimeoutexception") ||
312
message.contains("connection reset") ||
313
message.contains("connection refused") ||
314
ace instanceof ResetException;
315
}
316
317
private void waitWithBackoff(int attempt) {
318
long backoffMs = INITIAL_BACKOFF_MS * (1L << (attempt - 1)); // Exponential backoff
319
try {
320
TimeUnit.MILLISECONDS.sleep(backoffMs);
321
} catch (InterruptedException e) {
322
Thread.currentThread().interrupt();
323
throw new SdkInterruptedException("Interrupted during backoff", e);
324
}
325
}
326
}
327
```
328
329
### Usage of Retry Logic
330
331
```java
332
ResilientAwsOperations resilientOps = new ResilientAwsOperations();
333
334
// Retry S3 operations
335
S3Object result = resilientOps.executeWithRetry(() ->
336
s3Client.getObject("my-bucket", "my-key")
337
);
338
339
// Retry DynamoDB operations
340
GetItemResult dynamoResult = resilientOps.executeWithRetry(() ->
341
dynamoClient.getItem("my-table", key)
342
);
343
```
344
345
## Error Logging and Monitoring
346
347
### Structured Error Logging
348
349
```java
350
import org.slf4j.Logger;
351
import org.slf4j.LoggerFactory;
352
import org.slf4j.MDC;
353
354
public class AwsErrorLogger {
355
private static final Logger logger = LoggerFactory.getLogger(AwsErrorLogger.class);
356
357
public static void logServiceException(AmazonServiceException ase, String operation) {
358
try {
359
// Add context to MDC
360
MDC.put("aws.service", ase.getServiceName());
361
MDC.put("aws.errorCode", ase.getErrorCode());
362
MDC.put("aws.requestId", ase.getRequestId());
363
MDC.put("aws.statusCode", String.valueOf(ase.getStatusCode()));
364
MDC.put("operation", operation);
365
366
// Log based on error severity
367
if (ase.getStatusCode() >= 500) {
368
logger.error("AWS service error: {} - {}", ase.getErrorCode(), ase.getErrorMessage(), ase);
369
} else {
370
logger.warn("AWS client error: {} - {}", ase.getErrorCode(), ase.getErrorMessage());
371
}
372
373
} finally {
374
MDC.clear();
375
}
376
}
377
378
public static void logClientException(SdkClientException ace, String operation) {
379
try {
380
MDC.put("operation", operation);
381
MDC.put("exceptionType", ace.getClass().getSimpleName());
382
383
logger.error("AWS client error during {}: {}", operation, ace.getMessage(), ace);
384
385
} finally {
386
MDC.clear();
387
}
388
}
389
}
390
```
391
392
### Metrics Collection
393
394
```java
395
import com.amazonaws.metrics.RequestMetricCollector;
396
import java.util.concurrent.atomic.AtomicLong;
397
398
public class ErrorMetricsCollector extends RequestMetricCollector {
399
private final AtomicLong clientErrors = new AtomicLong();
400
private final AtomicLong serviceErrors = new AtomicLong();
401
private final AtomicLong throttlingErrors = new AtomicLong();
402
403
@Override
404
public void collectMetrics(Request<?> request, Response<?> response, Exception exception) {
405
if (exception instanceof AmazonServiceException) {
406
AmazonServiceException ase = (AmazonServiceException) exception;
407
serviceErrors.incrementAndGet();
408
409
if (ase.getStatusCode() == 429 || "Throttling".equals(ase.getErrorCode())) {
410
throttlingErrors.incrementAndGet();
411
}
412
413
// Log metrics
414
logger.info("Service error metrics - Total: {}, Throttling: {}",
415
serviceErrors.get(), throttlingErrors.get());
416
417
} else if (exception instanceof SdkClientException) {
418
clientErrors.incrementAndGet();
419
420
logger.info("Client error metrics - Total: {}", clientErrors.get());
421
}
422
}
423
424
// Expose metrics for monitoring systems
425
public long getClientErrorCount() { return clientErrors.get(); }
426
public long getServiceErrorCount() { return serviceErrors.get(); }
427
public long getThrottlingErrorCount() { return throttlingErrors.get(); }
428
}
429
```
430
431
## Error Recovery Strategies
432
433
### Circuit Breaker Pattern
434
435
```java
436
public class AwsCircuitBreaker {
437
private enum State { CLOSED, OPEN, HALF_OPEN }
438
439
private volatile State state = State.CLOSED;
440
private volatile long lastFailureTime;
441
private volatile int failureCount;
442
443
private final int failureThreshold;
444
private final long timeoutMs;
445
446
public AwsCircuitBreaker(int failureThreshold, long timeoutMs) {
447
this.failureThreshold = failureThreshold;
448
this.timeoutMs = timeoutMs;
449
}
450
451
public <T> T execute(Supplier<T> operation) throws Exception {
452
if (state == State.OPEN) {
453
if (System.currentTimeMillis() - lastFailureTime > timeoutMs) {
454
state = State.HALF_OPEN;
455
} else {
456
throw new RuntimeException("Circuit breaker is OPEN");
457
}
458
}
459
460
try {
461
T result = operation.get();
462
onSuccess();
463
return result;
464
465
} catch (Exception e) {
466
onFailure();
467
throw e;
468
}
469
}
470
471
private void onSuccess() {
472
failureCount = 0;
473
state = State.CLOSED;
474
}
475
476
private void onFailure() {
477
failureCount++;
478
lastFailureTime = System.currentTimeMillis();
479
480
if (failureCount >= failureThreshold) {
481
state = State.OPEN;
482
}
483
}
484
}
485
```
486
487
### Fallback Mechanisms
488
489
```java
490
public class FallbackAwsOperations {
491
private final AmazonS3 primaryS3Client;
492
private final AmazonS3 fallbackS3Client;
493
494
public FallbackAwsOperations(AmazonS3 primary, AmazonS3 fallback) {
495
this.primaryS3Client = primary;
496
this.fallbackS3Client = fallback;
497
}
498
499
public S3Object getObjectWithFallback(String bucket, String key) {
500
try {
501
return primaryS3Client.getObject(bucket, key);
502
503
} catch (AmazonServiceException ase) {
504
if (ase.getStatusCode() >= 500) {
505
logger.warn("Primary S3 service error, trying fallback: {}", ase.getErrorCode());
506
return fallbackS3Client.getObject(bucket, key);
507
}
508
throw ase;
509
510
} catch (SdkClientException ace) {
511
logger.warn("Primary S3 client error, trying fallback: {}", ace.getMessage());
512
return fallbackS3Client.getObject(bucket, key);
513
}
514
}
515
}
516
```
517
518
## Best Practices
519
520
### Exception Handling Best Practices
521
522
1. **Always catch specific exceptions first** - Handle `AmazonServiceException` before `SdkClientException`
523
2. **Log error details comprehensively** - Include error codes, request IDs, and HTTP status codes
524
3. **Implement appropriate retry logic** - Use exponential backoff for retryable errors
525
4. **Don't ignore client exceptions** - They often indicate configuration or network issues
526
5. **Use structured logging** - Include operation context and error metadata
527
528
### Error Recovery Best Practices
529
530
1. **Implement circuit breakers** for external service calls
531
2. **Use fallback mechanisms** where appropriate
532
3. **Monitor error rates and patterns** for proactive issue detection
533
4. **Test error scenarios** in development and staging environments
534
5. **Document error handling strategies** for operations teams
535
536
### Performance Considerations
537
538
1. **Avoid excessive retries** on non-retryable errors
539
2. **Use appropriate timeout configurations** to prevent hanging operations
540
3. **Consider async patterns** for better resource utilization during retries
541
4. **Cache error classifications** to avoid repeated error type checking
542
543
The AWS Java SDK's exception hierarchy provides the foundation for building resilient applications that can gracefully handle the various error conditions that occur in distributed cloud environments.