docs
This documentation has been enhanced for AI coding agents with comprehensive examples, complete API signatures, thread safety notes, error handling patterns, and production-ready usage patterns.
Package: org.springframework.boot.availability
Module: org.springframework.boot:spring-boot
Since: 2.3.0
Application availability state management for tracking and controlling whether an application is live (running correctly) and ready (able to handle traffic). Essential for Kubernetes liveness and readiness probes, load balancer integration, and graceful shutdown.
Spring Boot's availability system provides a formal model for representing application state through two independent dimensions:
This enables proper integration with platform health checks and orchestration systems.
Central interface for querying and updating application availability state.
package org.springframework.boot.availability;
/**
* Provides availability state information for the application.
* Applications can inject this bean to query or update availability state.
*
* Thread Safety: Thread-safe. All methods can be called concurrently.
*
* @since 2.3.0
*/
public interface ApplicationAvailability {
/**
* Return the current LivenessState of the application.
* Indicates whether the internal state of the application is correct.
*
* DEFAULT: Returns LivenessState.BROKEN if no event has been published yet.
*
* @return the liveness state (never null)
*/
LivenessState getLivenessState();
/**
* Return the current ReadinessState of the application.
* Indicates whether the application is ready to accept traffic.
*
* DEFAULT: Returns ReadinessState.REFUSING_TRAFFIC if no event has been published yet.
*
* @return the readiness state (never null)
*/
ReadinessState getReadinessState();
/**
* Return the current state of the application for the given type with a default.
* Generic method for querying any availability state type with a fallback value.
*
* @param <S> the state type
* @param stateType the state type class
* @param defaultState the default state to return if no event published yet
* @return the current state (never null)
*/
<S extends AvailabilityState> S getState(Class<S> stateType, S defaultState);
/**
* Return the current state of the application for the given type.
* Generic method for querying any availability state type.
*
* @param <S> the state type
* @param stateType the state type class
* @return the current state, or null if no event has been published yet for this type
*/
<S extends AvailabilityState> @Nullable S getState(Class<S> stateType);
/**
* Return the last AvailabilityChangeEvent received for a given state type.
* Useful for debugging state transitions.
*
* @param <S> the state type
* @param stateType the state type class
* @return the last change event, or null if no event received yet
*/
<S extends AvailabilityState> @Nullable AvailabilityChangeEvent<S> getLastChangeEvent(Class<S> stateType);
}Marker interface for availability states.
package org.springframework.boot.availability;
/**
* Tagging interface used on ApplicationAvailability states.
* This interface is usually implemented on an enum type.
*
* @since 2.3.0
*/
public interface AvailabilityState {
}Bean implementation that provides ApplicationAvailability by listening for change events.
package org.springframework.boot.availability;
import org.springframework.context.ApplicationListener;
/**
* Bean that provides an ApplicationAvailability implementation by listening for
* AvailabilityChangeEvent change events.
*
* Thread Safety: Thread-safe. Uses ConcurrentHashMap for state storage.
*
* @since 2.3.0
*/
public class ApplicationAvailabilityBean
implements ApplicationAvailability, ApplicationListener<AvailabilityChangeEvent<?>> {
/**
* Create a new ApplicationAvailabilityBean instance.
*/
public ApplicationAvailabilityBean() {
}
@Override
public <S extends AvailabilityState> S getState(Class<S> stateType, S defaultState) {
// Implementation inherited from ApplicationAvailability interface
}
@Override
public <S extends AvailabilityState> @Nullable S getState(Class<S> stateType) {
// Implementation inherited from ApplicationAvailability interface
}
@Override
public <S extends AvailabilityState> @Nullable AvailabilityChangeEvent<S> getLastChangeEvent(Class<S> stateType) {
// Implementation inherited from ApplicationAvailability interface
}
@Override
public void onApplicationEvent(AvailabilityChangeEvent<?> event) {
// Stores the event for later retrieval
}
}Enum representing application liveness state.
package org.springframework.boot.availability;
/**
* "Liveness" state of the application.
*
* An application is considered "live" when it's running with a correct internal state.
* "Liveness" failure means that the internal state of the application is broken and
* we cannot recover from it. As a consequence, the platform should restart the application.
*
* @since 2.3.0
*/
public enum LivenessState implements AvailabilityState {
/**
* The application is running and its internal state is correct.
* Normal operating state - application should continue running.
*
* Kubernetes Action: Do nothing, application is healthy.
*/
CORRECT,
/**
* The application is running but its internal state is broken.
* The application cannot recover and should be restarted.
*
* Kubernetes Action: Restart the container (liveness probe fails).
*
* Examples:
* - Unrecoverable exception in critical component
* - Corrupted in-memory state
* - Deadlock detected
* - Critical resource permanently unavailable
*/
BROKEN;
}Enum representing application readiness state.
package org.springframework.boot.availability;
/**
* "Readiness" state of the application.
*
* An application is considered "ready" when it's ready to accept traffic.
* "Readiness" failure means that the application is not able to accept traffic
* and that the infrastructure should not route requests to it.
*
* @since 2.3.0
*/
public enum ReadinessState implements AvailabilityState {
/**
* The application is ready to receive traffic.
* Load balancer should route requests to this instance.
*
* Kubernetes Action: Route traffic to pod (readiness probe succeeds).
*/
ACCEPTING_TRAFFIC,
/**
* The application is not willing to receive traffic.
* Load balancer should not route requests to this instance.
*
* Kubernetes Action: Stop routing traffic (readiness probe fails).
*
* Examples:
* - Application is starting up
* - Application is shutting down
* - Database connection pool exhausted (temporary)
* - Circuit breaker open
* - Planned maintenance mode
*/
REFUSING_TRAFFIC;
}Event published when availability state changes.
package org.springframework.boot.availability;
import org.springframework.context.ApplicationEvent;
import org.springframework.context.PayloadApplicationEvent;
import org.springframework.core.ResolvableType;
/**
* ApplicationEvent sent when the AvailabilityState of the application changes.
* Any application component can send such events to update the state of the application.
*
* @param <S> the availability state type
* @since 2.3.0
*/
public class AvailabilityChangeEvent<S extends AvailabilityState> extends PayloadApplicationEvent<S> {
/**
* Create a new AvailabilityChangeEvent instance.
*
* @param source the source of the event (typically the component changing state)
* @param state the new availability state (never null)
*/
public AvailabilityChangeEvent(Object source, S state) {
super(source, state);
}
/**
* Return the changed availability state.
*
* @return the availability state
*/
public S getState() {
return getPayload();
}
@Override
public ResolvableType getResolvableType() {
return ResolvableType.forClassWithGenerics(getClass(), getStateType());
}
private Class<?> getStateType() {
S state = getState();
if (state instanceof Enum<?> enumState) {
return enumState.getDeclaringClass();
}
return state.getClass();
}
/**
* Convenience method to publish an availability change event.
* Publishes the event through the ApplicationContext.
*
* @param <S> the availability state type
* @param context the application context to publish the event to
* @param state the new availability state
*/
public static <S extends AvailabilityState> void publish(
ApplicationContext context, S state) {
context.publishEvent(new AvailabilityChangeEvent<>(context, state));
}
/**
* Convenience method to publish an availability change event.
* Publishes the event through the ApplicationEventPublisher.
*
* @param <S> the availability state type
* @param publisher the event publisher
* @param source the source of the event
* @param state the new availability state
*/
public static <S extends AvailabilityState> void publish(
ApplicationEventPublisher publisher, Object source, S state) {
publisher.publishEvent(new AvailabilityChangeEvent<>(source, state));
}
}package com.example.monitoring;
import org.springframework.boot.availability.ApplicationAvailability;
import org.springframework.boot.availability.LivenessState;
import org.springframework.boot.availability.ReadinessState;
import org.springframework.stereotype.Component;
/**
* Component that queries availability state.
*/
@Component
public class AvailabilityMonitor {
private final ApplicationAvailability availability;
public AvailabilityMonitor(ApplicationAvailability availability) {
this.availability = availability;
}
public void checkApplicationHealth() {
LivenessState liveness = availability.getLivenessState();
ReadinessState readiness = availability.getReadinessState();
System.out.println("Liveness: " + liveness);
System.out.println("Readiness: " + readiness);
if (liveness == LivenessState.BROKEN) {
System.err.println("APPLICATION IS BROKEN - RESTART REQUIRED");
}
if (readiness == ReadinessState.REFUSING_TRAFFIC) {
System.out.println("Application not ready for traffic");
}
}
public boolean isHealthy() {
return availability.getLivenessState() == LivenessState.CORRECT &&
availability.getReadinessState() == ReadinessState.ACCEPTING_TRAFFIC;
}
/**
* Demonstrates using generic getState methods with proper null handling.
*/
public void checkGenericState() {
// Option 1: Use getState with default value (non-null guarantee)
LivenessState liveness = availability.getState(LivenessState.class, LivenessState.CORRECT);
System.out.println("Liveness (with default): " + liveness); // Never null
// Option 2: Use getState without default (nullable)
ReadinessState readiness = availability.getState(ReadinessState.class);
if (readiness != null) {
System.out.println("Readiness: " + readiness);
} else {
System.out.println("No readiness state published yet");
}
}
}package com.example.management;
import org.springframework.boot.availability.AvailabilityChangeEvent;
import org.springframework.boot.availability.LivenessState;
import org.springframework.boot.availability.ReadinessState;
import org.springframework.context.ApplicationEventPublisher;
import org.springframework.stereotype.Component;
/**
* Component that updates availability state based on conditions.
*/
@Component
public class AvailabilityManager {
private final ApplicationEventPublisher eventPublisher;
public AvailabilityManager(ApplicationEventPublisher eventPublisher) {
this.eventPublisher = eventPublisher;
}
/**
* Mark application as broken when unrecoverable error detected.
*/
public void markAsBroken() {
AvailabilityChangeEvent.publish(
eventPublisher,
this,
LivenessState.BROKEN
);
System.err.println("Application marked as BROKEN");
}
/**
* Refuse traffic temporarily (e.g., during maintenance).
*/
public void refuseTraffic() {
AvailabilityChangeEvent.publish(
eventPublisher,
this,
ReadinessState.REFUSING_TRAFFIC
);
System.out.println("Application refusing traffic");
}
/**
* Accept traffic after maintenance or startup.
*/
public void acceptTraffic() {
AvailabilityChangeEvent.publish(
eventPublisher,
this,
ReadinessState.ACCEPTING_TRAFFIC
);
System.out.println("Application accepting traffic");
}
/**
* Enter maintenance mode: refuse traffic but stay alive.
*/
public void enterMaintenanceMode() {
refuseTraffic();
System.out.println("Entered maintenance mode");
}
/**
* Exit maintenance mode: accept traffic again.
*/
public void exitMaintenanceMode() {
acceptTraffic();
System.out.println("Exited maintenance mode");
}
}package com.example.monitoring;
import org.springframework.boot.availability.AvailabilityChangeEvent;
import org.springframework.boot.availability.LivenessState;
import org.springframework.boot.availability.ReadinessState;
import org.springframework.context.event.EventListener;
import org.springframework.stereotype.Component;
/**
* Listens to availability state changes and reacts accordingly.
*/
@Component
public class AvailabilityStateListener {
/**
* React to liveness state changes.
*/
@EventListener
public void onLivenessStateChange(AvailabilityChangeEvent<LivenessState> event) {
LivenessState state = event.getState();
switch (state) {
case CORRECT:
System.out.println("Application liveness: CORRECT");
break;
case BROKEN:
System.err.println("APPLICATION LIVENESS: BROKEN");
System.err.println("Platform should restart this instance");
// Notify monitoring systems
notifyMonitoring("Application marked as BROKEN");
break;
}
}
/**
* React to readiness state changes.
*/
@EventListener
public void onReadinessStateChange(AvailabilityChangeEvent<ReadinessState> event) {
ReadinessState state = event.getState();
switch (state) {
case ACCEPTING_TRAFFIC:
System.out.println("Application readiness: ACCEPTING TRAFFIC");
// Re-register with service discovery
registerWithServiceDiscovery();
break;
case REFUSING_TRAFFIC:
System.out.println("Application readiness: REFUSING TRAFFIC");
// Deregister from service discovery
deregisterFromServiceDiscovery();
break;
}
}
private void notifyMonitoring(String message) {
// Send alert to monitoring system
}
private void registerWithServiceDiscovery() {
// Register with Consul, Eureka, etc.
}
private void deregisterFromServiceDiscovery() {
// Deregister from service discovery
}
}package com.example.actuator;
import org.springframework.boot.actuate.health.Health;
import org.springframework.boot.actuate.health.HealthIndicator;
import org.springframework.boot.availability.ApplicationAvailability;
import org.springframework.boot.availability.LivenessState;
import org.springframework.boot.availability.ReadinessState;
import org.springframework.stereotype.Component;
/**
* Custom health indicators for Kubernetes probes.
*/
@Component("livenessProbe")
public class LivenessHealthIndicator implements HealthIndicator {
private final ApplicationAvailability availability;
public LivenessHealthIndicator(ApplicationAvailability availability) {
this.availability = availability;
}
@Override
public Health health() {
LivenessState state = availability.getLivenessState();
return (state == LivenessState.CORRECT)
? Health.up().withDetail("state", "CORRECT").build()
: Health.down().withDetail("state", "BROKEN").build();
}
}
@Component("readinessProbe")
class ReadinessHealthIndicator implements HealthIndicator {
private final ApplicationAvailability availability;
public ReadinessHealthIndicator(ApplicationAvailability availability) {
this.availability = availability;
}
@Override
public Health health() {
ReadinessState state = availability.getReadinessState();
return (state == ReadinessState.ACCEPTING_TRAFFIC)
? Health.up().withDetail("state", "ACCEPTING_TRAFFIC").build()
: Health.outOfService().withDetail("state", "REFUSING_TRAFFIC").build();
}
}Kubernetes Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
template:
spec:
containers:
- name: app
image: myapp:latest
ports:
- containerPort: 8080
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
failureThreshold: 3
timeoutSeconds: 5
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 3
timeoutSeconds: 3package com.example.resilience;
import org.springframework.boot.availability.AvailabilityChangeEvent;
import org.springframework.boot.availability.ReadinessState;
import org.springframework.context.ApplicationEventPublisher;
import org.springframework.stereotype.Component;
/**
* Integrates circuit breaker with readiness state.
*/
@Component
public class CircuitBreakerAvailabilityController {
private final ApplicationEventPublisher eventPublisher;
public CircuitBreakerAvailabilityController(ApplicationEventPublisher eventPublisher) {
this.eventPublisher = eventPublisher;
}
/**
* Called when circuit breaker opens (too many failures).
* Refuse traffic until circuit closes.
*/
public void onCircuitBreakerOpen(String circuitName) {
System.err.println("Circuit breaker opened: " + circuitName);
// Refuse traffic to prevent cascade failures
AvailabilityChangeEvent.publish(
eventPublisher,
this,
ReadinessState.REFUSING_TRAFFIC
);
}
/**
* Called when circuit breaker closes (service recovered).
* Resume accepting traffic.
*/
public void onCircuitBreakerClosed(String circuitName) {
System.out.println("Circuit breaker closed: " + circuitName);
// Accept traffic again
AvailabilityChangeEvent.publish(
eventPublisher,
this,
ReadinessState.ACCEPTING_TRAFFIC
);
}
}package com.example.lifecycle;
import org.springframework.boot.availability.AvailabilityChangeEvent;
import org.springframework.boot.availability.ReadinessState;
import org.springframework.context.ApplicationEventPublisher;
import org.springframework.context.event.ContextClosedEvent;
import org.springframework.context.event.EventListener;
import org.springframework.stereotype.Component;
/**
* Ensures graceful shutdown by refusing traffic before shutdown.
*/
@Component
public class GracefulShutdownManager {
private final ApplicationEventPublisher eventPublisher;
public GracefulShutdownManager(ApplicationEventPublisher eventPublisher) {
this.eventPublisher = eventPublisher;
}
/**
* Refuse traffic when context starts closing.
* Gives load balancer time to stop routing requests.
*/
@EventListener
public void onContextClosed(ContextClosedEvent event) {
System.out.println("Context closing - refusing traffic");
// Immediately refuse traffic
AvailabilityChangeEvent.publish(
eventPublisher,
this,
ReadinessState.REFUSING_TRAFFIC
);
// Give load balancer time to deregister (5 seconds)
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
System.out.println("Proceeding with shutdown");
}
}package com.example.database;
import org.springframework.boot.availability.AvailabilityChangeEvent;
import org.springframework.boot.availability.ReadinessState;
import org.springframework.context.ApplicationEventPublisher;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Component;
import javax.sql.DataSource;
import java.sql.Connection;
/**
* Monitors database connection pool and updates readiness.
*/
@Component
public class DatabaseConnectionMonitor {
private final DataSource dataSource;
private final ApplicationEventPublisher eventPublisher;
private boolean wasHealthy = true;
public DatabaseConnectionMonitor(
DataSource dataSource,
ApplicationEventPublisher eventPublisher) {
this.dataSource = dataSource;
this.eventPublisher = eventPublisher;
}
@Scheduled(fixedRate = 5000) // Check every 5 seconds
public void checkDatabaseHealth() {
boolean isHealthy = canConnectToDatabase();
// Only update state if it changed
if (isHealthy != wasHealthy) {
ReadinessState newState = isHealthy
? ReadinessState.ACCEPTING_TRAFFIC
: ReadinessState.REFUSING_TRAFFIC;
AvailabilityChangeEvent.publish(eventPublisher, this, newState);
System.out.printf("Database health changed: %s -> readiness: %s%n",
isHealthy ? "HEALTHY" : "UNHEALTHY", newState);
wasHealthy = isHealthy;
}
}
private boolean canConnectToDatabase() {
try (Connection conn = dataSource.getConnection()) {
return conn.isValid(3);
} catch (Exception e) {
System.err.println("Database connection check failed: " + e.getMessage());
return false;
}
}
}package com.example.diagnostics;
import org.springframework.boot.availability.AvailabilityChangeEvent;
import org.springframework.boot.availability.LivenessState;
import org.springframework.context.ApplicationEventPublisher;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Component;
import java.lang.management.ManagementFactory;
import java.lang.management.ThreadMXBean;
/**
* Detects deadlocks and marks application as broken.
*/
@Component
public class DeadlockDetector {
private final ApplicationEventPublisher eventPublisher;
private final ThreadMXBean threadMXBean;
public DeadlockDetector(ApplicationEventPublisher eventPublisher) {
this.eventPublisher = eventPublisher;
this.threadMXBean = ManagementFactory.getThreadMXBean();
}
@Scheduled(fixedRate = 60000) // Check every minute
public void detectDeadlocks() {
long[] deadlockedThreads = threadMXBean.findDeadlockedThreads();
if (deadlockedThreads != null && deadlockedThreads.length > 0) {
System.err.println("DEADLOCK DETECTED: " + deadlockedThreads.length + " threads");
// Mark application as broken - requires restart
AvailabilityChangeEvent.publish(
eventPublisher,
this,
LivenessState.BROKEN
);
}
}
}Complete maintenance mode implementation with scheduled operations:
package com.example.maintenance;
import org.springframework.boot.availability.AvailabilityChangeEvent;
import org.springframework.boot.availability.ReadinessState;
import org.springframework.context.ApplicationEventPublisher;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Service;
import java.time.Duration;
import java.time.Instant;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicReference;
/**
* Controls application maintenance mode with automatic traffic management.
*/
@Service
public class MaintenanceModeController {
private final ApplicationEventPublisher eventPublisher;
private final AtomicBoolean maintenanceMode = new AtomicBoolean(false);
private final AtomicReference<Instant> maintenanceStartTime = new AtomicReference<>();
private final Duration maxMaintenanceDuration = Duration.ofHours(2);
public MaintenanceModeController(ApplicationEventPublisher eventPublisher) {
this.eventPublisher = eventPublisher;
}
/**
* Enter maintenance mode and refuse new traffic.
*
* @param reason reason for maintenance
*/
public void enterMaintenanceMode(String reason) {
if (maintenanceMode.compareAndSet(false, true)) {
System.out.println("Entering maintenance mode: " + reason);
maintenanceStartTime.set(Instant.now());
// Refuse traffic
AvailabilityChangeEvent.publish(
eventPublisher,
this,
ReadinessState.REFUSING_TRAFFIC
);
System.out.println("Application is now in maintenance mode");
} else {
System.out.println("Already in maintenance mode");
}
}
/**
* Exit maintenance mode and accept traffic.
*/
public void exitMaintenanceMode() {
if (maintenanceMode.compareAndSet(true, false)) {
System.out.println("Exiting maintenance mode");
maintenanceStartTime.set(null);
// Accept traffic
AvailabilityChangeEvent.publish(
eventPublisher,
this,
ReadinessState.ACCEPTING_TRAFFIC
);
System.out.println("Application is accepting traffic");
} else {
System.out.println("Not in maintenance mode");
}
}
/**
* Check if currently in maintenance mode.
*
* @return true if in maintenance
*/
public boolean isInMaintenanceMode() {
return maintenanceMode.get();
}
/**
* Get duration in maintenance mode.
*
* @return duration or null if not in maintenance
*/
public Duration getMaintenanceDuration() {
Instant startTime = maintenanceStartTime.get();
return startTime != null
? Duration.between(startTime, Instant.now())
: null;
}
/**
* Scheduled check to prevent indefinite maintenance mode.
*/
@Scheduled(fixedRate = 60000) // Check every minute
public void checkMaintenanceTimeout() {
if (!maintenanceMode.get()) {
return;
}
Duration duration = getMaintenanceDuration();
if (duration != null && duration.compareTo(maxMaintenanceDuration) > 0) {
System.err.println("WARNING: Maintenance mode exceeded maximum duration");
System.err.println("Auto-exiting maintenance mode");
exitMaintenanceMode();
}
}
}Monitor external dependencies and update readiness accordingly:
package com.example.health;
import org.springframework.boot.availability.AvailabilityChangeEvent;
import org.springframework.boot.availability.ReadinessState;
import org.springframework.context.ApplicationEventPublisher;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Service;
import javax.sql.DataSource;
import java.sql.Connection;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.atomic.AtomicInteger;
/**
* Monitors critical dependencies and updates readiness state.
*/
@Service
public class DependencyHealthMonitor {
private final ApplicationEventPublisher eventPublisher;
private final DataSource dataSource;
private final AtomicInteger consecutiveFailures = new AtomicInteger(0);
private static final int FAILURE_THRESHOLD = 3;
private volatile boolean wasReady = true;
public DependencyHealthMonitor(ApplicationEventPublisher eventPublisher,
DataSource dataSource) {
this.eventPublisher = eventPublisher;
this.dataSource = dataSource;
}
@Scheduled(fixedRate = 5000) // Check every 5 seconds
public void checkDependencies() {
List<HealthCheck> checks = performHealthChecks();
boolean allHealthy = checks.stream().allMatch(HealthCheck::isHealthy);
boolean isReady = allHealthy && consecutiveFailures.get() < FAILURE_THRESHOLD;
// Only update state if it changed
if (isReady != wasReady) {
updateReadinessState(isReady);
wasReady = isReady;
}
// Track consecutive failures
if (!allHealthy) {
int failures = consecutiveFailures.incrementAndGet();
System.err.printf("Health check failed (%d consecutive failures)%n", failures);
} else {
consecutiveFailures.set(0);
}
}
private List<HealthCheck> performHealthChecks() {
List<HealthCheck> results = new ArrayList<>();
// Database health check
results.add(checkDatabase());
// Add more dependency checks here
// results.add(checkRedis());
// results.add(checkMessageQueue());
return results;
}
private HealthCheck checkDatabase() {
try (Connection conn = dataSource.getConnection()) {
boolean valid = conn.isValid(3);
return new HealthCheck("database", valid, valid ? null : "Connection invalid");
} catch (Exception e) {
return new HealthCheck("database", false, e.getMessage());
}
}
private void updateReadinessState(boolean ready) {
ReadinessState newState = ready
? ReadinessState.ACCEPTING_TRAFFIC
: ReadinessState.REFUSING_TRAFFIC;
AvailabilityChangeEvent.publish(eventPublisher, this, newState);
System.out.printf("Readiness state changed to: %s%n", newState);
}
record HealthCheck(String name, boolean isHealthy, String error) {}
}Implement gradual service degradation based on system load:
package com.example.degradation;
import org.springframework.boot.availability.AvailabilityChangeEvent;
import org.springframework.boot.availability.ReadinessState;
import org.springframework.context.ApplicationEventPublisher;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Service;
import java.lang.management.ManagementFactory;
import java.lang.management.MemoryMXBean;
import java.lang.management.OperatingSystemMXBean;
/**
* Implements graceful degradation by refusing traffic under high load.
*/
@Service
public class GracefulDegradationHandler {
private final ApplicationEventPublisher eventPublisher;
private final MemoryMXBean memoryMXBean;
private final OperatingSystemMXBean osMXBean;
private static final double CPU_THRESHOLD = 0.90; // 90%
private static final double MEMORY_THRESHOLD = 0.85; // 85%
private volatile boolean degraded = false;
public GracefulDegradationHandler(ApplicationEventPublisher eventPublisher) {
this.eventPublisher = eventPublisher;
this.memoryMXBean = ManagementFactory.getMemoryMXBean();
this.osMXBean = ManagementFactory.getOperatingSystemMXBean();
}
@Scheduled(fixedRate = 10000) // Check every 10 seconds
public void monitorSystemResources() {
double cpuLoad = osMXBean.getSystemLoadAverage();
double memoryUsage = getMemoryUsagePercentage();
boolean shouldDegrade = cpuLoad > CPU_THRESHOLD || memoryUsage > MEMORY_THRESHOLD;
if (shouldDegrade && !degraded) {
enterDegradedMode(cpuLoad, memoryUsage);
} else if (!shouldDegrade && degraded) {
exitDegradedMode();
}
}
private void enterDegradedMode(double cpuLoad, double memoryUsage) {
System.err.printf("Entering degraded mode: CPU=%.2f, Memory=%.2f%%%n",
cpuLoad, memoryUsage * 100);
degraded = true;
// Refuse new traffic to allow system to recover
AvailabilityChangeEvent.publish(
eventPublisher,
this,
ReadinessState.REFUSING_TRAFFIC
);
}
private void exitDegradedMode() {
System.out.println("Exiting degraded mode: system resources recovered");
degraded = false;
// Accept traffic again
AvailabilityChangeEvent.publish(
eventPublisher,
this,
ReadinessState.ACCEPTING_TRAFFIC
);
}
private double getMemoryUsagePercentage() {
long used = memoryMXBean.getHeapMemoryUsage().getUsed();
long max = memoryMXBean.getHeapMemoryUsage().getMax();
return (double) used / max;
}
public boolean isDegraded() {
return degraded;
}
}Control when application becomes ready after startup checks:
package com.example.startup;
import org.springframework.boot.availability.AvailabilityChangeEvent;
import org.springframework.boot.availability.ReadinessState;
import org.springframework.boot.context.event.ApplicationReadyEvent;
import org.springframework.context.ApplicationEventPublisher;
import org.springframework.context.event.EventListener;
import org.springframework.stereotype.Component;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.TimeUnit;
/**
* Controls when application becomes ready after startup validation.
*/
@Component
public class StartupReadinessGate {
private final ApplicationEventPublisher eventPublisher;
public StartupReadinessGate(ApplicationEventPublisher eventPublisher) {
this.eventPublisher = eventPublisher;
}
@EventListener
public void onApplicationReady(ApplicationReadyEvent event) {
System.out.println("Application started, running readiness checks...");
// Initially refuse traffic
AvailabilityChangeEvent.publish(
eventPublisher,
this,
ReadinessState.REFUSING_TRAFFIC
);
// Run readiness checks asynchronously
CompletableFuture.runAsync(() -> performReadinessChecks())
.orTimeout(30, TimeUnit.SECONDS)
.thenRun(() -> markAsReady())
.exceptionally(throwable -> {
System.err.println("Readiness checks failed: " + throwable.getMessage());
return null;
});
}
private void performReadinessChecks() {
List<ReadinessCheck> checks = new ArrayList<>();
// Add various readiness checks
checks.add(new DatabaseReadinessCheck());
checks.add(new CacheWarmupCheck());
checks.add(new ConfigurationValidationCheck());
for (ReadinessCheck check : checks) {
System.out.println("Running check: " + check.getName());
if (!check.execute()) {
throw new IllegalStateException("Check failed: " + check.getName());
}
}
System.out.println("All readiness checks passed");
}
private void markAsReady() {
System.out.println("Application is now ready to accept traffic");
AvailabilityChangeEvent.publish(
eventPublisher,
this,
ReadinessState.ACCEPTING_TRAFFIC
);
}
interface ReadinessCheck {
String getName();
boolean execute();
}
static class DatabaseReadinessCheck implements ReadinessCheck {
@Override
public String getName() {
return "Database Connection";
}
@Override
public boolean execute() {
// Check database connection
return true;
}
}
static class CacheWarmupCheck implements ReadinessCheck {
@Override
public String getName() {
return "Cache Warmup";
}
@Override
public boolean execute() {
// Warm up caches
return true;
}
}
static class ConfigurationValidationCheck implements ReadinessCheck {
@Override
public String getName() {
return "Configuration Validation";
}
@Override
public boolean execute() {
// Validate configuration
return true;
}
}
}Detect and recover from broken liveness state:
package com.example.recovery;
import org.springframework.boot.availability.ApplicationAvailability;
import org.springframework.boot.availability.AvailabilityChangeEvent;
import org.springframework.boot.availability.LivenessState;
import org.springframework.context.ApplicationEventPublisher;
import org.springframework.context.event.EventListener;
import org.springframework.stereotype.Component;
import java.time.Duration;
import java.time.Instant;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
/**
* Attempts automatic recovery when liveness state becomes broken.
*/
@Component
public class AutomaticRecoveryManager {
private final ApplicationAvailability availability;
private final ApplicationEventPublisher eventPublisher;
private final ScheduledExecutorService scheduler;
private volatile Instant brokenSince;
private static final Duration MAX_RECOVERY_TIME = Duration.ofMinutes(5);
public AutomaticRecoveryManager(ApplicationAvailability availability,
ApplicationEventPublisher eventPublisher) {
this.availability = availability;
this.eventPublisher = eventPublisher;
this.scheduler = Executors.newSingleThreadScheduledExecutor();
}
@EventListener
public void onLivenessStateChange(AvailabilityChangeEvent<LivenessState> event) {
if (event.getState() == LivenessState.BROKEN) {
handleBrokenState();
} else if (event.getState() == LivenessState.CORRECT) {
handleCorrectState();
}
}
private void handleBrokenState() {
System.err.println("Liveness state is BROKEN, attempting recovery...");
brokenSince = Instant.now();
// Schedule recovery attempts
scheduler.schedule(
this::attemptRecovery,
10,
TimeUnit.SECONDS
);
}
private void handleCorrectState() {
if (brokenSince != null) {
Duration downtime = Duration.between(brokenSince, Instant.now());
System.out.println("Recovered from broken state after: " + downtime);
brokenSince = null;
}
}
private void attemptRecovery() {
if (brokenSince == null) {
return; // Already recovered
}
Duration downtime = Duration.between(brokenSince, Instant.now());
if (downtime.compareTo(MAX_RECOVERY_TIME) > 0) {
System.err.println("Recovery failed, exceeded maximum recovery time");
return;
}
System.out.println("Attempting to recover application state...");
boolean recovered = performRecoveryActions();
if (recovered) {
System.out.println("Recovery successful, marking as CORRECT");
AvailabilityChangeEvent.publish(
eventPublisher,
this,
LivenessState.CORRECT
);
} else {
System.err.println("Recovery attempt failed, will retry");
scheduler.schedule(
this::attemptRecovery,
30,
TimeUnit.SECONDS
);
}
}
private boolean performRecoveryActions() {
try {
// Clear caches
clearCaches();
// Reset connection pools
resetConnectionPools();
// Verify critical services
return verifyCriticalServices();
} catch (Exception e) {
System.err.println("Recovery action failed: " + e.getMessage());
return false;
}
}
private void clearCaches() {
System.out.println("Clearing caches...");
// Implementation
}
private void resetConnectionPools() {
System.out.println("Resetting connection pools...");
// Implementation
}
private boolean verifyCriticalServices() {
System.out.println("Verifying critical services...");
// Implementation
return true;
}
}Problem: Using liveness probes for temporary issues that should use readiness
Error: Kubernetes restarts pod unnecessarily, causing cascading failures
Solution:
// WRONG: Marking as BROKEN for temporary database issue
if (!canConnectToDatabase()) {
AvailabilityChangeEvent.publish(eventPublisher, this, LivenessState.BROKEN);
}
// CORRECT: Use readiness for temporary issues
if (!canConnectToDatabase()) {
AvailabilityChangeEvent.publish(eventPublisher, this, ReadinessState.REFUSING_TRAFFIC);
}
// Use BROKEN only for truly unrecoverable situations
if (deadlockDetected() || corruptedMemoryState()) {
AvailabilityChangeEvent.publish(eventPublisher, this, LivenessState.BROKEN);
}Rationale: Liveness BROKEN triggers pod restart. Use readiness to temporarily stop traffic while maintaining application state. Reserve liveness for situations where restart is the only solution.
Problem: Injecting ApplicationAvailability in older Spring Boot versions
Error:
NoSuchBeanDefinitionException: No qualifying bean of type 'org.springframework.boot.availability.ApplicationAvailability'Solution:
// Use @Autowired(required = false) for compatibility
@Component
public class AvailabilityAwareService {
private final ApplicationAvailability availability;
public AvailabilityAwareService(
@Autowired(required = false) ApplicationAvailability availability) {
this.availability = availability;
}
public boolean isApplicationReady() {
if (availability == null) {
// Fallback for older Spring Boot versions
return true;
}
return availability.getReadinessState() == ReadinessState.ACCEPTING_TRAFFIC;
}
}Rationale: ApplicationAvailability was added in Spring Boot 2.3.0. Applications supporting older versions must handle its absence gracefully.
Problem: Availability state not exposed via health endpoints
Error: /actuator/health/liveness and /actuator/health/readiness return 404
Solution:
<!-- Add actuator dependency -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency># Enable health endpoints
management.endpoint.health.probes.enabled=true
management.health.livenessstate.enabled=true
management.health.readinessstate.enabled=trueRationale: Availability state is automatically exposed via actuator health endpoints, but only when actuator is present and probes are enabled.
Problem: Rapid state changes between ACCEPTING and REFUSING traffic
Error: Load balancer constantly adds/removes instance, causing instability
Solution:
// Implement hysteresis with threshold
public class StableAvailabilityManager {
private final AtomicInteger consecutiveHealthyChecks = new AtomicInteger(0);
private final AtomicInteger consecutiveUnhealthyChecks = new AtomicInteger(0);
private static final int HEALTHY_THRESHOLD = 3;
private static final int UNHEALTHY_THRESHOLD = 2;
public void checkHealth() {
boolean healthy = performHealthCheck();
if (healthy) {
consecutiveUnhealthyChecks.set(0);
if (consecutiveHealthyChecks.incrementAndGet() >= HEALTHY_THRESHOLD) {
transitionToReady();
}
} else {
consecutiveHealthyChecks.set(0);
if (consecutiveUnhealthyChecks.incrementAndGet() >= UNHEALTHY_THRESHOLD) {
transitionToNotReady();
}
}
}
}Rationale: Single failed health check shouldn't immediately change state. Require multiple consecutive checks to prevent flapping and give transient issues time to resolve.
Problem: Not refusing traffic during shutdown
Error: In-flight requests aborted, clients receive connection errors
Solution:
@Component
public class GracefulShutdownListener {
private final ApplicationEventPublisher eventPublisher;
@PreDestroy
public void onShutdown() {
System.out.println("Shutdown initiated, refusing traffic");
// Immediately refuse traffic
AvailabilityChangeEvent.publish(
eventPublisher,
this,
ReadinessState.REFUSING_TRAFFIC
);
// Give load balancer time to deregister
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}Rationale: Applications must refuse traffic before shutdown to allow load balancers to stop routing requests, preventing connection errors.
Problem: Long-running operations in availability event listeners
Error: State change blocks application, causing timeouts
Solution:
// WRONG: Blocking operation in event listener
@EventListener
public void onReadinessChange(AvailabilityChangeEvent<ReadinessState> event) {
if (event.getState() == ReadinessState.REFUSING_TRAFFIC) {
// This blocks event publishing!
performLengthyCleanup();
}
}
// CORRECT: Use async execution
@EventListener
@Async
public void onReadinessChange(AvailabilityChangeEvent<ReadinessState> event) {
if (event.getState() == ReadinessState.REFUSING_TRAFFIC) {
performLengthyCleanup();
}
}
// Or use executor
@EventListener
public void onReadinessChange(AvailabilityChangeEvent<ReadinessState> event) {
if (event.getState() == ReadinessState.REFUSING_TRAFFIC) {
CompletableFuture.runAsync(this::performLengthyCleanup);
}
}Rationale: Event listeners execute synchronously in the publisher's thread. Long operations block state changes and can cause cascading delays.
Problem: Readiness check hangs indefinitely
Error: Kubernetes never marks pod as ready, deployment stuck
Solution:
public boolean checkReadiness() {
CompletableFuture<Boolean> future = CompletableFuture.supplyAsync(() -> {
return performHealthCheck();
});
try {
// Set timeout to prevent hanging
return future.get(5, TimeUnit.SECONDS);
} catch (TimeoutException e) {
System.err.println("Health check timed out");
return false;
} catch (Exception e) {
System.err.println("Health check failed: " + e.getMessage());
return false;
}
}Rationale: Health checks that hang prevent proper availability state management. Always set timeouts to fail fast rather than hanging indefinitely.
Problem: No visibility into state transitions
Error: Cannot diagnose why pod was restarted or traffic stopped
Solution:
@Component
public class AvailabilityStateLogger {
private static final Logger log = LoggerFactory.getLogger(AvailabilityStateLogger.class);
@EventListener
public void onLivenessChange(AvailabilityChangeEvent<LivenessState> event) {
log.warn("Liveness state changed to: {} by {}",
event.getState(),
event.getSource().getClass().getSimpleName());
if (event.getState() == LivenessState.BROKEN) {
log.error("APPLICATION MARKED AS BROKEN - POD WILL BE RESTARTED");
logThreadDump();
}
}
@EventListener
public void onReadinessChange(AvailabilityChangeEvent<ReadinessState> event) {
log.info("Readiness state changed to: {} by {}",
event.getState(),
event.getSource().getClass().getSimpleName());
}
}Rationale: State transitions are critical events. Always log them with context to enable post-mortem analysis of outages.
Problem: Probes checking wrong endpoints or with bad timing
Error: Pods restarted too aggressively or not at all
Solution:
# Correct Kubernetes probe configuration
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 30 # Wait for startup
periodSeconds: 10 # Check every 10s
failureThreshold: 3 # Fail after 3 consecutive failures
timeoutSeconds: 5 # Timeout after 5s
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 10 # Can check sooner
periodSeconds: 5 # Check more frequently
failureThreshold: 3 # Stop traffic after 3 failures
successThreshold: 1 # Resume traffic after 1 success
timeoutSeconds: 3Rationale: Probe timing must balance responsiveness with stability. Too aggressive causes unnecessary restarts, too lenient delays recovery.
Problem: Availability logic not tested, fails in production
Error: Application doesn't properly transition states under load
Solution:
@SpringBootTest
class AvailabilityTest {
@Autowired
private ApplicationAvailability availability;
@Autowired
private ApplicationEventPublisher eventPublisher;
@Test
void shouldRefuseTrafficWhenMarkedNotReady() {
// Given: Application is ready
assertEquals(ReadinessState.ACCEPTING_TRAFFIC,
availability.getReadinessState());
// When: Mark as not ready
AvailabilityChangeEvent.publish(
eventPublisher,
this,
ReadinessState.REFUSING_TRAFFIC
);
// Then: State should change
assertEquals(ReadinessState.REFUSING_TRAFFIC,
availability.getReadinessState());
}
@Test
void shouldMarkAsBrokenOnDeadlock() {
// Test liveness state transitions
DeadlockDetector detector = new DeadlockDetector(eventPublisher);
// Simulate deadlock
detector.onDeadlockDetected();
// Verify state
assertEquals(LivenessState.BROKEN,
availability.getLivenessState());
}
}Rationale: Availability transitions are critical to application reliability. Test them thoroughly to ensure they work correctly under various failure scenarios.
All availability components are fully thread-safe:
// Availability core
import org.springframework.boot.availability.ApplicationAvailability;
import org.springframework.boot.availability.AvailabilityChangeEvent;
import org.springframework.boot.availability.AvailabilityState;
import org.springframework.boot.availability.LivenessState;
import org.springframework.boot.availability.ReadinessState;
// Spring context
import org.springframework.context.ApplicationContext;
import org.springframework.context.ApplicationEventPublisher;
import org.springframework.context.event.EventListener;
// Actuator (for health indicators)
import org.springframework.boot.actuate.health.Health;
import org.springframework.boot.actuate.health.HealthIndicator;