CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-io-dropwizard-metrics--metrics-core

Comprehensive metrics collection and monitoring library providing counters, gauges, histograms, meters, and timers for Java applications.

Pending
Overview
Eval results
Files

reservoirs-sampling.mddocs/

Reservoirs and Sampling

Reservoirs are the sampling strategies used by histograms and timers to manage memory usage while preserving statistical accuracy. They determine which values are kept for statistical analysis and which are discarded, implementing various algorithms to maintain representative samples of potentially unlimited data streams.

Reservoir Interface

All reservoir implementations conform to the Reservoir interface, providing a consistent API for value storage and statistical snapshot generation.

public interface Reservoir {
    // Current number of values stored
    int size();
    
    // Add a new value to the reservoir
    void update(long value);
    
    // Get statistical snapshot of current values
    Snapshot getSnapshot();
}

UniformReservoir

UniformReservoir implements uniform random sampling using Vitter's Algorithm R, ensuring that all values have an equal probability of being retained regardless of when they were recorded. This provides an unbiased sample of the entire data stream.

public class UniformReservoir implements Reservoir {
    // Constructors
    public UniformReservoir();              // Default size: 1028 samples
    public UniformReservoir(int size);      // Custom sample size
    
    // Reservoir interface implementation
    public int size();
    public void update(long value);
    public Snapshot getSnapshot();
}

Usage Examples

Basic Uniform Sampling:

// Default uniform reservoir (1028 samples)
Histogram responseTimeHistogram = new Histogram(new UniformReservoir());
registry.register("api.response.time", responseTimeHistogram);

// Custom size uniform reservoir
Histogram customHistogram = new Histogram(new UniformReservoir(500));
registry.register("custom.measurements", customHistogram);

When to Use UniformReservoir:

  • When you need unbiased sampling across the entire data stream
  • For long-running applications where historical data is as important as recent data
  • When statistical accuracy across all time periods is critical
  • For baseline measurements where time-based weighting isn't relevant
// Example: Measuring file sizes across entire application lifetime
UniformReservoir filesSizes = new UniformReservoir(2000);
Histogram fileSizeHistogram = new Histogram(filesSizes);

// All file sizes have equal probability of being sampled
for (File file : getAllApplicationFiles()) {
    fileSizeHistogram.update(file.length());
}

ExponentiallyDecayingReservoir

ExponentiallyDecayingReservoir implements time-biased sampling that favors recent values over older ones using an exponential decay algorithm. This reservoir is ideal for applications where recent behavior is more indicative of current system state.

public class ExponentiallyDecayingReservoir implements Reservoir {
    // Constructors
    public ExponentiallyDecayingReservoir();                           // Default: 1028 samples, α=0.015
    public ExponentiallyDecayingReservoir(int size, double alpha);     // Custom size and decay factor
    public ExponentiallyDecayingReservoir(int size, double alpha, Clock clock);  // Custom clock for testing
    
    // Reservoir interface implementation
    public int size();
    public void update(long value);
    public Snapshot getSnapshot();
    
    // Extended functionality
    public void update(long value, long timestamp);  // Update with explicit timestamp
}

Usage Examples

Default Time-Biased Sampling:

// Default exponentially decaying reservoir
Histogram requestLatency = new Histogram(new ExponentiallyDecayingReservoir());
Timer requestTimer = new Timer(new ExponentiallyDecayingReservoir());

// Recent values are more likely to be included in statistics
requestLatency.update(responseTime);
requestTimer.update(processingTime, TimeUnit.MILLISECONDS);

Custom Decay Parameters:

// Faster decay (α=0.1) - emphasizes very recent values  
ExponentiallyDecayingReservoir fastDecay = new ExponentiallyDecayingReservoir(1000, 0.1);
Histogram recentErrorRates = new Histogram(fastDecay);

// Slower decay (α=0.005) - maintains more historical influence
ExponentiallyDecayingReservoir slowDecay = new ExponentiallyDecayingReservoir(1000, 0.005);
Histogram backgroundProcessingTimes = new Histogram(slowDecay);

Explicit Timestamp Updates:

ExponentiallyDecayingReservoir timestampedReservoir = new ExponentiallyDecayingReservoir();

// Update with explicit timestamp (useful for processing historical data)
long eventTimestamp = getEventTimestamp();
long eventValue = getEventValue();
timestampedReservoir.update(eventValue, eventTimestamp);

Understanding Alpha (Decay Factor):

// α = 0.015 (default): Values lose ~50% weight after ~46 samples
// α = 0.1:            Values lose ~50% weight after ~7 samples  
// α = 0.001:          Values lose ~50% weight after ~693 samples

// For high-frequency metrics (many updates per second)
ExponentiallyDecayingReservoir highFreq = new ExponentiallyDecayingReservoir(1000, 0.05);

// For low-frequency metrics (few updates per minute)
ExponentiallyDecayingReservoir lowFreq = new ExponentiallyDecayingReservoir(1000, 0.001);

SlidingWindowReservoir

SlidingWindowReservoir maintains a fixed-size window of the most recent N values, providing a simple "last N samples" sampling strategy. When full, new values replace the oldest values in the window.

public class SlidingWindowReservoir implements Reservoir {
    // Constructor
    public SlidingWindowReservoir(int size);
    
    // Reservoir interface implementation
    public int size();
    public void update(long value);
    public Snapshot getSnapshot();
}

Usage Examples

Fixed-Size Recent Sample Window:

// Keep last 100 response times
SlidingWindowReservoir recentResponses = new SlidingWindowReservoir(100);
Histogram responseHistogram = new Histogram(recentResponses);

// Keep last 50 error counts
SlidingWindowReservoir recentErrors = new SlidingWindowReservoir(50);
Histogram errorHistogram = new Histogram(recentErrors);

When to Use SlidingWindowReservoir:

  • When you need statistics for exactly the last N values
  • For debugging or development environments where recent behavior is most relevant
  • When memory usage needs to be precisely controlled
  • For systems with predictable, steady-state behavior
// Example: Monitoring last 200 database query times
SlidingWindowReservoir queryTimes = new SlidingWindowReservoir(200);
Timer dbQueryTimer = new Timer(queryTimes);

// Statistics always represent the last 200 queries
for (Query query : queries) {
    Timer.Context context = dbQueryTimer.time();
    executeQuery(query);
    context.stop();
}

// Get statistics for last 200 queries only
Snapshot snapshot = dbQueryTimer.getSnapshot();
double avgLast200 = snapshot.getMean();

SlidingTimeWindowReservoir

SlidingTimeWindowReservoir maintains all values recorded within a specific time window (e.g., last 5 minutes), automatically discarding values that fall outside the time window. This provides time-based rather than count-based sampling.

public class SlidingTimeWindowReservoir implements Reservoir {
    // Constructors
    public SlidingTimeWindowReservoir(long window, TimeUnit windowUnit);
    public SlidingTimeWindowReservoir(long window, TimeUnit windowUnit, Clock clock);
    
    // Reservoir interface implementation
    public int size();
    public void update(long value);
    public Snapshot getSnapshot();
}

Usage Examples

Time-Based Sampling Windows:

// Keep all values from last 5 minutes
SlidingTimeWindowReservoir last5Minutes = new SlidingTimeWindowReservoir(5, TimeUnit.MINUTES);
Histogram recentActivity = new Histogram(last5Minutes);

// Keep all values from last 30 seconds
SlidingTimeWindowReservoir last30Seconds = new SlidingTimeWindowReservoir(30, TimeUnit.SECONDS);
Timer recentRequests = new Timer(last30Seconds);

Real-Time Monitoring:

// Monitor error rates in real-time (last 1 minute)
SlidingTimeWindowReservoir errorWindow = new SlidingTimeWindowReservoir(1, TimeUnit.MINUTES);
Histogram errorRateHistogram = new Histogram(errorWindow);

// All statistics reflect only the last minute of activity
errorRateHistogram.update(errorCount);
Snapshot recentErrors = errorRateHistogram.getSnapshot();
System.out.println("Errors in last minute: " + recentErrors.size());

Custom Clock for Testing:

// Controllable clock for testing time-based sampling
Clock testClock = new Clock() {
    private long currentTime = 0;
    public long getTick() { return currentTime * 1_000_000; }
    public long getTime() { return currentTime; }
    public void advance(long millis) { currentTime += millis; }
};

SlidingTimeWindowReservoir testReservoir = 
    new SlidingTimeWindowReservoir(10, TimeUnit.SECONDS, testClock);

// Add values and advance time for deterministic testing
testReservoir.update(100);
testClock.advance(5000);  // Advance 5 seconds
testReservoir.update(200);
testClock.advance(6000);  // Advance 6 more seconds (11 total)

// First value should be expired, only second value remains
assertEquals(1, testReservoir.size());

Statistical Snapshots

All reservoirs provide statistical snapshots through the Snapshot class, which offers comprehensive statistical analysis of the sampled values.

public abstract class Snapshot {
    // Quantile access
    public abstract double getValue(double quantile);  // 0.0 to 1.0
    public double getMedian();                         // 50th percentile
    public double get75thPercentile();
    public double get95thPercentile();
    public double get98thPercentile();
    public double get99thPercentile();
    public double get999thPercentile();
    
    // Descriptive statistics
    public abstract long[] getValues();               // All sampled values
    public abstract int size();                       // Number of values
    public abstract long getMax();
    public abstract long getMin(); 
    public abstract double getMean();
    public abstract double getStdDev();
    
    // Utility
    public abstract void dump(OutputStream output);   // Export values
}

Snapshot Usage Examples

Histogram histogram = new Histogram(new ExponentiallyDecayingReservoir());

// Record some values
for (int i = 0; i < 1000; i++) {
    histogram.update(random.nextInt(1000));
}

// Get comprehensive statistics
Snapshot snapshot = histogram.getSnapshot();

System.out.printf("Count: %d%n", snapshot.size());
System.out.printf("Min: %d, Max: %d%n", snapshot.getMin(), snapshot.getMax());
System.out.printf("Mean: %.2f, StdDev: %.2f%n", snapshot.getMean(), snapshot.getStdDev());
System.out.printf("Median: %.2f%n", snapshot.getMedian());
System.out.printf("95th percentile: %.2f%n", snapshot.get95thPercentile());
System.out.printf("99th percentile: %.2f%n", snapshot.get99thPercentile());

// Custom quantiles
double q90 = snapshot.getValue(0.90);    // 90th percentile
double q999 = snapshot.getValue(0.999);  // 99.9th percentile

// Export all values to file
try (FileOutputStream fos = new FileOutputStream("histogram-values.txt")) {
    snapshot.dump(fos);
}

Choosing the Right Reservoir

Decision Matrix

Use CaseRecommended ReservoirReason
Long-running application metricsExponentiallyDecayingReservoirEmphasizes recent behavior while maintaining historical context
Real-time monitoring dashboardSlidingTimeWindowReservoirProvides exact time-based windows for real-time analysis
Debug/development environmentsSlidingWindowReservoirSimple, predictable sampling of recent values
Baseline/benchmark measurementsUniformReservoirUnbiased sampling across entire measurement period
High-frequency metricsExponentiallyDecayingReservoir with higher αFaster decay to emphasize very recent values
Low-frequency metricsExponentiallyDecayingReservoir with lower αSlower decay to maintain longer historical influence

Memory Usage Considerations

// Memory usage comparison for different reservoirs:

// Fixed memory: 1028 * 8 bytes = ~8KB (plus small overhead)
UniformReservoir uniform = new UniformReservoir();

// Fixed memory: 1028 * (8 + 8) bytes = ~16KB (value + weight)
ExponentiallyDecayingReservoir exponential = new ExponentiallyDecayingReservoir();

// Fixed memory: 100 * 8 bytes = ~800 bytes
SlidingWindowReservoir sliding = new SlidingWindowReservoir(100);

// Variable memory: depends on activity in time window
SlidingTimeWindowReservoir timeWindow = new SlidingTimeWindowReservoir(1, TimeUnit.MINUTES);
// Could be 0 bytes (no activity) to unbounded (high activity)

Performance Characteristics

// Performance comparison (approximate):

// Fastest updates, good for high-frequency metrics
SlidingWindowReservoir fastest = new SlidingWindowReservoir(1000);

// Fast updates with periodic cleanup, good balance
ExponentiallyDecayingReservoir balanced = new ExponentiallyDecayingReservoir();

// Moderate performance, random access for updates
UniformReservoir moderate = new UniformReservoir();

// Slowest updates due to time-based cleanup, but accurate time windows
SlidingTimeWindowReservoir precise = new SlidingTimeWindowReservoir(5, TimeUnit.MINUTES);

Advanced Usage

Custom Reservoir Selection Strategy

public class AdaptiveReservoirFactory {
    public static Reservoir createReservoir(String metricName, MetricType type) {
        if (metricName.contains("error") || metricName.contains("exception")) {
            // For error metrics, emphasize recent events
            return new ExponentiallyDecayingReservoir(500, 0.1);
        } else if (metricName.contains("response.time")) {
            // For response times, use time-based window for real-time monitoring
            return new SlidingTimeWindowReservoir(2, TimeUnit.MINUTES);
        } else if (metricName.contains("throughput")) {
            // For throughput, use uniform sampling for unbiased measurement
            return new UniformReservoir(2000);  
        } else {
            // Default case
            return new ExponentiallyDecayingReservoir();
        }
    }
}

// Usage
Histogram errorHistogram = new Histogram(
    AdaptiveReservoirFactory.createReservoir("api.errors", MetricType.HISTOGRAM));

Reservoir Monitoring

// Monitor reservoir efficiency
public void analyzeReservoirEfficiency(Reservoir reservoir) {
    Snapshot snapshot = reservoir.getSnapshot();
    
    System.out.printf("Reservoir size: %d samples%n", reservoir.size());
    System.out.printf("Statistical range: %d - %d%n", 
        snapshot.getMin(), snapshot.getMax());
    System.out.printf("Distribution spread (std dev): %.2f%n", 
        snapshot.getStdDev());
    
    // Check if reservoir is providing good statistical coverage
    double range = snapshot.getMax() - snapshot.getMin();
    double stdDevRatio = snapshot.getStdDev() / snapshot.getMean();
    
    if (stdDevRatio > 1.0) {
        System.out.println("High variability - consider larger reservoir");
    }
    if (range < snapshot.getMean() * 0.1) {
        System.out.println("Low variability - smaller reservoir may suffice");
    }
}

Best Practices

Reservoir Configuration

  • Start with ExponentiallyDecayingReservoir as the default choice
  • Use SlidingTimeWindowReservoir for real-time dashboards and alerting
  • Choose reservoir size based on expected data variability and accuracy requirements
  • Consider memory constraints when selecting reservoir types and sizes

Performance Optimization

  • Pre-allocate reservoirs during application startup
  • Avoid creating new reservoirs frequently during runtime
  • Monitor reservoir sizes in time-windowed reservoirs to detect memory issues
  • Use appropriate α values for exponential decay based on update frequency

Testing Strategies

  • Use custom Clock implementations for deterministic testing of time-based reservoirs
  • Test reservoir behavior under various load patterns (high frequency, bursts, sparse)
  • Verify statistical accuracy by comparing reservoir snapshots with expected distributions
  • Test memory usage patterns, especially with time-windowed reservoirs

Monitoring and Debugging

  • Log reservoir sizes periodically to understand sampling behavior
  • Compare statistics from different reservoir types on the same data stream
  • Monitor reservoir performance under production load patterns
  • Use snapshot dumps for detailed analysis of sampled value distributions

Install with Tessl CLI

npx tessl i tessl/maven-io-dropwizard-metrics--metrics-core

docs

advanced-gauges.md

core-metrics.md

index.md

reporting.md

reservoirs-sampling.md

utilities.md

tile.json