or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

advanced-gauges.mdcore-metrics.mdindex.mdreporting.mdreservoirs-sampling.mdutilities.md

reservoirs-sampling.mddocs/

0

# Reservoirs and Sampling

1

2

Reservoirs are the sampling strategies used by histograms and timers to manage memory usage while preserving statistical accuracy. They determine which values are kept for statistical analysis and which are discarded, implementing various algorithms to maintain representative samples of potentially unlimited data streams.

3

4

## Reservoir Interface

5

6

All reservoir implementations conform to the `Reservoir` interface, providing a consistent API for value storage and statistical snapshot generation.

7

8

```java { .api }

9

public interface Reservoir {

10

// Current number of values stored

11

int size();

12

13

// Add a new value to the reservoir

14

void update(long value);

15

16

// Get statistical snapshot of current values

17

Snapshot getSnapshot();

18

}

19

```

20

21

## UniformReservoir

22

23

`UniformReservoir` implements uniform random sampling using Vitter's Algorithm R, ensuring that all values have an equal probability of being retained regardless of when they were recorded. This provides an unbiased sample of the entire data stream.

24

25

```java { .api }

26

public class UniformReservoir implements Reservoir {

27

// Constructors

28

public UniformReservoir(); // Default size: 1028 samples

29

public UniformReservoir(int size); // Custom sample size

30

31

// Reservoir interface implementation

32

public int size();

33

public void update(long value);

34

public Snapshot getSnapshot();

35

}

36

```

37

38

### Usage Examples

39

40

**Basic Uniform Sampling:**

41

```java

42

// Default uniform reservoir (1028 samples)

43

Histogram responseTimeHistogram = new Histogram(new UniformReservoir());

44

registry.register("api.response.time", responseTimeHistogram);

45

46

// Custom size uniform reservoir

47

Histogram customHistogram = new Histogram(new UniformReservoir(500));

48

registry.register("custom.measurements", customHistogram);

49

```

50

51

**When to Use UniformReservoir:**

52

- When you need unbiased sampling across the entire data stream

53

- For long-running applications where historical data is as important as recent data

54

- When statistical accuracy across all time periods is critical

55

- For baseline measurements where time-based weighting isn't relevant

56

57

```java

58

// Example: Measuring file sizes across entire application lifetime

59

UniformReservoir filesSizes = new UniformReservoir(2000);

60

Histogram fileSizeHistogram = new Histogram(filesSizes);

61

62

// All file sizes have equal probability of being sampled

63

for (File file : getAllApplicationFiles()) {

64

fileSizeHistogram.update(file.length());

65

}

66

```

67

68

## ExponentiallyDecayingReservoir

69

70

`ExponentiallyDecayingReservoir` implements time-biased sampling that favors recent values over older ones using an exponential decay algorithm. This reservoir is ideal for applications where recent behavior is more indicative of current system state.

71

72

```java { .api }

73

public class ExponentiallyDecayingReservoir implements Reservoir {

74

// Constructors

75

public ExponentiallyDecayingReservoir(); // Default: 1028 samples, α=0.015

76

public ExponentiallyDecayingReservoir(int size, double alpha); // Custom size and decay factor

77

public ExponentiallyDecayingReservoir(int size, double alpha, Clock clock); // Custom clock for testing

78

79

// Reservoir interface implementation

80

public int size();

81

public void update(long value);

82

public Snapshot getSnapshot();

83

84

// Extended functionality

85

public void update(long value, long timestamp); // Update with explicit timestamp

86

}

87

```

88

89

### Usage Examples

90

91

**Default Time-Biased Sampling:**

92

```java

93

// Default exponentially decaying reservoir

94

Histogram requestLatency = new Histogram(new ExponentiallyDecayingReservoir());

95

Timer requestTimer = new Timer(new ExponentiallyDecayingReservoir());

96

97

// Recent values are more likely to be included in statistics

98

requestLatency.update(responseTime);

99

requestTimer.update(processingTime, TimeUnit.MILLISECONDS);

100

```

101

102

**Custom Decay Parameters:**

103

```java

104

// Faster decay (α=0.1) - emphasizes very recent values

105

ExponentiallyDecayingReservoir fastDecay = new ExponentiallyDecayingReservoir(1000, 0.1);

106

Histogram recentErrorRates = new Histogram(fastDecay);

107

108

// Slower decay (α=0.005) - maintains more historical influence

109

ExponentiallyDecayingReservoir slowDecay = new ExponentiallyDecayingReservoir(1000, 0.005);

110

Histogram backgroundProcessingTimes = new Histogram(slowDecay);

111

```

112

113

**Explicit Timestamp Updates:**

114

```java

115

ExponentiallyDecayingReservoir timestampedReservoir = new ExponentiallyDecayingReservoir();

116

117

// Update with explicit timestamp (useful for processing historical data)

118

long eventTimestamp = getEventTimestamp();

119

long eventValue = getEventValue();

120

timestampedReservoir.update(eventValue, eventTimestamp);

121

```

122

123

**Understanding Alpha (Decay Factor):**

124

```java

125

// α = 0.015 (default): Values lose ~50% weight after ~46 samples

126

// α = 0.1: Values lose ~50% weight after ~7 samples

127

// α = 0.001: Values lose ~50% weight after ~693 samples

128

129

// For high-frequency metrics (many updates per second)

130

ExponentiallyDecayingReservoir highFreq = new ExponentiallyDecayingReservoir(1000, 0.05);

131

132

// For low-frequency metrics (few updates per minute)

133

ExponentiallyDecayingReservoir lowFreq = new ExponentiallyDecayingReservoir(1000, 0.001);

134

```

135

136

## SlidingWindowReservoir

137

138

`SlidingWindowReservoir` maintains a fixed-size window of the most recent N values, providing a simple "last N samples" sampling strategy. When full, new values replace the oldest values in the window.

139

140

```java { .api }

141

public class SlidingWindowReservoir implements Reservoir {

142

// Constructor

143

public SlidingWindowReservoir(int size);

144

145

// Reservoir interface implementation

146

public int size();

147

public void update(long value);

148

public Snapshot getSnapshot();

149

}

150

```

151

152

### Usage Examples

153

154

**Fixed-Size Recent Sample Window:**

155

```java

156

// Keep last 100 response times

157

SlidingWindowReservoir recentResponses = new SlidingWindowReservoir(100);

158

Histogram responseHistogram = new Histogram(recentResponses);

159

160

// Keep last 50 error counts

161

SlidingWindowReservoir recentErrors = new SlidingWindowReservoir(50);

162

Histogram errorHistogram = new Histogram(recentErrors);

163

```

164

165

**When to Use SlidingWindowReservoir:**

166

- When you need statistics for exactly the last N values

167

- For debugging or development environments where recent behavior is most relevant

168

- When memory usage needs to be precisely controlled

169

- For systems with predictable, steady-state behavior

170

171

```java

172

// Example: Monitoring last 200 database query times

173

SlidingWindowReservoir queryTimes = new SlidingWindowReservoir(200);

174

Timer dbQueryTimer = new Timer(queryTimes);

175

176

// Statistics always represent the last 200 queries

177

for (Query query : queries) {

178

Timer.Context context = dbQueryTimer.time();

179

executeQuery(query);

180

context.stop();

181

}

182

183

// Get statistics for last 200 queries only

184

Snapshot snapshot = dbQueryTimer.getSnapshot();

185

double avgLast200 = snapshot.getMean();

186

```

187

188

## SlidingTimeWindowReservoir

189

190

`SlidingTimeWindowReservoir` maintains all values recorded within a specific time window (e.g., last 5 minutes), automatically discarding values that fall outside the time window. This provides time-based rather than count-based sampling.

191

192

```java { .api }

193

public class SlidingTimeWindowReservoir implements Reservoir {

194

// Constructors

195

public SlidingTimeWindowReservoir(long window, TimeUnit windowUnit);

196

public SlidingTimeWindowReservoir(long window, TimeUnit windowUnit, Clock clock);

197

198

// Reservoir interface implementation

199

public int size();

200

public void update(long value);

201

public Snapshot getSnapshot();

202

}

203

```

204

205

### Usage Examples

206

207

**Time-Based Sampling Windows:**

208

```java

209

// Keep all values from last 5 minutes

210

SlidingTimeWindowReservoir last5Minutes = new SlidingTimeWindowReservoir(5, TimeUnit.MINUTES);

211

Histogram recentActivity = new Histogram(last5Minutes);

212

213

// Keep all values from last 30 seconds

214

SlidingTimeWindowReservoir last30Seconds = new SlidingTimeWindowReservoir(30, TimeUnit.SECONDS);

215

Timer recentRequests = new Timer(last30Seconds);

216

```

217

218

**Real-Time Monitoring:**

219

```java

220

// Monitor error rates in real-time (last 1 minute)

221

SlidingTimeWindowReservoir errorWindow = new SlidingTimeWindowReservoir(1, TimeUnit.MINUTES);

222

Histogram errorRateHistogram = new Histogram(errorWindow);

223

224

// All statistics reflect only the last minute of activity

225

errorRateHistogram.update(errorCount);

226

Snapshot recentErrors = errorRateHistogram.getSnapshot();

227

System.out.println("Errors in last minute: " + recentErrors.size());

228

```

229

230

**Custom Clock for Testing:**

231

```java

232

// Controllable clock for testing time-based sampling

233

Clock testClock = new Clock() {

234

private long currentTime = 0;

235

public long getTick() { return currentTime * 1_000_000; }

236

public long getTime() { return currentTime; }

237

public void advance(long millis) { currentTime += millis; }

238

};

239

240

SlidingTimeWindowReservoir testReservoir =

241

new SlidingTimeWindowReservoir(10, TimeUnit.SECONDS, testClock);

242

243

// Add values and advance time for deterministic testing

244

testReservoir.update(100);

245

testClock.advance(5000); // Advance 5 seconds

246

testReservoir.update(200);

247

testClock.advance(6000); // Advance 6 more seconds (11 total)

248

249

// First value should be expired, only second value remains

250

assertEquals(1, testReservoir.size());

251

```

252

253

## Statistical Snapshots

254

255

All reservoirs provide statistical snapshots through the `Snapshot` class, which offers comprehensive statistical analysis of the sampled values.

256

257

```java { .api }

258

public abstract class Snapshot {

259

// Quantile access

260

public abstract double getValue(double quantile); // 0.0 to 1.0

261

public double getMedian(); // 50th percentile

262

public double get75thPercentile();

263

public double get95thPercentile();

264

public double get98thPercentile();

265

public double get99thPercentile();

266

public double get999thPercentile();

267

268

// Descriptive statistics

269

public abstract long[] getValues(); // All sampled values

270

public abstract int size(); // Number of values

271

public abstract long getMax();

272

public abstract long getMin();

273

public abstract double getMean();

274

public abstract double getStdDev();

275

276

// Utility

277

public abstract void dump(OutputStream output); // Export values

278

}

279

```

280

281

### Snapshot Usage Examples

282

283

```java

284

Histogram histogram = new Histogram(new ExponentiallyDecayingReservoir());

285

286

// Record some values

287

for (int i = 0; i < 1000; i++) {

288

histogram.update(random.nextInt(1000));

289

}

290

291

// Get comprehensive statistics

292

Snapshot snapshot = histogram.getSnapshot();

293

294

System.out.printf("Count: %d%n", snapshot.size());

295

System.out.printf("Min: %d, Max: %d%n", snapshot.getMin(), snapshot.getMax());

296

System.out.printf("Mean: %.2f, StdDev: %.2f%n", snapshot.getMean(), snapshot.getStdDev());

297

System.out.printf("Median: %.2f%n", snapshot.getMedian());

298

System.out.printf("95th percentile: %.2f%n", snapshot.get95thPercentile());

299

System.out.printf("99th percentile: %.2f%n", snapshot.get99thPercentile());

300

301

// Custom quantiles

302

double q90 = snapshot.getValue(0.90); // 90th percentile

303

double q999 = snapshot.getValue(0.999); // 99.9th percentile

304

305

// Export all values to file

306

try (FileOutputStream fos = new FileOutputStream("histogram-values.txt")) {

307

snapshot.dump(fos);

308

}

309

```

310

311

## Choosing the Right Reservoir

312

313

### Decision Matrix

314

315

| Use Case | Recommended Reservoir | Reason |

316

|----------|----------------------|---------|

317

| Long-running application metrics | `ExponentiallyDecayingReservoir` | Emphasizes recent behavior while maintaining historical context |

318

| Real-time monitoring dashboard | `SlidingTimeWindowReservoir` | Provides exact time-based windows for real-time analysis |

319

| Debug/development environments | `SlidingWindowReservoir` | Simple, predictable sampling of recent values |

320

| Baseline/benchmark measurements | `UniformReservoir` | Unbiased sampling across entire measurement period |

321

| High-frequency metrics | `ExponentiallyDecayingReservoir` with higher α | Faster decay to emphasize very recent values |

322

| Low-frequency metrics | `ExponentiallyDecayingReservoir` with lower α | Slower decay to maintain longer historical influence |

323

324

### Memory Usage Considerations

325

326

```java

327

// Memory usage comparison for different reservoirs:

328

329

// Fixed memory: 1028 * 8 bytes = ~8KB (plus small overhead)

330

UniformReservoir uniform = new UniformReservoir();

331

332

// Fixed memory: 1028 * (8 + 8) bytes = ~16KB (value + weight)

333

ExponentiallyDecayingReservoir exponential = new ExponentiallyDecayingReservoir();

334

335

// Fixed memory: 100 * 8 bytes = ~800 bytes

336

SlidingWindowReservoir sliding = new SlidingWindowReservoir(100);

337

338

// Variable memory: depends on activity in time window

339

SlidingTimeWindowReservoir timeWindow = new SlidingTimeWindowReservoir(1, TimeUnit.MINUTES);

340

// Could be 0 bytes (no activity) to unbounded (high activity)

341

```

342

343

### Performance Characteristics

344

345

```java

346

// Performance comparison (approximate):

347

348

// Fastest updates, good for high-frequency metrics

349

SlidingWindowReservoir fastest = new SlidingWindowReservoir(1000);

350

351

// Fast updates with periodic cleanup, good balance

352

ExponentiallyDecayingReservoir balanced = new ExponentiallyDecayingReservoir();

353

354

// Moderate performance, random access for updates

355

UniformReservoir moderate = new UniformReservoir();

356

357

// Slowest updates due to time-based cleanup, but accurate time windows

358

SlidingTimeWindowReservoir precise = new SlidingTimeWindowReservoir(5, TimeUnit.MINUTES);

359

```

360

361

## Advanced Usage

362

363

### Custom Reservoir Selection Strategy

364

365

```java

366

public class AdaptiveReservoirFactory {

367

public static Reservoir createReservoir(String metricName, MetricType type) {

368

if (metricName.contains("error") || metricName.contains("exception")) {

369

// For error metrics, emphasize recent events

370

return new ExponentiallyDecayingReservoir(500, 0.1);

371

} else if (metricName.contains("response.time")) {

372

// For response times, use time-based window for real-time monitoring

373

return new SlidingTimeWindowReservoir(2, TimeUnit.MINUTES);

374

} else if (metricName.contains("throughput")) {

375

// For throughput, use uniform sampling for unbiased measurement

376

return new UniformReservoir(2000);

377

} else {

378

// Default case

379

return new ExponentiallyDecayingReservoir();

380

}

381

}

382

}

383

384

// Usage

385

Histogram errorHistogram = new Histogram(

386

AdaptiveReservoirFactory.createReservoir("api.errors", MetricType.HISTOGRAM));

387

```

388

389

### Reservoir Monitoring

390

391

```java

392

// Monitor reservoir efficiency

393

public void analyzeReservoirEfficiency(Reservoir reservoir) {

394

Snapshot snapshot = reservoir.getSnapshot();

395

396

System.out.printf("Reservoir size: %d samples%n", reservoir.size());

397

System.out.printf("Statistical range: %d - %d%n",

398

snapshot.getMin(), snapshot.getMax());

399

System.out.printf("Distribution spread (std dev): %.2f%n",

400

snapshot.getStdDev());

401

402

// Check if reservoir is providing good statistical coverage

403

double range = snapshot.getMax() - snapshot.getMin();

404

double stdDevRatio = snapshot.getStdDev() / snapshot.getMean();

405

406

if (stdDevRatio > 1.0) {

407

System.out.println("High variability - consider larger reservoir");

408

}

409

if (range < snapshot.getMean() * 0.1) {

410

System.out.println("Low variability - smaller reservoir may suffice");

411

}

412

}

413

```

414

415

## Best Practices

416

417

### Reservoir Configuration

418

- Start with `ExponentiallyDecayingReservoir` as the default choice

419

- Use `SlidingTimeWindowReservoir` for real-time dashboards and alerting

420

- Choose reservoir size based on expected data variability and accuracy requirements

421

- Consider memory constraints when selecting reservoir types and sizes

422

423

### Performance Optimization

424

- Pre-allocate reservoirs during application startup

425

- Avoid creating new reservoirs frequently during runtime

426

- Monitor reservoir sizes in time-windowed reservoirs to detect memory issues

427

- Use appropriate α values for exponential decay based on update frequency

428

429

### Testing Strategies

430

- Use custom `Clock` implementations for deterministic testing of time-based reservoirs

431

- Test reservoir behavior under various load patterns (high frequency, bursts, sparse)

432

- Verify statistical accuracy by comparing reservoir snapshots with expected distributions

433

- Test memory usage patterns, especially with time-windowed reservoirs

434

435

### Monitoring and Debugging

436

- Log reservoir sizes periodically to understand sampling behavior

437

- Compare statistics from different reservoir types on the same data stream

438

- Monitor reservoir performance under production load patterns

439

- Use snapshot dumps for detailed analysis of sampled value distributions