0
# Results Storage and Comparison
1
2
## Overview
3
4
pytest-benchmark provides robust storage backends for persisting benchmark results and powerful comparison capabilities for tracking performance over time. Results can be stored in files or Elasticsearch and compared across different runs, commits, or environments.
5
6
## Storage Backends
7
8
### FileStorage Class
9
10
```python { .api }
11
class FileStorage:
12
"""File-based storage backend for benchmark results."""
13
14
def __init__(self, path: str, logger, default_machine_id: str = None):
15
"""
16
Initialize file storage.
17
18
Args:
19
path: Directory path for storing benchmark files
20
logger: Logger instance for output
21
default_machine_id: Default machine identifier
22
"""
23
24
def save(self, output_json: dict, save: str) -> str:
25
"""
26
Save benchmark results to file.
27
28
Args:
29
output_json: Benchmark data in JSON format
30
save: Save identifier/name
31
32
Returns:
33
str: Path to saved file
34
"""
35
36
def load(self, name: str) -> dict:
37
"""
38
Load benchmark results from file.
39
40
Args:
41
name: File identifier to load
42
43
Returns:
44
dict: Loaded benchmark data
45
"""
46
```
47
48
### ElasticsearchStorage Class
49
50
```python { .api }
51
class ElasticsearchStorage:
52
"""Elasticsearch storage backend for benchmark results."""
53
54
def __init__(self, hosts: list, index: str, doctype: str, project_name: str = None, logger=None, **kwargs):
55
"""
56
Initialize Elasticsearch storage.
57
58
Args:
59
hosts: List of Elasticsearch host URLs
60
index: Index name for storing benchmarks
61
doctype: Document type (deprecated in ES 7+)
62
project_name: Project identifier
63
logger: Logger instance
64
**kwargs: Additional Elasticsearch client options
65
"""
66
67
def save(self, output_json: dict, save: str) -> str:
68
"""Save benchmark results to Elasticsearch."""
69
70
def load(self, name: str) -> dict:
71
"""Load benchmark results from Elasticsearch."""
72
```
73
74
## Storage Configuration
75
76
### File Storage URIs
77
78
```bash { .api }
79
# Default file storage
80
--benchmark-storage=file://./.benchmarks
81
82
# Absolute path
83
--benchmark-storage=file:///home/user/benchmarks
84
85
# Relative path
86
--benchmark-storage=file://./results/benchmarks
87
```
88
89
### Elasticsearch URIs
90
91
```bash { .api }
92
# Basic Elasticsearch
93
--benchmark-storage=elasticsearch+http://localhost:9200/benchmarks/results
94
95
# With authentication
96
--benchmark-storage=elasticsearch+https://user:pass@host:9200/index/doctype
97
98
# Multiple hosts
99
--benchmark-storage=elasticsearch+http://host1:9200,host2:9200/index/doctype
100
101
# With project name
102
--benchmark-storage=elasticsearch+http://host:9200/index/doctype?project_name=myproject
103
```
104
105
## Saving Results
106
107
### Manual Saving
108
109
```bash
110
# Save with custom name
111
pytest --benchmark-save=baseline
112
113
# Save with descriptive name
114
pytest --benchmark-save=feature-x-implementation
115
116
# Auto-save with timestamp
117
pytest --benchmark-autosave
118
```
119
120
### Programmatic Saving
121
122
```python
123
def test_with_custom_save(benchmark):
124
def my_function():
125
return sum(range(1000))
126
127
result = benchmark(my_function)
128
129
# Results automatically saved if --benchmark-save is used
130
assert result == 499500
131
```
132
133
### Save Data Options
134
135
```bash
136
# Save only statistics (default)
137
pytest --benchmark-save=baseline
138
139
# Save complete timing data
140
pytest --benchmark-save=baseline --benchmark-save-data
141
```
142
143
## Result Comparison
144
145
### Command-Line Comparison
146
147
```bash { .api }
148
# Compare against latest saved
149
pytest --benchmark-compare
150
151
# Compare against specific run
152
pytest --benchmark-compare=baseline
153
pytest --benchmark-compare=0001
154
155
# Compare with failure thresholds
156
pytest --benchmark-compare=baseline --benchmark-compare-fail=mean:10%
157
```
158
159
### CLI Tool Usage
160
161
```bash
162
# List available runs
163
pytest-benchmark list
164
165
# Compare specific runs
166
pytest-benchmark compare 0001 0002
167
168
# Compare with filters
169
pytest-benchmark compare 'Linux-CPython-3.9-64bit/*'
170
171
# Display comparison table
172
pytest-benchmark compare --help
173
```
174
175
## Comparison Examples
176
177
### Basic Comparison
178
179
```bash
180
# First, establish baseline
181
pytest --benchmark-save=baseline tests/
182
183
# Later, compare new implementation
184
pytest --benchmark-compare=baseline tests/
185
```
186
187
### Continuous Integration Workflow
188
189
```bash
190
# In CI pipeline
191
# 1. Run benchmarks and save
192
pytest --benchmark-only --benchmark-save=commit-${BUILD_ID}
193
194
# 2. Compare against master baseline
195
pytest --benchmark-only --benchmark-compare=master-baseline \
196
--benchmark-compare-fail=mean:15%
197
```
198
199
### Multiple Environment Comparison
200
201
```bash
202
# Save results for different Python versions
203
pytest --benchmark-save=python38 tests/
204
pytest --benchmark-save=python39 tests/
205
pytest --benchmark-save=python310 tests/
206
207
# Compare across versions
208
pytest-benchmark compare python38 python39 python310
209
```
210
211
## Performance Regression Detection
212
213
### Failure Thresholds
214
215
```python { .api }
216
# Threshold expression formats:
217
"mean:5%" # Mean increased by more than 5%
218
"min:0.001" # Min increased by more than 1ms
219
"max:10%" # Max increased by more than 10%
220
"stddev:25%" # Standard deviation increased by 25%
221
```
222
223
### Multiple Thresholds
224
225
```bash
226
# Multiple failure conditions
227
pytest --benchmark-compare=baseline \
228
--benchmark-compare-fail=mean:10% \
229
--benchmark-compare-fail=max:20% \
230
--benchmark-compare-fail=min:0.005
231
```
232
233
### Example Regression Detection
234
235
```python
236
def test_performance_sensitive_function(benchmark):
237
def critical_function():
238
# This function's performance is critical
239
return sum(x**2 for x in range(10000))
240
241
result = benchmark(critical_function)
242
assert result == 333283335000
243
244
# Run with regression detection
245
# pytest --benchmark-compare=baseline --benchmark-compare-fail=mean:5%
246
```
247
248
## Machine Information Tracking
249
250
### Automatic Machine Detection
251
252
```python { .api }
253
def pytest_benchmark_generate_machine_info() -> dict:
254
"""
255
Generate machine information for benchmark context.
256
257
Returns:
258
dict: Machine information including:
259
- node: Machine hostname
260
- processor: Processor name
261
- machine: Machine architecture
262
- python_implementation: CPython/PyPy/etc
263
- python_version: Python version
264
- system: Operating system
265
- cpu: CPU information from py-cpuinfo
266
"""
267
```
268
269
### Machine Info Comparison
270
271
```bash
272
# Benchmarks warn if machine info differs
273
pytest --benchmark-compare=baseline
274
# Warning: Benchmark machine_info is different. Current: {...} VS saved: {...}
275
```
276
277
## Storage Management
278
279
### File Storage Structure
280
281
```
282
.benchmarks/
283
├── Linux-CPython-3.9-64bit/
284
│ ├── 0001_baseline.json
285
│ ├── 0002_feature_x.json
286
│ └── 0003_master.json
287
└── machine_info.json
288
```
289
290
### Elasticsearch Document Structure
291
292
```json
293
{
294
"_index": "benchmarks",
295
"_type": "results",
296
"_id": "0001_baseline",
297
"_source": {
298
"machine_info": {...},
299
"commit_info": {...},
300
"benchmarks": [...],
301
"datetime": "2023-01-01T12:00:00Z",
302
"version": "5.1.0"
303
}
304
}
305
```
306
307
## JSON Export Format
308
309
### Complete JSON Export
310
311
```bash
312
# Export with full timing data
313
pytest --benchmark-json=complete.json --benchmark-save-data
314
```
315
316
### JSON Structure
317
318
```python { .api }
319
# Complete benchmark JSON format:
320
{
321
"machine_info": {
322
"node": str,
323
"processor": str,
324
"machine": str,
325
"python_implementation": str,
326
"python_version": str,
327
"system": str,
328
"cpu": dict
329
},
330
"commit_info": {
331
"id": str,
332
"time": str,
333
"author_time": str,
334
"author_name": str,
335
"author_email": str,
336
"message": str,
337
"branch": str
338
},
339
"benchmarks": [
340
{
341
"group": str,
342
"name": str,
343
"fullname": str,
344
"params": dict,
345
"param": str,
346
"extra_info": dict,
347
"stats": {
348
"min": float,
349
"max": float,
350
"mean": float,
351
"stddev": float,
352
"rounds": int,
353
"median": float,
354
"iqr": float,
355
"q1": float,
356
"q3": float,
357
"iqr_outliers": int,
358
"stddev_outliers": int,
359
"outliers": str,
360
"ld15iqr": float,
361
"hd15iqr": float,
362
"ops": float,
363
"total": float
364
},
365
"data": [float, ...] # If --benchmark-save-data used
366
}
367
],
368
"datetime": str,
369
"version": str
370
}
371
```
372
373
## Advanced Usage
374
375
### Custom Commit Information
376
377
```python
378
def pytest_benchmark_generate_commit_info(config):
379
"""Custom commit info generation."""
380
return {
381
"id": "custom-build-123",
382
"branch": "feature/optimization",
383
"message": "Performance improvements",
384
"time": "2023-01-01T12:00:00Z"
385
}
386
```
387
388
### Storage Authentication
389
390
```bash
391
# Using netrc for Elasticsearch auth
392
echo "machine elasticsearch.example.com login user password secret" >> ~/.netrc
393
394
pytest --benchmark-storage=elasticsearch+https://elasticsearch.example.com:9200/bench/result \
395
--benchmark-netrc=~/.netrc
396
```
397
398
### Filtering Comparisons
399
400
```bash
401
# Compare only specific test patterns
402
pytest-benchmark compare baseline current --benchmark-filter="*string*"
403
404
# Compare specific groups
405
pytest-benchmark compare baseline current --group="database"
406
```
407
408
## Troubleshooting
409
410
### Storage Issues
411
412
```python
413
# Check storage connectivity
414
pytest --benchmark-storage=file://./test-storage --benchmark-save=test
415
416
# Verify Elasticsearch connection
417
pytest --benchmark-storage=elasticsearch+http://localhost:9200/test/bench \
418
--benchmark-save=connectivity-test
419
```
420
421
### Comparison Failures
422
423
```bash
424
# Debug comparison issues
425
pytest --benchmark-compare=baseline --benchmark-verbose
426
427
# List available runs for comparison
428
pytest-benchmark list --storage=file://.benchmarks
429
```