Tessl Tile for pypi/pytest-benchmark@5.1.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

analysis-reporting.md aspect-benchmarking.md cli-tools.md configuration.md core-benchmarking.md index.md storage-comparison.md

storage-comparison.mddocs/

0
# Results Storage and Comparison
1

2
## Overview
3

4
pytest-benchmark provides robust storage backends for persisting benchmark results and powerful comparison capabilities for tracking performance over time. Results can be stored in files or Elasticsearch and compared across different runs, commits, or environments.
5

6
## Storage Backends
7

8
### FileStorage Class
9

10
```python { .api }
11
class FileStorage:
12
    """File-based storage backend for benchmark results."""
13
    
14
    def __init__(self, path: str, logger, default_machine_id: str = None):
15
        """
16
        Initialize file storage.
17
        
18
        Args:
19
            path: Directory path for storing benchmark files
20
            logger: Logger instance for output
21
            default_machine_id: Default machine identifier
22
        """
23
    
24
    def save(self, output_json: dict, save: str) -> str:
25
        """
26
        Save benchmark results to file.
27
        
28
        Args:
29
            output_json: Benchmark data in JSON format
30
            save: Save identifier/name
31
            
32
        Returns:
33
            str: Path to saved file
34
        """
35
    
36
    def load(self, name: str) -> dict:
37
        """
38
        Load benchmark results from file.
39
        
40
        Args:
41
            name: File identifier to load
42
            
43
        Returns:
44
            dict: Loaded benchmark data
45
        """
46
```
47

48
### ElasticsearchStorage Class
49

50
```python { .api }
51
class ElasticsearchStorage:
52
    """Elasticsearch storage backend for benchmark results."""
53
    
54
    def __init__(self, hosts: list, index: str, doctype: str, project_name: str = None, logger=None, **kwargs):
55
        """
56
        Initialize Elasticsearch storage.
57
        
58
        Args:
59
            hosts: List of Elasticsearch host URLs
60
            index: Index name for storing benchmarks
61
            doctype: Document type (deprecated in ES 7+)
62
            project_name: Project identifier
63
            logger: Logger instance
64
            **kwargs: Additional Elasticsearch client options
65
        """
66
    
67
    def save(self, output_json: dict, save: str) -> str:
68
        """Save benchmark results to Elasticsearch."""
69
    
70
    def load(self, name: str) -> dict:
71
        """Load benchmark results from Elasticsearch."""
72
```
73

74
## Storage Configuration
75

76
### File Storage URIs
77

78
```bash { .api }
79
# Default file storage
80
--benchmark-storage=file://./.benchmarks
81

82
# Absolute path
83
--benchmark-storage=file:///home/user/benchmarks
84

85
# Relative path
86
--benchmark-storage=file://./results/benchmarks
87
```
88

89
### Elasticsearch URIs
90

91
```bash { .api }
92
# Basic Elasticsearch
93
--benchmark-storage=elasticsearch+http://localhost:9200/benchmarks/results
94

95
# With authentication
96
--benchmark-storage=elasticsearch+https://user:pass@host:9200/index/doctype
97

98
# Multiple hosts
99
--benchmark-storage=elasticsearch+http://host1:9200,host2:9200/index/doctype
100

101
# With project name
102
--benchmark-storage=elasticsearch+http://host:9200/index/doctype?project_name=myproject
103
```
104

105
## Saving Results
106

107
### Manual Saving
108

109
```bash
110
# Save with custom name
111
pytest --benchmark-save=baseline
112

113
# Save with descriptive name  
114
pytest --benchmark-save=feature-x-implementation
115

116
# Auto-save with timestamp
117
pytest --benchmark-autosave
118
```
119

120
### Programmatic Saving
121

122
```python
123
def test_with_custom_save(benchmark):
124
    def my_function():
125
        return sum(range(1000))
126
    
127
    result = benchmark(my_function)
128
    
129
    # Results automatically saved if --benchmark-save is used
130
    assert result == 499500
131
```
132

133
### Save Data Options
134

135
```bash
136
# Save only statistics (default)
137
pytest --benchmark-save=baseline
138

139
# Save complete timing data
140
pytest --benchmark-save=baseline --benchmark-save-data
141
```
142

143
## Result Comparison
144

145
### Command-Line Comparison
146

147
```bash { .api }
148
# Compare against latest saved
149
pytest --benchmark-compare
150

151
# Compare against specific run
152
pytest --benchmark-compare=baseline
153
pytest --benchmark-compare=0001
154

155
# Compare with failure thresholds
156
pytest --benchmark-compare=baseline --benchmark-compare-fail=mean:10%
157
```
158

159
### CLI Tool Usage
160

161
```bash
162
# List available runs
163
pytest-benchmark list
164

165
# Compare specific runs
166
pytest-benchmark compare 0001 0002
167

168
# Compare with filters
169
pytest-benchmark compare 'Linux-CPython-3.9-64bit/*'
170

171
# Display comparison table
172
pytest-benchmark compare --help
173
```
174

175
## Comparison Examples
176

177
### Basic Comparison
178

179
```bash
180
# First, establish baseline
181
pytest --benchmark-save=baseline tests/
182

183
# Later, compare new implementation
184
pytest --benchmark-compare=baseline tests/
185
```
186

187
### Continuous Integration Workflow
188

189
```bash
190
# In CI pipeline
191
# 1. Run benchmarks and save
192
pytest --benchmark-only --benchmark-save=commit-${BUILD_ID}
193

194
# 2. Compare against master baseline  
195
pytest --benchmark-only --benchmark-compare=master-baseline \
196
       --benchmark-compare-fail=mean:15%
197
```
198

199
### Multiple Environment Comparison
200

201
```bash
202
# Save results for different Python versions
203
pytest --benchmark-save=python38 tests/
204
pytest --benchmark-save=python39 tests/  
205
pytest --benchmark-save=python310 tests/
206

207
# Compare across versions
208
pytest-benchmark compare python38 python39 python310
209
```
210

211
## Performance Regression Detection
212

213
### Failure Thresholds
214

215
```python { .api }
216
# Threshold expression formats:
217
"mean:5%"           # Mean increased by more than 5%
218
"min:0.001"         # Min increased by more than 1ms
219
"max:10%"           # Max increased by more than 10%  
220
"stddev:25%"        # Standard deviation increased by 25%
221
```
222

223
### Multiple Thresholds
224

225
```bash
226
# Multiple failure conditions
227
pytest --benchmark-compare=baseline \
228
       --benchmark-compare-fail=mean:10% \
229
       --benchmark-compare-fail=max:20% \  
230
       --benchmark-compare-fail=min:0.005
231
```
232

233
### Example Regression Detection
234

235
```python
236
def test_performance_sensitive_function(benchmark):
237
    def critical_function():
238
        # This function's performance is critical
239
        return sum(x**2 for x in range(10000))
240
    
241
    result = benchmark(critical_function)
242
    assert result == 333283335000
243

244
# Run with regression detection
245
# pytest --benchmark-compare=baseline --benchmark-compare-fail=mean:5%
246
```
247

248
## Machine Information Tracking
249

250
### Automatic Machine Detection
251

252
```python { .api }
253
def pytest_benchmark_generate_machine_info() -> dict:
254
    """
255
    Generate machine information for benchmark context.
256
    
257
    Returns:
258
        dict: Machine information including:
259
            - node: Machine hostname
260
            - processor: Processor name
261
            - machine: Machine architecture  
262
            - python_implementation: CPython/PyPy/etc
263
            - python_version: Python version
264
            - system: Operating system
265
            - cpu: CPU information from py-cpuinfo
266
    """
267
```
268

269
### Machine Info Comparison
270

271
```bash
272
# Benchmarks warn if machine info differs
273
pytest --benchmark-compare=baseline
274
# Warning: Benchmark machine_info is different. Current: {...} VS saved: {...}
275
```
276

277
## Storage Management
278

279
### File Storage Structure
280

281
```
282
.benchmarks/
283
├── Linux-CPython-3.9-64bit/
284
│   ├── 0001_baseline.json
285
│   ├── 0002_feature_x.json  
286
│   └── 0003_master.json
287
└── machine_info.json
288
```
289

290
### Elasticsearch Document Structure
291

292
```json
293
{
294
  "_index": "benchmarks",
295
  "_type": "results", 
296
  "_id": "0001_baseline",
297
  "_source": {
298
    "machine_info": {...},
299
    "commit_info": {...},
300
    "benchmarks": [...],
301
    "datetime": "2023-01-01T12:00:00Z",
302
    "version": "5.1.0"
303
  }
304
}
305
```
306

307
## JSON Export Format
308

309
### Complete JSON Export
310

311
```bash
312
# Export with full timing data
313
pytest --benchmark-json=complete.json --benchmark-save-data
314
```
315

316
### JSON Structure
317

318
```python { .api }
319
# Complete benchmark JSON format:
320
{
321
    "machine_info": {
322
        "node": str,
323
        "processor": str, 
324
        "machine": str,
325
        "python_implementation": str,
326
        "python_version": str,
327
        "system": str,
328
        "cpu": dict
329
    },
330
    "commit_info": {
331
        "id": str,
332
        "time": str,
333
        "author_time": str,
334
        "author_name": str,
335
        "author_email": str,
336
        "message": str,
337
        "branch": str
338
    },
339
    "benchmarks": [
340
        {
341
            "group": str,
342
            "name": str,
343
            "fullname": str,
344
            "params": dict,
345
            "param": str,
346
            "extra_info": dict,
347
            "stats": {
348
                "min": float,
349
                "max": float, 
350
                "mean": float,
351
                "stddev": float,
352
                "rounds": int,
353
                "median": float,
354
                "iqr": float,
355
                "q1": float,
356
                "q3": float,
357
                "iqr_outliers": int,
358
                "stddev_outliers": int,
359
                "outliers": str,
360
                "ld15iqr": float,
361
                "hd15iqr": float,
362
                "ops": float,
363
                "total": float
364
            },
365
            "data": [float, ...]  # If --benchmark-save-data used
366
        }
367
    ],
368
    "datetime": str,
369
    "version": str
370
}
371
```
372

373
## Advanced Usage
374

375
### Custom Commit Information
376

377
```python
378
def pytest_benchmark_generate_commit_info(config):
379
    """Custom commit info generation."""
380
    return {
381
        "id": "custom-build-123",
382
        "branch": "feature/optimization", 
383
        "message": "Performance improvements",
384
        "time": "2023-01-01T12:00:00Z"
385
    }
386
```
387

388
### Storage Authentication
389

390
```bash
391
# Using netrc for Elasticsearch auth
392
echo "machine elasticsearch.example.com login user password secret" >> ~/.netrc
393

394
pytest --benchmark-storage=elasticsearch+https://elasticsearch.example.com:9200/bench/result \
395
       --benchmark-netrc=~/.netrc
396
```
397

398
### Filtering Comparisons
399

400
```bash
401
# Compare only specific test patterns
402
pytest-benchmark compare baseline current --benchmark-filter="*string*"
403

404
# Compare specific groups  
405
pytest-benchmark compare baseline current --group="database"
406
```
407

408
## Troubleshooting
409

410
### Storage Issues
411

412
```python
413
# Check storage connectivity
414
pytest --benchmark-storage=file://./test-storage --benchmark-save=test
415

416
# Verify Elasticsearch connection
417
pytest --benchmark-storage=elasticsearch+http://localhost:9200/test/bench \
418
       --benchmark-save=connectivity-test
419
```
420

421
### Comparison Failures
422

423
```bash
424
# Debug comparison issues
425
pytest --benchmark-compare=baseline --benchmark-verbose
426

427
# List available runs for comparison
428
pytest-benchmark list --storage=file://.benchmarks
429
```

Version

Tile

Files

storage-comparison.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

storage-comparison.mddocs/