0
# Data Management and Analysis
1
2
Comprehensive data structures for storing, analyzing, and managing benchmark results. Provides statistical analysis, metadata handling, and serialization capabilities for persistent storage and result sharing.
3
4
## Capabilities
5
6
### Run Class
7
8
Represents a single benchmark run containing timing values and execution metadata.
9
10
```python { .api }
11
class Run:
12
def __init__(self, values, warmups=None, metadata=None, collect_metadata=True):
13
"""
14
Create a benchmark run.
15
16
Args:
17
values: Sequence of timing measurements (numbers > 0)
18
warmups: Optional sequence of (loops, value) tuples for warmup measurements
19
metadata: Optional dictionary of metadata
20
collect_metadata: Whether to automatically collect system metadata
21
"""
22
23
def get_metadata(self) -> dict:
24
"""Get run metadata including system information."""
25
26
def get_loops(self) -> int:
27
"""Get number of loops per measurement."""
28
29
def get_inner_loops(self) -> int:
30
"""
31
Get inner loop count from metadata.
32
33
Returns:
34
Inner loop count (default: 1 if not specified)
35
"""
36
37
def get_total_loops(self) -> int:
38
"""
39
Get total loops (loops * inner_loops).
40
41
Returns:
42
Total number of loop iterations per measurement
43
"""
44
45
@property
46
def warmups(self) -> tuple:
47
"""Tuple of warmup measurements."""
48
49
@property
50
def values(self) -> tuple:
51
"""Tuple of timing values."""
52
```
53
54
### Benchmark Class
55
56
Collection of benchmark runs with comprehensive statistical analysis capabilities.
57
58
```python { .api }
59
class Benchmark:
60
def __init__(self, runs):
61
"""
62
Create a benchmark from a sequence of Run objects.
63
64
Args:
65
runs: Non-empty sequence of Run objects
66
"""
67
68
def get_name(self) -> str:
69
"""Get benchmark name."""
70
71
def get_metadata(self) -> dict:
72
"""Get metadata common to all runs."""
73
74
def get_runs(self) -> list:
75
"""Get list of Run objects."""
76
77
def get_values(self) -> tuple:
78
"""Get all measurement values from all runs."""
79
80
def get_nrun(self) -> int:
81
"""Get number of runs."""
82
83
def get_nvalue(self) -> int:
84
"""Get total number of measurement values."""
85
86
def get_unit(self) -> str:
87
"""Get measurement unit (e.g., 'second', 'byte')."""
88
89
def get_total_duration(self) -> float:
90
"""Get total execution time for all runs."""
91
92
def get_dates(self) -> tuple:
93
"""Get (start_date, end_date) timestamps."""
94
95
def mean(self) -> float:
96
"""Mean of all measurement values (cached property)."""
97
98
def stdev(self) -> float:
99
"""Standard deviation of all measurement values (cached property)."""
100
101
def median(self) -> float:
102
"""Median of all measurement values (cached property)."""
103
104
def median_abs_dev(self) -> float:
105
"""Median absolute deviation of all measurement values (cached property)."""
106
107
def get_dates(self) -> tuple:
108
"""
109
Get benchmark execution date range.
110
111
Returns:
112
Tuple of (start_date, end_date) as datetime objects,
113
or None if no date information available
114
"""
115
116
def required_nprocesses(self) -> int:
117
"""
118
Get recommended number of processes for stable results.
119
120
Returns:
121
Number of processes needed for 95% certainty with <1% variance,
122
or None if insufficient data
123
"""
124
125
def percentile(self, p: float) -> float:
126
"""
127
Get percentile value.
128
129
Args:
130
p: Percentile (0-100)
131
132
Returns:
133
Value at the specified percentile
134
"""
135
136
def add_run(self, run: Run):
137
"""
138
Add a new run to this benchmark.
139
140
Args:
141
run: Run object to add
142
"""
143
144
def add_runs(self, benchmark: 'Benchmark'):
145
"""
146
Merge runs from another benchmark.
147
148
Args:
149
benchmark: Benchmark object to merge runs from
150
"""
151
152
def format_value(self, value: float) -> str:
153
"""
154
Format a single value for display.
155
156
Args:
157
value: Value to format
158
159
Returns:
160
Formatted string representation
161
"""
162
163
def format_values(self, values: tuple) -> list:
164
"""
165
Format multiple values for display.
166
167
Args:
168
values: Tuple of values to format
169
170
Returns:
171
List of formatted string representations
172
"""
173
174
def update_metadata(self, metadata: dict):
175
"""
176
Update benchmark metadata.
177
178
Args:
179
metadata: Dictionary of metadata to merge
180
"""
181
182
@staticmethod
183
def load(file) -> 'Benchmark':
184
"""
185
Load benchmark from JSON file.
186
187
Args:
188
file: File path or file object
189
190
Returns:
191
Benchmark object loaded from file
192
"""
193
194
@staticmethod
195
def loads(string: str) -> 'Benchmark':
196
"""
197
Load benchmark from JSON string.
198
199
Args:
200
string: JSON string representation
201
202
Returns:
203
Benchmark object parsed from string
204
"""
205
206
def dump(self, file, compact=True, replace=False):
207
"""
208
Save benchmark to JSON file.
209
210
Args:
211
file: Output file path or file object
212
compact: Whether to use compact JSON format
213
replace: Whether to replace existing file
214
"""
215
```
216
217
### BenchmarkSuite Class
218
219
Collection of benchmarks that can be managed and saved together.
220
221
```python { .api }
222
class BenchmarkSuite:
223
def __init__(self, benchmarks, filename=None):
224
"""
225
Create a benchmark suite.
226
227
Args:
228
benchmarks: Non-empty sequence of Benchmark objects
229
filename: Optional filename for reference
230
"""
231
232
def get_benchmarks(self) -> list:
233
"""Get list of Benchmark objects."""
234
235
def get_benchmark(self, name: str) -> Benchmark:
236
"""
237
Get benchmark by name.
238
239
Args:
240
name: Benchmark name
241
242
Returns:
243
Benchmark object with matching name
244
"""
245
246
def get_benchmark_names(self) -> list:
247
"""Get list of benchmark names."""
248
249
def get_metadata(self) -> dict:
250
"""Get suite-wide metadata."""
251
252
def add_benchmark(self, benchmark: Benchmark):
253
"""
254
Add a benchmark to this suite.
255
256
Args:
257
benchmark: Benchmark object to add
258
"""
259
260
def add_runs(self, result):
261
"""
262
Add runs from Benchmark or BenchmarkSuite.
263
264
Args:
265
result: Benchmark or BenchmarkSuite object
266
"""
267
268
def get_total_duration(self) -> float:
269
"""Get total execution time for all benchmarks."""
270
271
def get_dates(self) -> tuple:
272
"""Get (start_date, end_date) timestamps for entire suite."""
273
274
@classmethod
275
def load(cls, file) -> 'BenchmarkSuite':
276
"""
277
Load benchmark suite from JSON file.
278
279
Args:
280
file: File path or file object
281
282
Returns:
283
BenchmarkSuite object loaded from file
284
"""
285
286
@classmethod
287
def loads(cls, string: str) -> 'BenchmarkSuite':
288
"""
289
Load benchmark suite from JSON string.
290
291
Args:
292
string: JSON string representation
293
294
Returns:
295
BenchmarkSuite object parsed from string
296
"""
297
298
def dump(self, file, compact=True, replace=False):
299
"""
300
Save benchmark suite to JSON file.
301
302
Args:
303
file: Output file path or file object
304
compact: Whether to use compact JSON format
305
replace: Whether to replace existing file
306
"""
307
```
308
309
### Suite Management Functions
310
311
Utility functions for working with benchmark files and data merging.
312
313
```python { .api }
314
def add_runs(filename: str, result):
315
"""
316
Add benchmark results to existing JSON file.
317
318
Args:
319
filename: Path to JSON file
320
result: Benchmark or BenchmarkSuite object to add
321
"""
322
```
323
324
## Usage Examples
325
326
### Creating and Managing Runs
327
328
```python
329
import pyperf
330
331
# Create runs manually
332
values = [0.001, 0.0012, 0.0011, 0.001, 0.0013]
333
run1 = pyperf.Run(values, metadata={'version': '1.0'})
334
run2 = pyperf.Run([0.0009, 0.001, 0.0011], metadata={'version': '1.0'})
335
336
# Create benchmark from runs
337
benchmark = pyperf.Benchmark([run1, run2])
338
print(f"Mean: {benchmark.mean():.6f} seconds")
339
print(f"Median: {benchmark.median():.6f} seconds")
340
print(f"Standard deviation: {benchmark.stdev():.6f} seconds")
341
```
342
343
### Statistical Analysis
344
345
```python
346
import pyperf
347
348
# Load benchmark from file
349
benchmark = pyperf.Benchmark.load('results.json')
350
351
# Basic statistics
352
print(f"Number of runs: {benchmark.get_nrun()}")
353
print(f"Total values: {benchmark.get_nvalue()}")
354
print(f"Mean: {benchmark.mean():.6f} ± {benchmark.stdev():.6f} seconds")
355
356
# Percentile analysis
357
print(f"Min (0th percentile): {benchmark.percentile(0):.6f}")
358
print(f"25th percentile: {benchmark.percentile(25):.6f}")
359
print(f"Median (50th percentile): {benchmark.percentile(50):.6f}")
360
print(f"75th percentile: {benchmark.percentile(75):.6f}")
361
print(f"Max (100th percentile): {benchmark.percentile(100):.6f}")
362
363
# Median absolute deviation
364
print(f"Median absolute deviation: {benchmark.median_abs_dev():.6f}")
365
```
366
367
### Data Persistence
368
369
```python
370
import pyperf
371
372
# Save single benchmark
373
runner = pyperf.Runner()
374
benchmark = runner.timeit('test', '[i for i in range(100)]')
375
benchmark.dump('single_benchmark.json')
376
377
# Load and modify
378
loaded = pyperf.Benchmark.load('single_benchmark.json')
379
loaded.update_metadata({'test_type': 'list_comprehension'})
380
loaded.dump('modified_benchmark.json')
381
382
# Create and save benchmark suite
383
suite = pyperf.BenchmarkSuite([benchmark])
384
suite.dump('benchmark_suite.json')
385
386
# Add results to existing file
387
new_benchmark = runner.timeit('test2', 'sum(range(100))')
388
pyperf.add_runs('benchmark_suite.json', new_benchmark)
389
```
390
391
### Working with Metadata
392
393
```python
394
import pyperf
395
396
benchmark = pyperf.Benchmark.load('results.json')
397
398
# Examine metadata
399
metadata = benchmark.get_metadata()
400
print(f"Python version: {metadata.get('python_version')}")
401
print(f"CPU model: {metadata.get('cpu_model_name')}")
402
print(f"Platform: {metadata.get('platform')}")
403
404
# Get run-specific metadata
405
for i, run in enumerate(benchmark.get_runs()):
406
run_metadata = run.get_metadata()
407
print(f"Run {i}: {run_metadata.get('date')}")
408
```
409
410
### Benchmark Merging
411
412
```python
413
import pyperf
414
415
# Load multiple benchmark files
416
bench1 = pyperf.Benchmark.load('results1.json')
417
bench2 = pyperf.Benchmark.load('results2.json')
418
419
# Merge benchmarks (must have compatible metadata)
420
bench1.add_runs(bench2)
421
print(f"Combined runs: {bench1.get_nrun()}")
422
423
# Create suite from multiple benchmarks
424
suite = pyperf.BenchmarkSuite([bench1, bench2])
425
print(f"Suite benchmarks: {suite.get_benchmark_names()}")
426
```