Tessl Tile for pypi/pytest-benchmark@5.1.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

analysis-reporting.md aspect-benchmarking.md cli-tools.md configuration.md core-benchmarking.md index.md storage-comparison.md

core-benchmarking.mddocs/

0
# Core Benchmarking
1

2
## Overview
3

4
The core benchmarking functionality provides the primary `benchmark` fixture that automatically calibrates test runs for accurate performance measurements. It offers two main modes: automatic benchmarking with `benchmark()` and pedantic mode with `benchmark.pedantic()` for fine-grained control.
5

6
## Imports
7

8
```python
9
# The benchmark fixture is automatically available in pytest tests
10
import pytest
11

12
# For programmatic access to benchmarking classes (rarely needed)
13
from pytest_benchmark.fixture import BenchmarkFixture, FixtureAlreadyUsed
14
```
15

16
## Core APIs
17

18
### benchmark fixture
19

20
```python { .api }
21
@pytest.fixture
22
def benchmark(request) -> BenchmarkFixture:
23
    """
24
    Primary pytest fixture for benchmarking functions.
25
    
26
    Returns:
27
        BenchmarkFixture: The benchmark fixture instance for the current test.
28
    """
29
```
30

31
### BenchmarkFixture.__call__
32

33
```python { .api }
34
def __call__(self, function_to_benchmark, *args, **kwargs) -> Any:
35
    """
36
    Benchmark a function with automatic calibration and timing.
37
    
38
    Args:
39
        function_to_benchmark: The function to benchmark
40
        *args: Positional arguments to pass to the function
41
        **kwargs: Keyword arguments to pass to the function
42
        
43
    Returns:
44
        Any: The return value of the benchmarked function
45
        
46
    Raises:
47
        FixtureAlreadyUsed: If the fixture has already been used in this test
48
    """
49
```
50

51
### BenchmarkFixture.pedantic
52

53
```python { .api }
54
def pedantic(self, target, args=(), kwargs=None, setup=None, rounds=1, warmup_rounds=0, iterations=1) -> Any:
55
    """
56
    Benchmark with precise control over execution parameters.
57
    
58
    Args:
59
        target: The function to benchmark
60
        args: Tuple of positional arguments (default: ())
61
        kwargs: Dict of keyword arguments (default: None)
62
        setup: Setup function called before each round (default: None)
63
        rounds: Number of measurement rounds (default: 1)
64
        warmup_rounds: Number of warmup rounds (default: 0)
65
        iterations: Number of iterations per round (default: 1)
66
        
67
    Returns:
68
        Any: The return value of the benchmarked function
69
        
70
    Raises:
71
        ValueError: If iterations, rounds, or warmup_rounds are invalid
72
        TypeError: If setup returns arguments when args/kwargs are also provided
73
        FixtureAlreadyUsed: If the fixture has already been used in this test
74
    """
75
```
76

77
### BenchmarkFixture.weave
78

79
```python { .api }
80
def weave(self, target, **kwargs) -> None:
81
    """
82
    Apply benchmarking to a target function using aspect-oriented programming.
83
    
84
    Args:
85
        target: The function, method, or object to benchmark
86
        **kwargs: Additional arguments passed to aspectlib.weave()
87
        
88
    Raises:
89
        ImportError: If aspectlib is not installed
90
        FixtureAlreadyUsed: If the fixture has already been used in this test
91
    """
92
```
93

94
### BenchmarkFixture.patch
95

96
```python { .api }
97
# Alias for weave method
98
patch = weave
99
```
100

101
### BenchmarkFixture Properties and Attributes
102

103
```python { .api }
104
@property
105
def enabled(self) -> bool:
106
    """Whether benchmarking is enabled (not disabled)."""
107

108
# Instance attributes (not properties)
109
name: str                     # The test function name
110
fullname: str                 # The full test node ID  
111
param: str | None             # Test parametrization ID if parametrized, None otherwise
112
params: dict | None           # Test parametrization parameters if parametrized, None otherwise
113
group: str | None             # Benchmark group name if specified, None otherwise
114
has_error: bool               # Whether the benchmark encountered an error
115
extra_info: dict              # Additional benchmark information
116
stats: 'Metadata' | None      # Benchmark statistics after execution, None before execution
117
```
118

119
## Usage Examples
120

121
### Basic Function Benchmarking
122

123
```python
124
def fibonacci(n):
125
    if n < 2:
126
        return n
127
    return fibonacci(n-1) + fibonacci(n-2)
128

129
def test_fibonacci_benchmark(benchmark):
130
    # Automatic calibration determines optimal iterations
131
    result = benchmark(fibonacci, 20)
132
    assert result == 6765
133
```
134

135
### Benchmarking with Arguments
136

137
```python
138
def string_operations(text, count):
139
    result = []
140
    for i in range(count):
141
        result.append(text.upper().replace(' ', '_'))
142
    return result
143

144
def test_string_operations(benchmark):
145
    result = benchmark(string_operations, "hello world", 1000)
146
    assert len(result) == 1000
147
    assert result[0] == "HELLO_WORLD"
148
```
149

150
### Pedantic Mode Examples
151

152
#### Basic Pedantic Benchmarking
153

154
```python
155
def test_pedantic_basic(benchmark):
156
    def expensive_operation():
157
        return sum(x**2 for x in range(10000))
158
    
159
    result = benchmark.pedantic(
160
        target=expensive_operation,
161
        rounds=5,           # Run 5 measurement rounds
162
        iterations=10       # 10 iterations per round
163
    )
164
    assert result == 333283335000
165
```
166

167
#### Pedantic with Setup Function
168

169
```python
170
def test_pedantic_with_setup(benchmark):
171
    def create_data():
172
        # Setup function returns (args, kwargs)
173
        data = list(range(10000))
174
        return (data,), {}
175
    
176
    def process_data(data):
177
        return sum(x**2 for x in data)
178
    
179
    result = benchmark.pedantic(
180
        target=process_data,
181
        setup=create_data,
182
        rounds=3,
183
        warmup_rounds=1
184
    )
185
    assert result == 333283335000
186
```
187

188
#### Pedantic with Explicit Arguments
189

190
```python
191
def test_pedantic_with_args(benchmark):
192
    def multiply_matrices(a, b, size):
193
        result = [[0] * size for _ in range(size)]
194
        for i in range(size):
195
            for j in range(size):
196
                for k in range(size):
197
                    result[i][j] += a[i][k] * b[k][j]
198
        return result
199
    
200
    # Create test matrices
201
    size = 50
202
    matrix_a = [[1] * size for _ in range(size)]
203
    matrix_b = [[2] * size for _ in range(size)]
204
    
205
    result = benchmark.pedantic(
206
        target=multiply_matrices,
207
        args=(matrix_a, matrix_b, size),
208
        rounds=3,
209
        iterations=1
210
    )
211
    assert result[0][0] == 100  # 50 * 1 * 2
212
```
213

214
## Calibration and Timing
215

216
The benchmark fixture automatically calibrates the number of iterations to ensure reliable measurements:
217

218
1. **Timer Precision**: Computes timer precision for the platform
219
2. **Minimum Time**: Ensures each measurement round meets minimum time thresholds
220
3. **Calibration**: Automatically determines optimal number of iterations
221
4. **Warmup**: Optional warmup rounds to stabilize performance
222
5. **Statistics**: Collects timing data across multiple rounds
223

224
### Calibration Process
225

226
```python
227
def test_calibration_behavior(benchmark):
228
    def fast_function():
229
        return sum(range(100))
230
    
231
    # The fixture will automatically:
232
    # 1. Measure timer precision
233
    # 2. Run calibration to find optimal iterations
234
    # 3. Execute warmup rounds if configured
235
    # 4. Run the actual benchmark rounds
236
    # 5. Collect and analyze statistics
237
    result = benchmark(fast_function)
238
    assert result == 4950
239
```
240

241
## Exception Handling
242

243
### FixtureAlreadyUsed
244

245
```python { .api }
246
class FixtureAlreadyUsed(Exception):
247
    """Raised when benchmark fixture is used more than once in a test."""
248
```
249

250
```python
251
def test_fixture_single_use(benchmark):
252
    benchmark(lambda: 42)
253
    
254
    # This would raise FixtureAlreadyUsed
255
    # benchmark(lambda: 24)  # Error!
256
```
257

258
### Error States
259

260
```python
261
def test_error_handling(benchmark):
262
    def failing_function():
263
        raise ValueError("Something went wrong")
264
    
265
    # Benchmark properly handles and propagates exceptions
266
    with pytest.raises(ValueError):
267
        benchmark(failing_function)
268
    
269
    # Fixture.has_error will be True
270
    assert benchmark.has_error
271
```
272

273
## Integration with pytest
274

275
### Test Parametrization
276

277
```python
278
@pytest.mark.parametrize("size", [100, 1000, 10000])
279
def test_scaling_benchmark(benchmark, size):
280
    def process_list(n):
281
        return sum(range(n))
282
    
283
    result = benchmark(process_list, size)
284
    expected = size * (size - 1) // 2
285
    assert result == expected
286
```
287

288
### Test Collection and Skipping
289

290
The benchmark fixture integrates with pytest's test collection and skipping mechanisms. Tests with benchmarks are automatically identified and can be controlled via command-line options like `--benchmark-skip` and `--benchmark-only`.

Version

Tile

Files

core-benchmarking.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

core-benchmarking.mddocs/