or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

analysis-reporting.mdaspect-benchmarking.mdcli-tools.mdconfiguration.mdcore-benchmarking.mdindex.mdstorage-comparison.md

core-benchmarking.mddocs/

0

# Core Benchmarking

1

2

## Overview

3

4

The core benchmarking functionality provides the primary `benchmark` fixture that automatically calibrates test runs for accurate performance measurements. It offers two main modes: automatic benchmarking with `benchmark()` and pedantic mode with `benchmark.pedantic()` for fine-grained control.

5

6

## Imports

7

8

```python

9

# The benchmark fixture is automatically available in pytest tests

10

import pytest

11

12

# For programmatic access to benchmarking classes (rarely needed)

13

from pytest_benchmark.fixture import BenchmarkFixture, FixtureAlreadyUsed

14

```

15

16

## Core APIs

17

18

### benchmark fixture

19

20

```python { .api }

21

@pytest.fixture

22

def benchmark(request) -> BenchmarkFixture:

23

"""

24

Primary pytest fixture for benchmarking functions.

25

26

Returns:

27

BenchmarkFixture: The benchmark fixture instance for the current test.

28

"""

29

```

30

31

### BenchmarkFixture.__call__

32

33

```python { .api }

34

def __call__(self, function_to_benchmark, *args, **kwargs) -> Any:

35

"""

36

Benchmark a function with automatic calibration and timing.

37

38

Args:

39

function_to_benchmark: The function to benchmark

40

*args: Positional arguments to pass to the function

41

**kwargs: Keyword arguments to pass to the function

42

43

Returns:

44

Any: The return value of the benchmarked function

45

46

Raises:

47

FixtureAlreadyUsed: If the fixture has already been used in this test

48

"""

49

```

50

51

### BenchmarkFixture.pedantic

52

53

```python { .api }

54

def pedantic(self, target, args=(), kwargs=None, setup=None, rounds=1, warmup_rounds=0, iterations=1) -> Any:

55

"""

56

Benchmark with precise control over execution parameters.

57

58

Args:

59

target: The function to benchmark

60

args: Tuple of positional arguments (default: ())

61

kwargs: Dict of keyword arguments (default: None)

62

setup: Setup function called before each round (default: None)

63

rounds: Number of measurement rounds (default: 1)

64

warmup_rounds: Number of warmup rounds (default: 0)

65

iterations: Number of iterations per round (default: 1)

66

67

Returns:

68

Any: The return value of the benchmarked function

69

70

Raises:

71

ValueError: If iterations, rounds, or warmup_rounds are invalid

72

TypeError: If setup returns arguments when args/kwargs are also provided

73

FixtureAlreadyUsed: If the fixture has already been used in this test

74

"""

75

```

76

77

### BenchmarkFixture.weave

78

79

```python { .api }

80

def weave(self, target, **kwargs) -> None:

81

"""

82

Apply benchmarking to a target function using aspect-oriented programming.

83

84

Args:

85

target: The function, method, or object to benchmark

86

**kwargs: Additional arguments passed to aspectlib.weave()

87

88

Raises:

89

ImportError: If aspectlib is not installed

90

FixtureAlreadyUsed: If the fixture has already been used in this test

91

"""

92

```

93

94

### BenchmarkFixture.patch

95

96

```python { .api }

97

# Alias for weave method

98

patch = weave

99

```

100

101

### BenchmarkFixture Properties and Attributes

102

103

```python { .api }

104

@property

105

def enabled(self) -> bool:

106

"""Whether benchmarking is enabled (not disabled)."""

107

108

# Instance attributes (not properties)

109

name: str # The test function name

110

fullname: str # The full test node ID

111

param: str | None # Test parametrization ID if parametrized, None otherwise

112

params: dict | None # Test parametrization parameters if parametrized, None otherwise

113

group: str | None # Benchmark group name if specified, None otherwise

114

has_error: bool # Whether the benchmark encountered an error

115

extra_info: dict # Additional benchmark information

116

stats: 'Metadata' | None # Benchmark statistics after execution, None before execution

117

```

118

119

## Usage Examples

120

121

### Basic Function Benchmarking

122

123

```python

124

def fibonacci(n):

125

if n < 2:

126

return n

127

return fibonacci(n-1) + fibonacci(n-2)

128

129

def test_fibonacci_benchmark(benchmark):

130

# Automatic calibration determines optimal iterations

131

result = benchmark(fibonacci, 20)

132

assert result == 6765

133

```

134

135

### Benchmarking with Arguments

136

137

```python

138

def string_operations(text, count):

139

result = []

140

for i in range(count):

141

result.append(text.upper().replace(' ', '_'))

142

return result

143

144

def test_string_operations(benchmark):

145

result = benchmark(string_operations, "hello world", 1000)

146

assert len(result) == 1000

147

assert result[0] == "HELLO_WORLD"

148

```

149

150

### Pedantic Mode Examples

151

152

#### Basic Pedantic Benchmarking

153

154

```python

155

def test_pedantic_basic(benchmark):

156

def expensive_operation():

157

return sum(x**2 for x in range(10000))

158

159

result = benchmark.pedantic(

160

target=expensive_operation,

161

rounds=5, # Run 5 measurement rounds

162

iterations=10 # 10 iterations per round

163

)

164

assert result == 333283335000

165

```

166

167

#### Pedantic with Setup Function

168

169

```python

170

def test_pedantic_with_setup(benchmark):

171

def create_data():

172

# Setup function returns (args, kwargs)

173

data = list(range(10000))

174

return (data,), {}

175

176

def process_data(data):

177

return sum(x**2 for x in data)

178

179

result = benchmark.pedantic(

180

target=process_data,

181

setup=create_data,

182

rounds=3,

183

warmup_rounds=1

184

)

185

assert result == 333283335000

186

```

187

188

#### Pedantic with Explicit Arguments

189

190

```python

191

def test_pedantic_with_args(benchmark):

192

def multiply_matrices(a, b, size):

193

result = [[0] * size for _ in range(size)]

194

for i in range(size):

195

for j in range(size):

196

for k in range(size):

197

result[i][j] += a[i][k] * b[k][j]

198

return result

199

200

# Create test matrices

201

size = 50

202

matrix_a = [[1] * size for _ in range(size)]

203

matrix_b = [[2] * size for _ in range(size)]

204

205

result = benchmark.pedantic(

206

target=multiply_matrices,

207

args=(matrix_a, matrix_b, size),

208

rounds=3,

209

iterations=1

210

)

211

assert result[0][0] == 100 # 50 * 1 * 2

212

```

213

214

## Calibration and Timing

215

216

The benchmark fixture automatically calibrates the number of iterations to ensure reliable measurements:

217

218

1. **Timer Precision**: Computes timer precision for the platform

219

2. **Minimum Time**: Ensures each measurement round meets minimum time thresholds

220

3. **Calibration**: Automatically determines optimal number of iterations

221

4. **Warmup**: Optional warmup rounds to stabilize performance

222

5. **Statistics**: Collects timing data across multiple rounds

223

224

### Calibration Process

225

226

```python

227

def test_calibration_behavior(benchmark):

228

def fast_function():

229

return sum(range(100))

230

231

# The fixture will automatically:

232

# 1. Measure timer precision

233

# 2. Run calibration to find optimal iterations

234

# 3. Execute warmup rounds if configured

235

# 4. Run the actual benchmark rounds

236

# 5. Collect and analyze statistics

237

result = benchmark(fast_function)

238

assert result == 4950

239

```

240

241

## Exception Handling

242

243

### FixtureAlreadyUsed

244

245

```python { .api }

246

class FixtureAlreadyUsed(Exception):

247

"""Raised when benchmark fixture is used more than once in a test."""

248

```

249

250

```python

251

def test_fixture_single_use(benchmark):

252

benchmark(lambda: 42)

253

254

# This would raise FixtureAlreadyUsed

255

# benchmark(lambda: 24) # Error!

256

```

257

258

### Error States

259

260

```python

261

def test_error_handling(benchmark):

262

def failing_function():

263

raise ValueError("Something went wrong")

264

265

# Benchmark properly handles and propagates exceptions

266

with pytest.raises(ValueError):

267

benchmark(failing_function)

268

269

# Fixture.has_error will be True

270

assert benchmark.has_error

271

```

272

273

## Integration with pytest

274

275

### Test Parametrization

276

277

```python

278

@pytest.mark.parametrize("size", [100, 1000, 10000])

279

def test_scaling_benchmark(benchmark, size):

280

def process_list(n):

281

return sum(range(n))

282

283

result = benchmark(process_list, size)

284

expected = size * (size - 1) // 2

285

assert result == expected

286

```

287

288

### Test Collection and Skipping

289

290

The benchmark fixture integrates with pytest's test collection and skipping mechanisms. Tests with benchmarks are automatically identified and can be controlled via command-line options like `--benchmark-skip` and `--benchmark-only`.