0
# Atheris
1
2
A coverage-guided Python fuzzing engine based on libFuzzer. Atheris enables fuzzing of Python code and native extensions, providing feedback-guided testing to discover bugs through automatic input generation and mutation.
3
4
## Package Information
5
6
- **Package Name**: atheris
7
- **Language**: Python
8
- **Installation**: `pip install atheris`
9
- **Supported Platforms**: Linux (32-bit, 64-bit), Mac OS X
10
- **Python Versions**: 3.6 - 3.11
11
12
## Core Imports
13
14
```python
15
import atheris
16
```
17
18
For specific functionality:
19
20
```python
21
from atheris import Setup, Fuzz, FuzzedDataProvider
22
from atheris import instrument_imports, instrument_func, instrument_all
23
```
24
25
## Basic Usage
26
27
```python
28
#!/usr/bin/python3
29
30
import atheris
31
import sys
32
33
# Import libraries to fuzz within instrumentation context
34
with atheris.instrument_imports():
35
import some_library
36
37
def TestOneInput(data):
38
"""Fuzzer entry point that receives random bytes from libFuzzer."""
39
some_library.parse(data)
40
41
# Configure and start fuzzing
42
atheris.Setup(sys.argv, TestOneInput)
43
atheris.Fuzz()
44
```
45
46
Advanced usage with data provider:
47
48
```python
49
import atheris
50
import sys
51
52
with atheris.instrument_imports():
53
import target_module
54
55
def TestOneInput(data):
56
fdp = atheris.FuzzedDataProvider(data)
57
58
# Extract structured data from raw bytes
59
length = fdp.ConsumeIntInRange(1, 100)
60
text = fdp.ConsumeUnicodeNoSurrogates(length)
61
flag = fdp.ConsumeBool()
62
63
target_module.process(text, flag)
64
65
atheris.Setup(sys.argv, TestOneInput)
66
atheris.Fuzz()
67
```
68
69
## Architecture
70
71
Atheris implements coverage-guided fuzzing through several key components:
72
73
- **libFuzzer Integration**: Uses LLVM's libFuzzer engine for input generation, mutation, and coverage feedback
74
- **Bytecode Instrumentation**: Dynamically patches Python bytecode to collect branch and comparison coverage
75
- **Import-time Instrumentation**: Hooks module imports to automatically instrument loaded code
76
- **Native Extension Support**: Instruments C/C++ extensions when built with appropriate compiler flags
77
- **Custom Mutators**: Supports user-defined mutation strategies for domain-specific input generation
78
79
## Capabilities
80
81
### Core Fuzzing Functions
82
83
Essential functions for setting up and running the fuzzer, including configuration and execution control.
84
85
```python { .api }
86
def Setup(args, test_one_input, internal_libfuzzer=None, custom_mutator=None, custom_crossover=None):
87
"""
88
Configure the fuzzer with test function and options.
89
90
Args:
91
args (list): Command-line arguments (typically sys.argv)
92
test_one_input (callable): Function that takes bytes and performs testing
93
internal_libfuzzer (bool, optional): Use internal libfuzzer (auto-detected if None)
94
custom_mutator (callable, optional): Custom mutation function
95
custom_crossover (callable, optional): Custom crossover function
96
97
Returns:
98
list: Remaining command-line arguments after fuzzer consumption
99
"""
100
101
def Fuzz():
102
"""
103
Start the fuzzing loop. Must call Setup() first.
104
105
This function does not return - it runs until the fuzzer stops
106
due to finding a crash, reaching run limits, or external termination.
107
"""
108
109
def Mutate(data, max_size):
110
"""
111
Mutate input data using libFuzzer's built-in mutator.
112
113
Args:
114
data (bytes): Input data to mutate
115
max_size (int): Maximum size of mutated output
116
117
Returns:
118
bytes: Mutated data
119
"""
120
```
121
122
[Core Fuzzing](./core-fuzzing.md)
123
124
### Data Provider
125
126
Utilities for converting raw fuzzer bytes into structured data types for more effective testing.
127
128
```python { .api }
129
class FuzzedDataProvider:
130
"""Converts raw fuzzer bytes into various data types."""
131
132
def __init__(self, input_bytes: bytes):
133
"""Initialize with raw fuzzer input."""
134
135
def ConsumeBytes(self, count: int) -> bytes:
136
"""Consume count bytes from the input."""
137
138
def ConsumeInt(self, byte_size: int) -> int:
139
"""Consume a signed integer of specified byte size."""
140
141
def ConsumeIntInRange(self, min_val: int, max_val: int) -> int:
142
"""Consume an integer in the range [min_val, max_val]."""
143
144
def ConsumeBool(self) -> bool:
145
"""Consume either True or False."""
146
```
147
148
[Data Provider](./data-provider.md)
149
150
### Code Instrumentation
151
152
Functions for adding coverage instrumentation to Python code at import-time, runtime, or globally.
153
154
```python { .api }
155
def instrument_imports(include=None, exclude=None, enable_loader_override=True):
156
"""
157
Context manager that instruments Python modules imported within the context.
158
159
Args:
160
include (list, optional): Module names to include
161
exclude (list, optional): Module names to exclude
162
enable_loader_override (bool): Enable custom loader instrumentation
163
164
Returns:
165
Context manager for instrumented imports
166
"""
167
168
def instrument_func(func):
169
"""
170
Decorator that instruments a specific Python function.
171
172
Args:
173
func (callable): Function to instrument
174
175
Returns:
176
callable: Instrumented function
177
"""
178
179
def instrument_all():
180
"""Instrument all currently loaded Python functions."""
181
```
182
183
[Code Instrumentation](./instrumentation.md)
184
185
### Advanced Features
186
187
Hook management, custom mutators, and specialized instrumentation for regex and string operations.
188
189
```python { .api }
190
# Hook management
191
enabled_hooks = EnabledHooks() # Global hook manager
192
193
def gen_match(pattern):
194
"""
195
Generate a string that matches a regex pattern.
196
197
Args:
198
pattern (str or bytes): Regular expression pattern
199
200
Returns:
201
str or bytes: String that matches the pattern
202
"""
203
204
def path() -> str:
205
"""
206
Get the path to the Atheris installation directory.
207
208
Returns:
209
str: Path to the directory containing Atheris files
210
"""
211
212
# Constants
213
ALL_REMAINING: int # Special value for FuzzedDataProvider to consume all remaining bytes
214
```
215
216
[Advanced Features](./advanced-features.md)
217
218
## Types
219
220
```python { .api }
221
class EnabledHooks:
222
"""Manages the set of enabled instrumentation hooks."""
223
224
def add(self, hook: str) -> None:
225
"""
226
Enable a specific hook.
227
228
Args:
229
hook (str): Hook name ('RegEx' or 'str')
230
"""
231
232
def __contains__(self, hook: str) -> bool:
233
"""
234
Check if a hook is enabled.
235
236
Args:
237
hook (str): Hook name to check
238
239
Returns:
240
bool: True if the hook is enabled
241
"""
242
```