CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-atheris

A coverage-guided fuzzer for Python and Python extensions based on libFuzzer

91

1.28x
Overview
Eval results
Files

instrumentation.mddocs/

Code Instrumentation

Atheris provides multiple methods for adding coverage instrumentation to Python code. Instrumentation is essential for effective fuzzing as it provides feedback to the fuzzer about which code paths have been executed, enabling guided exploration of the program's behavior.

Capabilities

Import-Time Instrumentation

Automatically instrument Python modules as they are imported.

def instrument_imports(include=None, exclude=None, enable_loader_override=True):
    """
    Context manager that instruments Python modules imported within the context.
    
    This is the recommended approach for most fuzzing scenarios as it automatically
    instruments libraries without requiring manual intervention.
    
    Args:
        include (list, optional): List of fully-qualified module names to instrument.
                                If provided, only these modules will be instrumented.
        exclude (list, optional): List of fully-qualified module names to exclude
                                from instrumentation.
        enable_loader_override (bool): Whether to enable experimental feature of
                                     instrumenting custom loaders (default: True).
    
    Returns:
        Context manager: Use with 'with' statement to enable instrumentation
    """

Usage Examples:

import atheris

# Basic usage - instrument all imported modules
with atheris.instrument_imports():
    import json
    import urllib.parse
    from xml.etree import ElementTree

def TestOneInput(data):
    # All the imported modules are now instrumented
    parsed = json.loads(data.decode('utf-8', errors='ignore'))
    # ...
# Selective instrumentation
with atheris.instrument_imports(include=['mypackage', 'mypackage.submodule']):
    import mypackage  # Will be instrumented
    import os        # Will NOT be instrumented

# Exclusion-based instrumentation  
with atheris.instrument_imports(exclude=['requests.sessions']):
    import requests  # Most of requests will be instrumented
    # except requests.sessions which is excluded

Important Notes:

  • Only modules imported within the context will be instrumented
  • Previously imported modules cannot be instrumented with this method
  • Use instrument_all() for already-imported modules

Function-Level Instrumentation

Instrument individual functions with a decorator or direct call.

def instrument_func(func):
    """
    Instrument a specific Python function for coverage tracking.
    
    Can be used as a decorator or called directly on a function object.
    The function is instrumented in-place, affecting all call sites.
    
    Args:
        func (callable): Function to instrument
    
    Returns:
        callable: The same function, now instrumented
    """

Usage Examples:

import atheris

# As a decorator
@atheris.instrument_func
def my_function(x, y):
    if x > y:
        return x * 2
    else:
        return y * 2

# Direct instrumentation
def another_function(data):
    return process_data(data)

atheris.instrument_func(another_function)

# Instrumenting the test function itself (recommended)
@atheris.instrument_func
def TestOneInput(data):
    my_function(data[0], data[1])
    another_function(data[2:])

When to Use Function-Level Instrumentation:

  • For specific functions you want to ensure are instrumented
  • When you need instrumentation but can't use instrument_imports()
  • For the TestOneInput function itself to avoid "no interesting inputs" errors

Global Instrumentation

Instrument all currently loaded Python functions.

def instrument_all():
    """
    Instrument all currently loaded Python functions.
    
    This scans the entire Python interpreter and instruments every function
    it finds, including core Python functions and previously imported modules.
    This operation can be slow but provides comprehensive coverage.
    
    Note: This is an experimental feature.
    """

Usage Example:

import atheris
import json
import sys

# Import modules first
import target_module

def TestOneInput(data):
    target_module.process(data)

# Instrument everything that's currently loaded
atheris.instrument_all()

atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()

When to Use Global Instrumentation:

  • When you need to instrument modules that were imported before you could use instrument_imports()
  • For comprehensive coverage including Python standard library functions
  • When troubleshooting coverage issues

Low-Level Bytecode Instrumentation

Direct bytecode manipulation for advanced use cases.

def patch_code(code, trace_dataflow, nested=False):
    """
    Patch Python bytecode with instrumentation for coverage tracking.
    
    This is a low-level function used internally by other instrumentation methods.
    Most users should use the higher-level functions instead.
    
    Args:
        code (types.CodeType): Python code object to patch
        trace_dataflow (bool): Whether to trace dataflow (comparisons)
        nested (bool): Whether this is a nested code object (default: False)
    
    Returns:
        types.CodeType: New code object with instrumentation added
    """

Instrumentation Best Practices

Recommended Pattern:

#!/usr/bin/python3

import atheris
import sys

# Import system modules before instrumentation
import os
import sys

# Import and instrument target modules
with atheris.instrument_imports():
    import target_library
    from mypackage import mymodule

@atheris.instrument_func
def TestOneInput(data):
    # Instrumented test function
    target_library.parse(data)
    mymodule.process(data)

atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()

Handling "No Interesting Inputs" Error:

If you see this error, it means the first few executions didn't generate coverage:

ERROR: no interesting inputs were found. Is the code instrumented for coverage? Exiting.

Solutions:

  1. Add @atheris.instrument_func to your TestOneInput function
  2. Use atheris.instrument_all() if import-time instrumentation isn't sufficient
  3. Ensure your test function actually calls instrumented code in the first few executions

Example Fix:

# This might cause "no interesting inputs" error
def TestOneInput(data):
    if len(data) < 10:
        return  # Early return with no instrumented code
    target_function(data)

# Better approach
@atheris.instrument_func  # Instrument the test function itself
def TestOneInput(data):
    if len(data) < 10:
        return  # Now this branch is instrumented
    target_function(data)

Performance Considerations

  • Import-time instrumentation has minimal runtime overhead
  • Global instrumentation can be slow during the instrumentation phase but has similar runtime performance
  • Function-level instrumentation provides fine-grained control with minimal overhead

Module Compatibility

Some modules may not work well with instrumentation:

# Skip problematic modules
with atheris.instrument_imports(exclude=['numpy.core', 'matplotlib._internal']):
    import numpy
    import matplotlib.pyplot as plt

Common modules to exclude:

  • Native extension internals (e.g., numpy.core)
  • Modules that use dynamic code generation
  • Modules with C extensions that don't interact well with bytecode instrumentation

Testing Instrumentation

Verify that instrumentation is working:

import atheris
import sys

with atheris.instrument_imports():
    import target_module

def TestOneInput(data):
    print(f"Testing with {len(data)} bytes")
    target_module.function(data)

# Run with limited iterations to verify coverage
# Command: python fuzzer.py -atheris_runs=10
atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()

If instrumentation is working, you should see coverage feedback in the fuzzer output.

Install with Tessl CLI

npx tessl i tessl/pypi-atheris

docs

advanced-features.md

core-fuzzing.md

data-provider.md

index.md

instrumentation.md

tile.json