CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-pyevmasm

Ethereum Virtual Machine (EVM) assembler and disassembler library for working with EVM bytecode and assembly instructions

Pending
Overview
Eval results
Files

disassembly.mddocs/

Disassembly Operations

Convert EVM bytecode to assembly language with support for various input formats including bytes, hex strings, and iterators. Provides detailed instruction analysis and supports all Ethereum hard forks for accurate opcode interpretation.

Capabilities

Single Instruction Disassembly

Disassemble a single EVM instruction from bytecode, extracting complete instruction metadata including operands.

def disassemble_one(bytecode, pc: int = 0, fork: str = DEFAULT_FORK) -> Instruction:
    """
    Disassemble a single instruction from bytecode.

    Parameters:
    - bytecode (str | bytes | bytearray | iterator): The bytecode stream
    - pc (int, optional): Program counter of the instruction. Default: 0
    - fork (str, optional): Fork name. Default: DEFAULT_FORK ("istanbul")

    Returns:
    Instruction: An Instruction object with complete metadata, or None if no instruction found

    Raises:
    ParseError: If bytecode is malformed or insufficient for operand parsing
    """

Usage Examples:

from pyevmasm import disassemble_one

# Disassemble from bytes
instruction = disassemble_one(b'\x60\x40')  # PUSH1 0x40
print(f"Name: {instruction.name}")           # PUSH1
print(f"Operand: 0x{instruction.operand:x}")  # 0x40
print(f"Gas: {instruction.fee}")            # 3

# From hex string (without 0x prefix)
instruction = disassemble_one("6040")
print(f"Same instruction: {instruction.name}")  # PUSH1

# Invalid instructions become INVALID
invalid = disassemble_one(b'\xff\xff')
print(f"Invalid opcode: {invalid.name}")  # INVALID (for 0xff opcodes in some contexts)

# With program counter
instruction = disassemble_one(b'\x56', pc=100)  # JUMP
print(f"PC: {instruction.pc}")  # 100

Multiple Instruction Disassembly

Disassemble all instructions in a bytecode sequence, returning a generator for memory-efficient processing of large bytecode.

def disassemble_all(bytecode, pc: int = 0, fork: str = DEFAULT_FORK):
    """
    Disassemble all instructions in bytecode.

    Parameters:
    - bytecode (str | bytes | bytearray | iterator): An EVM bytecode (binary)
    - pc (int, optional): Program counter of the first instruction. Default: 0
    - fork (str, optional): Fork name. Default: DEFAULT_FORK ("istanbul")

    Returns:
    Generator[Instruction]: A generator of Instruction objects

    Note:
    Generator stops when no more valid instructions can be parsed
    """

Usage Examples:

from pyevmasm import disassemble_all

# Disassemble complete bytecode
bytecode = b'\x60\x80\x60\x40\x52\x60\x04\x36\x10'
instructions = list(disassemble_all(bytecode))

for instr in instructions:
    print(f"{instr.pc:08x}: {instr}")
# Output:
# 00000000: PUSH1 0x80
# 00000002: PUSH1 0x40
# 00000004: MSTORE
# 00000005: PUSH1 0x4
# 00000007: CALLDATASIZE
# 00000008: LT

# Memory-efficient processing of large bytecode
for instruction in disassemble_all(large_bytecode):
    if instruction.is_branch:
        print(f"Branch at 0x{instruction.pc:x}: {instruction}")

Text Disassembly

Disassemble bytecode to human-readable assembly text format, suitable for display and analysis.

def disassemble(bytecode, pc: int = 0, fork: str = DEFAULT_FORK) -> str:
    """
    Disassemble an EVM bytecode to text representation.

    Parameters:
    - bytecode (str | bytes | bytearray): Binary representation of EVM bytecode
    - pc (int, optional): Program counter of the first instruction. Default: 0
    - fork (str, optional): Fork name. Default: DEFAULT_FORK ("istanbul")

    Returns:
    str: The text representation of the assembly code (newline-separated)
    """

Hexadecimal Disassembly

Disassemble hexadecimal bytecode strings to assembly text, handling common hex formats automatically.

def disassemble_hex(bytecode: str, pc: int = 0, fork: str = DEFAULT_FORK) -> str:
    """
    Disassemble hexadecimal EVM bytecode to assembly text.

    Parameters:
    - bytecode (str): Canonical representation of EVM bytecode (hexadecimal, with or without 0x prefix)
    - pc (int, optional): Program counter of the first instruction. Default: 0
    - fork (str, optional): Fork name. Default: DEFAULT_FORK ("istanbul")

    Returns:
    str: The text representation of the assembly code (newline-separated)
    """

Usage Examples:

from pyevmasm import disassemble, disassemble_hex

# From binary bytecode
binary_code = b'\x60\x80\x60\x40\x52'
assembly = disassemble(binary_code)
print(assembly)
# PUSH1 0x80
# PUSH1 0x40
# MSTORE

# From hex string (most common usage)
hex_code = "0x608060405260043610"
assembly = disassemble_hex(hex_code)
print(assembly)
# PUSH1 0x80
# PUSH1 0x40
# MSTORE
# PUSH1 0x4
# CALLDATASIZE
# LT

# Hex without 0x prefix also works
assembly = disassemble_hex("608060405260043610")
# Same result

Input Format Support

PyEVMAsm's disassembly functions accept multiple input formats:

Binary Formats

  • bytes: Native Python bytes objects
  • bytearray: Mutable byte arrays
  • str (binary): Latin-1 encoded strings (legacy support)
  • iterator: Any iterator yielding integer byte values

Hexadecimal Formats

  • 0x-prefixed: "0x608060405260043610"
  • Plain hex: "608060405260043610"
  • Mixed case: Case-insensitive hex parsing

Special Format Handling

The disassembly functions include intelligent format detection:

from pyevmasm import disassemble_hex

# Automatically handles 0x prefix
code1 = disassemble_hex("0x6080604052")
code2 = disassemble_hex("6080604052")
assert code1 == code2

# Binary Ninja format detection (EVM prefix)
code3 = disassemble_hex("EVM6080604052")  # Strips "EVM" prefix

# All-hex string detection
mixed_format = "6080604052"  # Detected as hex even without 0x

Fork-Specific Disassembly

Different Ethereum forks have different instruction sets. PyEVMAsm provides accurate disassembly for each fork:

from pyevmasm import disassemble_one

# Byzantium introduced RETURNDATASIZE (0x3d)
instr = disassemble_one(b'\x3d', fork="byzantium")
print(instr.name)  # "RETURNDATASIZE"

# Same opcode in frontier fork
instr = disassemble_one(b'\x3d', fork="frontier")
print(instr.name)  # "INVALID"

# Constantinople introduced shift operations
instr = disassemble_one(b'\x1b', fork="constantinople")
print(instr.name)  # "SHL"

instr = disassemble_one(b'\x1b', fork="byzantium")
print(instr.name)  # "INVALID"

Error Handling

Disassembly functions handle various error conditions gracefully:

  • Insufficient data: Returns None or stops generator when not enough bytes for operands
  • Invalid opcodes: Creates INVALID instruction objects for unknown opcodes
  • Empty input: Returns None (single) or empty generator (multiple)
  • Malformed hex: Raises appropriate parsing exceptions

Install with Tessl CLI

npx tessl i tessl/pypi-pyevmasm

docs

assembly.md

disassembly.md

fork-management.md

index.md

instruction-analysis.md

tile.json