Ethereum Virtual Machine (EVM) assembler and disassembler library for working with EVM bytecode and assembly instructions
—
Convert EVM bytecode to assembly language with support for various input formats including bytes, hex strings, and iterators. Provides detailed instruction analysis and supports all Ethereum hard forks for accurate opcode interpretation.
Disassemble a single EVM instruction from bytecode, extracting complete instruction metadata including operands.
def disassemble_one(bytecode, pc: int = 0, fork: str = DEFAULT_FORK) -> Instruction:
"""
Disassemble a single instruction from bytecode.
Parameters:
- bytecode (str | bytes | bytearray | iterator): The bytecode stream
- pc (int, optional): Program counter of the instruction. Default: 0
- fork (str, optional): Fork name. Default: DEFAULT_FORK ("istanbul")
Returns:
Instruction: An Instruction object with complete metadata, or None if no instruction found
Raises:
ParseError: If bytecode is malformed or insufficient for operand parsing
"""Usage Examples:
from pyevmasm import disassemble_one
# Disassemble from bytes
instruction = disassemble_one(b'\x60\x40') # PUSH1 0x40
print(f"Name: {instruction.name}") # PUSH1
print(f"Operand: 0x{instruction.operand:x}") # 0x40
print(f"Gas: {instruction.fee}") # 3
# From hex string (without 0x prefix)
instruction = disassemble_one("6040")
print(f"Same instruction: {instruction.name}") # PUSH1
# Invalid instructions become INVALID
invalid = disassemble_one(b'\xff\xff')
print(f"Invalid opcode: {invalid.name}") # INVALID (for 0xff opcodes in some contexts)
# With program counter
instruction = disassemble_one(b'\x56', pc=100) # JUMP
print(f"PC: {instruction.pc}") # 100Disassemble all instructions in a bytecode sequence, returning a generator for memory-efficient processing of large bytecode.
def disassemble_all(bytecode, pc: int = 0, fork: str = DEFAULT_FORK):
"""
Disassemble all instructions in bytecode.
Parameters:
- bytecode (str | bytes | bytearray | iterator): An EVM bytecode (binary)
- pc (int, optional): Program counter of the first instruction. Default: 0
- fork (str, optional): Fork name. Default: DEFAULT_FORK ("istanbul")
Returns:
Generator[Instruction]: A generator of Instruction objects
Note:
Generator stops when no more valid instructions can be parsed
"""Usage Examples:
from pyevmasm import disassemble_all
# Disassemble complete bytecode
bytecode = b'\x60\x80\x60\x40\x52\x60\x04\x36\x10'
instructions = list(disassemble_all(bytecode))
for instr in instructions:
print(f"{instr.pc:08x}: {instr}")
# Output:
# 00000000: PUSH1 0x80
# 00000002: PUSH1 0x40
# 00000004: MSTORE
# 00000005: PUSH1 0x4
# 00000007: CALLDATASIZE
# 00000008: LT
# Memory-efficient processing of large bytecode
for instruction in disassemble_all(large_bytecode):
if instruction.is_branch:
print(f"Branch at 0x{instruction.pc:x}: {instruction}")Disassemble bytecode to human-readable assembly text format, suitable for display and analysis.
def disassemble(bytecode, pc: int = 0, fork: str = DEFAULT_FORK) -> str:
"""
Disassemble an EVM bytecode to text representation.
Parameters:
- bytecode (str | bytes | bytearray): Binary representation of EVM bytecode
- pc (int, optional): Program counter of the first instruction. Default: 0
- fork (str, optional): Fork name. Default: DEFAULT_FORK ("istanbul")
Returns:
str: The text representation of the assembly code (newline-separated)
"""Disassemble hexadecimal bytecode strings to assembly text, handling common hex formats automatically.
def disassemble_hex(bytecode: str, pc: int = 0, fork: str = DEFAULT_FORK) -> str:
"""
Disassemble hexadecimal EVM bytecode to assembly text.
Parameters:
- bytecode (str): Canonical representation of EVM bytecode (hexadecimal, with or without 0x prefix)
- pc (int, optional): Program counter of the first instruction. Default: 0
- fork (str, optional): Fork name. Default: DEFAULT_FORK ("istanbul")
Returns:
str: The text representation of the assembly code (newline-separated)
"""Usage Examples:
from pyevmasm import disassemble, disassemble_hex
# From binary bytecode
binary_code = b'\x60\x80\x60\x40\x52'
assembly = disassemble(binary_code)
print(assembly)
# PUSH1 0x80
# PUSH1 0x40
# MSTORE
# From hex string (most common usage)
hex_code = "0x608060405260043610"
assembly = disassemble_hex(hex_code)
print(assembly)
# PUSH1 0x80
# PUSH1 0x40
# MSTORE
# PUSH1 0x4
# CALLDATASIZE
# LT
# Hex without 0x prefix also works
assembly = disassemble_hex("608060405260043610")
# Same resultPyEVMAsm's disassembly functions accept multiple input formats:
The disassembly functions include intelligent format detection:
from pyevmasm import disassemble_hex
# Automatically handles 0x prefix
code1 = disassemble_hex("0x6080604052")
code2 = disassemble_hex("6080604052")
assert code1 == code2
# Binary Ninja format detection (EVM prefix)
code3 = disassemble_hex("EVM6080604052") # Strips "EVM" prefix
# All-hex string detection
mixed_format = "6080604052" # Detected as hex even without 0xDifferent Ethereum forks have different instruction sets. PyEVMAsm provides accurate disassembly for each fork:
from pyevmasm import disassemble_one
# Byzantium introduced RETURNDATASIZE (0x3d)
instr = disassemble_one(b'\x3d', fork="byzantium")
print(instr.name) # "RETURNDATASIZE"
# Same opcode in frontier fork
instr = disassemble_one(b'\x3d', fork="frontier")
print(instr.name) # "INVALID"
# Constantinople introduced shift operations
instr = disassemble_one(b'\x1b', fork="constantinople")
print(instr.name) # "SHL"
instr = disassemble_one(b'\x1b', fork="byzantium")
print(instr.name) # "INVALID"Disassembly functions handle various error conditions gracefully:
Install with Tessl CLI
npx tessl i tessl/pypi-pyevmasm