or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

assembly.mddisassembly.mdfork-management.mdindex.mdinstruction-analysis.md

disassembly.mddocs/

0

# Disassembly Operations

1

2

Convert EVM bytecode to assembly language with support for various input formats including bytes, hex strings, and iterators. Provides detailed instruction analysis and supports all Ethereum hard forks for accurate opcode interpretation.

3

4

## Capabilities

5

6

### Single Instruction Disassembly

7

8

Disassemble a single EVM instruction from bytecode, extracting complete instruction metadata including operands.

9

10

```python { .api }

11

def disassemble_one(bytecode, pc: int = 0, fork: str = DEFAULT_FORK) -> Instruction:

12

"""

13

Disassemble a single instruction from bytecode.

14

15

Parameters:

16

- bytecode (str | bytes | bytearray | iterator): The bytecode stream

17

- pc (int, optional): Program counter of the instruction. Default: 0

18

- fork (str, optional): Fork name. Default: DEFAULT_FORK ("istanbul")

19

20

Returns:

21

Instruction: An Instruction object with complete metadata, or None if no instruction found

22

23

Raises:

24

ParseError: If bytecode is malformed or insufficient for operand parsing

25

"""

26

```

27

28

**Usage Examples:**

29

30

```python

31

from pyevmasm import disassemble_one

32

33

# Disassemble from bytes

34

instruction = disassemble_one(b'\x60\x40') # PUSH1 0x40

35

print(f"Name: {instruction.name}") # PUSH1

36

print(f"Operand: 0x{instruction.operand:x}") # 0x40

37

print(f"Gas: {instruction.fee}") # 3

38

39

# From hex string (without 0x prefix)

40

instruction = disassemble_one("6040")

41

print(f"Same instruction: {instruction.name}") # PUSH1

42

43

# Invalid instructions become INVALID

44

invalid = disassemble_one(b'\xff\xff')

45

print(f"Invalid opcode: {invalid.name}") # INVALID (for 0xff opcodes in some contexts)

46

47

# With program counter

48

instruction = disassemble_one(b'\x56', pc=100) # JUMP

49

print(f"PC: {instruction.pc}") # 100

50

```

51

52

### Multiple Instruction Disassembly

53

54

Disassemble all instructions in a bytecode sequence, returning a generator for memory-efficient processing of large bytecode.

55

56

```python { .api }

57

def disassemble_all(bytecode, pc: int = 0, fork: str = DEFAULT_FORK):

58

"""

59

Disassemble all instructions in bytecode.

60

61

Parameters:

62

- bytecode (str | bytes | bytearray | iterator): An EVM bytecode (binary)

63

- pc (int, optional): Program counter of the first instruction. Default: 0

64

- fork (str, optional): Fork name. Default: DEFAULT_FORK ("istanbul")

65

66

Returns:

67

Generator[Instruction]: A generator of Instruction objects

68

69

Note:

70

Generator stops when no more valid instructions can be parsed

71

"""

72

```

73

74

**Usage Examples:**

75

76

```python

77

from pyevmasm import disassemble_all

78

79

# Disassemble complete bytecode

80

bytecode = b'\x60\x80\x60\x40\x52\x60\x04\x36\x10'

81

instructions = list(disassemble_all(bytecode))

82

83

for instr in instructions:

84

print(f"{instr.pc:08x}: {instr}")

85

# Output:

86

# 00000000: PUSH1 0x80

87

# 00000002: PUSH1 0x40

88

# 00000004: MSTORE

89

# 00000005: PUSH1 0x4

90

# 00000007: CALLDATASIZE

91

# 00000008: LT

92

93

# Memory-efficient processing of large bytecode

94

for instruction in disassemble_all(large_bytecode):

95

if instruction.is_branch:

96

print(f"Branch at 0x{instruction.pc:x}: {instruction}")

97

```

98

99

### Text Disassembly

100

101

Disassemble bytecode to human-readable assembly text format, suitable for display and analysis.

102

103

```python { .api }

104

def disassemble(bytecode, pc: int = 0, fork: str = DEFAULT_FORK) -> str:

105

"""

106

Disassemble an EVM bytecode to text representation.

107

108

Parameters:

109

- bytecode (str | bytes | bytearray): Binary representation of EVM bytecode

110

- pc (int, optional): Program counter of the first instruction. Default: 0

111

- fork (str, optional): Fork name. Default: DEFAULT_FORK ("istanbul")

112

113

Returns:

114

str: The text representation of the assembly code (newline-separated)

115

"""

116

```

117

118

### Hexadecimal Disassembly

119

120

Disassemble hexadecimal bytecode strings to assembly text, handling common hex formats automatically.

121

122

```python { .api }

123

def disassemble_hex(bytecode: str, pc: int = 0, fork: str = DEFAULT_FORK) -> str:

124

"""

125

Disassemble hexadecimal EVM bytecode to assembly text.

126

127

Parameters:

128

- bytecode (str): Canonical representation of EVM bytecode (hexadecimal, with or without 0x prefix)

129

- pc (int, optional): Program counter of the first instruction. Default: 0

130

- fork (str, optional): Fork name. Default: DEFAULT_FORK ("istanbul")

131

132

Returns:

133

str: The text representation of the assembly code (newline-separated)

134

"""

135

```

136

137

**Usage Examples:**

138

139

```python

140

from pyevmasm import disassemble, disassemble_hex

141

142

# From binary bytecode

143

binary_code = b'\x60\x80\x60\x40\x52'

144

assembly = disassemble(binary_code)

145

print(assembly)

146

# PUSH1 0x80

147

# PUSH1 0x40

148

# MSTORE

149

150

# From hex string (most common usage)

151

hex_code = "0x608060405260043610"

152

assembly = disassemble_hex(hex_code)

153

print(assembly)

154

# PUSH1 0x80

155

# PUSH1 0x40

156

# MSTORE

157

# PUSH1 0x4

158

# CALLDATASIZE

159

# LT

160

161

# Hex without 0x prefix also works

162

assembly = disassemble_hex("608060405260043610")

163

# Same result

164

```

165

166

## Input Format Support

167

168

PyEVMAsm's disassembly functions accept multiple input formats:

169

170

### Binary Formats

171

- **bytes**: Native Python bytes objects

172

- **bytearray**: Mutable byte arrays

173

- **str (binary)**: Latin-1 encoded strings (legacy support)

174

- **iterator**: Any iterator yielding integer byte values

175

176

### Hexadecimal Formats

177

- **0x-prefixed**: "0x608060405260043610"

178

- **Plain hex**: "608060405260043610"

179

- **Mixed case**: Case-insensitive hex parsing

180

181

### Special Format Handling

182

183

The disassembly functions include intelligent format detection:

184

185

```python

186

from pyevmasm import disassemble_hex

187

188

# Automatically handles 0x prefix

189

code1 = disassemble_hex("0x6080604052")

190

code2 = disassemble_hex("6080604052")

191

assert code1 == code2

192

193

# Binary Ninja format detection (EVM prefix)

194

code3 = disassemble_hex("EVM6080604052") # Strips "EVM" prefix

195

196

# All-hex string detection

197

mixed_format = "6080604052" # Detected as hex even without 0x

198

```

199

200

## Fork-Specific Disassembly

201

202

Different Ethereum forks have different instruction sets. PyEVMAsm provides accurate disassembly for each fork:

203

204

```python

205

from pyevmasm import disassemble_one

206

207

# Byzantium introduced RETURNDATASIZE (0x3d)

208

instr = disassemble_one(b'\x3d', fork="byzantium")

209

print(instr.name) # "RETURNDATASIZE"

210

211

# Same opcode in frontier fork

212

instr = disassemble_one(b'\x3d', fork="frontier")

213

print(instr.name) # "INVALID"

214

215

# Constantinople introduced shift operations

216

instr = disassemble_one(b'\x1b', fork="constantinople")

217

print(instr.name) # "SHL"

218

219

instr = disassemble_one(b'\x1b', fork="byzantium")

220

print(instr.name) # "INVALID"

221

```

222

223

## Error Handling

224

225

Disassembly functions handle various error conditions gracefully:

226

227

- **Insufficient data**: Returns None or stops generator when not enough bytes for operands

228

- **Invalid opcodes**: Creates INVALID instruction objects for unknown opcodes

229

- **Empty input**: Returns None (single) or empty generator (multiple)

230

- **Malformed hex**: Raises appropriate parsing exceptions