0
# pefile
1
2
A comprehensive Python module for parsing and working with Portable Executable (PE) files. Pefile provides access to virtually all information contained in PE file headers, sections, and data directories, enabling detailed analysis and modification of Windows executable files, DLLs, and drivers.
3
4
## Package Information
5
6
- **Package Name**: pefile
7
- **Language**: Python
8
- **Installation**: `pip install pefile`
9
10
## Core Imports
11
12
```python
13
import pefile
14
```
15
16
For utilities and packer detection:
17
18
```python
19
import peutils
20
```
21
22
For ordinal lookups:
23
24
```python
25
import ordlookup
26
```
27
28
## Basic Usage
29
30
```python
31
import pefile
32
33
# Load PE file from path
34
pe = pefile.PE('path/to/executable.exe')
35
36
# Or load from raw data
37
with open('path/to/executable.exe', 'rb') as f:
38
pe = pefile.PE(data=f.read())
39
40
# Access basic information
41
print(f"Machine type: {pe.FILE_HEADER.Machine}")
42
print(f"Number of sections: {pe.FILE_HEADER.NumberOfSections}")
43
print(f"Is DLL: {pe.is_dll()}")
44
print(f"Is executable: {pe.is_exe()}")
45
46
# Access sections
47
for section in pe.sections:
48
print(f"Section: {section.Name.decode('utf-8').strip()}")
49
print(f"Virtual Address: 0x{section.VirtualAddress:08x}")
50
print(f"Size: {section.SizeOfRawData}")
51
52
# Access imports (if present)
53
if hasattr(pe, 'DIRECTORY_ENTRY_IMPORT'):
54
for entry in pe.DIRECTORY_ENTRY_IMPORT:
55
print(f"DLL: {entry.dll.decode('utf-8')}")
56
for imp in entry.imports:
57
if imp.name:
58
print(f" Function: {imp.name.decode('utf-8')}")
59
60
# Clean up resources
61
pe.close()
62
```
63
64
## Architecture
65
66
The pefile module is built around a hierarchical structure that mirrors the PE file format:
67
68
- **PE class**: Main parser that loads and provides access to all PE components
69
- **Structure classes**: Represent binary data structures (headers, directories, sections)
70
- **Data container classes**: Organize parsed information (imports, exports, resources, debug info)
71
- **Utility functions**: Support address translation, data access, and format validation
72
73
The module handles corrupted and malformed PE files gracefully, making it suitable for malware analysis and security research.
74
75
## Capabilities
76
77
### PE File Parsing
78
79
Core functionality for loading, parsing, and accessing PE file structures including headers, sections, and data directories.
80
81
```python { .api }
82
class PE:
83
def __init__(self, name=None, data=None, fast_load=None, max_symbol_exports=8192, max_repeated_symbol=120): ...
84
def __enter__(self): ...
85
def __exit__(self, type, value, traceback): ...
86
def close(self): ...
87
def write(self, filename=None): ...
88
def full_load(self): ...
89
```
90
91
[PE File Parsing](./pe-parsing.md)
92
93
### Data Access and Modification
94
95
Methods for reading and writing data within PE files, including address translation between file offsets and relative virtual addresses (RVAs).
96
97
```python { .api }
98
def get_data(self, rva=0, length=None): ...
99
def get_string_at_rva(self, rva, max_length=1048576): ...
100
def get_string_u_at_rva(self, rva, max_length=65536, encoding=None): ...
101
def get_dword_at_rva(self, rva): ...
102
def get_word_at_rva(self, rva): ...
103
def get_qword_at_rva(self, rva): ...
104
def set_dword_at_rva(self, rva, dword): ...
105
def set_word_at_rva(self, rva, word): ...
106
def set_qword_at_rva(self, rva, qword): ...
107
def set_bytes_at_rva(self, rva, data): ...
108
def get_offset_from_rva(self, rva): ...
109
def get_rva_from_offset(self, offset): ...
110
def get_physical_by_rva(self, rva): ...
111
```
112
113
[Data Access](./data-access.md)
114
115
### Import and Export Analysis
116
117
Functionality for analyzing import and export tables, including generation of import/export hashes for malware analysis.
118
119
```python { .api }
120
def get_imphash(self): ...
121
def get_exphash(self): ...
122
def parse_import_directory(self, rva, size, dllnames_only=False): ...
123
def parse_export_directory(self, rva, size, forwarded_only=False): ...
124
```
125
126
[Import Export Analysis](./import-export.md)
127
128
### Section Operations
129
130
Methods for working with PE sections, including accessing section data and metadata.
131
132
```python { .api }
133
def get_section_by_rva(self, rva): ...
134
def get_section_by_offset(self, offset): ...
135
def merge_modified_section_data(self): ...
136
```
137
138
[Section Operations](./sections.md)
139
140
### Memory Layout and Relocations
141
142
Functions for memory mapping PE files and handling base relocations for different load addresses.
143
144
```python { .api }
145
def get_memory_mapped_image(self, max_virtual_address=268435456, ImageBase=None): ...
146
def relocate_image(self, new_ImageBase): ...
147
def has_relocs(self): ...
148
def has_dynamic_relocs(self): ...
149
def get_overlay(self): ...
150
def get_overlay_data_start_offset(self): ...
151
def trim(self): ...
152
```
153
154
[Memory Layout](./memory.md)
155
156
### Resource Analysis
157
158
Access to embedded resources including strings, icons, version information, and other resource types.
159
160
```python { .api }
161
def get_resources_strings(self): ...
162
def parse_resources_directory(self, rva, size=0, base_rva=None, level=0, dirs=None): ...
163
def parse_version_information(self, version_struct): ...
164
```
165
166
[Resource Analysis](./resources.md)
167
168
### Debug Information
169
170
Access to debug directories and related debugging information embedded in PE files.
171
172
```python { .api }
173
def parse_debug_directory(self, rva, size): ...
174
```
175
176
[Debug Information](./debug.md)
177
178
### Hash and Verification
179
180
Checksum verification and various hash calculation methods for file integrity and identification.
181
182
```python { .api }
183
def verify_checksum(self): ...
184
def generate_checksum(self): ...
185
def get_rich_header_hash(self, algorithm="md5"): ...
186
def is_exe(self): ...
187
def is_dll(self): ...
188
def is_driver(self): ...
189
```
190
191
[Hash Verification](./hashing.md)
192
193
### Packer Detection (peutils)
194
195
Utilities for detecting packed executables and identifying packers/compilers using signature databases.
196
197
```python { .api }
198
class SignatureDatabase:
199
def __init__(self, filename=None, data=None): ...
200
def match(self, pe, ep_only=True, section_start_only=False): ...
201
def match_all(self, pe, ep_only=True, section_start_only=False): ...
202
def load(self, filename=None, data=None): ...
203
def generate_ep_signature(self, pe, name, sig_length=512): ...
204
205
def is_probably_packed(pe): ...
206
def is_suspicious(pe): ...
207
def is_valid(pe): ...
208
```
209
210
[Packer Detection](./packer-detection.md)
211
212
### Ordinal Lookups (ordlookup)
213
214
Database of ordinal to symbol name mappings for common Windows DLLs.
215
216
```python { .api }
217
def ordLookup(libname, ord_val, make_name=False): ...
218
def formatOrdString(ord_val): ...
219
```
220
221
[Ordinal Lookups](./ordinal-lookups.md)
222
223
## Types
224
225
```python { .api }
226
class PE:
227
"""Main PE file parser class."""
228
DOS_HEADER: Structure
229
NT_HEADERS: Structure
230
FILE_HEADER: Structure
231
OPTIONAL_HEADER: Structure
232
sections: list
233
234
class Structure:
235
"""Base class for binary data structures."""
236
def __init__(self, format, name=None, file_offset=None): ...
237
def get_field_absolute_offset(self, field_name): ...
238
def get_field_relative_offset(self, field_name): ...
239
def sizeof(self): ...
240
def dump(self, indentation=0): ...
241
def dump_dict(self): ...
242
243
class SectionStructure(Structure):
244
"""Section structure with data access methods."""
245
def get_data(self, start=None, length=None, ignore_padding=False): ...
246
def get_entropy(self): ...
247
def get_hash_md5(self): ...
248
def get_hash_sha1(self): ...
249
def get_hash_sha256(self): ...
250
def get_hash_sha512(self): ...
251
def contains_rva(self, rva): ...
252
def contains_offset(self, offset): ...
253
254
class ImportDescData:
255
"""Import descriptor data container."""
256
struct: Structure
257
imports: list
258
dll: bytes
259
260
class ImportData:
261
"""Individual import data container."""
262
struct: Structure
263
name: bytes
264
import_by_ordinal: bool
265
ordinal: int
266
bound: int
267
address: int
268
hint: int
269
270
class ExportDirData:
271
"""Export directory data container."""
272
struct: Structure
273
symbols: list
274
275
class ExportData:
276
"""Individual export data container."""
277
struct: Structure
278
name: bytes
279
ordinal: int
280
address: int
281
forwarder: bytes
282
283
class ResourceDirData:
284
"""Resource directory data container."""
285
struct: Structure
286
entries: list
287
288
class DebugData:
289
"""Debug directory data container."""
290
struct: Structure
291
entry: Structure
292
293
class BaseRelocationData:
294
"""Base relocation data container."""
295
struct: Structure
296
entries: list
297
298
class RelocationData:
299
"""Individual relocation data container."""
300
struct: Structure
301
type: int
302
base_rva: int
303
rva: int
304
305
class TlsData:
306
"""TLS directory data container."""
307
struct: Structure
308
309
class BoundImportDescData:
310
"""Bound import descriptor data container."""
311
struct: Structure
312
entries: list
313
314
class LoadConfigData:
315
"""Load config data container."""
316
struct: Structure
317
318
class SignatureDatabase:
319
"""PEiD signature database for packer detection."""
320
signature_tree_eponly_true: dict
321
signature_tree_eponly_false: dict
322
signature_tree_section_start: dict
323
signature_count_eponly_true: int
324
signature_count_eponly_false: int
325
signature_count_section_start: int
326
max_depth: int
327
328
class PEFormatError(Exception):
329
"""Exception raised for PE format errors."""
330
pass
331
```