0
# DEX Analysis
1
2
Dalvik Executable (DEX) file parsing and bytecode analysis providing access to classes, methods, instructions, and control flow structures. The DEX format contains compiled Android application code in bytecode format.
3
4
## Capabilities
5
6
### DEX Class
7
8
The main class for DEX file analysis and manipulation, providing comprehensive access to all DEX components including classes, methods, fields, and bytecode instructions.
9
10
```python { .api }
11
class DEX:
12
def __init__(self, buff: bytes, decompiler=None, config=None, using_api: int = None):
13
"""
14
Initialize DEX analysis.
15
16
Parameters:
17
- buff: Raw DEX file bytes
18
- decompiler: Associated decompiler object (optional)
19
- config: Configuration options (optional)
20
- using_api: API level to use for analysis (optional)
21
"""
22
23
def get_classes_names(self) -> list[str]:
24
"""Return list of all class names in the DEX file."""
25
26
def get_classes(self) -> list:
27
"""Return list of all ClassDefItem objects."""
28
29
def get_class(self, name: str):
30
"""
31
Return specific class by name.
32
33
Parameters:
34
- name: Full class name (e.g., 'Lcom/example/MyClass;')
35
36
Returns:
37
ClassDefItem object or None if not found
38
"""
39
40
def get_methods(self) -> list:
41
"""Return list of all EncodedMethod objects."""
42
43
def get_fields(self) -> list:
44
"""Return list of all EncodedField objects."""
45
46
def get_strings(self) -> list[str]:
47
"""Return list of all strings in the DEX string pool."""
48
```
49
50
### Class Definition Access
51
52
Access class definitions and their metadata within DEX files.
53
54
```python { .api }
55
def get_class_manager(self):
56
"""Return ClassManager for advanced class resolution."""
57
58
def get_format_type(self) -> str:
59
"""Return DEX format type ('DEX' or 'ODEX')."""
60
61
def get_dex_object(self, item):
62
"""Get object representation of DEX item."""
63
64
def save(self, filename: str) -> None:
65
"""
66
Save DEX file to disk.
67
68
Parameters:
69
- filename: Output file path
70
"""
71
72
def get_raw(self) -> bytes:
73
"""Return raw bytes of the entire DEX file."""
74
75
def get_length(self) -> int:
76
"""Return length of DEX file in bytes."""
77
```
78
79
### Method Analysis
80
81
Access method definitions, signatures, and bytecode within classes.
82
83
```python { .api }
84
def get_method(self, class_name: str, method_name: str, descriptor: str = None):
85
"""
86
Get specific method by class and method name.
87
88
Parameters:
89
- class_name: Full class name
90
- method_name: Method name
91
- descriptor: Method descriptor (optional)
92
93
Returns:
94
EncodedMethod object or None if not found
95
"""
96
97
def get_method_signature(self, method, predef_sign: str = None) -> str:
98
"""
99
Generate method signature string.
100
101
Parameters:
102
- method: EncodedMethod object
103
- predef_sign: Predefined signature format
104
105
Returns:
106
Method signature string
107
"""
108
109
def get_method_bytecode(self, method):
110
"""
111
Get bytecode instructions for method.
112
113
Parameters:
114
- method: EncodedMethod object
115
116
Returns:
117
DalvikCode object with instructions
118
"""
119
```
120
121
### Field Analysis
122
123
Access field definitions and metadata within classes.
124
125
```python { .api }
126
def get_field_signature(self, field, predef_sign: str = None) -> str:
127
"""
128
Generate field signature string.
129
130
Parameters:
131
- field: EncodedField object
132
- predef_sign: Predefined signature format
133
134
Returns:
135
Field signature string
136
"""
137
138
def get_fields_ids(self) -> list:
139
"""Return list of all FieldIdItem objects."""
140
141
def get_methods_ids(self) -> list:
142
"""Return list of all MethodIdItem objects."""
143
144
def get_types(self) -> list:
145
"""Return list of all TypeIdItem objects."""
146
147
def get_protos(self) -> list:
148
"""Return list of all ProtoIdItem objects."""
149
```
150
151
### String and Type Analysis
152
153
Access string pools and type information within DEX files.
154
155
```python { .api }
156
def get_string(self, idx: int) -> str:
157
"""
158
Get string by index from string pool.
159
160
Parameters:
161
- idx: String index
162
163
Returns:
164
String value or empty string if invalid index
165
"""
166
167
def get_type(self, idx: int) -> str:
168
"""
169
Get type name by index.
170
171
Parameters:
172
- idx: Type index
173
174
Returns:
175
Type name string
176
"""
177
178
def get_proto(self, idx: int):
179
"""
180
Get method prototype by index.
181
182
Parameters:
183
- idx: Prototype index
184
185
Returns:
186
ProtoIdItem object
187
"""
188
```
189
190
## ODEX File Support
191
192
Support for Optimized DEX (ODEX) files with additional optimization metadata.
193
194
```python { .api }
195
class ODEX(DEX):
196
def __init__(self, buff: bytes, odex: bytes = None):
197
"""
198
Initialize ODEX analysis.
199
200
Parameters:
201
- buff: Raw ODEX file bytes
202
- odex: Additional ODEX metadata
203
"""
204
205
def get_optimization_data(self) -> bytes:
206
"""Return optimization metadata from ODEX file."""
207
208
def get_dependencies(self) -> list[str]:
209
"""Return list of dependency libraries."""
210
211
def is_valid_odex(self) -> bool:
212
"""Return True if ODEX file structure is valid."""
213
```
214
215
## Class Definition Objects
216
217
Individual class definition and metadata access.
218
219
```python { .api }
220
class ClassDefItem:
221
def get_class_idx(self) -> int:
222
"""Return class type index."""
223
224
def get_access_flags(self) -> int:
225
"""Return access flags bitmask."""
226
227
def get_superclass_idx(self) -> int:
228
"""Return superclass type index."""
229
230
def get_interfaces_off(self) -> int:
231
"""Return interfaces list offset."""
232
233
def get_source_file_idx(self) -> int:
234
"""Return source filename index."""
235
236
def get_annotations_off(self) -> int:
237
"""Return annotations offset."""
238
239
def get_class_data_off(self) -> int:
240
"""Return class data offset."""
241
242
def get_static_values_off(self) -> int:
243
"""Return static values offset."""
244
245
def get_name(self) -> str:
246
"""Return class name."""
247
248
def get_superclassname(self) -> str:
249
"""Return superclass name."""
250
251
def get_interfaces(self) -> list[str]:
252
"""Return list of implemented interface names."""
253
254
def get_methods(self) -> list:
255
"""Return list of all methods in class."""
256
257
def get_fields(self) -> list:
258
"""Return list of all fields in class."""
259
```
260
261
## Method Objects
262
263
Individual method definition and bytecode access.
264
265
```python { .api }
266
class EncodedMethod:
267
def get_method_idx(self) -> int:
268
"""Return method ID index."""
269
270
def get_access_flags(self) -> int:
271
"""Return method access flags."""
272
273
def get_code_off(self) -> int:
274
"""Return code offset."""
275
276
def get_name(self) -> str:
277
"""Return method name."""
278
279
def get_descriptor(self) -> str:
280
"""Return method descriptor."""
281
282
def get_class_name(self) -> str:
283
"""Return containing class name."""
284
285
def get_code(self):
286
"""Return CodeItem with bytecode."""
287
288
def is_external(self) -> bool:
289
"""Return True if method is external/native."""
290
291
def is_android_api(self) -> bool:
292
"""Return True if method is part of Android API."""
293
294
def get_length(self) -> int:
295
"""Return method bytecode length."""
296
297
def pretty_show(self, m_a=None) -> str:
298
"""Return formatted method representation."""
299
```
300
301
## Field Objects
302
303
Individual field definition and metadata access.
304
305
```python { .api }
306
class EncodedField:
307
def get_field_idx(self) -> int:
308
"""Return field ID index."""
309
310
def get_access_flags(self) -> int:
311
"""Return field access flags."""
312
313
def get_name(self) -> str:
314
"""Return field name."""
315
316
def get_descriptor(self) -> str:
317
"""Return field type descriptor."""
318
319
def get_class_name(self) -> str:
320
"""Return containing class name."""
321
322
def get_init_value(self):
323
"""Return field initialization value if any."""
324
325
def is_static(self) -> bool:
326
"""Return True if field is static."""
327
328
def pretty_show(self, f_a=None) -> str:
329
"""Return formatted field representation."""
330
```
331
332
## Bytecode Instructions
333
334
Access to individual Dalvik bytecode instructions and their operands.
335
336
```python { .api }
337
class Instruction:
338
def get_name(self) -> str:
339
"""Return instruction mnemonic."""
340
341
def get_op_value(self) -> int:
342
"""Return opcode value."""
343
344
def get_literals(self) -> list:
345
"""Return literal values in instruction."""
346
347
def get_operands(self) -> list:
348
"""Return operand values."""
349
350
def get_output(self, idx: int = 0) -> str:
351
"""Return formatted instruction output."""
352
353
def get_length(self) -> int:
354
"""Return instruction length in 16-bit code units."""
355
356
class DalvikCode:
357
def __init__(self, class_manager, code_item):
358
"""
359
Initialize Dalvik code representation.
360
361
Parameters:
362
- class_manager: ClassManager instance
363
- code_item: CodeItem with bytecode
364
"""
365
366
def get_instructions(self) -> list[Instruction]:
367
"""Return list of all instructions."""
368
369
def get_instruction(self, idx: int) -> Instruction:
370
"""Get instruction at specific index."""
371
372
def get_length(self) -> int:
373
"""Return total code length."""
374
375
def get_registers_size(self) -> int:
376
"""Return number of registers used."""
377
378
def get_ins_size(self) -> int:
379
"""Return number of input parameters."""
380
381
def get_outs_size(self) -> int:
382
"""Return number of output parameters."""
383
384
def get_tries(self) -> list:
385
"""Return exception handling try blocks."""
386
```
387
388
## Usage Examples
389
390
### Basic DEX Analysis
391
392
```python
393
from androguard.core.dex import DEX
394
395
# Load DEX file
396
dex = DEX(open("classes.dex", "rb").read())
397
398
# Get basic information
399
print(f"Classes: {len(dex.get_classes())}")
400
print(f"Methods: {len(dex.get_methods())}")
401
print(f"Strings: {len(dex.get_strings())}")
402
403
# List all classes
404
for class_name in dex.get_classes_names():
405
print(f"Class: {class_name}")
406
```
407
408
### Class and Method Inspection
409
410
```python
411
# Get specific class
412
target_class = dex.get_class("Lcom/example/MainActivity;")
413
if target_class:
414
print(f"Superclass: {target_class.get_superclassname()}")
415
print(f"Interfaces: {target_class.get_interfaces()}")
416
417
# Examine methods
418
for method in target_class.get_methods():
419
print(f"Method: {method.get_name()}")
420
print(f" Descriptor: {method.get_descriptor()}")
421
print(f" Access flags: {hex(method.get_access_flags())}")
422
423
# Get bytecode if available
424
code = method.get_code()
425
if code:
426
dalvik_code = DalvikCode(dex.get_class_manager(), code)
427
print(f" Instructions: {len(dalvik_code.get_instructions())}")
428
```
429
430
### Bytecode Analysis
431
432
```python
433
# Find specific method and analyze bytecode
434
method = dex.get_method("Lcom/example/MainActivity;", "onCreate", "(Landroid/os/Bundle;)V")
435
if method:
436
code = method.get_code()
437
if code:
438
dalvik_code = DalvikCode(dex.get_class_manager(), code)
439
440
print(f"Registers: {dalvik_code.get_registers_size()}")
441
print(f"Instructions:")
442
443
for i, instruction in enumerate(dalvik_code.get_instructions()):
444
print(f" {i:04x}: {instruction.get_name()} {instruction.get_operands()}")
445
446
# Check for string literals
447
literals = instruction.get_literals()
448
if literals:
449
for literal in literals:
450
if isinstance(literal, int):
451
string_val = dex.get_string(literal)
452
if string_val:
453
print(f" -> '{string_val}'")
454
```
455
456
### String and Resource Analysis
457
458
```python
459
# Analyze all strings in DEX
460
strings = dex.get_strings()
461
print(f"Total strings: {len(strings)}")
462
463
# Look for specific patterns
464
api_calls = []
465
for i, string in enumerate(strings):
466
if "android" in string.lower():
467
api_calls.append((i, string))
468
469
print("Potential API calls:")
470
for idx, call in api_calls[:10]: # Show first 10
471
print(f" [{idx}] {call}")
472
473
# Find methods that use specific strings
474
target_string = "password"
475
for i, string in enumerate(strings):
476
if target_string.lower() in string.lower():
477
print(f"Found '{string}' at index {i}")
478
479
# Find methods that reference this string
480
for method in dex.get_methods():
481
code = method.get_code()
482
if code:
483
dalvik_code = DalvikCode(dex.get_class_manager(), code)
484
for instruction in dalvik_code.get_instructions():
485
if i in instruction.get_literals():
486
print(f" Used in {method.get_class_name()}->{method.get_name()}")
487
```
488
489
### ODEX File Analysis
490
491
```python
492
from androguard.core.dex import ODEX
493
494
# Load ODEX file
495
odex = ODEX(open("system_app.odex", "rb").read())
496
497
if odex.is_valid_odex():
498
print("Valid ODEX file")
499
print(f"Dependencies: {odex.get_dependencies()}")
500
501
# ODEX files contain the same DEX analysis capabilities
502
classes = odex.get_classes_names()
503
print(f"Classes in ODEX: {len(classes)}")
504
```
505
506
## Utility Functions
507
508
```python { .api }
509
def get_access_flags_string(access_flags: int, flag_type: str) -> str:
510
"""
511
Convert access flags bitmask to human-readable string.
512
513
Parameters:
514
- access_flags: Access flags bitmask
515
- flag_type: Type of flags ('class', 'method', or 'field')
516
517
Returns:
518
Space-separated string of flag names
519
"""
520
521
def clean_name_instruction(instruction_name: str) -> str:
522
"""
523
Clean and normalize instruction name.
524
525
Parameters:
526
- instruction_name: Raw instruction name
527
528
Returns:
529
Cleaned instruction name
530
"""
531
532
def get_type(descriptor: str) -> str:
533
"""
534
Convert type descriptor to readable type name.
535
536
Parameters:
537
- descriptor: Type descriptor (e.g., 'Ljava/lang/String;')
538
539
Returns:
540
Human-readable type name
541
"""
542
543
def static_operand_instruction(instruction) -> dict:
544
"""
545
Get static operand information for instruction.
546
547
Parameters:
548
- instruction: Instruction object
549
550
Returns:
551
Dictionary with operand details
552
"""
553
```