0
# Decompilation
1
2
Java-like source code generation from Android bytecode using the DAD (Dex to Android Decompiler) engine. The decompiler converts Dalvik bytecode back into readable Java-like source code with proper control flow structures.
3
4
## Capabilities
5
6
### DAD Decompiler
7
8
The main decompiler implementation that converts DEX bytecode to readable Java-like source code.
9
10
```python { .api }
11
class DecompilerDAD:
12
def __init__(self, vm, vmx):
13
"""
14
Initialize DAD decompiler.
15
16
Parameters:
17
- vm: DEX object containing bytecode
18
- vmx: Analysis object with cross-references
19
"""
20
21
def get_source_class(self, _class) -> str:
22
"""
23
Decompile entire class to source code.
24
25
Parameters:
26
- _class: ClassDefItem to decompile
27
28
Returns:
29
Complete Java-like source code for class
30
"""
31
32
def get_source_method(self, method) -> str:
33
"""
34
Decompile single method to source code.
35
36
Parameters:
37
- method: EncodedMethod to decompile
38
39
Returns:
40
Java-like source code for method
41
"""
42
43
def display_source(self, method) -> None:
44
"""
45
Print decompiled method source to stdout.
46
47
Parameters:
48
- method: EncodedMethod to display
49
"""
50
51
def get_ast(self, method):
52
"""
53
Get Abstract Syntax Tree for method.
54
55
Parameters:
56
- method: EncodedMethod to analyze
57
58
Returns:
59
AST representation of method
60
"""
61
```
62
63
## Decompiled Class Objects
64
65
High-level objects representing decompiled classes with source code access.
66
67
```python { .api }
68
class DvClass:
69
def __init__(self, class_obj, vmx):
70
"""
71
Initialize decompiled class wrapper.
72
73
Parameters:
74
- class_obj: ClassDefItem object
75
- vmx: Analysis object
76
"""
77
78
def get_source(self) -> str:
79
"""
80
Get complete source code for class.
81
82
Returns:
83
Java-like source code including all methods and fields
84
"""
85
86
def get_name(self) -> str:
87
"""Return class name."""
88
89
def get_superclass_name(self) -> str:
90
"""Return superclass name."""
91
92
def get_interfaces(self) -> list[str]:
93
"""Return list of implemented interface names."""
94
95
def get_access_flags_string(self) -> str:
96
"""Return access flags as readable string."""
97
98
def get_methods(self) -> list:
99
"""
100
Get all decompiled methods in class.
101
102
Returns:
103
List of DvMethod objects
104
"""
105
106
def get_fields(self) -> list:
107
"""
108
Get all fields in class.
109
110
Returns:
111
List of field objects with source information
112
"""
113
114
def get_method(self, name: str, descriptor: str = None):
115
"""
116
Get specific decompiled method by name.
117
118
Parameters:
119
- name: Method name
120
- descriptor: Method descriptor (optional)
121
122
Returns:
123
DvMethod object or None if not found
124
"""
125
```
126
127
## Decompiled Method Objects
128
129
Individual decompiled methods with source code and metadata access.
130
131
```python { .api }
132
class DvMethod:
133
def __init__(self, method_obj, vmx):
134
"""
135
Initialize decompiled method wrapper.
136
137
Parameters:
138
- method_obj: EncodedMethod object
139
- vmx: Analysis object
140
"""
141
142
def get_source(self) -> str:
143
"""
144
Get source code for method.
145
146
Returns:
147
Java-like source code for method implementation
148
"""
149
150
def get_name(self) -> str:
151
"""Return method name."""
152
153
def get_descriptor(self) -> str:
154
"""Return method descriptor."""
155
156
def get_class_name(self) -> str:
157
"""Return containing class name."""
158
159
def get_access_flags_string(self) -> str:
160
"""Return access flags as readable string."""
161
162
def is_external(self) -> bool:
163
"""Return True if method is external."""
164
165
def is_android_api(self) -> bool:
166
"""Return True if method is Android API."""
167
168
def get_method_analysis(self):
169
"""
170
Get MethodAnalysis object.
171
172
Returns:
173
MethodAnalysis with cross-references and control flow
174
"""
175
176
def get_length(self) -> int:
177
"""Return method code length."""
178
179
def show_source(self) -> None:
180
"""Print method source code to stdout."""
181
```
182
183
### AST (Abstract Syntax Tree) Access
184
185
Access to the decompiler's internal AST representation for advanced analysis.
186
187
```python { .api }
188
def get_ast(self):
189
"""
190
Get Abstract Syntax Tree representation.
191
192
Returns:
193
AST node representing method structure
194
"""
195
196
def get_params_type(self) -> list[str]:
197
"""
198
Get parameter types.
199
200
Returns:
201
List of parameter type strings
202
"""
203
204
def get_information(self) -> dict:
205
"""
206
Get detailed method information.
207
208
Returns:
209
Dictionary with method metadata
210
"""
211
212
def get_locals(self) -> list:
213
"""Get local variable information."""
214
215
def get_arguments(self) -> list:
216
"""Get method argument information."""
217
```
218
219
## Decompiler Configuration
220
221
Control decompiler behavior and output formatting.
222
223
```python { .api }
224
class DecompilerOptions:
225
def __init__(self):
226
"""Initialize decompiler configuration."""
227
228
def set_pretty_show(self, enable: bool) -> None:
229
"""
230
Enable/disable pretty formatting.
231
232
Parameters:
233
- enable: True to enable pretty printing
234
"""
235
236
def set_colors(self, enable: bool) -> None:
237
"""
238
Enable/disable syntax coloring.
239
240
Parameters:
241
- enable: True to enable colors
242
"""
243
244
def set_show_exceptions(self, enable: bool) -> None:
245
"""
246
Enable/disable exception information.
247
248
Parameters:
249
- enable: True to show exception details
250
"""
251
252
def set_escape_unicode(self, enable: bool) -> None:
253
"""
254
Enable/disable Unicode escaping.
255
256
Parameters:
257
- enable: True to escape Unicode characters
258
"""
259
```
260
261
### Advanced Decompilation Options
262
263
Fine-tune decompilation output and behavior.
264
265
```python { .api }
266
def set_decompiler_options(self, options: dict) -> None:
267
"""
268
Set advanced decompiler options.
269
270
Parameters:
271
- options: Dictionary of option name to value mappings
272
"""
273
274
def get_decompiler_type(self) -> str:
275
"""Return decompiler type identifier."""
276
277
def process_folder(self, input_folder: str, output_folder: str) -> None:
278
"""
279
Batch decompile entire folder.
280
281
Parameters:
282
- input_folder: Path to folder containing DEX/APK files
283
- output_folder: Path to output decompiled source files
284
"""
285
```
286
287
## AST Node Types
288
289
Different types of AST nodes for representing code structures.
290
291
```python { .api }
292
class ASTNode:
293
def get_type(self) -> str:
294
"""Return AST node type."""
295
296
def get_children(self) -> list:
297
"""Return list of child nodes."""
298
299
class ExpressionNode(ASTNode):
300
"""AST node representing expressions."""
301
302
def get_value(self) -> object:
303
"""Return expression value."""
304
305
class StatementNode(ASTNode):
306
"""AST node representing statements."""
307
308
def is_compound(self) -> bool:
309
"""Return True if compound statement."""
310
```
311
312
## Intermediate Representation (IR)
313
314
Core intermediate representation classes for decompiler analysis and transformation.
315
316
```python { .api }
317
class IRForm:
318
"""Base class for all intermediate representation forms."""
319
320
def get_type(self) -> str:
321
"""Return IR form type identifier."""
322
323
def accept(self, visitor) -> None:
324
"""Accept visitor for traversal patterns."""
325
326
class Constant(IRForm):
327
"""Constant value representation in IR."""
328
329
def __init__(self, value: object, const_type: str):
330
"""
331
Initialize constant IR form.
332
333
Parameters:
334
- value: The constant value
335
- const_type: Type of the constant
336
"""
337
338
def get_value(self) -> object:
339
"""Return the constant value."""
340
341
def get_const_type(self) -> str:
342
"""Return the constant type."""
343
344
class Variable(IRForm):
345
"""Variable reference representation in IR."""
346
347
def __init__(self, name: str, var_type: str):
348
"""
349
Initialize variable IR form.
350
351
Parameters:
352
- name: Variable name
353
- var_type: Variable type
354
"""
355
356
def get_name(self) -> str:
357
"""Return variable name."""
358
359
def get_var_type(self) -> str:
360
"""Return variable type."""
361
362
class BinaryOperation(IRForm):
363
"""Binary operation representation in IR."""
364
365
def __init__(self, operator: str, left: IRForm, right: IRForm):
366
"""
367
Initialize binary operation IR form.
368
369
Parameters:
370
- operator: Operation operator
371
- left: Left operand
372
- right: Right operand
373
"""
374
375
def get_operator(self) -> str:
376
"""Return operation operator."""
377
378
def get_left_operand(self) -> IRForm:
379
"""Return left operand."""
380
381
def get_right_operand(self) -> IRForm:
382
"""Return right operand."""
383
384
class AssignExpression(IRForm):
385
"""Assignment expression representation in IR."""
386
387
def __init__(self, target: Variable, value: IRForm):
388
"""
389
Initialize assignment expression.
390
391
Parameters:
392
- target: Assignment target variable
393
- value: Value to assign
394
"""
395
396
def get_target(self) -> Variable:
397
"""Return assignment target."""
398
399
def get_value(self) -> IRForm:
400
"""Return assignment value."""
401
402
class InvokeInstruction(IRForm):
403
"""Method invocation representation in IR."""
404
405
def __init__(self, method_name: str, args: list[IRForm], invoke_type: str):
406
"""
407
Initialize method invocation.
408
409
Parameters:
410
- method_name: Name of invoked method
411
- args: List of arguments
412
- invoke_type: Type of invocation
413
"""
414
415
def get_method_name(self) -> str:
416
"""Return method name."""
417
418
def get_arguments(self) -> list[IRForm]:
419
"""Return argument list."""
420
421
def get_invoke_type(self) -> str:
422
"""Return invocation type."""
423
424
class FieldAccess(IRForm):
425
"""Field access representation in IR."""
426
427
def __init__(self, field_name: str, instance: IRForm = None):
428
"""
429
Initialize field access.
430
431
Parameters:
432
- field_name: Name of accessed field
433
- instance: Instance object (None for static fields)
434
"""
435
436
def get_field_name(self) -> str:
437
"""Return field name."""
438
439
def get_instance(self) -> IRForm:
440
"""Return instance object (None for static)."""
441
```
442
443
## Basic Block Representation
444
445
Control flow and basic block analysis structures for advanced decompilation.
446
447
```python { .api }
448
class BasicBlock:
449
"""Represents a basic block in control flow analysis."""
450
451
def __init__(self, block_id: int):
452
"""
453
Initialize basic block.
454
455
Parameters:
456
- block_id: Unique identifier for the block
457
"""
458
459
def get_id(self) -> int:
460
"""Return block identifier."""
461
462
def get_instructions(self) -> list[IRForm]:
463
"""Return list of IR instructions in this block."""
464
465
def get_predecessors(self) -> list:
466
"""Return list of predecessor blocks."""
467
468
def get_successors(self) -> list:
469
"""Return list of successor blocks."""
470
471
def add_instruction(self, instr: IRForm) -> None:
472
"""Add instruction to this block."""
473
474
class StatementBlock(BasicBlock):
475
"""Basic block containing statement instructions."""
476
477
def get_statements(self) -> list[IRForm]:
478
"""Return list of statement IR forms."""
479
480
class ConditionalBlock(BasicBlock):
481
"""Basic block with conditional branching."""
482
483
def get_condition(self) -> IRForm:
484
"""Return conditional expression."""
485
486
def get_true_block(self) -> BasicBlock:
487
"""Return block for true branch."""
488
489
def get_false_block(self) -> BasicBlock:
490
"""Return block for false branch."""
491
492
class LoopBlock(BasicBlock):
493
"""Basic block representing loop structures."""
494
495
def get_loop_condition(self) -> IRForm:
496
"""Return loop condition expression."""
497
498
def get_loop_body(self) -> list[BasicBlock]:
499
"""Return blocks in loop body."""
500
501
def is_while_loop(self) -> bool:
502
"""Return True if while loop."""
503
504
def is_for_loop(self) -> bool:
505
"""Return True if for loop."""
506
507
class TryBlock(BasicBlock):
508
"""Basic block for exception handling structures."""
509
510
def get_try_body(self) -> list[BasicBlock]:
511
"""Return blocks in try body."""
512
513
def get_catch_blocks(self) -> list[BasicBlock]:
514
"""Return catch handler blocks."""
515
516
def get_finally_block(self) -> BasicBlock:
517
"""Return finally block (if any)."""
518
519
def get_exception_types(self) -> list[str]:
520
"""Return list of handled exception types."""
521
522
class ReturnBlock(BasicBlock):
523
"""Basic block containing return statements."""
524
525
def get_return_value(self) -> IRForm:
526
"""Return the returned value expression."""
527
528
def has_return_value(self) -> bool:
529
"""Return True if returns a value."""
530
```
531
532
## Advanced IR Usage Examples
533
534
### Working with IR Forms
535
536
```python
537
from androguard.decompiler.instruction import IRForm, Constant, Variable, BinaryOperation
538
from androguard.decompiler.basic_blocks import BasicBlock, StatementBlock
539
540
# Create IR forms programmatically
541
const_5 = Constant(5, "int")
542
var_x = Variable("x", "int")
543
add_op = BinaryOperation("+", var_x, const_5)
544
545
print(f"IR Expression: {var_x.get_name()} {add_op.get_operator()} {const_5.get_value()}")
546
547
# Create basic blocks
548
block = StatementBlock(1)
549
block.add_instruction(add_op)
550
551
print(f"Block {block.get_id()} has {len(block.get_instructions())} instructions")
552
```
553
554
### Control Flow Analysis
555
556
```python
557
from androguard.decompiler.basic_blocks import ConditionalBlock, LoopBlock
558
559
# Analyze conditional structures
560
cond_block = ConditionalBlock(2)
561
condition = cond_block.get_condition()
562
563
if condition:
564
true_path = cond_block.get_true_block()
565
false_path = cond_block.get_false_block()
566
print(f"Conditional block branches to {true_path.get_id()} or {false_path.get_id()}")
567
568
# Analyze loop structures
569
loop_block = LoopBlock(3)
570
if loop_block.is_while_loop():
571
condition = loop_block.get_loop_condition()
572
body_blocks = loop_block.get_loop_body()
573
print(f"While loop with {len(body_blocks)} body blocks")
574
```
575
576
### Exception Handling Analysis
577
578
```python
579
from androguard.decompiler.basic_blocks import TryBlock
580
581
try_block = TryBlock(4)
582
exception_types = try_block.get_exception_types()
583
584
print(f"Try block handles {len(exception_types)} exception types:")
585
for exc_type in exception_types:
586
print(f" - {exc_type}")
587
588
# Get catch handlers
589
catch_blocks = try_block.get_catch_blocks()
590
for i, catch_block in enumerate(catch_blocks):
591
print(f"Catch block {i+1}: {catch_block.get_id()}")
592
593
# Check for finally block
594
finally_block = try_block.get_finally_block()
595
if finally_block:
596
print(f"Finally block: {finally_block.get_id()}")
597
598
def accept(self, visitor) -> None:
599
"""Accept visitor for AST traversal."""
600
601
class MethodNode(ASTNode):
602
def get_body(self):
603
"""Return method body node."""
604
605
def get_parameters(self) -> list:
606
"""Return parameter nodes."""
607
608
def get_return_type(self) -> str:
609
"""Return return type string."""
610
611
class ClassNode(ASTNode):
612
def get_methods(self) -> list:
613
"""Return method nodes."""
614
615
def get_fields(self) -> list:
616
"""Return field nodes."""
617
618
def get_superclass(self) -> str:
619
"""Return superclass name."""
620
```
621
622
## Usage Examples
623
624
### Basic Decompilation
625
626
```python
627
from androguard.misc import AnalyzeAPK
628
629
# Analyze APK with decompilation
630
apk, dex_objects, dx = AnalyzeAPK("app.apk")
631
632
# Get all decompiled classes
633
print(f"Decompiling {len(dex_objects)} DEX files...")
634
635
for dex in dex_objects:
636
classes = dex.get_classes()
637
print(f"Classes in DEX: {len(classes)}")
638
639
# Decompile each class
640
for class_obj in classes[:5]: # First 5 classes
641
class_name = class_obj.get_name()
642
print(f"\nDecompiling class: {class_name}")
643
644
# Create DvClass object
645
dv_class = DvClass(class_obj, dx)
646
647
# Get source code
648
source_code = dv_class.get_source()
649
print(f"Source length: {len(source_code)} characters")
650
651
# Save to file
652
filename = class_name.replace('/', '_').replace(';', '') + '.java'
653
with open(filename, 'w') as f:
654
f.write(source_code)
655
```
656
657
### Method-Level Decompilation
658
659
```python
660
# Find and decompile specific methods
661
oncreate_methods = dx.find_methods(method_name="onCreate")
662
663
for method_analysis in oncreate_methods:
664
method_obj = method_analysis.get_method()
665
666
print(f"\nDecompiling: {method_analysis.get_class_name()}.{method_analysis.get_name()}")
667
668
# Create DvMethod object
669
dv_method = DvMethod(method_obj, dx)
670
671
# Get decompiled source
672
source = dv_method.get_source()
673
print("Decompiled source:")
674
print(source)
675
676
# Get method information
677
info = dv_method.get_information()
678
print(f"Method info: {info}")
679
```
680
681
### Targeted Class Decompilation
682
683
```python
684
# Find MainActivity and decompile completely
685
main_activities = dx.find_classes(r".*MainActivity.*")
686
687
for class_analysis in main_activities:
688
class_obj = class_analysis.get_class()
689
690
print(f"Decompiling MainActivity: {class_analysis.get_name()}")
691
692
# Create decompiled class
693
dv_class = DvClass(class_obj, dx)
694
695
# Get class metadata
696
print(f"Superclass: {dv_class.get_superclass_name()}")
697
print(f"Interfaces: {dv_class.get_interfaces()}")
698
print(f"Access flags: {dv_class.get_access_flags_string()}")
699
700
# Decompile all methods
701
dv_methods = dv_class.get_methods()
702
print(f"Methods to decompile: {len(dv_methods)}")
703
704
for dv_method in dv_methods:
705
method_name = dv_method.get_name()
706
print(f"\n--- Method: {method_name} ---")
707
708
try:
709
source = dv_method.get_source()
710
print(source)
711
except Exception as e:
712
print(f"Decompilation failed: {e}")
713
```
714
715
### Advanced AST Analysis
716
717
```python
718
from androguard.decompiler.dad.decompile import DecompilerDAD
719
720
# Create DAD decompiler
721
decompiler = DecompilerDAD(dex_objects[0], dx)
722
723
# Find interesting methods
724
crypto_methods = dx.find_methods(method_name=r".*(encrypt|decrypt|hash).*")
725
726
for method_analysis in crypto_methods:
727
method_obj = method_analysis.get_method()
728
729
print(f"\nAnalyzing AST for: {method_analysis.get_name()}")
730
731
try:
732
# Get AST representation
733
ast = decompiler.get_ast(method_obj)
734
735
if ast:
736
print("AST structure available")
737
# Custom AST analysis would go here
738
739
# Get source with decompiler directly
740
source = decompiler.get_source_method(method_obj)
741
print("Direct decompilation successful")
742
print(source[:200] + "..." if len(source) > 200 else source)
743
744
except Exception as e:
745
print(f"AST analysis failed: {e}")
746
```
747
748
### Batch Decompilation with Error Handling
749
750
```python
751
import os
752
753
def safe_decompile_class(class_obj, dx, output_dir):
754
"""Safely decompile a class with error handling."""
755
try:
756
dv_class = DvClass(class_obj, dx)
757
class_name = dv_class.get_name()
758
759
# Clean filename
760
safe_name = class_name.replace('L', '').replace(';', '').replace('/', '_')
761
filename = os.path.join(output_dir, safe_name + '.java')
762
763
# Get source
764
source = dv_class.get_source()
765
766
# Write to file
767
os.makedirs(os.path.dirname(filename), exist_ok=True)
768
with open(filename, 'w', encoding='utf-8') as f:
769
f.write(f"// Decompiled class: {class_name}\n")
770
f.write(f"// Superclass: {dv_class.get_superclass_name()}\n")
771
f.write(f"// Interfaces: {', '.join(dv_class.get_interfaces())}\n\n")
772
f.write(source)
773
774
return filename, None
775
except Exception as e:
776
return None, str(e)
777
778
# Batch decompile all classes
779
output_directory = "decompiled_sources"
780
os.makedirs(output_directory, exist_ok=True)
781
782
total_classes = 0
783
successful = 0
784
failed = 0
785
786
for dex in dex_objects:
787
classes = dex.get_classes()
788
total_classes += len(classes)
789
790
for class_obj in classes:
791
filename, error = safe_decompile_class(class_obj, dx, output_directory)
792
793
if filename:
794
successful += 1
795
print(f"✓ {os.path.basename(filename)}")
796
else:
797
failed += 1
798
class_name = class_obj.get_name()
799
print(f"✗ {class_name}: {error}")
800
801
print(f"\nDecompilation complete:")
802
print(f"Total classes: {total_classes}")
803
print(f"Successful: {successful}")
804
print(f"Failed: {failed}")
805
print(f"Success rate: {successful/total_classes*100:.1f}%")
806
```
807
808
### Decompiler Comparison and Validation
809
810
```python
811
# Compare different decompilation approaches
812
def compare_decompilation_methods(method_obj, dx):
813
"""Compare different ways to decompile a method."""
814
815
method_name = f"{method_obj.get_class_name()}.{method_obj.get_name()}"
816
print(f"\nComparing decompilation for: {method_name}")
817
818
# Method 1: Direct DAD decompiler
819
try:
820
decompiler = DecompilerDAD(dex_objects[0], dx)
821
source1 = decompiler.get_source_method(method_obj)
822
print("✓ DAD decompiler successful")
823
except Exception as e:
824
source1 = None
825
print(f"✗ DAD decompiler failed: {e}")
826
827
# Method 2: DvMethod wrapper
828
try:
829
dv_method = DvMethod(method_obj, dx)
830
source2 = dv_method.get_source()
831
print("✓ DvMethod wrapper successful")
832
except Exception as e:
833
source2 = None
834
print(f"✗ DvMethod wrapper failed: {e}")
835
836
# Compare results
837
if source1 and source2:
838
if source1 == source2:
839
print("✓ Both methods produce identical results")
840
else:
841
print("⚠ Methods produce different results")
842
print(f" DAD length: {len(source1)}")
843
print(f" DvMethod length: {len(source2)}")
844
845
return source1 or source2
846
847
# Test on various method types
848
test_methods = []
849
test_methods.extend(dx.find_methods(method_name="<init>")[:3]) # Constructors
850
test_methods.extend(dx.find_methods(method_name="onCreate")[:2]) # Lifecycle
851
test_methods.extend(dx.find_methods(accessflags=r".*static.*")[:2]) # Static methods
852
853
for method_analysis in test_methods:
854
method_obj = method_analysis.get_method()
855
source = compare_decompilation_methods(method_obj, dx)
856
857
if source:
858
print(f"Sample output ({len(source)} chars):")
859
print(source[:150] + "..." if len(source) > 150 else source)
860
```
861
862
## Utility Functions
863
864
```python { .api }
865
def auto_vm(filename: str):
866
"""
867
Automatically determine file type and create appropriate VM.
868
869
Parameters:
870
- filename: Path to DEX/APK/ODEX file
871
872
Returns:
873
Tuple of (DEX_object, Analysis_object)
874
"""
875
876
def pretty_show(vmx, method, colors: bool = True) -> str:
877
"""
878
Pretty print decompiled method with optional colors.
879
880
Parameters:
881
- vmx: Analysis object
882
- method: Method to decompile
883
- colors: Enable syntax highlighting
884
885
Returns:
886
Formatted source code string
887
"""
888
889
def export_source_to_disk(output_dir: str, vmx, java: bool = True, raw: bool = False) -> None:
890
"""
891
Export all decompiled source to disk.
892
893
Parameters:
894
- output_dir: Output directory path
895
- vmx: Analysis object
896
- java: Export as .java files
897
- raw: Export raw bytecode alongside source
898
"""
899
```