0
# Android Formats
1
2
Specialized support for Android application formats including DEX bytecode, ART runtime files, OAT optimized executables, and VDEX verification data. These formats enable analysis of Android applications, runtime optimization, and security research.
3
4
## Capabilities
5
6
### Format Detection
7
8
Identify Android-specific formats for proper parsing and analysis.
9
10
```python { .api }
11
def is_dex(file: Union[str, Sequence[int]]) -> bool:
12
"""Check if file is DEX format."""
13
14
def is_art(file: Union[str, Sequence[int]]) -> bool:
15
"""Check if file is ART format."""
16
17
def is_oat(file: Union[str, Sequence[int], ELF.Binary]) -> bool:
18
"""Check if file is OAT format."""
19
20
def is_vdex(file: Union[str, Sequence[int]]) -> bool:
21
"""Check if file is VDEX format."""
22
```
23
24
Usage example:
25
```python
26
import lief
27
28
# Check Android format types
29
apk_dex = "/data/app/com.example.app/base.apk!/classes.dex"
30
if lief.is_dex(apk_dex):
31
print("DEX file detected")
32
33
oat_file = "/data/dalvik-cache/arm64/system@framework@boot.oat"
34
if lief.is_oat(oat_file):
35
print("OAT file detected")
36
37
art_file = "/data/dalvik-cache/arm64/system@framework@boot.art"
38
if lief.is_art(art_file):
39
print("ART file detected")
40
```
41
42
### Android Version Support
43
44
Identify and work with different Android API versions and their characteristics.
45
46
```python { .api }
47
class ANDROID_VERSIONS(enum.Enum):
48
UNKNOWN = 0
49
VERSION_601 = 1 # Android 6.0.1
50
VERSION_700 = 2 # Android 7.0
51
VERSION_710 = 3 # Android 7.1.0
52
VERSION_712 = 4 # Android 7.1.2
53
VERSION_800 = 5 # Android 8.0
54
VERSION_810 = 6 # Android 8.1
55
VERSION_900 = 7 # Android 9.0
56
57
def code_name(version: ANDROID_VERSIONS) -> str:
58
"""Get Android codename for version."""
59
60
def version_string(version: ANDROID_VERSIONS) -> str:
61
"""Get version string for Android version."""
62
```
63
64
Usage example:
65
```python
66
import lief.Android as Android
67
68
# Work with Android versions
69
version = Android.ANDROID_VERSIONS.VERSION_900
70
print(f"Codename: {Android.code_name(version)}")
71
print(f"Version: {Android.version_string(version)}")
72
73
# Map all versions
74
for version in Android.ANDROID_VERSIONS:
75
if version != Android.ANDROID_VERSIONS.UNKNOWN:
76
codename = Android.code_name(version)
77
version_str = Android.version_string(version)
78
print(f"{version_str} ({codename})")
79
```
80
81
### DEX Format Analysis
82
83
Analyze DEX (Dalvik Executable) files containing Android application bytecode.
84
85
```python { .api }
86
# Access through lief.DEX module
87
import lief.DEX as DEX
88
89
def parse(file: Union[str, bytes, io.IOBase]) -> Optional[File]
90
91
class File:
92
header: Header
93
classes: Iterator[Class]
94
methods: Iterator[Method]
95
strings: Iterator[str]
96
types: Iterator[Type]
97
fields: Iterator[Field]
98
prototypes: Iterator[Prototype]
99
100
class Header:
101
magic: bytes
102
version: int
103
checksum: int
104
signature: bytes
105
file_size: int
106
header_size: int
107
link_size: int
108
link_offset: int
109
map_offset: int
110
strings_offset: int
111
types_offset: int
112
prototypes_offset: int
113
fields_offset: int
114
methods_offset: int
115
classes_offset: int
116
data_size: int
117
data_offset: int
118
119
class Class:
120
fullname: str
121
package_name: str
122
name: str
123
pretty_name: str
124
access_flags: int
125
parent: Optional[Class]
126
source_filename: str
127
methods: Iterator[Method]
128
fields: Iterator[Field]
129
130
class Method:
131
name: str
132
index: int
133
pretty_name: str
134
access_flags: int
135
code_offset: int
136
prototype: Prototype
137
138
class Field:
139
name: str
140
index: int
141
access_flags: int
142
type: Type
143
144
class Type:
145
type: str
146
dim: int
147
148
class Prototype:
149
return_type: Type
150
parameters_type: List[Type]
151
```
152
153
DEX files contain the compiled Java bytecode for Android applications. They include:
154
155
- **Class definitions**: Java classes compiled to Dalvik bytecode
156
- **Method signatures**: Function definitions and implementations
157
- **String constants**: All string literals used in the application
158
- **Type definitions**: Class and primitive type information
159
- **Field definitions**: Class member variables and their types
160
161
Usage example:
162
```python
163
import lief.DEX as DEX
164
165
# Parse DEX file from APK
166
dex_binary = DEX.parse("classes.dex")
167
if dex_binary:
168
print("DEX file parsed successfully")
169
# DEX-specific analysis would be available through dex_binary object
170
```
171
172
### ART Format Analysis
173
174
Analyze ART (Android Runtime) files used for ahead-of-time compilation.
175
176
```python { .api }
177
# Access through lief.ART module
178
import lief.ART as ART
179
180
def parse(file: Union[str, bytes, io.IOBase]) -> Optional[File]
181
182
class File:
183
header: Header
184
185
class Header:
186
magic: bytes
187
version: str
188
image_base: int
189
image_size: int
190
image_methods_offset: int
191
image_methods_count: int
192
image_roots_offset: int
193
oat_checksum: int
194
oat_file_begin: int
195
oat_data_begin: int
196
oat_data_end: int
197
oat_file_end: int
198
boot_image_begin: int
199
boot_image_size: int
200
boot_oat_begin: int
201
boot_oat_size: int
202
patch_delta: int
203
image_roots_size: int
204
pointer_size: int
205
compile_pic: bool
206
is_pic: bool
207
storage_mode: int
208
data_size: int
209
```
210
211
ART files are used by the Android Runtime for ahead-of-time (AOT) compilation:
212
213
- **Compiled code**: Native machine code compiled from DEX bytecode
214
- **Runtime metadata**: Information needed for execution and garbage collection
215
- **Object layouts**: Memory layout information for Java objects
216
- **Method information**: Mapping between DEX methods and compiled code
217
218
Usage example:
219
```python
220
import lief.ART as ART
221
222
# Parse ART file
223
art_binary = ART.parse("boot.art")
224
if art_binary:
225
print("ART file parsed successfully")
226
# ART-specific analysis would be available
227
```
228
229
### OAT Format Analysis
230
231
Analyze OAT (Optimized Android executables) files containing compiled DEX code.
232
233
```python { .api }
234
# Access through lief.OAT module (OAT files are ELF-based)
235
import lief.OAT as OAT
236
237
def parse(file: Union[str, bytes, io.IOBase]) -> Optional[Binary]
238
239
class Binary(lief.ELF.Binary):
240
header: Header
241
dex_files: Iterator[DexFile]
242
classes: Iterator[Class]
243
methods: Iterator[Method]
244
245
class Header:
246
magic: bytes
247
version: str
248
adler32_checksum: int
249
instruction_set: INSTRUCTION_SETS
250
instruction_set_features: int
251
dex_file_count: int
252
executable_offset: int
253
interpreter_to_interpreter_bridge_offset: int
254
interpreter_to_compiled_code_bridge_offset: int
255
jni_dlsym_lookup_offset: int
256
quick_generic_jni_trampoline_offset: int
257
quick_imt_conflict_trampoline_offset: int
258
quick_resolution_trampoline_offset: int
259
quick_to_interpreter_bridge_offset: int
260
image_patch_delta: int
261
image_file_location_oat_checksum: int
262
image_file_location_oat_data_begin: int
263
key_value_size: int
264
265
class DexFile:
266
location: str
267
checksum: int
268
dex_file_offset: int
269
classes_offsets: List[int]
270
lookup_table_offset: int
271
272
class Class:
273
status: int
274
type: str
275
bitmap: List[int]
276
methods: Iterator[Method]
277
278
class Method:
279
native_method_addr: int
280
native_quick_generic_jni_trampoline: int
281
native_quick_to_interpreter_bridge: int
282
283
enum INSTRUCTION_SETS:
284
NONE = 0
285
ARM = 1
286
ARM64 = 2
287
X86 = 3
288
X86_64 = 4
289
MIPS = 5
290
MIPS64 = 6
291
```
292
293
OAT files are ELF files with Android-specific extensions:
294
295
- **ELF structure**: Standard ELF format with additional sections
296
- **Compiled DEX code**: Native code compiled from DEX bytecode
297
- **DEX file embedding**: Original DEX files embedded within OAT
298
- **Runtime information**: Data needed for execution and optimization
299
300
Usage example:
301
```python
302
import lief.ELF as ELF
303
304
# OAT files are ELF files, so parse with ELF module
305
oat_binary = ELF.parse("boot.oat")
306
if oat_binary and lief.is_oat(oat_binary):
307
print("OAT file parsed as ELF")
308
309
# Look for Android-specific sections
310
for section in oat_binary.sections:
311
if section.name.startswith(".oat"):
312
print(f"OAT section: {section.name}")
313
```
314
315
### VDEX Format Analysis
316
317
Analyze VDEX (Verified DEX) files containing verification and optimization data.
318
319
```python { .api }
320
# Access through lief.VDEX module
321
import lief.VDEX as VDEX
322
323
def parse(file: Union[str, bytes, io.IOBase]) -> Optional[File]
324
325
class File:
326
header: Header
327
dex_files: Iterator[DEX.File]
328
329
class Header:
330
magic: bytes
331
version: str
332
number_of_dex_files: int
333
dex_size: int
334
dex_shared_data_size: int
335
quickening_info_size: int
336
verifier_deps_size: int
337
bootclasspath_checksums_size: int
338
```
339
340
VDEX files contain verification and optimization data:
341
342
- **Verification data**: Information about DEX file verification status
343
- **Optimization information**: Data for runtime optimizations
344
- **DEX contents**: The original DEX file data
345
- **Dependencies**: Information about class dependencies and loading
346
347
Usage example:
348
```python
349
import lief.VDEX as VDEX
350
351
# Parse VDEX file
352
vdex_binary = VDEX.parse("base.vdex")
353
if vdex_binary:
354
print("VDEX file parsed successfully")
355
# VDEX-specific analysis would be available
356
```
357
358
### APK Analysis Integration
359
360
Integrate Android format analysis with APK (Android Package) files.
361
362
```python { .api }
363
# APK files are ZIP archives containing:
364
# - classes.dex (DEX files)
365
# - AndroidManifest.xml
366
# - Resources (res/)
367
# - Native libraries (lib/)
368
# - Assets (assets/)
369
```
370
371
APK files can be analyzed by extracting their components:
372
373
Usage example:
374
```python
375
import zipfile
376
import lief
377
378
def analyze_apk(apk_path):
379
"""Analyze Android APK file contents."""
380
with zipfile.ZipFile(apk_path, 'r') as apk:
381
# List all files
382
print("APK contents:")
383
for file_info in apk.filelist:
384
print(f" {file_info.filename}")
385
386
# Analyze DEX files
387
for filename in apk.namelist():
388
if filename.endswith('.dex'):
389
print(f"Found DEX file: {filename}")
390
dex_data = apk.read(filename)
391
if lief.is_dex(dex_data):
392
print(f" Valid DEX file: {len(dex_data)} bytes")
393
394
# Check for native libraries
395
native_libs = [f for f in apk.namelist() if f.startswith('lib/')]
396
if native_libs:
397
print("Native libraries:")
398
for lib in native_libs:
399
print(f" {lib}")
400
lib_data = apk.read(lib)
401
if lief.is_elf(lib_data):
402
elf_binary = lief.ELF.parse(lib_data)
403
if elf_binary:
404
print(f" ELF: {elf_binary.header.machine_type}")
405
406
# Usage
407
analyze_apk("app.apk")
408
```
409
410
### Runtime Analysis Support
411
412
Support for analyzing Android runtime environments and optimization.
413
414
DEX Optimization Process:
415
1. **DEX files** are packaged in APK
416
2. **VDEX files** contain verification data
417
3. **OAT files** contain compiled native code
418
4. **ART files** contain runtime metadata
419
420
Analysis workflow:
421
```python
422
def analyze_android_optimization(package_path):
423
"""Analyze Android app optimization artifacts."""
424
425
# Check for different optimization stages
426
dex_files = find_dex_files(package_path)
427
vdex_files = find_vdex_files(package_path)
428
oat_files = find_oat_files(package_path)
429
art_files = find_art_files(package_path)
430
431
print(f"Found {len(dex_files)} DEX files")
432
print(f"Found {len(vdex_files)} VDEX files")
433
print(f"Found {len(oat_files)} OAT files")
434
print(f"Found {len(art_files)} ART files")
435
436
# Analyze optimization level
437
if art_files:
438
print("Ahead-of-time (AOT) compilation detected")
439
elif oat_files:
440
print("Optimized DEX compilation detected")
441
elif vdex_files:
442
print("Verified DEX files detected")
443
else:
444
print("Unoptimized DEX files only")
445
```
446
447
### Security Analysis
448
449
Android format analysis for security research and malware detection.
450
451
Common security analysis tasks:
452
- **DEX code analysis**: Examine Android application logic
453
- **Native library analysis**: Analyze JNI libraries in APKs
454
- **OAT inspection**: Understand runtime optimizations
455
- **Packing detection**: Identify obfuscated or packed applications
456
457
Usage example:
458
```python
459
def security_analysis(apk_path):
460
"""Perform security analysis on Android APK."""
461
462
with zipfile.ZipFile(apk_path, 'r') as apk:
463
# Check for suspicious files
464
suspicious_files = []
465
466
for filename in apk.namelist():
467
# Look for native libraries
468
if filename.startswith('lib/') and filename.endswith('.so'):
469
lib_data = apk.read(filename)
470
if lief.is_elf(lib_data):
471
elf_binary = lief.ELF.parse(lib_data)
472
if elf_binary:
473
# Check for packing indicators
474
if any('upx' in section.name.lower() for section in elf_binary.sections):
475
suspicious_files.append(f"{filename}: UPX packed")
476
477
# Check for anti-debugging
478
if elf_binary.has_symbol("ptrace"):
479
suspicious_files.append(f"{filename}: ptrace usage")
480
481
# Analyze DEX files
482
elif filename.endswith('.dex'):
483
dex_data = apk.read(filename)
484
if lief.is_dex(dex_data):
485
# DEX-specific security checks would go here
486
print(f"Analyzing DEX: {filename}")
487
488
if suspicious_files:
489
print("Suspicious indicators found:")
490
for indicator in suspicious_files:
491
print(f" {indicator}")
492
```
493
494
## Types
495
496
```python { .api }
497
# Android-specific enumerations and constants
498
499
enum ANDROID_VERSIONS:
500
UNKNOWN = 0
501
VERSION_601 = 1 # Marshmallow 6.0.1
502
VERSION_700 = 2 # Nougat 7.0
503
VERSION_710 = 3 # Nougat 7.1.0
504
VERSION_712 = 4 # Nougat 7.1.2
505
VERSION_800 = 5 # Oreo 8.0
506
VERSION_810 = 6 # Oreo 8.1
507
VERSION_900 = 7 # Pie 9.0
508
509
# DEX file format constants
510
DEX_FILE_MAGIC = "dex\n"
511
DEX_FILE_VERSION_035 = "035\0"
512
DEX_FILE_VERSION_037 = "037\0"
513
DEX_FILE_VERSION_038 = "038\0"
514
DEX_FILE_VERSION_039 = "039\0"
515
516
# OAT file format constants
517
OAT_MAGIC = "oat\n"
518
OAT_VERSION_MINIMUM = "045"
519
OAT_VERSION_MAXIMUM = "199"
520
521
# ART file format constants
522
ART_MAGIC = "art\n"
523
524
# VDEX file format constants
525
VDEX_MAGIC = "vdex"
526
```