This tutorial introduces Android formats as well as the API to use them. We are talking about DEX, OAT, VDEX and ART.
Files used in this tutorial are available on the tutorials repository
By Romain Thomas - @rh0main
Let’s start with a quick reminder about compilation, installation and the execution of Android applications.
When dealing with application development, the main part of the code is usually written in Java. Developers can also write native code (C/C++
) through the Java Native Interface (JNI) interface.
In the APK building process, the Java code is eventually transformed in the Dalvik bytecode which is interpreted by the Android Java virtual machine. The Android JVM is different from the implementation by Oracle and, among the differences, it is based on registers whereas the one from Oracle is based on a stack.
To produce the Dalvik bytecode, Java sources are first compiled with javac
into the Java bytecode and then Android transforms this bytecode into the Dalvik one by using the dx
compiler (or the new one: D8
). This bytecode is finally wrapped in a DEX file(s) such as classes.dex
. The DEX format is specific to Android and the documentation is available here.
During the installation of the APK, the system applies optimizations on this DEX file in order to speed-up the execution. Indeed interpreting bytecode is not as efficient as executing native code and the Dalvik virtual machine is based on registers that are 32-bits width size whereas most of the recent CPU are 64-bits width.
To address this issue and prior to Android 4.4 (KitKat), the runtime used JIT compilation to transform Dalvik bytecode into assembly. The JIT ocurred during the execution and it was done each time the application was executed. Since Android 4.4 they moved to a new runtime which, among other features, performs the optimizations during the installation. Consequently the installation takes more time but transformations to native code are done once.
To optimize the Dalvik bytecode, the original DEX file (e.g. classes.dex
) is transformed into another file that will contain the native code. This new file usually has the .odex
, .oat
extension and is wrapped by the ELF format. Using ELF format makes sense for mainly two reasons:
It’s the default format used by Linux and Android to package assembly code.
It enables to use the same loader: /system/bin/linker{64}
OAT files are in fact ELF and this is why, we choose to add this new format in LIEF. This ELF format is actually used as a wrapper over another format which is specific to Android: the OAT format.
Basically the ELF associated exports few symbols:
import lief
oat = lief.parse("SomeOAT")
for s in oat.dynamic_symbols:
print(s)
oatdata OBJECT GLOBAL 1000 1262000
oatexec OBJECT GLOBAL 1263000 10d4060
oatlastword OBJECT GLOBAL 233705c 4
oatbss OBJECT GLOBAL 2338000 f5050
oatbsslastword OBJECT GLOBAL 242d04c 4
These symbols are a kind of pointers to specific part of the OAT format. For example, oatdata
will point to the begining of the underlying OAT format whereas oatexec
points to the native code. For those who are interested in a deeper understand of OAT internal structures, See:
These different formats can be a bit confusing and to summarize:
DEX are transformed into .odex
files that are primarily ELF files wrapping a custom OAT format.
The structure of the OAT is poorly documented and its internal structures change for each version of Android without backward compatibility. It means that OAT files produced on Android 6.0.1 can only be used on this version.
In the Android framework the dex2oat
executable is responsible to convert and optimize the APK DEX files into OATs. This executable is located in the /system/bin/
directory and we can have its output through logcat:
$ adb logcat -s "dex2oat:I"
...
05-04 10:16:37.218 1987 1987 I dex2oat : /system/bin/dex2oat --compiler-filter=speed --dex-file=/data/user/0/com.google.android.gms/snet/installed/snet.jar --oat-file=/data/user/0/com.google.android.gms/snet/dalvik-cache/snet.dex
05-04 10:16:37.688 1987 1998 W dex2oat : Compilation of void com.google.android.snet.Snet.enterSnetIdle(android.content.Context, android.os.Bundle) took 116.995ms
05-04 10:16:37.768 1987 1987 I dex2oat : ----------------------------------------------------
05-04 10:16:37.768 1987 1987 I dex2oat : <SS>: S T A R T I N G . . .
05-04 10:16:37.768 1987 1987 E dex2oat : <SS>: oat location is not valid /data/user/0/com.google.android.gms/snet/dalvik-cache/snet.dex
05-04 10:16:37.768 1987 1987 I dex2oat : dex2oat took 552.045ms (threads: 8) arena alloc=3MB java alloc=1150KB native alloc=8MB free=3MB
05-04 12:25:50.878 10460 10460 I dex2oat : /system/bin/dex2oat --compiler-filter=speed
...
The output above is the transformation of SaftyNet DEX files, located in /data/user/0/com.google.android.gms/snet/installed/snet.jar
, into an OAT saved in /data/user/0/com.google.android.gms/snet/dalvik-cache/snet.dex
.
One can notice that the extension is .dex
so it should be a DEX file and not an OAT. Actually if we check the type:
$ file snet.dex
snet.dex: ELF 64-bit LSB shared object, ARM aarch64, version 1 (GNU/Linux), dynamically linked, stripped
We can see an ELF.
Warning
Do not trust extensions: .dex can be DEX or OAT, .odex are OAT, .oat are OAT, …
The process of converting Java sources into the OAT can be simplified with the following diagram:
If we analyze applications from the Google PlayStore, we usually have the classes.dex
file(s) in APK. As this file contains the Dalvik bytecode, most of the tools rely on this file to perform the analysis (decompilation, static analysis, …)
However when analyzing constructor firmwares (or ROM) these DEX files could miss. For example, if we are interested in com.android.settings
from Samsung, the application is associated with the /system/priv-app/SecSettings2
directory which has the following structure:
$ tree system/priv-app/SecSettings2
├── oat
│ └── arm64
│ └── SecSettings2.odex
└── SecSettings2.apk
2 directories, 2 files
By looking at the files in SecSettings2.apk
, we can’t find .dex
files:
$ unzip -l ./SecSettings2.apk|grep -c "classes.dex"
0
Next to the SecSettings2.apk
, we find SecSettings2.odex
which is the OAT file resulting of the optimization of the missing DEX file. As ROM developers control the Android version and the target architecture, they just have to provide the OAT file.
They can also use this “feature” to avoid analysis and reverse engineering of the application. As the Dalvik bytecode is located in the DEX file, without this file the analysis is quite limited.
Thankfully there is a copy of the original DEX within the OAT! Actually it’s not an exact copy as dex2oat
replaces some Dalvik instructions (like invoke-virtual
) with optimized ones [1] but starting from Android N, we can also recover the original instructions.
Prior Android Oreo (8.0.0) DEX files were embedded in the OAT itself and after Oreo, the transformation performed by dex2oat
generates two files:
classes.odex: OAT containing native code
classes.vdex: VDEX file containing copy of original DEX files
The DEX files originally located in the OAT has been exported in a new file with a new format: the VDEX format. This new format is completely different from OAT, especially it’s not an ELF.
In the same way as OAT format, VDEX internal structures change for each version of Android without backward compatibility.
It also exists tools [4] [5] [6] to extract DEX from OAT/VDEX files but the extraction [3] is either limited to OAT [4] or to VDEX [5]. With LIEF we aim to provide a single framework to deal with these formats.
As explained in the previous part, internal structures of the formats change for each version of Android. LIEF provides an abstraction of these modifications and the user can deal with OAT or VDEX without carrying of the underlying version of the OAT.
It currently supports OAT files from Android 6.0 Marshmallow (OAT v64) to Android 8.0.1 Oreo (OAT v131).
The OAT version is available with the lief.OAT.version()
function:
>>> import lief
>>> lief.OAT.version("classes.odex") # From Android 6
64
>>> lief.OAT.version("classes.odex") # From Android 7
88
One can also access to the associated Android version by using lief.OAT.android_version()
:
>>> lief.OAT.android_version(64)
ANDROID_VERSIONS.VERSION_601
>>> lief.OAT.android_version(124)
ANDROID_VERSIONS.VERSION_800
>>> lief.Android.code_name(lief.Android.ANDROID_VERSIONS.VERSION_800)
'Oreo'
>>> lief.Android.version_string(lief.Android.ANDROID_VERSIONS.VERSION_800)
"8.0.0"
To express the fact that OAT files are first ELF, the lief.OAT.Binary
class extends the lief.ELF.Binary
>>> import lief
>>> oat = lief.parse("classes.odex")
>>> type(oat)
_pylief.OAT.Binary
>>> isinstance(oat, lief.ELF.Binary)
True
Thus the same ELF API is available: adding sections, modifying dynamic entries, etc and the lief.OAT.Binary
object adds the following methods:
OAT binary representation
Return the abstract representation of the current binary (lief.Binary
)
Overloaded function.
add(self, arg: lief._lief.ELF.DynamicEntry, /) -> lief._lief.ELF.DynamicEntry
dynamic_entry
add(self, section: lief._lief.ELF.Section, loaded: bool = True) -> lief._lief.ELF.Section
Add the given
Section
to the binary.If the section does not aim at being loaded in memory, the
loaded
parameter has to be set toFalse
(default:True
)
add(self, segment: lief._lief.ELF.Segment, base: int = 0) -> lief._lief.ELF.Segment
Add a new Segment
in the binary
add(self, note: lief._lief.ELF.Note) -> lief._lief.ELF.Note
Add a new Note
in the binary
Add a new dynamic relocation.
We consider a dynamic relocation as a relocation which is not plt-related.
Add a dynamic Symbol
to the binary
The function also takes an optional lief.ELF.SymbolVersion
Create a symbol for the function at the given address
and create an export
Add a library with the given name as dependency
Add relocation for object file (.o)
The first parameter is the section to add while the second parameter is the Section
associated with the relocation.
If there is an error, this function returns a nullptr. Otherwise, it returns the relocation added.”,
Add a .plt.got relocation. This kind of relocation is usually associated with a PLT stub that aims at resolving the underlying symbol.
Add a static Symbol
to the binary
Return an iterator over Class
The concrete representation of the binary. Basically, this property cast a lief.Binary
into a lief.PE.Binary
, lief.ELF.Binary
or lief.MachO.Binary
.
See also: lief.Binary.abstract
Constructor functions that are called prior to any other functions
Return an iterator over File
List of the binary destructors (typically, the functions located in the .fini_array
)
Return an iterator to DynamicEntry
entries as a list
Return an iterator over dynamics Relocation
Return an iterator to dynamic Symbol
Binary’s entrypoint
Return the last offset used by the ELF binary according to both, the sections table and the segments table.
Overloaded function.
export_symbol(self, symbol: lief._lief.ELF.Symbol) -> lief._lief.ELF.Symbol
Export the given symbol and create an entry if it doesn’t exist
export_symbol(self, symbol_name: str, value: int = 0) -> lief._lief.ELF.Symbol
Export the symbol with the given name and create an entry if it doesn’t exist
Return the binary’s exported Function
Return dynamic Symbol
which are exported
Overloaded function.
extend(self, segment: lief._lief.ELF.Segment, size: int) -> lief._lief.ELF.Segment
Extend the given given Segment
by the given size
extend(self, segment: lief._lief.ELF.Section, size: int) -> lief._lief.ELF.Section
Extend the given given Section
by the given size
File format EXE_FORMATS
of the underlying binary.
List of the functions found the in the binary
Overloaded function.
get(self, tag: lief._lief.ELF.DYNAMIC_TAGS) -> lief._lief.ELF.DynamicEntry
Return the first binary’s
DynamicEntry
from the givenDYNAMIC_TAGS
.It returns None if the dynamic entry can’t be found.
get(self, type: lief._lief.ELF.SEGMENT_TYPES) -> lief._lief.ELF.Segment
Return the first binary’s
Segment
from the givenSEGMENT_TYPES
It returns None if the segment can’t be found.
get(self, type: lief._lief.ELF.NOTE_TYPES) -> lief._lief.ELF.Note
Return the first binary’s
Note
from the givenNOTE_TYPES
.It returns None if the note can’t be found.
get(self, type: lief._lief.ELF.SECTION_TYPES) -> lief._lief.ELF.Section
Return the first binary’s
Section
from the givenELF_SECTION_TYPES
It returns None if the section can’t be found.
Overloaded function.
get_class(self, class_name: str) -> lief._lief.OAT.Class
Return the Class
from its name
get_class(self, class_index: int) -> lief._lief.OAT.Class
Return the Class
from its index
Return the content located at the provided virtual address. The virtual address is specified in the first argument and size to read (in bytes) in the second.
If the underlying binary is a PE, one can specify if the virtual address is a RVA
or a VA
. By default, it is set to AUTO
.
Get the dynamic symbol from the given name.
It returns None if it can’t be found.
Return the address of the given function name
Return the DynamicEntryLibrary
with the given name
It returns None if the library can’t be found.
Overloaded function.
get_relocation(self, symbol_name: str) -> lief._lief.ELF.Relocation
Return the Relocation
associated with the given symbol name
get_relocation(self, symbol: lief._lief.ELF.Symbol) -> lief._lief.ELF.Relocation
Return the Relocation
associated with the given Symbol
get_relocation(self, address: int) -> lief._lief.ELF.Relocation
Return the Relocation
associated with the given address
Return the Section
with the given name
It returns None if the section can’t be found.
Get the static symbol from the given name
.
It returns None if it can’t be found.
Return list of strings used in the current ELF file with a minimal size given in first parameter (Default: 5) It looks for strings in the .roadata
section
Return the Symbol
from the given name
.
If the symbol can’t be found, it returns None.
Return the GnuHash
object
Hash are used by the loader to speed up symbols resolution (GNU Version)
Overloaded function.
has(self, tag: lief._lief.ELF.DYNAMIC_TAGS) -> bool
Check if it exists a
DynamicEntry
with the givenDYNAMIC_TAGS
has(self, type: lief._lief.ELF.SEGMENT_TYPES) -> bool
Check if a Segment
of type (SEGMENT_TYPES
) exists
has(self, type: lief._lief.ELF.NOTE_TYPES) -> bool
Check if a Note
of type (NOTE_TYPES
) exists
has(self, type: lief._lief.ELF.SECTION_TYPES) -> bool
Check if a Section
of type (SECTION_TYPES
) exists
Check if the class if the given name is present in the current OAT binary
Check if the symbol with the given name
exists in the dynamic symbol table
Check if the binary uses a loader (also named linker or interpreter)
Check if the given library name exists in the current binary
True
if the binary contains notes
Check if the binary has NX
protection (non executable stack)
True if data are appended to the end of the binary
Check if a Section
with the given name exists in the binary
Check if a Section
that encompasses the given offset exists
Check if a Section
that encompasses the given virtual address exists
Check if the symbol with the given name
exists in the static symbol table
Check if a Symbol
with the given name exists
Return the OAT Header
Return the program image base. (e.g. 0x400000
)
Return the binary’s imported Function
(name)
Return dynamic Symbol
which are imported
ELF interpreter (loader) if any. (e.g. /lib64/ld-linux-x86-64.so.2
)
Check if the binary has been compiled with -fpie -pie flags
To do so we check if there is a PT_INTERP segment and if the binary type is ET_DYN (Shared object)
Iterator over lief._lief.OAT.Class
Iterator over lief._lief.DEX.File
Iterator over lief._lief.ELF.Symbol
Iterator over lief._lief.ELF.DynamicEntry
Iterator over lief._lief.ELF.Relocation
Iterator over lief._lief.ELF.Symbol
Iterator over lief._lief.OAT.Method
Iterator over lief._lief.ELF.Note
Iterator over lief._lief.OAT.DexFile
Iterator over lief._lief.ELF.Relocation
Iterator over lief._lief.ELF.Section
Iterator over lief._lief.ELF.Segment
Iterator over lief._lief.ELF.Symbol
Iterator over lief._lief.ELF.SymbolVersion
Iterator over lief._lief.ELF.SymbolVersionDefinition
Iterator over lief._lief.ELF.SymbolVersionRequirement
Return the last offset used in binary according to sections table
Return the last offset used in binary according to segments table
Return binary’s imported libraries (name)
Return an iterator over Method
Return the next virtual address available
Return an iterator over the Note
entries
Return an iterator over DexFile
Return an iterator over object Relocation
Convert an offset into a virtual address.
Overlay data that are not a part of the ELF format
Overloaded function.
patch_address(self, address: int, patch_value: list[int], va_type: lief._lief.Binary.VA_TYPES = lief._lief.VA_TYPES.AUTO) -> None
patch_address(self, address: int, patch_value: int, size: int = 8, va_type: lief._lief.Binary.VA_TYPES = lief._lief.VA_TYPES.AUTO) -> None
Overloaded function.
patch_pltgot(self, symbol_name: str, address: int) -> None
Patch the imported symbol’s name with the address
patch_pltgot(self, symbol: lief._lief.ELF.Symbol, address: int) -> None
Patch the imported Symbol
with the address
Apply the given permutation on the dynamic symbols table
Return an iterator over PLT/GOT Relocation
Force relocating the segments table in a specific way (see: PHDR_RELOC
).
This function can be used to enforce a specific relocation of the segments table. Upon successful relocation, the function returns the offset of the relocated segments table. Otherwise, if the function fails, it returns 0
Return an iterator over all Relocation
Overloaded function.
remove(self, dynamic_entry: lief._lief.ELF.DynamicEntry) -> None
Remove the given DynamicEntry
from the dynamic table
remove(self, tag: lief._lief.ELF.DYNAMIC_TAGS) -> None
Remove all the DynamicEntry
with the given DYNAMIC_TAGS
remove(self, section: lief._lief.ELF.Section, clear: bool = False) -> None
Remove the given Section
. The clear
parameter specifies whether or not we must fill its content with 0
before removing
remove(self, note: lief._lief.ELF.Note) -> None
Remove the given Note
remove(self, type: lief._lief.ELF.NOTE_TYPES) -> None
Remove all the Note
with the given NOTE_TYPES
Overloaded function.
remove_dynamic_symbol(self, arg: lief._lief.ELF.Symbol, /) -> None
Remove the given Symbol
from the .dynsym
section
remove_dynamic_symbol(self, arg: str, /) -> None
Remove the Symbol
with the name given in parameter from the .dynsym
section
Remove the given library
Remove the section with the given name
Remove the given Symbol
from the .symtab
section
Replace the Segment
given in 2nd parameter with the Segment
given in the first parameter and return the updated segment.
Warning
The original_segment
is no longer valid after this function
Return the Section
which encompasses the given offset. It returns None if a section can’t be found.
If skip_nobits
is set (which is the case by default), this function won’t consider sections for which the type is SHT_NOBITS
(like .bss, .tbss, ...
)
Return the Section
which encompasses the given virtual address. It returns None if a section can’t be found.
If skip_nobits
is set (which is the case by default), this function won’t consider sections for which the type is SHT_NOBITS
(like .bss, .tbss, ...
)
Return an iterator over binary’s Section
Return the Segment
which encompasses the given offset. It returns None if a segment can’t be found.
Return the Segment
which encompasses the given virtual address. It returns None if a segment can’t be found.
Return an iterator to binary’s Segment
Return an iterator to static Symbol
Return list of strings used in the current ELF file. Basically this function looks for strings in the .roadata
section
Strip the binary
Return an iterator over both static and dynamic Symbol
Return an iterator SymbolVersion
Return an iterator to SymbolVersionDefinition
Return an iterator to SymbolVersionRequirement
Return the SysvHash
object
Hash are used by the loader to speed up symbols resolution (SYSV version)
Return the binary’s ELF_CLASS
True
if GNU hash is used
True
if SYSV hash is used
Convert the virtual address to a file offset
Return the size of the mapped binary
Overloaded function.
write(self, output: str) -> None
Rebuild the binary and write it in a file
write(self, output: str, config: lief._lief.ELF.Builder.config_t) -> None
Rebuild the binary with the given configuration and write it in a file
Return all virtual addresses that use the address
given in parameter
If the given OAT targets Android Marshmallow or Nougat (6 or 7) then DEX files can be retrieved with the lief.OAT.Binary.dex_files
attribute:
>>> len(oat.dex_files) # > 1 if multi-dex
1
>>> dex = oat.dex_files[0]
>>> dex.save("/tmp/classes.dex")
From the code above, the lief.DEX.File
has been extracted to /tmp/classes.dex
(with de-optimization).
If the given OAT targets Android Oreo or above, then extraction is done by using the VDEX file. The lief.OAT.parse()
function accepts an OAT file or an OAT and a VDEX file. By providing the VDEX file in the second parameter, the lief.OAT.Binary
object will have the same functionalities as the one for OAT pre-Oreo.
If the VDEX file is not provided then lief.OAT.Binary
will have limited information:
# Without VDEX file
>>> oat_oreo = lief.parse("KeyChain.odex")
>>> len(oat_oreo.dex_files)
0
>>> len(oat_oreo.classes)
0
>>> len(oat_oreo.oat_dex_files)
1
>>> oat_dex_file = oat_oreo.oat_dex_files[0]
>>> print(oat_dex_file)
/system/app/KeyChain/KeyChain.apk - (Checksum: 0x206c8ab1)
# With VDEX file
>>> oat_oreo = lief.OAT.parse("KeyChain.odex", "KeyChain.vdex")
>>> len(oat_oreo.dex_files)
1
>>> len(oat_oreo.classes)
17
>>> oat_oreo.dex_files[0].save("/tmp/classes.dex")
We can also use the LIEF’s VDEX module directly:
>>> vdex = lief.VDEX.parse("KeyChain.vdex")
As the VDEX format is completely different from OAT, ELF, PE and Mach-O the VDEX parser creates a lief.VDEX.File
object and not a Binary
. We can also extract DEX files with the lief.VDEX.File.dex_files
attribute:
>>> len(vdex.dex_files)
1
>>> vdex.dex_files[0].save("/tmp/KeyChain.dex") # With de-optimization
The previous part was about the OAT/VDEX formats and how to access to the underlying DEX. This part introduces the main API for the lief.DEX.File
object.
The LIEF DEX module enables to get information about Java code such as String, classes name, dalvik bytecodes, …
Note
As LIEF project is only focused on formats, there won’t be Dalvik disassembler in the DEX module.
The main API for a DEX file is in the lief.DEX.File
object. This object can be generated using:
>>> oat = lief.parse("SecSettings2.odex")
>>> type(oat.dex_files[0])
_pylief.DEX.File
>>> vdex = lief.VDEX.parse("SecSettings2.odex")
>>> type(vdex.dex_files[0])
_pylief.DEX.File
>>> dex = lief.DEX.parse("classes.dex")
>>> type(dex)
_pylief.DEX.File
Once created, we can access to the strings with the lief.DEX.File.strings
attribute:
>>> len(dex.strings)
23529
>>> for s in dex.strings:
... if http in s:
... print(s)
https://analytics.mopub.com/i/jot/exchange_client_event
https://app-measurement.com/a
https://mobilecrashreporting.googleapis.com/v1/crashes:batchCreate?key=
https://pagead2.googlesyndication.com/pagead/gen_204?id=gmob-apps
https://plus.google.com/
https://ssl.google-analytics.com
https://support.google.com/dfp_premium/answer/7160685#push
https://www.google.com
...
Similarly, methods and classes are available with the lief.DEX.File.classes
/ lief.DEX.File.methods
attributes:
for cls in dex.classes:
if cls.source_filename:
print(cls)
com.avast.android.sdk.antitheft.internal.protection.wipe.a - CalendarWiper.java - 3 Methods
com.avast.android.account.internal.identity.a - AvastIdentityProvider.java - 17 Methods
com.avast.android.account.internal.identity.d - FacebookIdentityProvider.java - 19 Methods
com.avast.android.lib.wifiscanner.internal.b$a - WifiScannerComponentFactory.java - 1 Methods
In the DEX file format, there is a special attribute for classes that register the original source filename: source_file_idx. Some obfuscators mangle classes but keep this attribute! Since Java source filenames are associated with class names, we can easily recover the deobfuscated name using:
for cls in dex.classes:
if cls.source_filename:
print(cls.pretty_name + ": ---> " + cls.source_filename)
com.avast.android.sdk.antitheft.internal.protection.wipe.a: ---> CalendarWiper.java
com.avast.android.account.internal.identity.a ---> AvastIdentityProvider.java
com.avast.android.account.internal.identity.d ---> FacebookIdentityProvider.java
com.avast.android.lib.wifiscanner.internal.b$a ---> WifiScannerComponentFactory.java
If we are interested with DEX methods, they are represented with the :class:`~lief.DEX.Method` object and we can access to the **raw** Dalvik bytecode through :attr:`lief.DEX.Method.bytecode`
ART is the name of the Android Runtime but it’s also a format! This format is used for optimization purpose by the Android’s framework.
As discussed previously, Android has its own implementation of the Java virtual based on the Dalvik bytecode. This JVM is implemented in C++ and Java primitives (java.lang.String
, java.lang.Object
, etc) are mirrored with C++ objects:
java.lang.Class
: art::mirror::Class
java.lang.String
: art::mirror::String
java.lang.reflect.Method
: art::mirror::Method
…
When instantiating a new Java class, it creates a mirrored C++ object (Memory allocation, calling constructors, …) and the JVM handles a reference on this C++ object. To speed up the boot process and to avoid instantiation of well-known classes [2] at each boot, Android uses the ART format to store instances of C++ objects. To simplify, it can be seen as a heap dump of C++ objects.
In the same way as OAT and VDEX, the internal structures of this format change for each version of Android.
LIEF 0.9 has a very basic support for this format and only exposes the ART lief.ART.Header
. The main API is available in the lief.ART.File
object.
art = lief.ART.parse("boot.art")
print(art.header)
Version: 46
Image Begin: 0x70000000
Image Size: 0x238ac8
Checksum: 0x997c0fb0
OAT File Begin: 0x70a5b000
OAT File End: 0x71272000
OAT Data Begin: 0x70a5c000
OAT Data End: 0x7126df70
Patch Delta: 0
Pointer Size: 8
Compile pic: true
Number of sections: 10
Number of methods: 7
Boot Image Begin: 0
Boot Image Size: 0
Boot OAT Begin: 0
Boot OAT Size: 0
Storage Mode: UNCOMPRESSED
Data Size: 0x2389f0
LIEF 0.9 is read-only on these formats but further versions should enable to modify them (Add methods, change names, patch checksum, …)
Enjoy!
Notes
API