thesis/4_class_loader/_removed_from_intro.typ
Jean-Marie Mineau 81f49f87d3
All checks were successful
/ test_checkout (push) Successful in 1m21s
wip
2025-08-19 23:27:25 +02:00

36 lines
4.2 KiB
XML

/*
When building an application with Android Studio, the source codes of applications are compiled to Java bytecode, which is then converted to Dalvik bytecode.
Dalvik bytecode is then put in a zip archive with other resources such as the application manifest, and the zip archive is then signed.
All this process is handled by Android Studio, behind the scene.
At runtime, the Dalvik bytecode is either interpreted by the Dalvik virtual machine or compiled by #ART in order to execute native code and it is up to these components to handle the loading of the classes.
Both behaviors are possible at the same time for a single application, and it is up to Android to choose which part of an application is compiled in native code.
*/
Android applications are distributed using markets of applications.
The market maintainers have the difficult task to discover suspicious applications and delete them if they are effectively malicious applications.
For such a task, some automated analysis is performed, but sometimes, a manual investigation is required.
A reverser is in charge of studying the application: they usually perform a static analysis and a dynamic analysis.
The reverser uses in the first phase static analysis tools in order to access and review the code of the application.
If this first phase is not accurately driven, for example if they fail to access a critical class, they may decide that a malicious application is safe.
Additionally, as stated by Li #etal~@Li2017 in their conclusions, such a task is complexified by dynamic code loading, reflective calls, native code, and multi-threading which cannot be easily handled statically.
Nevertheless, even if we do not consider these aspects, determining statically how the regular class loading system of Android is working is a difficult task.
Class loading occurs at runtime and is handled by the components of #ART, even when the application is partially or fully compiled ahead of time.
Nevertheless, at the development stage, Android Studio handles the resolution of the different classes that can be internal to the application.
When building, the code is linked to the standard library i.e. the code contained in `android.jar`.
In this article, we call these classes "Development #SDK classes".
`android.jar` is not added to the application because its classes will be available at runtime in others `.jar` files.
To distinguish those classes found at runtime from Dev #SDK classes, we call them #Asdkc.
When releasing the application, the building process of Android Studio can manage different versions of the #Asdk, reported in the Manifest as the "#SDK versions".
Indeed, some parts of the core #Asdkc can be embedded in the application, for retro compatibility purposes: by comparing the specified minimum #SDK version and the target #SDK version, the code of extra #Asdkc is stored in the APK file.
As a consequence, it is frequent to find inside applications some classes that come from the `com.android` packages.
At runtime each smartphone runs a unique version of Android, but, as the application is deployed on multiple versions of Android, it is difficult to predict which classes will be loaded from the #Asdkc or from the APK file itself.
This complexity increases with the multi-#DEX format of recent #APK files that can contain several bytecode files.
Going back to the problem of a reverser studying a suspicious application statically, the reverser uses tools to disassemble the application~@mauthe_large-scale_2021 and track the flows of data in the bytecode.
As an example, for a spyware potentially leaking personal information, the reverser can unpack the application with Apktool and, after manually locating a method that they suspect to read sensitive data (by reading the unpacked bytecode), they can compute with FlowDroid~@Arzt2014a if there is a flow from this method to methods performing HTTP requests.
During these steps, the reverser faces the problem of resolving statically, which class is loaded from the APK file and the #Asdkc.
If they, or the tools they use, choose the wrong version of the class, they may obtain wrong conclusions about the code.
Thus, the possibility of shadowing classes could be exploited by an attacker in order to obfuscate the code.