thesis/2_background/4_2_classloader.typ
Jean-Marie 'Histausse' Mineau a3fcff0c19
All checks were successful
/ test_checkout (push) Successful in 1m52s
in the end I ingored a lot of feedbacks, sory jfl
2025-09-30 23:40:43 +02:00

63 lines
5.8 KiB
Typst

#import "../lib.typ": SDK, API, API, DEX, pb2, pb2-text, etal, APIs, ie
#import "../lib.typ": todo
=== Android Class Loading <sec:bg-soa-cl>
#pb2-text
This subsection is mainly dedicated to class loading in Java and Android.
Because we focus on the _default_ class loading algorithm, we will not focus on dynamic code loading (#ie loading of additional bytecode while the application is already running).
However, class loading is used, without dynamic code loading, to load classes other than the one in the application: the platform classes.
In the second part of this subsection, we will look at the work that has been done related to those platform classes.
==== Class Loading <sec:bg-cl>
Class loading is a fundamental element of Java; it defines which classes are loaded from where.
In Android, this is often associated with dynamic code loading, as the `ClassLoader` objects are used to load code at runtime.
However, class loading also intervenes to load platform classes or classes from the application itself, and thus requires some attention when analysing an application.
Class loading mechanisms have been studied in the general context of the Java language.
Gong~@gong_secure_1998 describes the JDK 1.2 class loading architecture and capabilities.
One of the main advantages of class loading is the type safety property that prevents type spoofing.
As explained by Liang and Bracha~@liang_dynamic_1998, by capturing events at runtime (new loaders, new class) and maintaining constraints on the multiple loaders and their delegation hierarchy, authors can avoid confusion when loading a spoofed class.
This behaviour is now implemented in modern Java virtual machines.
Later, Tazawa and Hagiya~@tozawa_formalization_2002 proposed a formalisation of the Java Virtual Machine supporting dynamic class loading in order to ensure type safety.
Those works ensure strong safety for the Java Virtual Machine, in particular when linking new classes at runtime.
Although Android has a similar mechanism, the implementation is not shared with the JVM of Oracle.
Additionally, our problem statement does not focus on spoofing classes at runtime, but on confusions that occur when using a static analyser used by a reverser that tries to understand the code loading process offline.
Contributions about Android class loading focus on using the capabilities of class loading to extend Android features or to prevent reverse engineering of Android applications.
For instance, Zhou #etal~@zhou_dynamic_2022 extend the class loading mechanism of Android to support regular Java bytecode, and Kritz and Maly~@kriz_provisioning_2015 propose a new class loader to automatically load modules of an application without user interactions.
Regarding reverse engineering, class loading mechanisms are frequently used by packers for hiding all or parts of the code of an application~@Duan2018.
For example, packers exploits the class loading capability of Android to load new code.
They also combine the loading with code generation from ciphered assets or code modification from native code calls~@liao2016automated to increase the difficulty of recovery of the code.
Because parts of the original code will be only available at runtime, deobfuscation approaches propose techniques that track #DEX structures when manipulated by the application~@zhang2015dexhunter @xue2017adaptive @wong2018tackling.
Those contributions interact with the class loading mechanism of Android to collect the #DEX structures at the right moment.
Some classes, however, are not loaded from the application, nor dynamically loaded by the application.
Those classes are platform classes, and apart from dynamic code loaded, they are the main reason class loading is needed by Android.
We will now look at the literature related to them.
==== Platform Classes <sec:bg-soa-platform>
Platform classes are divided between #SDK classes that are documented, and the other classes, often referred to as hidden #APIs.
#SDK classes are clearly listed and documented by Google, so they do not require as much attention as hidden #APIs.
As we said earlier, hidden #API are undocumented methods that can be used by an application, thus making them a potential blind spot when analysing an application.
However, not a lot of research has been done on the subject.
Li #etal did an empirical study of the usage and evolution of hidden #API~@li_accessing_2016.
They found that hidden #API are added and removed in every release of Android, and that they are used both by benign and malicious applications.
More recently, He #etal~@he_systematic_2023 did a systematic study of hidden service #API related to security.
They studied how the hidden #API can be used to bypass Android security restrictions and found that although Google countermeasures are effective, they need to be implemented inside the system services and not the hidden #API due to the lack of in-app privilege isolation: the framework code is in the same process as the user code, meaning any restriction in the framework can be bypassed by the user.
Unfortunately, those two contributions do not explore further the consequences of the use of hidden #APIs for a reverse engineer.
#v(2em)
In conclusion, class loading mechanisms have been studied carefully in the context of the Java language.
However, the same cannot be said about Android, whose implementation diverges significantly from classic Java Virtual Machines.
Most work done on Android focuses on extending Android capabilities using class loading, or on analysing dynamically the code loading operations of an application.
In @sec:cl, we will model the behaviour of Android when loaded classes used by an application that do not use dynamic code loading, and check if this behaviour matches the behaviour of common analysis tools.
We will also take some time to check if the state of the art related to hidden #API is up to date with the current Android versions.