thesis/2_background/4_2_classloader.typ
Jean-Marie Mineau 471a176683
All checks were successful
/ test_checkout (push) Successful in 1m48s
keep refactoring
2025-09-24 17:19:23 +02:00

57 lines
5.1 KiB
Typst

#import "../lib.typ": SDK, API, API, DEX, pb2, pb2-text, etal
#import "../lib.typ": todo
== Android Class Loading <sec:bg-soa-cl>
#todo[Refactor]
=== Platform Classes <sec:bg-soa-platform>
As we said earlier, hidden #API are undocumented methods that can be used by an application, thus making them a potential blind spot when analysing an application.
However, not a lot a research has been done on the subject.
Li #etal did an empirical study of the usage and evolution of hidden #API~@li_accessing_2016.
They found that hidden #API are added and removed in every release of Android, and that they are used both by benign and malicious applications.
More recently, He #etal~@he_systematic_2023 did a systematic study of hidden service #API related to security.
They studied how the hidden #API can be used to bypass Android security restrictions and found that although Google countermeasures are effective, they need to be implemented inside the system services and not the hidden #API due to the lack of in-app privilege isolation: the framework code is in the same process as the user code, meaning any restriction in the framework can be bypassed by the user.
Unfortunately those two contributions do not explore further the consequences of the use of hidden #API for a reverse engineer.
=== Class Loading <sec:bg-cl>
Another rarely considered element of Android is its class loading mechanism.
Class loading is a fundamental element of Java, it define which classes are loaded from where.
In Android, this is often associated to dynamic code loading, as the `ClassLoader` objects are used to load code at runtime.
However, class loading also intervenes to load platform classes or classes from the application itself, and thus require some attention when analysing an application.
Class loading mechanisms have been studied in the general context of the Java language.
Gong~@gong_secure_1998 describes the JDK 1.2 class loading architecture and capabilities.
One of the main advantages of class loading is the type safety property that prevents type spoofing.
As explained by Liang and Bracha~@liang_dynamic_1998, by capturing events at runtime (new loaders, new class) and maintaining constraints on the multiple loaders and their delegation hierarchy, authors can avoid confusion when loading a spoofed class.
This behavior is now implemented in modern Java virtual machines.
Later Tazawa and Hagiya~@tozawa_formalization_2002 proposed a formalization of the Java Virtual Machine supporting dynamic class loading in order to ensure type safety.
Those works ensure strong safety for the Java Virtual Machine, in particular when linking new classes at runtime.
Although Android has a similar mechanism, the implementation is not shared with the JVM of Oracle.
Additionally, in this paper, we do not focus on spoofing classes at runtime, but on confusion that occurs when using a static analyser used by a reverser that tries to understand the code loading process offline.
Contributions about Android class loading focus on using the capabilities of class loading to extend Android features or to prevent reverse engineering of Android applications.
For instance, Zhou #etal~@zhou_dynamic_2022 extend the class loading mechanism of Android to support regular Java bytecode and Kritz and Maly~@kriz_provisioning_2015 propose a new class loader to automatically load modules of an application without user interactions.
Regarding reverse engineering, class loading mechanisms are frequently used by packers for hiding all or parts of the code of an application~@Duan2018.
For example, packers exploits the class loading capability of Android to load new code.
They also combine the loading with code generation from ciphered assets or code modification from native code calls~@liao2016automated to increase the difficulty of recovery of the code.
Because parts of the original code will be only available at runtime, deobfuscation approaches propose techniques that track #DEX structures when manipulated by the application~@zhang2015dexhunter @xue2017adaptive @wong2018tackling.
Those contributions interact with the class loading mechanism of Android to collect the #DEX structures at the right moment.
Deobfuscating an application is the first problem the reverse engineer has to solve.
Nevertheless, even, if all classes of the code are recovered by the reverse engineer, understanding what are the classes that are really loaded by Android brings an additional problem.
The reverse engineer can have the feeling that what he sees in the bytecode is what is loaded at runtime, whereas the system can choose alternative implementations of a class.
#v(2em)
Class loading mechanisms have been studies carefully in the context of the Java language.
However, the same cannot be said about Android, whose implementation significantly from classic Java Virtual Machine.
Most work done on Android focus on extending Android capabilities using class loading, or on analysing dynamically the code loading operations of an application.
This leaves open the question of the actual default class loading behavior of Android, leading us to #pb2:
#pb2-text