thesis/4_class_loader/1_related_work.typ

#import "../lib.typ": etal, paragraph, DEX
#import "X_var.typ": *

== State of the Art <sec:cl-soa>

#paragraph([Class loading])[
Class loading mechanisms  have been studied in the general context of the Java language.
Gong~@gong_secure_1998 describes the JDK 1.2 class loading architecture and capabilities.
One of the main advantages of class loading is the type safety property that prevents type spoofing.
As explained by Liang and Bracha~@liang_dynamic_1998, by capturing events at runtime (new loaders, new class) and maintaining constraints on the multiple loaders and their delegation hierarchy, authors can avoid confusion when loading a spoofed class.
This behavior is now implemented in modern Java virtual machines.
Later Tazawa and Hagiya~@tozawa_formalization_2002 proposed a formalization of the Java Virtual Machine supporting dynamic class loading in order to ensure type safety.
Those works ensure strong safety for the Java Virtual Machine, in particular when linking new classes at runtime.
Although Android has a similar mechanism, the implementation is not shared with the JVM of Oracle.
Additionally, in this paper, we do not focus on spoofing classes at runtime, but on confusion that occurs when using a static analyzer used by a reverser that tries to understand the code loading process offline.

Contributions about Android class loading focus on using the capabilities of class loading to extend Android features or to prevent reverse engineering of Android applications.
For instance, Zhou #etal~@zhou_dynamic_2022 extend the class loading mechanism of Android to support regular Java bytecode and Kritz and Maly~@kriz_provisioning_2015 propose a new class loader to automatically load modules of an application without user interactions.

Regarding reverse engineering, class loading mechanisms are frequently used by packers for hiding all or parts of the code of an application~@Duan2018.
The problem to be solved consists in locating secondary #dexfiles that can be unciphered just before being loaded.
Dynamic hook mechanisms should be used to intercept the bytecode at load time.
These techniques can be of some help for the reverser, but they require to instrument the source code of AOSP or the application itself.
The engineering cost is high and anti-debugging techniques can slow down the process.
Thus, a reverser always starts by studying statically an application using static analysis tools~@Li2017, and will eventually go to dynamic analysis~@Egele2012 if further costly extra analysis is needed (for example, if they spot the use of a custom class loader).
Performing a static analysis of an application can be time consuming if the programmer uses obfuscation techniques such as native code, packing techniques, value encryption, or reflection.
Such techniques can partially hide the Java bytecode from a static analysis investigation as they modify it at runtime.
For example, packers exploits the class loading capability of Android to load new code.
They also combine the loading with code generation from ciphered assets or code modification from native code calls~@liao2016automated to increase the difficulty of recovery of the code.
Because parts of the original code will be only available at runtime, deobfuscation approaches propose techniques that track #DEX structures when manipulated by the application~@zhang2015dexhunter @xue2017adaptive @wong2018tackling. All those contributions are directly related to the class loading mechanism of Android.

Deobfuscating an application is the first problem the reverse engineer has to solve. Nevertheless, even, if all classes of the code are recovered by the reverse engineer, understanding what are the classes that are really loaded by Android brings an additional problem.
The reverse engineer can have the feeling that what he sees in the bytecode is what is loaded at runtime, whereas the system can choose alternative implementations of a class.
Our goal is to show that tools mentioned in the literature~@Li2017 can suffer from attacks exploiting confusion inside regular class loading mechanisms of Android.
]

#paragraph([Hidden APIs])[
Li #etal did an empirical study of the usage and evolution of hidden APIs~@li_accessing_2016.
They found that hidden APIs are added and removed in every release of Android, and that they are used both by benign and malicious applications.
More recently, He #etal~@he_systematic_2023 did a systematic study of hidden service API related to security.
They studied how the hidden API can be used to bypass Android security restrictions and found that although Google countermeasures are effective, they need to be implemented inside the system services and not the hidden API due to the lack of in-app privilege isolation: the framework code is in the same process as the user code, meaning any restriction in the framework can be bypassed by the user.
]