This commit is contained in:
parent
c9752714db
commit
f23390279c
7 changed files with 177 additions and 182 deletions
|
@ -1,17 +1,17 @@
|
|||
#import "../lib.typ": todo, ie, etal, num, DEX, ART, SDK, API, APK, APIs, AOSP
|
||||
#import "X_var.typ": *
|
||||
|
||||
== Analyzing the Class Loading Process <sec:cl-loading>
|
||||
== Analysing the Class Loading Process <sec:cl-loading>
|
||||
|
||||
For building obfuscation techniques based on the confusion of tools with class loaders, we manually studied the code of Android that handles class loading.
|
||||
In this section, we report the inner workings of #ART and we focus on the specificities of class loading that can bring confusion.
|
||||
Because the class loading implementation has evolved over time during the multiple iterations of the Android operating system, we mainly describe the behavior of #ART from Android version 14 (#SDK 34).
|
||||
In this section, we report the inner workings of #ART, and we focus on the specificities of class loading that can bring confusion.
|
||||
Because the class loading implementation has evolved over time during the multiple iterations of the Android operating system, we mainly describe the behaviour of #ART from Android version 14 (#SDK 34).
|
||||
|
||||
=== Class Loaders
|
||||
|
||||
When #ART needs to access a class, it queries a `ClassLoader` to retrieve its implementation.
|
||||
When #ART needs to access a class, it queries an object implementing the `ClassLoader` class to retrieve its implementation.
|
||||
Each class has a reference to the `ClassLoader` that loaded it, and this class loader is the one that will be used to load supplementary classes used by the original class.
|
||||
For example in @lst:cl-expl-cl-loading, when calling `A.f()`, the #ART will load `B` with the class loader that was used to load `A`.
|
||||
For example, in @lst:cl-expl-cl-loading, when calling `A.f()`, the #ART will load `B` with the class loader that was used to load `A`.
|
||||
|
||||
#figure(
|
||||
```java
|
||||
|
@ -24,16 +24,16 @@ For example in @lst:cl-expl-cl-loading, when calling `A.f()`, the #ART will load
|
|||
caption: [Class instantiation],
|
||||
) <lst:cl-expl-cl-loading>
|
||||
|
||||
This behavior has been inherited from Java and most of the core classes regarding class loaders have been kept in Android.
|
||||
Nevertheless, the Android implementation has slight differences and new class loaders have been added.
|
||||
For example, the java class loader `URLClassLoader` is still present in Android, but contrary to the official documentation, most of its methods have been removed or replaced by a stub that just raises an exception.
|
||||
This behaviour has been inherited from Java, and most of the core classes regarding class loaders have been kept in Android.
|
||||
Nevertheless, the Android implementation has slight differences, and new class loaders have been added.
|
||||
For example, the Java class loader `URLClassLoader` is still present in Android, but contrary to the official documentation, most of its methods have been removed or replaced by a stub that just raises an exception.
|
||||
Moreover, rather than using the Java class loaders `SecureClassLoader` or `URLClassLoader`, Android has several new class loaders that inherit from `ClassLoader` and override the appropriate methods.
|
||||
|
||||
The left part of @fig:cl-class_loading_classes shows the different class loaders specific to Android in white and the stubs of the original Java class loaders in grey.
|
||||
The main difference between the original Java class loaders and the ones used by Android is that they do not support the Java bytecode format.
|
||||
Instead, the Android-specific class loaders load their classes from (many) different file formats specific to Android.
|
||||
Usually, when used by a programmer, the classes are loaded from memory or from a file using the #DEX format (`.dex`).
|
||||
When used directly by #ART, the classes are usually stored in an application file (`.apk`) or in an optimized format (`OAR/ODEX`).
|
||||
When used directly by #ART, the classes are usually stored in an application file (`.apk`) or in an optimised format (`OAR/ODEX`).
|
||||
|
||||
#figure([
|
||||
#image(
|
||||
|
@ -50,7 +50,7 @@ When used directly by #ART, the classes are usually stored in an application fil
|
|||
On the runtime side, there are 5 boxes: bootClassLoader, appClassLoader (multi dex), systemClassLoader,
|
||||
Specific delegator with two delegates, X.
|
||||
Arrows labeled delegate go from appClassLoader, systemClassLoader, and Specific delegator to bootClassLoader, and from Specific delegator to X.
|
||||
bootClassLoader, appClassLoader, and systemClassLoader are grouped in a dotted box labeled Android default behavior.
|
||||
bootClassLoader, appClassLoader, and systemClassLoader are grouped in a dotted box labeled Android default behaviour.
|
||||
Dotted lines labeled instance go across the central demarcation from appClassLoader to PathClassLoader, from systemClassLoader to PathClassLoader, and from Specific delegator to DelegateLastClassLoader.
|
||||
Another dotted line labeled instance singleton goes from bootClassLoader to BootClassLoader.
|
||||
"
|
||||
|
@ -63,15 +63,15 @@ When used directly by #ART, the classes are usually stored in an application fil
|
|||
=== Delegation <sec:cl-delegation>
|
||||
|
||||
The order in which classes are loaded at runtime requires special attention.
|
||||
All the specific Android class loaders (`DexClassLoader`, `InMemoryClassLoader`, etc.) have the same behavior (except `DelegateLastClassLoader`) but they handle specificities for the input format.
|
||||
All the specific Android class loaders (`DexClassLoader`, `InMemoryClassLoader`, etc.) have the same behaviour (except `DelegateLastClassLoader`), but they handle specificities for the input format.
|
||||
Each class loader has a delegate class loader, represented in the right part of @fig:cl-class_loading_classes by black plain arrows for an instance of `PathClassLoader` and an instance of `DelegateLastClassLoader` (the other class loaders also have this delegate).
|
||||
This delegate is a concept specific to class loaders and has nothing to do with class inheritance.
|
||||
By default, class loaders will delegate to the singleton class `BootClassLoader`, except if a specific class loader is provided when instantiating the new class loader.
|
||||
When a class loader needs to load a class, except for `DelegateLastClassLoader`, it will first ask the delegate, i.e. `BootClassLoader`, and if the delegate does not find the class, the class loader will try to load the class on its own.
|
||||
This behavior implements a priority and avoids redefining by error a core class of the system, for example redefining `java.lang.String` that would be loaded by a child class loader instead of its delegates.
|
||||
`DelegateLastClassLoader` behaves slightly differently: it will first delegate to `BootClassLoader` then, it will check its files and finally, it will delegate to its actual delegate (given when instantiating the `DelegateLastClassLoader`).
|
||||
This behavior is useful for overriding specific classes of a class loader while keeping the other classes.
|
||||
A normal class loader would prioritize the classes of its delegate over its own.
|
||||
This behaviour implements a priority and avoids redefining by error a core class of the system, for example, redefining `java.lang.String` that would be loaded by a child class loader instead of its delegates.
|
||||
`DelegateLastClassLoader` behaves slightly differently: it will first delegate to `BootClassLoader`, then it will check its files, and finally, it will delegate to its actual delegate (given when instantiating the `DelegateLastClassLoader`).
|
||||
This behaviour is useful for overriding specific classes of a class loader while keeping the other classes.
|
||||
A normal class loader would prioritise the classes of its delegate over its own.
|
||||
|
||||
#figure(
|
||||
```python
|
||||
|
@ -105,19 +105,19 @@ It is the direct delegate of the two other class loaders instantiated by Android
|
|||
`systemClassLoader` is a `PathClassLoader` pointing to `'.'`, the working directory of the application, which is `'/'` by default.
|
||||
The documentation of `ClassLoader.getSystemClassLoader` reports that this class loader is the default context class loader for the main application thread.
|
||||
In reality, the #platc are loaded by `bootClassLoader` and the classes from the application are loaded from `appClassLoader`.
|
||||
`systemClassLoader` is never used.
|
||||
`systemClassLoader` is never used in production according to our careful reading of the #AOSP sources.
|
||||
|
||||
In addition to the class loaders instantiated by ART when starting an application, the developer of an application can use class loaders explicitly by calling to ones from the #Asdk, or by recoding custom class loaders that inherit from the `ClassLoader` class.
|
||||
At this point, modeling accurately the complete class loading algorithm becomes impossible: the developer can program any algorithm of their choice.
|
||||
For this reason, this case is excluded from this chapter and we focus on the default behavior where the context class loader is the one pointing to the `.apk` file and where its delegate is `BootClassLoader`.
|
||||
With such a hypothesis, the delegation process can be modeled by the pseudo-code of method `load_class` given in <lst:cl-listing3>.
|
||||
In addition to the class loaders instantiated by ART when starting an application, the developer of an application can use class loaders explicitly by calling ones from the #Asdk, or by recoding custom class loaders that inherit from the `ClassLoader` class.
|
||||
At this point, accurately modelling the complete class loading algorithm becomes impossible: the developer can program any algorithm of their choice.
|
||||
For this reason, this case is excluded from this chapter, and we focus on the default behaviour where the context class loader is the one pointing to the `.apk` file and where its delegate is `BootClassLoader`.
|
||||
With such a hypothesis, the delegation process can be modelled by the pseudo-code of method `load_class` given in @lst:cl-loading-alg.
|
||||
|
||||
In addition, it is important to distinguish the two types of #platc handled by `BootClassLoader` and that both have priority over classes from the application at runtime:
|
||||
|
||||
- the ones available in the *#Asdk* (normally visible in the documentation);
|
||||
- the ones that are internal and that should not be used by the developer. We call them *#hidec*~@he_systematic_2023 @li_accessing_2016 (not documented).
|
||||
|
||||
As a preliminary conclusion, we observe that a priority exists in the class loading mechanism and that an attacker could use it to prioritize an implementation over another one.
|
||||
As a preliminary conclusion, we observe that a priority exists in the class loading mechanism and that an attacker could use it to prioritise an implementation over another one.
|
||||
This could mislead the reverser if they use the one that has the lowest priority.
|
||||
To determine if a class is impacted by the priority given to `BootClassLoader`, we need to obtain the list of classes that are part of Android #ie the #platc.
|
||||
We discuss in the next section how to obtain these classes from the emulator.
|
||||
|
@ -147,22 +147,22 @@ On the top right, a diagram of a web browser open at https//develoer.android.com
|
|||
|
||||
@fig:cl-archisdk shows how classes of Android are used in the development environment and at runtime.
|
||||
In the development environment, Android Studio uses `android.jar` and the specific classes written by the developer.
|
||||
After compilation, only the classes of the developer, and sometimes extra classes computed by Android Studio are zipped in the #APK file, using the multi-dex format.
|
||||
After compilation, only the classes of the developer, and sometimes extra classes computed by Android Studio, are zipped in the #APK file, using the multi-dex format.
|
||||
At runtime, the application uses `BootClassLoader` to load the #platc from Android.
|
||||
Until our work, previous works~@he_systematic_2023 @li_accessing_2016 considered both #Asdk and #hidec to be in the file `/system/framework/framework.jar` found in the phone itself, but we found that the classes loaded by `bootClassLoader` are not all present in `framework.jar`.
|
||||
For example, He #etal~@he_systematic_2023 counted 495 thousand #APIs (fields and methods) in Android 12, based on Google documentation on restriction for non #SDK interfaces#footnote[https://developer.android.com/guide/app-compatibility/restrictions-non-sdk-interfaces].
|
||||
Until our work, previous works~@he_systematic_2023 @li_accessing_2016 considered both #Asdk and #hidec to be in the file `/system/framework/framework.jar` found in the phone itself, but we found that the classes loaded by `bootClassLoader` are not all present in `framework.jar`.
|
||||
For example, He #etal~@he_systematic_2023 counted 495 thousand #APIs (fields and methods) in Android 12, based on Google documentation on restrictions for non #SDK interfaces#footnote[https://developer.android.com/guide/app-compatibility/restrictions-non-sdk-interfaces].
|
||||
However, when looking at the content of `framework.jar`, we only found #num(333) thousand #APIs.
|
||||
Indeed, classes such as `com.android.okhttp.OkHttpClient` are loaded by `bootClassLoader`, listed by Google, but not in `framework.jar`.
|
||||
|
||||
For optimization purposes, classes are now loaded from `boot.art`.
|
||||
For optimisation purposes, classes are now loaded from `boot.art`.
|
||||
This file is used to speed up the start-up time of applications: it stores a dump of the C++ objects representing the *#platc* (#Asdk and #hidec) so that they do not need to be generated each time an application starts.
|
||||
Unfortunately, this format is not documented and not retro-compatible between Android versions and is thus difficult to parse.
|
||||
An easier solution to investigate the #platc is to look at the `BOOTCLASSPATH` environment variable in an emulator.
|
||||
This variable is used to load the classes without the `boot.art` optimization.
|
||||
This variable is used to load the classes without the `boot.art` optimisation.
|
||||
We found 25 `.jar` files, including `framework.jar`, in the `BOOTCLASSPATH` of the standard emulator for Android 12 (#SDK 32), 31 for Android 13 (#SDK 33), and 35 for Android 14 (#SDK 35), containing respectively a total of #num(499837), #num(539236) and #num(605098) API methods and fields.
|
||||
@tab:cl-platform_apis) summarizes the discrepancies we found between Google's list and the #platc we found in Android emulators.
|
||||
@tab:cl-platform_apis summarises the discrepancies we found between Google's list and the #platc we found in Android emulators.
|
||||
Note also that some methods may also be found _only_ in the documentation.
|
||||
Our manual investigations suggest that the documentation is not well synchronized with the evolution of the #platc and that Google has almost solved this issue in #API 34.
|
||||
Our manual investigations suggest that the documentation is not well synchronised with the evolution of the #platc and that Google has almost solved this issue in #API 34.
|
||||
|
||||
|
||||
#figure({
|
||||
|
@ -175,10 +175,10 @@ Our manual investigations suggest that the documentation is not well synchronize
|
|||
table.hline(),
|
||||
table.header(
|
||||
table.cell(colspan: 5, inset: 3pt)[],
|
||||
table.cell(rowspan: 2)[*SDK version*],
|
||||
table.cell(rowspan: 2)[*#SDK version*],
|
||||
table.vline(end: 3),
|
||||
table.vline(start: 4),
|
||||
table.cell(colspan: 4)[*Number of API methods*],
|
||||
table.cell(colspan: 4)[*Number of #API methods*],
|
||||
[Documented], [In emulator], [Only documented], [Only in emulator],
|
||||
),
|
||||
table.cell(colspan: 5, inset: 3pt)[],
|
||||
|
@ -193,7 +193,7 @@ Our manual investigations suggest that the documentation is not well synchronize
|
|||
table.hline(),
|
||||
)},
|
||||
|
||||
caption: [Comparison for #API methods between documentation and emulators],
|
||||
caption: [Comparison of #API methods between documentation and emulators],
|
||||
)<tab:cl-platform_apis>
|
||||
|
||||
We conclude that it can be dangerous to trust the documentation and that gathering information from the emulator or phone is the only reliable source.
|
||||
|
@ -202,20 +202,20 @@ Gathering the precise list of classes and the associated bytecode is not a trivi
|
|||
=== Multiple #DEX Files <sec:cl-collision>
|
||||
|
||||
For the application class files, Android uses its specific format called #DEX: all the classes of an application are loaded from the file `classes.dex`.
|
||||
With the increasing complexity of Android applications, the need arrised to load more methods than the #DEX format could support in one #dexfile.
|
||||
With the increasing complexity of Android applications, the need arose to load more methods than the #DEX format could support in one #dexfile.
|
||||
To solve this problem, Android started storing classes in multiple files named `classesX.dex` as illustrated by the @lst:cl-dexname that generates the filenames read by class loaders.
|
||||
Android starts loading the file `GetMultiDexClassesDexName(0)` (`classes.dex`), then `GetMultiDexClassesDexName(1)` (`classes2.dex`), and continues until finding a value `n` for which `GetMultiDexClassesDexName(n)` does not exist.
|
||||
Even if Android emits a warning message when it finds more than 100 #dexfiles, it will still load any number of #dexfiles that way.
|
||||
This change had the unintended consequence of permitting two classes with the same name but different implementations to be stored in the same `.apk` file using two #dexfiles.
|
||||
This change had the unintended consequence of permitting two classes with the same name but different implementations to be stored in the same `.apk` file using two #dexfiles (#eg the class `Foo` can be defined both in `classes.dex` and `classes2.dex`).
|
||||
|
||||
Android explicitly performs checks that prevent several classes from using the same name inside a #dexfile.
|
||||
However, this check does not apply to multiple #dexfiles in the same `.apk` file, and a `.dex` can contain a class with a name already used by another class in another #dexfile of the application.
|
||||
Of course, such a situation should not happen when multiple #dexfiles have been generated by properly Android Studio.
|
||||
Of course, such a situation should not happen when multiple #dexfiles have been generated properly by Android Studio.
|
||||
Nevertheless, for an attacker controlling the process, this issue raises the question of which class is selected when several classes sharing the same name are present in `.apk` files.
|
||||
|
||||
We found that Android loads the class whose implementation is found first when looking in the order of multiple `dexfiles`, as generated by the method `GetMultiDexClassesDexName`.
|
||||
We will show later in @sec:cl-evaltools that this choice is not the most intuitive and can lead to fool analysis tools when reversing an application.
|
||||
As a conclusion, we model both the multi-dex and delegation behaviors in the pseudo-code of @lst:cl-loading-alg.
|
||||
We will show later in @sec:cl-evaltools that this choice is not the most intuitive and can lead to fooling analysis tools when reversing an application.
|
||||
As a conclusion, we model both the multi-dex and delegation behaviours in the pseudo-code of @lst:cl-loading-alg.
|
||||
|
||||
#figure(
|
||||
```C
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue