wip
All checks were successful
/ test_checkout (push) Successful in 1m1s

This commit is contained in:
Jean-Marie Mineau 2025-07-21 22:00:29 +02:00
parent fd4d6fa239
commit ea82a3ca8b
Signed by: histausse
GPG key ID: B66AEEDA9B645AD2
10 changed files with 119 additions and 98 deletions

View file

@ -1,4 +1,4 @@
#import "../lib.typ": etal, ie
#import "../lib.typ": etal, ie, ART, DEX, APK, SDK
#import "X_var.typ": *
== Introduction
@ -7,7 +7,7 @@
When building an application with Android Studio, the source codes of applications are compiled to Java bytecode, which is then converted to Dalvik bytecode.
Dalvik bytecode is then put in a zip archive with other resources such as the application manifest, and the zip archive is then signed.
All this process is handled by Android Studio, behind the scene.
At runtime, the Dalvik bytecode is either interpreted by the Dalvik virtual machine or compiled by ART in order to execute native code and it is up to these components to handle the loading of the classes.
At runtime, the Dalvik bytecode is either interpreted by the Dalvik virtual machine or compiled by #ART in order to execute native code and it is up to these components to handle the loading of the classes.
Both behaviors are possible at the same time for a single application, and it is up to Android to choose which part of an application is compiled in native code.
*/
@ -20,17 +20,17 @@ If this first phase is not accurately driven, for example if they fail to access
Additionally, as stated by Li #etal@Li2017 in their conclusions, such a task is complexified by dynamic code loading, reflective calls, native code, and multi-threading which cannot be easily handled statically.
Nevertheless, even if we do not consider these aspects, determining statically how the regular class loading system of Android is working is a difficult task.
Class loading occurs at runtime and is handled by the components of Android Runtime (ART), even when the application is partially or fully compiled ahead of time.
Class loading occurs at runtime and is handled by the components of #ART, even when the application is partially or fully compiled ahead of time.
Nevertheless, at the development stage, Android Studio handles the resolution of the different classes that can be internal to the application.
When building, the code is linked to the standard library i.e. the code contained in `android.jar`.
In this article, we call these classes "Development SDK classes".
In this article, we call these classes "Development #SDK classes".
`android.jar` is not added to the application because its classes will be available at runtime in others `.jar` files.
To distinguish those classes found at runtime from Dev SDK classes, we call them #Asdkc.
When releasing the application, the building process of Android Studio can manage different versions of the #Asdk, reported in the Manifest as the "SDK versions".
Indeed, some parts of the core #Asdkc can be embedded in the application, for retro compatibility purposes: by comparing the specified minimum SDK version and the target SDK version, the code of extra #Asdkc is stored in the APK file.
To distinguish those classes found at runtime from Dev #SDK classes, we call them #Asdkc.
When releasing the application, the building process of Android Studio can manage different versions of the #Asdk, reported in the Manifest as the "#SDK versions".
Indeed, some parts of the core #Asdkc can be embedded in the application, for retro compatibility purposes: by comparing the specified minimum #SDK version and the target #SDK version, the code of extra #Asdkc is stored in the APK file.
As a consequence, it is frequent to find inside applications some classes that come from the `com.android` packages.
At runtime each smartphone runs a unique version of Android, but, as the application is deployed on multiple versions of Android, it is difficult to predict which classes will be loaded from the #Asdkc or from the APK file itself.
This complexity increases with the multi-DEX format of recent APK files that can contain several bytecode files.
This complexity increases with the multi-#DEX format of recent #APK files that can contain several bytecode files.
Going back to the problem of a reverser studying a suspicious application statically, the reverser uses tools to disassemble the application@mauthe_large-scale_2021 and track the flows of data in the bytecode.
As an example, for a spyware potentially leaking personal information, the reverser can unpack the application with Apktool and, after manually locating a method that they suspect to read sensitive data (by reading the unpacked bytecode), they can compute with FlowDroid@Arzt2014a if there is a flow from this method to methods performing HTTP requests.
@ -47,7 +47,7 @@ The goal of such an attack is to confuse them during the reversing process: at r
This attack can be applied to regular classes of the #Asdk or to hidden classes of Android@he_systematic_2023 @li_accessing_2016.
We show how these attacks can confuse the tools of the reverser when he performs a static analysis.
In order to evaluate if such attacks are already used in the wild, we analyzed #nbapk applications from 2023 that we extracted randomly from AndroZoo@allixAndroZooCollectingMillions2016.
Our main result is that #shadowsdk of these applications contain shadow collisions against the SDK and #shadowhidden against hidden classes.
Our main result is that #shadowsdk of these applications contain shadow collisions against the #SDK and #shadowhidden against hidden classes.
Our investigations conclude that most of these collisions are not voluntary attacks, but we highlight one specific malware sample performing strong obfuscation revealed by our detection of one shadow attack.
The paper is structured as follows.

View file

@ -40,15 +40,3 @@ They found that hidden APIs are added and removed in every release of Android, a
More recently, He #etal @he_systematic_2023 did a systematic study of hidden service API related to security.
They studied how the hidden API can be used to bypass Android security restrictions and found that although Google countermeasures are effective, they need to be implemented inside the system services and not the hidden API due to the lack of in-app privilege isolation: the framework code is in the same process as the user code, meaning any restriction in the framework can be bypassed by the user.
]
#paragraph([Static analysis tools])[
Static analysis tools are used to perform operations on an APK file, for example extracting its bytecode or information from the Manifest file.
Because of the complexity of Android, few tools have followed all the evolutions of the file format and are robust enough to analyze all applications without crashing@mineau_evaluating_2024.
The tools can share the backend used to manipulate the code.
For example, Apktool is often called in a subprocess to extracte the bytecode.
Another example is Soot@Arzt2013, a Java framework that allows to manipulate the bytecode from an object representation of instructions.
This framework enables advanced features such as inserting or removing bytecode instructions but can require a lot of memory and time to perform its operations.
The most known tool built on top of Soot is FlowDroid@Arzt2014a, which enables to compute information flows statically into the code.
Because these tools are used by reversers, we will evaluate the accuracy of the provided results in the case of an application developer exploits the possible confusions that brings the class loading mechanisms of Android.
]