wip
All checks were successful
/ test_checkout (push) Successful in 1m1s

This commit is contained in:
Jean-Marie Mineau 2025-07-21 22:00:29 +02:00
parent fd4d6fa239
commit ea82a3ca8b
Signed by: histausse
GPG key ID: B66AEEDA9B645AD2
10 changed files with 119 additions and 98 deletions

View file

@ -1,4 +1,4 @@
#import "../lib.typ": etal, ie
#import "../lib.typ": etal, ie, ART, DEX, APK, SDK
#import "X_var.typ": *
== Introduction
@ -7,7 +7,7 @@
When building an application with Android Studio, the source codes of applications are compiled to Java bytecode, which is then converted to Dalvik bytecode.
Dalvik bytecode is then put in a zip archive with other resources such as the application manifest, and the zip archive is then signed.
All this process is handled by Android Studio, behind the scene.
At runtime, the Dalvik bytecode is either interpreted by the Dalvik virtual machine or compiled by ART in order to execute native code and it is up to these components to handle the loading of the classes.
At runtime, the Dalvik bytecode is either interpreted by the Dalvik virtual machine or compiled by #ART in order to execute native code and it is up to these components to handle the loading of the classes.
Both behaviors are possible at the same time for a single application, and it is up to Android to choose which part of an application is compiled in native code.
*/
@ -20,17 +20,17 @@ If this first phase is not accurately driven, for example if they fail to access
Additionally, as stated by Li #etal@Li2017 in their conclusions, such a task is complexified by dynamic code loading, reflective calls, native code, and multi-threading which cannot be easily handled statically.
Nevertheless, even if we do not consider these aspects, determining statically how the regular class loading system of Android is working is a difficult task.
Class loading occurs at runtime and is handled by the components of Android Runtime (ART), even when the application is partially or fully compiled ahead of time.
Class loading occurs at runtime and is handled by the components of #ART, even when the application is partially or fully compiled ahead of time.
Nevertheless, at the development stage, Android Studio handles the resolution of the different classes that can be internal to the application.
When building, the code is linked to the standard library i.e. the code contained in `android.jar`.
In this article, we call these classes "Development SDK classes".
In this article, we call these classes "Development #SDK classes".
`android.jar` is not added to the application because its classes will be available at runtime in others `.jar` files.
To distinguish those classes found at runtime from Dev SDK classes, we call them #Asdkc.
When releasing the application, the building process of Android Studio can manage different versions of the #Asdk, reported in the Manifest as the "SDK versions".
Indeed, some parts of the core #Asdkc can be embedded in the application, for retro compatibility purposes: by comparing the specified minimum SDK version and the target SDK version, the code of extra #Asdkc is stored in the APK file.
To distinguish those classes found at runtime from Dev #SDK classes, we call them #Asdkc.
When releasing the application, the building process of Android Studio can manage different versions of the #Asdk, reported in the Manifest as the "#SDK versions".
Indeed, some parts of the core #Asdkc can be embedded in the application, for retro compatibility purposes: by comparing the specified minimum #SDK version and the target #SDK version, the code of extra #Asdkc is stored in the APK file.
As a consequence, it is frequent to find inside applications some classes that come from the `com.android` packages.
At runtime each smartphone runs a unique version of Android, but, as the application is deployed on multiple versions of Android, it is difficult to predict which classes will be loaded from the #Asdkc or from the APK file itself.
This complexity increases with the multi-DEX format of recent APK files that can contain several bytecode files.
This complexity increases with the multi-#DEX format of recent #APK files that can contain several bytecode files.
Going back to the problem of a reverser studying a suspicious application statically, the reverser uses tools to disassemble the application@mauthe_large-scale_2021 and track the flows of data in the bytecode.
As an example, for a spyware potentially leaking personal information, the reverser can unpack the application with Apktool and, after manually locating a method that they suspect to read sensitive data (by reading the unpacked bytecode), they can compute with FlowDroid@Arzt2014a if there is a flow from this method to methods performing HTTP requests.
@ -47,7 +47,7 @@ The goal of such an attack is to confuse them during the reversing process: at r
This attack can be applied to regular classes of the #Asdk or to hidden classes of Android@he_systematic_2023 @li_accessing_2016.
We show how these attacks can confuse the tools of the reverser when he performs a static analysis.
In order to evaluate if such attacks are already used in the wild, we analyzed #nbapk applications from 2023 that we extracted randomly from AndroZoo@allixAndroZooCollectingMillions2016.
Our main result is that #shadowsdk of these applications contain shadow collisions against the SDK and #shadowhidden against hidden classes.
Our main result is that #shadowsdk of these applications contain shadow collisions against the #SDK and #shadowhidden against hidden classes.
Our investigations conclude that most of these collisions are not voluntary attacks, but we highlight one specific malware sample performing strong obfuscation revealed by our detection of one shadow attack.
The paper is structured as follows.