wip
All checks were successful
/ test_checkout (push) Successful in 51s

This commit is contained in:
Jean-Marie 'Histausse' Mineau 2025-07-16 16:01:35 +02:00
parent 655bff8de2
commit e6c8b0ee6c
Signed by: histausse
GPG key ID: B66AEEDA9B645AD2
9 changed files with 28 additions and 14 deletions

View file

@ -1,7 +1,7 @@
#import "../lib.typ": etal, paragraph, DEX #import "../lib.typ": etal, paragraph, DEX
#import "X_var.typ": * #import "X_var.typ": *
== State of the art <sec:cl-soa> == State of the Art <sec:cl-soa>
#paragraph([Class loading])[ #paragraph([Class loading])[
Class loading mechanisms have been studied in the general context of the Java language. Class loading mechanisms have been studied in the general context of the Java language.

View file

@ -1,13 +1,13 @@
#import "../lib.typ": todo, ie, etal, num #import "../lib.typ": todo, ie, etal, num, DEX
#import "X_var.typ": * #import "X_var.typ": *
== Analyzing the class loading process <sec:cl-loading> == Analyzing the Class Loading Process <sec:cl-loading>
For building obfuscation techniques based on the confusion of tools with class loaders, we manually studied the code of Android that handles class loading. For building obfuscation techniques based on the confusion of tools with class loaders, we manually studied the code of Android that handles class loading.
In this section, we report the inner workings of ART and we focus on the specificities of class loading that can bring confusion. In this section, we report the inner workings of ART and we focus on the specificities of class loading that can bring confusion.
Because the class loading implementation has evolved over time during the multiple iterations of the Android operating system, we mainly describe the behavior of ART from Android version 14 (SDK 34). Because the class loading implementation has evolved over time during the multiple iterations of the Android operating system, we mainly describe the behavior of ART from Android version 14 (SDK 34).
=== Class loaders === Class Loaders
When ART needs to access a class, it queries a `ClassLoader` to retrieve its implementation. When ART needs to access a class, it queries a `ClassLoader` to retrieve its implementation.
Each class has a reference to the `ClassLoader` that loaded it, and this class loader is the one that will be used to load supplementary classes used by the original class. Each class has a reference to the `ClassLoader` that loaded it, and this class loader is the one that will be used to load supplementary classes used by the original class.
@ -109,7 +109,7 @@ This could mislead the reverser if they use the one that has the lowest priority
To determine if a class is impacted by the priority given to `BootClassLoader`, we need to obtain the list of classes that are part of Android #ie the #platc. To determine if a class is impacted by the priority given to `BootClassLoader`, we need to obtain the list of classes that are part of Android #ie the #platc.
We discuss in the next section how to obtain these classes from the emulator. We discuss in the next section how to obtain these classes from the emulator.
=== Determining #platc === Determining Platform Classes
#figure( #figure(
image( image(
@ -174,7 +174,7 @@ Our manual investigations suggest that the documentation is not well synchronize
We conclude that it can be dangerous to trust the documentation and that gathering information from the emulator or phone is the only reliable source. We conclude that it can be dangerous to trust the documentation and that gathering information from the emulator or phone is the only reliable source.
Gathering the precise list of classes and the associated bytecode is not a trivial task. Gathering the precise list of classes and the associated bytecode is not a trivial task.
=== Multiple DEX files <sec:cl-collision> === Multiple #DEX Files <sec:cl-collision>
For the application class files, Android uses its specific format called DEX: all the classes of an application are loaded from the file `classes.dex`. For the application class files, Android uses its specific format called DEX: all the classes of an application are loaded from the file `classes.dex`.
With the increasing complexity of Android applications, the need arrised to load more methods than the DEX format could support in one #dexfile. With the increasing complexity of Android applications, the need arrised to load more methods than the DEX format could support in one #dexfile.

View file

@ -60,7 +60,7 @@ Because ART will give priority to the internal version of the class, the version
Such shadow attacks are more difficult to detect by a reverser, that may not know the existence of this specific hidden class in Android. Such shadow attacks are more difficult to detect by a reverser, that may not know the existence of this specific hidden class in Android.
] ]
=== Impact on static analysis tools <sec:cl-evaltools> === Impact on Static Analysis Tools <sec:cl-evaltools>
#figure( #figure(
```java ```java
@ -248,7 +248,7 @@ In addition to the data flow in hidden classes, Flowdroid needs a list of data s
We believe that analysis tools can handle shadow attacks to some degree. We believe that analysis tools can handle shadow attacks to some degree.
The implementation of the solution will differ depending on the nature tool and may not always require the same implementation effort. The implementation of the solution will differ depending on the nature tool and may not always require the same implementation effort.
=== Relation with obfuscation techniques <sec:cl-cross-obf> === Relation with Obfuscation Techniques <sec:cl-cross-obf>
As described in the state of the art, reverse engineers face other techniques of obfuscation such as packers or native code. As described in the state of the art, reverse engineers face other techniques of obfuscation such as packers or native code.
These techniques rely on custom class loaders that load new parts of the application from ciphered assets or from the network. These techniques rely on custom class loaders that load new parts of the application from ciphered assets or from the network.

View file

@ -1,7 +1,7 @@
#import "../lib.typ": num, todo, paragraph #import "../lib.typ": num, todo, paragraph
#import "X_var.typ": * #import "X_var.typ": *
== Shadow attacks in the wild <sec:cl-wild> == Shadow Attacks in the Wild <sec:cl-wild>
In this section, we evaluate in the wild if applications that can be found in the Play store or other markets use one of the shadow techniques. In this section, we evaluate in the wild if applications that can be found in the Play store or other markets use one of the shadow techniques.
Our goal is to explore the usage of shadow techniques in real applications. Our goal is to explore the usage of shadow techniques in real applications.
@ -203,7 +203,7 @@ The top 3 packages whose code actually differs from the ones found in Android ar
All these hidden shadow classes are libraries included by the developers who probably did not know that they were already embedded in Android. All these hidden shadow classes are libraries included by the developers who probably did not know that they were already embedded in Android.
] ]
=== Shadowing in malware applications <sec:cl-malware> === Shadowing in Malware Applications <sec:cl-malware>
#figure( #figure(
```java ```java

View file

@ -1,4 +1,4 @@
== Threat to validity <sec:cl-ttv> == Threat to Validity <sec:cl-ttv>
During the analysis of the ART internals, we made the hypothesis that its different operating modes are equivalent: we analyzed the loading process for classes stored as non-optimized `.dex` format, and not for the pre-compiled `.oat`. During the analysis of the ART internals, we made the hypothesis that its different operating modes are equivalent: we analyzed the loading process for classes stored as non-optimized `.dex` format, and not for the pre-compiled `.oat`.
It is a reasonable hypothesis to suppose that the two implementations have been produced from the same algorithm using two compilation workflows. It is a reasonable hypothesis to suppose that the two implementations have been produced from the same algorithm using two compilation workflows.

View file

@ -1,6 +1,6 @@
#import "../lib.typ": todo, epigraph #import "../lib.typ": todo, epigraph
= Class loaders in the middle: confusing Android static analyzers = Class Loaders in the Middle: Confusing Android Static Analyzers
#epigraph("Esmerelda Weatherwax, Wyrd Sisters, Terry Pratchett")[Things that try to look like things often do look more like things than things.] #epigraph("Esmerelda Weatherwax, Wyrd Sisters, Terry Pratchett")[Things that try to look like things often do look more like things than things.]

View file

@ -47,7 +47,7 @@ When instanciating an object with `Object obj = cst.newInstance("Hello Void")`,
#figure( #figure(
```java ```java
Method mth = clz.getMethod("myMethod", String.class); Method mth = clz.getMethod("myMethod", String.class);
Object[] args = {(Object)"an argument"} Object[] args = {(Object)"an argument"};
String retData = (String) mth.invoke(obj, args); String retData = (String) mth.invoke(obj, args);
```, ```,
caption: [Calling a method using reflection] caption: [Calling a method using reflection]
@ -133,7 +133,7 @@ In those cases, the parameters could be used directly whithout the detour inside
) <lst:-th-expl-cl-call-trans> ) <lst:-th-expl-cl-call-trans>
=== Code loading <sec:th-trans-cl> === Code Loading <sec:th-trans-cl>
An application can dynamically import code from several format like #DEX, #APK, #JAR or #OAT, either stored in memory or in a file. An application can dynamically import code from several format like #DEX, #APK, #JAR or #OAT, either stored in memory or in a file.
Because it is an internal, platform dependant format, we elected to ignore the #OAT format. Because it is an internal, platform dependant format, we elected to ignore the #OAT format.

View file

@ -0,0 +1,13 @@
#import "../lib.typ": todo
== Collection Runtime Information <sec:th-dyn>
In order to perform the transformations described in @sec:th-trans, we need information like the name and signature of the method called with reflection, or the actual bytecode loaded dynamically.
We are doing those transformation specifically because those information are difficult to extract statically.
Hence, we are using dynamic analysis to collect the runtime information we need.
=== Collect Bytecode
=== Collect Reflection Data
=== Application Execition

View file

@ -88,6 +88,7 @@
// Keep interline in table // Keep interline in table
#show table: set par(leading: 0.65em) if paper_draft #show table: set par(leading: 0.65em) if paper_draft
#todo[Normalize classloaders vs class loaders]
#include("1_introduction/main.typ") #include("1_introduction/main.typ")
#include("2_background/main.typ") #include("2_background/main.typ")