pass chap 4
Some checks failed
/ test_checkout (push) Failing after 22s

This commit is contained in:
Jean-Marie 'Histausse' Mineau 2025-09-29 03:10:59 +02:00
parent c9752714db
commit f23390279c
Signed by: histausse
GPG key ID: B66AEEDA9B645AD2
7 changed files with 177 additions and 182 deletions

View file

@ -6,9 +6,9 @@
In this section, we present new obfuscation techniques that take advantage of the complexity of the class loading process.
Then, in order to evaluate their efficiency, we reviewed some common Android reverse analysis tools to see how they behave when collisions occur between classes of the #APK or between a class of the #APK and classes of Android (#Asdk or #hidec).
We call this collision "*class shadowing*", because the attacker version of the class shadows the one that will be used at runtime.
To evaluate if such shadow attacks are working, we handcrafted three applications implementing shadowing techniques to test their impact on static analysis tools.
Then, we manually inspected the output of the tools in order to check its consistency with what Android is really doing at runtime.
We call this collision "*class shadowing*", because the attacker's version of the class shadows the one that will be used at runtime.
To evaluate if such shadow attacks are working, we handcrafted three applications implementing shadowing techniques to test their impact on static analysis tools.
Then, we manually inspected the output of the tools in order to check their consistency with what Android is really doing at runtime.
For example, for Apktool, we look at the output disassembled code, and for Flowdroid~@Arzt2014a, we check that a flow between `Taint.source()` and `Taint.sink()` is correctly computed.
@ -24,7 +24,7 @@ on peut shadow une classe hidden
=== Obfuscation Techniques
From the results presented in @sec:cl-loading, three approaches can be designed to hide the behavior of an application.
From the results presented in @sec:cl-loading, three approaches can be designed to hide the behaviour of an application.
/*
#paragraph([Hidden classes])[
@ -41,24 +41,24 @@ On the other hand, using #hidec leave classes without implementation in the appl
*/
#paragraph([*Self shadow*: shadowing a class with another from #APK])[
This method consists in hiding the implementation of a class with another one by exploiting the possible collision of class names, as described in @sec:cl-collision with multiple #dexfiles.
This method consists of hiding the implementation of a class with another one by exploiting the possible collision of class names, as described in @sec:cl-collision with multiple #dexfiles.
If reversers or tools ignore the priority order of a multi-dex file, they can take into account the wrong version of a class.
]
//priorité aux classes SDK meme si une shadow classe est définie dans l'APK (tout ca a cause de Boot)
#paragraph([*SDK shadow*: shadowing a #SDK class])[
This method consists in presenting to the reverser a fake implementation of a class of the #SDK.
This class is embedded in the #APK file and has the same name as the one of the #SDK.
This method consists of presenting to the reverser a fake implementation of a class of the #SDK.
This class is embedded in the #APK file and has the same name as one of the #SDK.
Because `BootClassLoader` will give priority to the #Asdk at runtime, the reverser or tool should ignore any version of a class that is contained in the #APK.
The only constraint when shadowing an #SDK class is that the shadowing implementation must respect the signature of real classes.
Note that, by introducing a custom class loader, the attacker could inverse the priority, but this case is out of the scope of this chapter.
Note that, by introducing a custom class loader, the attacker could invert the priority, but this case is out of the scope of this chapter.
]
// priorité aux classes hidden (car du SDK) meme si une shadow classe est définie dans l'APK
#paragraph([*Hidden shadow*: shadowing an hidden class])[
#paragraph([*Hidden shadow*: shadowing a hidden class])[
This method is similar to the previous one, except the class that is shadowed is a #hidecsingular.
Because #ART will give priority to the internal version of the class, the version provided in the #APK file will be ignored.
Such shadow attacks are more difficult to detect by a reverser, that may not know the existence of this specific hidden class in Android.
Such shadow attacks are more difficult to detect by a reverse engineer, who may not know the existence of this specific hidden class in Android.
]
=== Impact on Static Analysis Tools <sec:cl-evaltools>
@ -73,7 +73,7 @@ Such shadow attacks are more difficult to detect by a reverser, that may not kno
}
}
// customized for each obfuscation technique
// customised for each obfuscation technique
public class Obfuscation {
public static String hide_flow(String personal_data) { ... }
}
@ -81,28 +81,28 @@ Such shadow attacks are more difficult to detect by a reverser, that may not kno
caption: [Main body of test apps]
)<lst:cl-testapp>
We selected tools that are commonly used to unpack and reverse Android applications.
The only two tools that we found to still be alive in @sec:rasta-src-select: Androguard#footnote[https://github.com/androguard/androguard] and Flowdroid~@Arzt2014a.
We also selected Jadx#footnote[https://github.com/skylot/jadx], a state-of-the-art decompiler for Android applications, as well as Apktool#footnote[https://apktool.org/], a disassembler/repackager used by 9 of the tools tested in @sec:rasta and often used by reverser when Jadx fails.
In @sec:rasta (@sec:rasta-src-select), we found only two tools to be still actively maintained: Androguard#footnote[https://github.com/androguard/androguard] and Flowdroid#footnote[https://github.com/secure-software-engineering/FlowDroid].
We also noticed that Apktool#footnote[https://apktool.org/] was a common dependency for a lot of the tools we tested in @sec:rasta (see @tab:rasta-rec-deps), and is still used today.
Consequently, we will test the impact of shadow attacks on those three tools.
Lastly, because it is a state-of-the-art decompiler for Android applications, we added Jadx#footnote[https://github.com/skylot/jadx] to the list of tools we tested.
To evaluate the tools, we designed a single application that we can customize for different tests.
To evaluate the tools, we designed a single application that we can customise for different tests.
@lst:cl-testapp shows the main body implementing:
- a possible flow to evaluate FlowDroid: a flow from a method `Taint.source()` to a method `Taint.sink(Activity, String)` through a method `Obfuscation.hide_flow(String)`.
- a possible use of a #SDK or hidden class inside the class `Obfuscation` to evaluate #platc shadowing for other tools.
We used 4 versions of this application:
+ A control application that does not do anything special: `Obfuscation.hide_flow(String personal_data)` simply return `personal_data`.
It will be used for checking the expecting result of tools.
+ A version that implements self shadowing: the class `Obfuscation` is duplicated: one is the same as the in the control app (`Obfuscation.hide_flow(String)` returns its arguments), and the other version returns a constant string.
+ A control application that does not do anything special: `Obfuscation.hide_flow(String personal_data)` returns `personal_data`.
It will be used for checking the expected result of tools.
+ A version that implements self-shadowing: the class `Obfuscation` is duplicated: one is the same as the one in the control app (`Obfuscation.hide_flow(String)` returns its arguments), and the other version returns a constant string.
These two versions are embedded in several #DEX of a multi-dex application.
+ The third version implement #SDK shadowing and needs an existing class of the #SDK.
We used the #SDK class `Pair` that we try to shadow.
+ The third version implements #SDK shadowing and needs an existing class of the #SDK.
We used the #SDK class `Pair` as the class to shadow.
We put data in a new `Pair` instance and reread the data from the `Pair`.
The colliding `Pair` class we created discards the data at the initialisation and stores `null` instead of the argument values.
This decoy class break the flow of information: Flowdroid will detect the information flow if it uses the actuall #SDK implementation of `Pair` to compute the #DFG, but not if it uses the decoy.
This decoy class break the flow of information: Flowdroid will detect the information flow if it uses the actual #SDK implementation of `Pair` to compute the #DFG, but not if it uses the decoy.
+ The last version tests for Hidden #API shadowing.
Like for the third one, we similarly store data in `com.android.okhttp.Request` and then retrieve it.
Again, the shadowing implementation discards the data.
@ -113,7 +113,7 @@ In @tab:cl-results, we report on the types of shadowing that can trick each tool
A plain circle is a shadow attack that leads to a wrong result.
A white circle indicates a tool emitting warnings or that displays the two versions of the class.
A cross is a tool not impacted by a shadow attack.
We explain in more detail in the following the results for each considered tool.
//We explain in more detail in the following the results for each considered tool.
#figure({
table(
@ -128,7 +128,7 @@ We explain in more detail in the following the results for each considered tool.
table.vline(end: 3),
table.vline(start: 4),
table.cell(colspan: 3)[Shadow Attack],
[Self], [SDK], [Hidden],
[Self], [#SDK], [Hidden],
),
table.cell(colspan: 5, inset: 3pt)[],
table.hline(),
@ -149,22 +149,22 @@ We explain in more detail in the following the results for each considered tool.
==== Jadx
Jadx is a reverse engineering tool that regenerates the Java source code of an application.
It processes all the classes present in the application, but only save/display one class by name, even if two versions are present in multiple #dexfiles.
//Jadx is a reverse engineering tool that regenerates the Java source code of an application.
Jadx processes all the classes present in the application, but only saves/displays one class by name, even if two versions are present in multiple #dexfiles.
Nevertheless, when multiple classes with the same name are found, Jadx reports it in a comment added to the generated Java source code.
This warning stipulates that a possible collision exists and lists the files that contain the different versions of the class.
Unfortunately, after reviewing the code of Jadx, we believe that the selection of the displayed class is an undefined behavior.
At least for the version 1.5.0 that we tested, we found that Jadx selects the wrong implementation when a class with the same name is present.
For example in `classes2.dex` and `classes3.dex`.
Unfortunately, after reviewing the code of Jadx, we believe that the selection of the displayed class is an undefined behaviour.
At least for version 1.5.0 that we tested, we found that Jadx selects the wrong implementation when a class with the same name is present.
For example, in `classes2.dex` and `classes3.dex`.
We report it with a "#warn" because warnings are issued.
//Using #hidec does not affect Jadx beyond the fact that #hidec are not decompiled, which is to be expected by the user anyway.
Shadowing #Asdk and #hidec is possible in Jadx: there is only one implementation of the class in the application and Jadx does not have a list of the internal classes of Android: no warning is issued to the reverser that the displayed class is not the one used by Android.
Shadowing #Asdk and #hidec is possible in Jadx: there is only one implementation of the class in the application, and Jadx does not have a list of the internal classes of Android: no warning is issued to the reverser that the displayed class is not the one used by Android.
==== Apktool
Apktool generates Smali files, an assembler language for #DEX bytecode.
//Apktool generates Smali files, an assembler language for #DEX bytecode.
Apktool will store the disassembled classes in a folder that matches the #dexfile that stores the bytecode.
This means that when shadowing a class with two versions in two #dexfiles, the shadow implementations will be disassembled into two directories.
No indication is displayed that a collision is possible.
@ -178,8 +178,8 @@ Androguard has different usages, with different levels of analysis.
The documentation highlights the analysis commands that compute three types of objects: an #APK object, a list of #DEX objects, and an Analysis object.
The #APK and the list of #dexfiles are a one-to-one representation of the content of an application, and have the same issues that we discussed with Apktool: they provide the different versions of a shadow class contained in multiple #dexfiles.
The Analysis object is used to compute a method call graph and we found that this algorithm may choose the wrong version of a shadowed class when using the cross references that are computed.
This leads to an invalid call graph as shown in @fig:cl-andro_obf_cg: the two methods `doSomething()` are represented in the graph, but the one linked to `main()` on the graph is the one calling the method `good()` when in fact the method `bad()` is called when running the application.
The Analysis object is used to compute a method call graph, and we found that this algorithm may choose the wrong version of a shadowed class when using the cross-references that are computed.
This leads to an invalid call graph, as shown in @fig:cl-andro_obf_cg: the two methods `doSomething()` are represented in the graph, but the one linked to `main()` on the graph is the one calling the method `good()` when in fact the method `bad()` is called when running the application.
Androguard has a method `.is_external()` to detect if the implementation of a class is not provided inside the application and a method `.is_android_api()` to detect if the class is part of the Android #API.
Regrettably, the documentation of `.is_android_api()` explains that the method is still experimental and just checks a few package names.
@ -224,11 +224,12 @@ Because of that, like for Apktool and Jadx, Androguard has no way to warn the re
) <fig:cl-andro_obf_cg>
])
h(1em)},
caption: [Call Graphs of an application calling `Main.bad()` from a shadowed `Obfuscation` class.],
caption: [Call Graphs of an application calling `Main.bad()` from a shadowed `Obfuscation` class],
)<fig:cl-androguard_call_graph>
==== Flowdroid
/*
#jfl-note[Flowdroid~@Arzt2014a is used to detect if an application can leak sensitive information.
To do so, the analyst provides a list of source and sink methods.
The return value of a method marked as source is considered sensitive and the argument of a method marked as sink is considered to be leaked.
@ -237,22 +238,24 @@ Flowdroid is built on top of the Soot~@Arzt2013 framework that handles, among ot
deja dit dans chap2?
Non mais on aurait du, ca viendra et il faudra modifier a ce moment
]
]*/
We found that when selecting the classes implementation in a multi-dex #APK, Soot uses an algorithm close to what #ART is performing:
Soot sorts the `.dex` bytecode file with a specified `prioritizer` (a comparison function that defines an order for #dexfiles) and selects the first implementation found when iterating over the sorted files.
Unfortunately, the `prioritizer` used by Soot is not exactly the same as the one used by the ART.
The Soot `prioritizer` will give priority to `classes.dex` and then give priority to files whose name starts with `classes` over other files and finally will use the alphabetical order.
This order is good enough for application with a small number of #dexfiles generated by Android Studio, but because it uses the alphabetical order and does not check the exact format used by Android, a malicious developer could hide the implementation of a class in `classes2.dex` by putting a false implementation in `classes0.dex`, `classes1.dex` or `classes12.dex`.
The Soot `prioritizer` will give priority to `classes.dex` and then give priority to files whose name starts with `classes` over other files, and finally will use alphabetical order.
This order is good enough for application with a small number of #dexfiles generated by Android Studio, but because it uses the alphabetical order and does not check the exact format used by Android, a malicious developer could hide the implementation of a class in `classes2.dex` by putting a false implementation in `classes0.dex`, `classes1.dex` or `classes12.dex`.
Because Flowdroid is based on Soot, it inherits this issue from it.
// TODO This could use more investigation
In addition to self shadowing, Flowdroid is sensitive to the use of #platc, as it needs the bytecode of those classes to be able to track data flows.
This is solved for #SDK classes by providing `android.jar` to Flowdroid.
Flowdroid gives priority to the classes from the #SDK over the classes implemented in the application, thus defeating #SDK shadow attacks.
Unfortunately, `android.jar` only contains classes from the #Asdk, meaning that using #hidec breaks the flow tracking.
Solving this issue would require finding the bytecode of all the platform classes of the Android version targeted and as we said previously it requires extracting this information from the emulator.
In addition to self-shadowing, Flowdroid is sensitive to the use of #platc, as it needs the bytecode of those classes to be able to track data flows.
//This is solved for #SDK classes by providing `android.jar` to Flowdroid.
Flowdroid does have a record of #SDK classes, and gives priority to the actual #SDK classes over the classes implemented in the application, thus defeating #SDK shadow attacks.
//Unfortunately, `android.jar` only contains classes from the #Asdk, meaning that using #hidec breaks the flow tracking.
Unfortunately, Flowdroid does not have a record of all platform classes, meaning that using #hidec breaks the flow tracking.
Solving this issue would require finding the bytecode of all the platform classes of the Android version targeted, and, as we said previously, it requires extracting this information from the emulator or phone.
#v(2em)
We have seen that tools can be impacted by shadow attacks. In the next section, we will investigate if these attacks are used in the wild.
We have seen that tools can be impacted by shadow attacks. In the next section, we will investigate whether these attacks are used in the wild.