wip
All checks were successful
/ test_checkout (push) Successful in 1m2s

This commit is contained in:
Jean-Marie Mineau 2025-07-22 16:53:39 +02:00
parent ea82a3ca8b
commit d9ab1b8d6a
Signed by: histausse
GPG key ID: B66AEEDA9B645AD2
3 changed files with 182 additions and 8 deletions

View file

@ -1,4 +1,5 @@
#import "../lib.typ": todo, APK #import "../lib.typ": todo, APK
#import "@preview/diagraph:0.3.3": raw-render
== Android Reverse Engineering Techniques <sec:bg-techniques> == Android Reverse Engineering Techniques <sec:bg-techniques>
@ -14,13 +15,117 @@ For malware, dynamic analysis is also limited by evading techniques that may pre
=== Static Analysis <sec:bg-static> === Static Analysis <sec:bg-static>
Static analysis tools are used to perform operations on an #APK file, like extracting its bytecode or information from the `AndroidManifest.xml` file. Static analysis program examine an #APK file without executing it to extract information from it.
Basic static analysis can include extracting information from the `AndroidManifest.xml` file or decompiling bytecode to Java code.
#todo[Explain controle flow graph, data flow graph, and link to tools?] More advance analysis consist in the computing the control-flow of an application and computing its data-flow@Li2017.
A classic goal of a static analysis is to compute data flows to detect potential information leaks@weiAmandroidPreciseGeneral2014 @titzeAppareciumRevealingData2015 @bosuCollusiveDataLeak2017 @klieberAndroidTaintFlow2014 @DBLPconfndssGordonKPGNR15 @octeauCompositeConstantPropagation2015 @liIccTADetectingInterComponent2015 by analyzing the bytecode of an Android application. The most basic form of control-flow analysis is to build a call graph.
A call graph is a graph where the nodes represent the methods in the application, and the edges reprensent calls from one method to another.
@fig:bg-fizzbuzz-cg-cfg b) show the call graph of the code in @fig:bg-fizzbuzz-cg-cfg a).
A more advance control-flow analysis consist in building the control-flow graph.
This times instead of methods, the nodes represent instructions, and the edges indicate which instruction can follow which instruction.
@fig:bg-fizzbuzz-cg-cfg c) represent the control-flow graph of @fig:bg-fizzbuzz-cg-cfg a), with code statement instead of bytecode instructions.
Static analysis tools for Android application must overcom many difficulties: #figure({
set align(center)
stack(dir: ttb,[
#figure(
```java
public static void fizzBuzz(int n) {
for (int i = 1; i <= n; i++) {
if (i % 3 == 0 && i % 5 == 0) {
Buzzer.fizzBuzz();
} else if (i % 3 == 0) {
Buzzer.fizz();
} else if (i % 5 == 0) {
Buzzer.buzz();
} else {
Log.e("fizzbuzz", String.valueOf(i));
}
}
}
```,
supplement: none,
kind: "bg-fizzbuzz-cg-cfg subfig",
caption: [a) A Java program],
) <fig:bg-fizzbuzz-java>], v(2em), stack(dir: ltr, [
#figure(
raw-render(```
digraph {
rankdir=LR
"fizzBuzz(int)" -> "Buzzer.fizzBuzz()"
"fizzBuzz(int)" -> "Buzzer.fizz()"
"fizzBuzz(int)" -> "Buzzer.buzz()"
"fizzBuzz(int)" -> "String.valueOf(int)"
"fizzBuzz(int)" -> "Log.e(String, String)"
}
```,
width: 40%
),
supplement: none,
kind: "bg-fizzbuzz-cg-cfg subfig",
caption: [b) Corresponding Call Graph]
) <fig:bg-fizzbuzz-cg>],[
#figure(
raw-render(```
digraph {
l1
l2
l3
l4
l5
l6
l7
l9
l1 -> l2
l2 -> l3
l3 -> l1
l2 -> l4
l4 -> l5
l5 -> l1
l4 -> l6
l6 -> l7
l7 -> l1
l6 -> l9
l9 -> l1
}
```,
labels: (
"l1": `for (int i = 1; i <= n; i++) {`,
"l2": `if (i % 3 == 0 && i % 5 == 0) {`,
"l3": `Buzzer.fizzBuzz();`,
"l4": `} else if (i % 3 == 0) {`,
"l5": `Buzzer.fizz();`,
"l6": `} else if (i % 5 == 0) {`,
"l7": `Buzzer.buzz();`,
"l9": `Log.e("fizzbuzz", String.valueOf(i));`,
),
width: 50%
),
supplement: none,
kind: "bg-fizzbuzz-cg-cfg subfig",
caption: [c) Corresponding Control-Flow Graph]
) <fig:bg-fizzbuzz-cfg>]))
h(1em)},
supplement: [Figure],
caption: [Source code for a simple Java method and its Call and Control Flow Graphs],
)<fig:bg-fizzbuzz-cg-cfg>
Once the control-flow graph is computed, it can be used to compute data-flows.
Data-flow analysis, also called taint-tracking, allows to follow the flow of information in the application.
Be defining a list of methods and fields that can generate critical information (taint sources) and a list of method that can consume information (taint sink), taint-tracking allows to detect potential data leak (if a data flow link a taint source and a taint sink).
For example, `TelephonyManager.getImei()` is return an unique, persistent, device identifier.
This can be used to identify the user can cannot be changed if compromised.
This make `TelephonyManager.getImei()` a good candidate as a taint source.
On the other hand, `UrlRequest.start()` send a request to an external server, making it a taint sink.
If a data-flow is found linking `TelephonyManager.getImei()` to `UrlRequest.start()`, this means the application is potentially leaking a critical information to an external entity, a behavior that is probably not wanted by the user.
Data-flow analysis is the subject of many contribution@weiAmandroidPreciseGeneral2014 @titzeAppareciumRevealingData2015 @bosuCollusiveDataLeak2017 @klieberAndroidTaintFlow2014 @DBLPconfndssGordonKPGNR15 @octeauCompositeConstantPropagation2015 @liIccTADetectingInterComponent2015, the most notable source being Flowdroid@Arzt2014a.
Static analysis is powerfull as it allows to detects unwanted behavior in an application even is the behavior does not manifest itself when running the application.
Hovewer, static analysis tools must overcom many challenges when analysing Android applications:
/ the Java object-oriented paradigm: A call to a method can in fact correspond to a call to any method overriding the original method in subclasses
/ the multiplicity of entry points: Each component of an application can be an entry point for the application / the multiplicity of entry points: Each component of an application can be an entry point for the application
/ the event driven architecture: Methods of in the applications can be called in many different order depending on external events / the event driven architecture: Methods of in the applications can be called in many different order depending on external events
/ the interleaving of native code and bytecode: Native code can be called from bytecode and vice versa, but tools often only handle one of those format / the interleaving of native code and bytecode: Native code can be called from bytecode and vice versa, but tools often only handle one of those format
@ -29,13 +134,19 @@ Static analysis tools for Android application must overcom many difficulties:
/ the continual evolution of Android: each new version brings new features that an analysis tools must be aware of / the continual evolution of Android: each new version brings new features that an analysis tools must be aware of
The tools can share the backend used to interact with the bytecode. The tools can share the backend used to interact with the bytecode.
For example, Apktool is often called in a subprocess to extracte the bytecode. For example, Apktool is often called in a subprocess to extracte the bytecode, and the Soot framework is a commonly used both to analyse bytecode and modify it.
Another example is Soot@Arzt2013, a Java framework that allows to manipulate the bytecode from an object representation of instructions. The most notable user of Soot is Flowdroid.
The most known tool built on top of Soot is FlowDroid@Arzt2014a, which enables to compute information flows statically into the code.
=== Dynamic Analysis <sec:bg-dynamic> === Dynamic Analysis <sec:bg-dynamic>
#todo[y a du boulot] #todo[y a du boulot]
- #todo[evasion: droid DroidDungeon @ruggia_unmasking_2024]
- #todo[DroidScope@droidscope180237 and CopperDroid@Tam2015]
- #todo[Xposed: DroidHook / Mirage: Toward a stealthier and modular malware analysis sandbox for android]
- #todo[Frida: CamoDroid]
- #todo[modified android framework: RealDroid]
=== Hybrid Analysis <sec:bg-hybrid> === Hybrid Analysis <sec:bg-hybrid>
- #todo[DyDroid, audit of Dynamic Code Loading@qu_dydroid_2017]

View file

@ -2,7 +2,7 @@
/* /*
* Parler de dex lego et du papier qui encode les resultats d'anger en jimple * Parler de dex lego et du papier qui encode les resultats d'anger en jimple
* * argggg https://dl.acm.org/doi/10.1145/2931037.2931044 is verrryyyyy close
* *
*/ */

View file

@ -916,3 +916,66 @@
pages = {423--426}, pages = {423--426},
file = {IEEE Xplore Abstract Record:/home/histausse/Zotero/storage/QEQLZHMD/7129009.html:text/html;Kriz and Maly - 2015 - Provisioning of application modules to Android dev.pdf:/home/histausse/Zotero/storage/8GRUYQLQ/Kriz and Maly - 2015 - Provisioning of application modules to Android dev.pdf:application/pdf}, file = {IEEE Xplore Abstract Record:/home/histausse/Zotero/storage/QEQLZHMD/7129009.html:text/html;Kriz and Maly - 2015 - Provisioning of application modules to Android dev.pdf:/home/histausse/Zotero/storage/8GRUYQLQ/Kriz and Maly - 2015 - Provisioning of application modules to Android dev.pdf:application/pdf},
} }
@inproceedings{ruggia_unmasking_2024,
address = {New York, NY, USA},
series = {{ASIA} {CCS} '24},
title = {Unmasking the {Veiled}: {A} {Comprehensive} {Analysis} of {Android} {Evasive} {Malware}},
isbn = {979-8-4007-0482-6},
shorttitle = {Unmasking the {Veiled}},
url = {https://dl.acm.org/doi/10.1145/3634737.3637658},
doi = {10.1145/3634737.3637658},
abstract = {Since Android is the most widespread operating system, malware targeting it poses a severe threat to the security and privacy of millions of users and is increasing from year to year. The response from the community was swift, and many researchers have ventured to defend this system. In this cat-and-mouse game, attackers pay special attention to flying under the radar of analysis tools, and the techniques to understand whether their app is under analysis have become more and more sophisticated. Moreover, these evasive techniques are also adopted by benign apps to deter reverse engineering, making this phenomenon pervasive in the Android app ecosystem.While the scientific literature has proposed many evasive techniques and investigated their impact, one aspect still needs to be studied: how and to what extent Android apps, both malware and goodware, use such controls. This paper fills this gap by introducing a comprehensive taxonomy of evasive controls for the Android ecosystem and a proof-of-concept app that implements them all. We release the app as open source to help researchers and practitioners to assess whether their app analysis systems are sufficiently resilient to known evasion techniques. We also propose DroidDungeon, a novel probe-based sandbox, which circumvents evasive techniques thanks to a substantial engineering effort, making the apps under analysis believe they are running on an actual device. To the best of our knowledge, currently, DroidDungeon is the only solution providing anti-evasion capabilities, maintainability, and scalability at once.Using our sandbox, we studied evasive controls in both benign and malicious Android apps, revealing insights about their purpose, differences, and relationships between evasive controls and packers/protectors. Finally, we analyzed how the execution of an app differs depending on the presence or absence of evasive counter-measures. Our main finding is that 14\% and 4\% of malicious and benign samples refrain from running in an analysis environment that does not correctly mitigate evasive controls.},
urldate = {2025-07-22},
booktitle = {Proceedings of the 19th {ACM} {Asia} {Conference} on {Computer} and {Communications} {Security}},
publisher = {Association for Computing Machinery},
author = {Ruggia, Antonio and Nisi, Dario and Dambra, Savino and Merlo, Alessio and Balzarotti, Davide and Aonzo, Simone},
month = jul,
year = {2024},
pages = {383--398},
file = {Full Text PDF:/home/histausse/Zotero/storage/V5LLQ8SP/Ruggia et al. - 2024 - Unmasking the Veiled A Comprehensive Analysis of Android Evasive Malware.pdf:application/pdf},
}
@inproceedings {droidscope180237,
author = {Lok Kwong Yan and Heng Yin},
title = {{DroidScope}: Seamlessly Reconstructing the {OS} and Dalvik Semantic Views for Dynamic Android Malware Analysis},
booktitle = {21st USENIX Security Symposium (USENIX Security 12)},
year = {2012},
isbn = {978-931971-95-9},
address = {Bellevue, WA},
pages = {569--584},
url = {https://www.usenix.org/conference/usenixsecurity12/technical-sessions/presentation/yan},
publisher = {USENIX Association},
month = aug
}
@inproceedings{Tam2015,
address = {San Diego, California, USA},
title = {{CopperDroid}: {Automatic} {Reconstruction} of {Android} {Malware} {Behaviors}},
abstract = {Mobile devices and their application marketplaces drive the entire economy of the todays mobile landscape. Android platforms alone have produced staggering revenues, exceeding five billion USD, which has attracted cybercriminals and increased malware in Android markets at an alarming rate. To better understand this slew of threats, we present CopperDroid , an automatic VMI-based dynamic analysis system to reconstruct the behaviors of Android malware. The novelty of CopperDroid lies in its agnostic approach to identify interesting OS- and high-level Android-specific behaviors. It reconstructs these behaviors by observing and dissecting system calls and, therefore, is resistant to the multitude of alterations the Android runtime is subjected to over its life-cycle. CopperDroid automatically and accurately reconstructs events of interest that describe, not only well-known process-OS interactions (e.g., file and process creation), but also complex intra- and inter-process communications (e.g., SMS reception), whose semantics are typically contextualized through complex Android objects. Because CopperDroid s reconstruction mechanisms are agnostic to the underlying action invocation methods, it is able to capture actions initiated both from Java and native code execution. CopperDroid s analysis generates detailed behavioral profiles that abstract a large stream of low-level—often uninteresting—events into concise, high-level semantics, which are well-suited to provide insightful behavioral traits and open the possibility to further research directions. We carried out an extensive evaluation to assess the capabilities and performance of CopperDroid on more than 2,900 Android malware samples. Our experiments show that CopperDroid faithfully reconstructs OS- and Android-specific behaviors. Additionally, we demonstrate how CopperDroid can be leveraged to disclose additional behaviors through the use of a simple, yet effective, app stimulation technique. Using this technique, we successfully triggered and disclosed additional behaviors on more than 60\% of the analyzed malware samples. This qualitatively demonstrates the versatility of CopperDroid s ability to improve dynamic-based code coverage.},
booktitle = {22nd {Annual} {Network} and {Distributed} {System} {Security} {Symposium}},
publisher = {The Internet Society},
author = {Tam, Kimberly and Khan, Salahuddin and Fattori, Aristide and Cavallaro, Lorenzo},
month = feb,
year = {2015},
file = {PDF:/home/histausse/Zotero/storage/7TF382QC/Tam et al. - 2015 - CopperDroid Automatic Reconstruction of Android Malware Behaviors.pdf:application/pdf},
}
@inproceedings{qu_dydroid_2017,
title = {{DyDroid}: {Measuring} {Dynamic} {Code} {Loading} and {Its} {Security} {Implications} in {Android} {Applications}},
shorttitle = {{DyDroid}},
url = {https://ieeexplore.ieee.org/abstract/document/8023141},
doi = {10.1109/DSN.2017.14},
abstract = {Android has provided dynamic code loading (DCL) since API level one. DCL allows an app developer to load additional code at runtime. DCL raises numerous challenges with regards to security and accountability analysis of apps. While previous studies have investigated DCL on Android, in this paper we formulate and answer three critical questions that are missing from previous studies: (1) Where does the loaded code come from (remotely fetched or locally packaged), and who is the responsible entity to invoke its functionality? (2) In what ways is DCL utilized to harden mobile apps, specifically, application obfuscation? (3) What are the security risks and implications that can be found from DCL in off-the-shelf apps? We design and implement DyDroid, a system which uses both dynamic and static analysis to analyze dynamically loaded code. Dynamic analysis is used to automatically exercise apps, capture DCL behavior, and intercept the loaded code. Static analysis is used to investigate malicious behavior and privacy leakage in that dynamically loaded code. We have used DyDroid to analyze over 46K apps with little manual intervention, allowing us to conduct a large-scale measurement to investigate five aspects of DCL, such as source identification, malware detection, vulnerability analysis, obfuscation analysis, and privacy tracking analysis. We have several interesting findings. (1) 27 apps are found to violate the content policy of Google Play by executing code downloaded from remote servers. (2) We determine the distribution, pros/cons, and implications of several common obfuscation methods, including DEX encryption/loading. (3) DCL's stealthiness enables it to be a channel to deploy malware, and we find 87 apps loading malicious binaries which are not detected by existing antivirus tools. (4) We found 14 apps that are vulnerable to code injection attacks due to dynamically loading code which is writable by other apps. (5) DCL is mainly used by third-party SDKs, meaning that app developers may not know what sort of sensitive functionality is injected into their apps.},
urldate = {2024-04-30},
booktitle = {2017 47th {Annual} {IEEE}/{IFIP} {International} {Conference} on {Dependable} {Systems} and {Networks} ({DSN})},
author = {Qu, Zhengyang and Alam, Shahid and Chen, Yan and Zhou, Xiaoyong and Hong, Wangjun and Riley, Ryan},
month = jun,
year = {2017},
note = {ISSN: 2158-3927},
keywords = {Security, Android, Androids, Google, Humanoid robots, Malware, Dynamic analysis, Dynamic Code Loading, Loading, Measurement, Mobile security, Runtime, Smartphone},
pages = {415--426},
file = {IEEE Xplore Abstract Record:/home/histausse/Zotero/storage/RFUDH972/8023141.html:text/html;Qu et al. - 2017 - DyDroid Measuring Dynamic Code Loading and Its Se.pdf:/home/histausse/Zotero/storage/27Z9P5T4/Qu et al. - 2017 - DyDroid Measuring Dynamic Code Loading and Its Se.pdf:application/pdf},
}