wip
All checks were successful
/ test_checkout (push) Successful in 1m1s

This commit is contained in:
Jean-Marie Mineau 2025-07-21 22:00:29 +02:00
parent fd4d6fa239
commit ea82a3ca8b
Signed by: histausse
GPG key ID: B66AEEDA9B645AD2
10 changed files with 119 additions and 98 deletions

View file

@ -1,19 +1,25 @@
#import "../lib.typ": todo
#import "../lib.typ": todo, epigraph
= Introduction <sec:intro>
#todo[Write an introduction]
// https://youtu.be/si9iqF5uTFk?t=1512
#epigraph("Rear Admiral Grace Hopper")[If during the next 12 months any one of you says "but we have always done it that way", I will instantly materialize beside you and I will haunt you for 24 hours.]
// De tout temps les hommes on fait des apps android ...
Android is the most used mobile operating system since 2014, and since 2017, it even surpasses Windows all platforms combined#footnote[https://gs.statcounter.com/os-market-share#monthly-200901-202304].
The public adoption of Android is confirmed by application developers, with 1.3 millions apps available in the Google Play Store in 2014, and 3.5 millions apps available in 2017#footnote[https://www.statista.com/statistics/266210].
Its popularity makes Android a prime target for malware developers.
For example, various applications have been shown to steal personal information@shanSelfhidingBehaviorAndroid2018.
Consequently, Android has also been an important subject for security research.
#todo[developper "De tout temps les hommes "]
#todo[Introduire problématique:]
#todo[1) résulats trop bons sur des datasets faciles]
#todo[2) facile a pieger: shadow attacks]
#todo[3) savent pas gerer le chargement dyn et reflection]
/*
*
* De tout temps les hommes on fait des apps android ...
*
* Introduire la notion de reverseur qui veux analyser une app
*
* Les outils d'analyses android sont problématique:
* - résulats trop bons sur des datasets faciles
* - facile a pieger: shadow attacks
* - savent pas gerer le chargement dyn et reflection
*
* Problématique: todo
*/

View file

@ -57,12 +57,10 @@ In addition to decompilling #DEX files, Jadx can also decode Android manifests a
=== Soot <sec:bg-soot>
#todo[soot ref]
Soot#footnote[https://github.com/soot-oss/soot] is a Java optimization framework.
Soot#footnote[https://github.com/soot-oss/soot] @Arzt2013 is a Java optimization framework.
It can leaft java bytecode to other intermediate representations that can be used to perform optimization then converted back to bytecode.
Because Dalvik bytecode and Java bytecode are equivalent, support for Android was added to Soot, and Soot features are now leveraged to analyse Android applications.
One of the best known example of Soot usage for Android analysis is Flowdroid #todo[ref], a tool that compute data flow in an application.
One of the best known example of Soot usage for Android analysis is Flowdroid@Arzt2014a, a tool that compute data flow in an application.
A new version of Soot, SootUp#footnote[https://github.com/soot-oss/SootUp], is currently beeing worked on.
Compared to Soot, it has a modernize interface and architecture, but it is not yet feature complete and some tools like Flowdroid are still using Soot.

View file

@ -0,0 +1,41 @@
#import "../lib.typ": todo, APK
== Android Reverse Engineering Techniques <sec:bg-techniques>
#todo[swap with tool section ?]
In the past fifteen years, the research community released many tools to detect or analyze malicious behaviors in applications.
Two main approaches can be distinguished: static and dynamic analysis@Li2017.
Dynamic analysis requires to run the application in a controlled environment to observe runtime values and/or interactions with the operating system.
For example, an Android emulator with a patched kernel can capture these interactions but the modifications to apply are not a trivial task.
Such approach is limited by the required time to execute a limited part of the application with no guarantee on the obtained code coverage.
For malware, dynamic analysis is also limited by evading techniques that may prevent the execution of malicious parts of the code.
//As a consequence, a lot of efforts have been put in static approaches, which is the focus of this paper.
=== Static Analysis <sec:bg-static>
Static analysis tools are used to perform operations on an #APK file, like extracting its bytecode or information from the `AndroidManifest.xml` file.
#todo[Explain controle flow graph, data flow graph, and link to tools?]
A classic goal of a static analysis is to compute data flows to detect potential information leaks@weiAmandroidPreciseGeneral2014 @titzeAppareciumRevealingData2015 @bosuCollusiveDataLeak2017 @klieberAndroidTaintFlow2014 @DBLPconfndssGordonKPGNR15 @octeauCompositeConstantPropagation2015 @liIccTADetectingInterComponent2015 by analyzing the bytecode of an Android application.
Static analysis tools for Android application must overcom many difficulties:
/ the multiplicity of entry points: Each component of an application can be an entry point for the application
/ the event driven architecture: Methods of in the applications can be called in many different order depending on external events
/ the interleaving of native code and bytecode: Native code can be called from bytecode and vice versa, but tools often only handle one of those format
/ the potential dynamic code loading: And application can run code that was not orriginally in the application
/ the use of reflection: Methods can be called from their name as a string object, which is not necessary known statically
/ the continual evolution of Android: each new version brings new features that an analysis tools must be aware of
The tools can share the backend used to interact with the bytecode.
For example, Apktool is often called in a subprocess to extracte the bytecode.
Another example is Soot@Arzt2013, a Java framework that allows to manipulate the bytecode from an object representation of instructions.
The most known tool built on top of Soot is FlowDroid@Arzt2014a, which enables to compute information flows statically into the code.
=== Dynamic Analysis <sec:bg-dynamic>
#todo[y a du boulot]
=== Hybrid Analysis <sec:bg-hybrid>

View file

@ -0,0 +1,23 @@
#import "../lib.typ": todo, etal, APK
== Application Datasets <sec:bg-datasets>
Computing if an application contains a possible information flow is an example of a static analysis goal.
Some datasets have been built especially for evaluating tools that are computing information flows inside Android applications.
One of the first well known dataset is DroidBench, that was released with the tool Flowdroid@Arzt2014a.
Later, the dataset ICC-Bench was introduced with the tool Amandroid@weiAmandroidPreciseGeneral2014 to complement DroidBench by introducing applications using Inter-Component data flows.
These datasets contain carefully crafted applications containing flows that the tools should be able to detect.
These hand-crafted applications can also be used for testing purposes or to detect any regression when the software code evolves.
Contrary to real world applications, the behavior of these hand-crafted applications is known in advance, thus providing the ground truth that the tools try to compute.
However, these datasets are not representative of real-world applications@Pendlebury2018 and the obtained results can be misleading.
Contrary to DroidBench and ICC-Bench, some approaches use real-world applications.
Bosu #etal@bosuCollusiveDataLeak2017 use DIALDroid to perform a threat analysis of Inter-Application communication and published DIALDroid-Bench, an associated dataset.
Similarly, Luo #etal released TaintBench@luoTaintBenchAutomaticRealworld2022 a real-world dataset and the associated recommendations to build such a dataset.
These datasets are useful for carefully spotting missing taint flows, but contain only a few dozen of applications.
In addition to those datasets, Androzoo@allixAndroZooCollectingMillions2016 collect applications from several application market places, including the Google Play store (the official Google application store), Anzhi and AppChina (two chinese stores), or FDroid (a store dedicated to free and open source applications).
Currently, Androzoo contains more than 25 millions applications, that can be downloaded by researchers from the SHA256 hash of the application.
Androzoo provide additionnal information about the applications, like the date the application was detected for the first time by Androzoo or the number of antivirus from VirusTotal that flaged the application as malicious.
In addition to providing researchers with an easy access to real world applications, Androzoo make it a lot easier to share datasets for reproducibility: instead of sharing hundreds of #APK files, the list of SHA256 is enough.

View file

@ -4,10 +4,10 @@
#epigraph("Alexis \"Lex\" Murphy, Jurassic Park")[This is a Unix system. I know this.]
#todo[Present field background and related work]
#include("X_android.typ")
#include("X_tools.typ")
#include("1_android.typ")
#include("2_tools.typ")
#include("3_analysis_techniques.typ")
#include("4_datasets.typ")
/*
* Cours generique sur android

View file

@ -3,33 +3,16 @@
== Introduction
Android is the most used mobile operating system since 2014, and since 2017, it even surpasses Windows all platforms combined#footnote[https://gs.statcounter.com/os-market-share#monthly-200901-202304].
The public adoption of Android is confirmed by application developers, with 1.3 millions apps available in the Google Play Store in 2014, and 3.5 millions apps available in 2017#footnote[https://www.statista.com/statistics/266210].
Its popularity makes Android a prime target for malware developers. // For example, various applications have been shown to steal personal information@shanSelfhidingBehaviorAndroid2018.
Consequently, Android has also been an important subject for security research.
In the past fifteen years, the research community released many tools to detect or analyze malicious behaviors in applications. Two main approaches can be distinguished: static and dynamic analysis@Li2017.
Dynamic analysis requires to run the application in a controlled environment to observe runtime values and/or interactions with the operating system.
For example, an Android emulator with a patched kernel can capture these interactions but the modifications to apply are not a trivial task.
// Such approach is limited by the required time to execute a limited part of the application with no guarantee on the obtained code coverage.
// For malware, dynamic analysis is also limited by evading techniques that may prevent the execution of malicious parts of the code. // To explain better if we restore these sentences about malware + evading.
As a consequence, a lot of efforts have been put in static approaches, which is the focus of this paper.
The usual goal of a static analysis is to compute data flows to detect potential information leaks@weiAmandroidPreciseGeneral2014 @titzeAppareciumRevealingData2015 @bosuCollusiveDataLeak2017 @klieberAndroidTaintFlow2014 @DBLPconfndssGordonKPGNR15,@octeauCompositeConstantPropagation2015,@liIccTADetectingInterComponent2015 by analyzing the bytecode of an Android application.
The associated developed tools should support the Dalvik bytecode format, the multiplicity of entry points, the event driven architecture of Android applications, the interleaving of native code and bytecode, possibly loaded dynamically, the use of reflection, to name a few.
All these obstacles threaten the research efforts.
When using a more recent version of Android or a recent set of applications, the results previously obtained may become outdated and the developed tools may not work correctly anymore.
In this paper/*#footnote[This work was supported by the ANR Research under the Plan France 2030 bearing the reference ANR-22-PECY-0007.]*/, we study the reusability of open source static analysis tools that appeared between 2011 and 2017, on a recent Android dataset.
In this chapter, we study the reusability of open source static analysis tools that appeared between 2011 and 2017, on a recent Android dataset.
The scope of our study is *not* to quantify if the output results are accurate for ensuring reproducibility, because all the studied static analysis tools have different goals in the end.
On the contrary, we take as hypothesis that the provided tools compute the intended result but may crash or fail to compute a result due to the evolution of the internals of an Android application, raising unexpected bugs during an analysis.
This paper intends to show that sharing the software artifacts of a paper may not be sufficient to ensure that the provided software would be reusable.
This chapter intends to show that sharing the software artifacts of a paper may not be sufficient to ensure that the provided software would be reusable.
Thus, our contributions are the following.
We carefully retrieved static analysis tools for Android applications that were selected by Li #etal@Li2017 between 2011 and 2017.
We contacted the authors, whenever possible, for selecting the best candidate versions and to confirm the good usage of the tools.
We rebuild the tools in their original environment and we plan to share our Docker images with this paper.
We evaluated the reusability of the tools by measuring the number of successful analysis of applications taken /*in the Drebin dataset@Arp2014 and */ in a custom dataset that contains more recent applications (#NBTOTALSTRING in total).
We evaluated the reusability of the tools by measuring the number of successful analysis of applications taken in the Drebin dataset@Arp2014 and in a custom dataset that contains more recent applications (#NBTOTALSTRING in total).
The observation of the success or failure of these analysis enables us to answer the following research questions:
/ RQ1: What Android static analysis tools that are more than 5 years old are still available and can be reused without crashing with a reasonable effort?
@ -45,7 +28,7 @@ As a summary, the contributions of this paper are the following:
- We discuss the effect of applications features (date, size, SDK version, goodware/malware) on static analysis tools and the nature of the issues we found by studying statistics on the errors captured during our experiments.
*/
The paper is structured as follows.
The chapter is structured as follows.
@sec:rasta-soa presents a summary of previous works dedicated to Android static analysis tools.
@sec:rasta-methodology presents the methodology employed to build our evaluation process and @sec:rasta-xp gives the associated experimental results.
// @sec:rasta-discussion investigates the reasons behind the observed failures of some of the tools.

View file

@ -9,38 +9,7 @@
// For example, taint analysis datasets should provide the source and expected sink of a taint.
// In some cases, the datasets are provided with additional software for automatizing part of the analysis.
// Thus,
We review in this section the past existing datasets provided by the community and the papers related to static analysis tools reusability.
=== Application Datasets
Computing if an application contains a possible information flow is an example of a static analysis goal.
Some datasets have been built especially for evaluating tools that are computing information flows inside Android applications.
One of the first well known dataset is DroidBench, that was released with the tool Flowdroid@Arzt2014a.
Later, the dataset ICC-Bench was introduced with the tool Amandroid@weiAmandroidPreciseGeneral2014 to complement DroidBench by introducing applications using Inter-Component data flows.
These datasets contain carefully crafted applications containing flows that the tools should be able to detect.
These hand-crafted applications can also be used for testing purposes or to detect any regression when the software code evolves.
Contrary to real world applications, the behavior of these hand-crafted applications is known in advance, thus providing the ground truth that the tools try to compute.
However, these datasets are not representative of real-world applications@Pendlebury2018 and the obtained results can be misleading.
//, especially for performance or reliability evaluation.
Contrary to DroidBench and ICC-Bench, some approaches use real-world applications.
Bosu #etal@bosuCollusiveDataLeak2017 use DIALDroid to perform a threat analysis of Inter-Application communication and published DIALDroid-Bench, an associated dataset.
Similarly, Luo #etal released TaintBench@luoTaintBenchAutomaticRealworld2022 a real-world dataset and the associated recommendations to build such a dataset.
These datasets confirmed that some tools such as Amandroid@weiAmandroidPreciseGeneral2014 and Flowdroid@Arzt2014a are less efficient on real-world applications.
These datasets are useful for carefully spotting missing taint flows, but contain only a few dozen of applications.
// A larger number of applications would be more suitable for our goal, #ie evaluating the reusability of a variety of static analysis tools.
Pauck #etal@pauckAndroidTaintAnalysis2018 used those three datasets to compare Amandroid@weiAmandroidPreciseGeneral2014, DIAL-Droid@bosuCollusiveDataLeak2017, DidFail@klieberAndroidTaintFlow2014, DroidSafe@DBLPconfndssGordonKPGNR15, FlowDroid@Arzt2014a and IccTA@liIccTADetectingInterComponent2015 -- all these tools will be also compared in this paper.
To perform their comparison, they introduced the AQL (Android App Analysis Query Language) format.
AQL can be used as a common language to describe the computed taint flow as well as the expected result for the datasets.
It is interesting to notice that all the tested tools timed out at least once on real-world applications, and that Amandroid@weiAmandroidPreciseGeneral2014, DidFail@klieberAndroidTaintFlow2014, DroidSafe@DBLPconfndssGordonKPGNR15, IccTA@liIccTADetectingInterComponent2015 and ApkCombiner@liApkCombinerCombiningMultiple2015 (a tool used to combine applications) all failed to run on applications built for Android API 26.
These results suggest that a more thorough study of the link between application characteristics (#eg date, size) should be conducted.
Luo #etal@luoTaintBenchAutomaticRealworld2022 used the framework introduced by Pauck #etal to compare Amandroid@weiAmandroidPreciseGeneral2014 and Flowdroid@Arzt2014a on DroidBench and their own dataset TaintBench, composed of real-world android malware.
They found out that those tools have a low recall on real-world malware, and are thus over adapted to micro-datasets.
Unfortunately, because AQL is only focused on taint flows, we cannot use it to evaluate tools performing more generic analysis.
=== Static Analysis Tools Reusability
We review in this section the past existing contributions related to static analysis tools reusability.
Several papers have reviewed Android analysis tools produced by researchers.
Li #etal@Li2017 published a systematic literature review for Android static analysis before May 2015.
@ -49,6 +18,19 @@ In particular, they listed 27 approaches with an open-source implementation avai
Nevertheless, experiments to evaluate the reusability of the pointed out software were not performed.
We believe that the effort of reviewing the literature for making a comprehensive overview of available approaches should be pushed further: an existing published approach with a software that cannot be used for technical reasons endanger both the reproducibility and reusability of research.
As we saw in @sec:bg-datasets that the need for a ground truth to test analysis tools leads test datasets to often be handcrafted.
The few datasets composed of real-world application confirmed that some tools such as Amandroid@weiAmandroidPreciseGeneral2014 and Flowdroid@Arzt2014a are less efficient on real-world applications@bosuCollusiveDataLeak2017 @luoTaintBenchAutomaticRealworld2022.
Unfortunatly, those real-world applications datasets are rather small, and a larger number of applications would be more suitable for our goal, #ie evaluating the reusability of a variety of static analysis tools.
Pauck #etal@pauckAndroidTaintAnalysis2018 used DroidBench@@Arzt2014a, ICC-Bench@weiAmandroidPreciseGeneral2014 and DIALDroid-Bench@@bosuCollusiveDataLeak2017 to compare Amandroid@weiAmandroidPreciseGeneral2014, DIAL-Droid@bosuCollusiveDataLeak2017, DidFail@klieberAndroidTaintFlow2014, DroidSafe@DBLPconfndssGordonKPGNR15, FlowDroid@Arzt2014a and IccTA@liIccTADetectingInterComponent2015 -- all these tools will be also compared in this chapter.
To perform their comparison, they introduced the AQL (Android App Analysis Query Language) format.
AQL can be used as a common language to describe the computed taint flow as well as the expected result for the datasets.
It is interesting to notice that all the tested tools timed out at least once on real-world applications, and that Amandroid@weiAmandroidPreciseGeneral2014, DidFail@klieberAndroidTaintFlow2014, DroidSafe@DBLPconfndssGordonKPGNR15, IccTA@liIccTADetectingInterComponent2015 and ApkCombiner@liApkCombinerCombiningMultiple2015 (a tool used to combine applications) all failed to run on applications built for Android API 26.
These results suggest that a more thorough study of the link between application characteristics (#eg date, size) should be conducted.
Luo #etal@luoTaintBenchAutomaticRealworld2022 used the framework introduced by Pauck #etal to compare Amandroid@weiAmandroidPreciseGeneral2014 and Flowdroid@Arzt2014a on DroidBench and their own dataset TaintBench, composed of real-world android malware.
They found out that those tools have a low recall on real-world malware, and are thus over adapted to micro-datasets.
Unfortunately, because AQL is only focused on taint flows, we cannot use it to evaluate tools performing more generic analysis.
A first work about quantifying the reusability of static analysis tools was proposed by Reaves #etal@reaves_droid_2016.
Seven Android analysis tools (Amandroid@weiAmandroidPreciseGeneral2014, AppAudit@xiaEffectiveRealTimeAndroid2015, DroidSafe@DBLPconfndssGordonKPGNR15, Epicc@octeau2013effective, FlowDroid@Arzt2014a, MalloDroid@fahlWhyEveMallory2012 and TaintDroid@Enck2010) were selected to check if they were still readily usable.
For each tool, both the usability and results of the tool were evaluated by asking auditors to install and use it on DroidBench and 16 real world applications.
@ -56,7 +38,7 @@ The auditors reported that most of the tools require a significant amount of tim
Reaves #etal propose to solve these issues by distributing a Virtual Machine with a functional build of the tool in addition to the source code.
Regrettably, these Virtual Machines were not made available, preventing future researchers to take advantage of the work done by the auditors.
Reaves #etal also report that real world applications are more challenging to analyze, with tools having lower results, taking more time and memory to run, sometimes to the point of not being able to run the analysis.
We will confirm and expand this result in this paper with a larger dataset than only 16 real-world applications.
We will confirm and expand this result in this chapter with a larger dataset than only 16 real-world applications.
// Indeed, a more diverse dataset would assess the results and give more insight about the factors impacting the performances of the tools.
Finally, our approach is similar to the methodology employed by Mauthe #etal for decompilers@mauthe_large-scale_2021.

View file

@ -1,4 +1,4 @@
#import "../lib.typ": etal, ie
#import "../lib.typ": etal, ie, ART, DEX, APK, SDK
#import "X_var.typ": *
== Introduction
@ -7,7 +7,7 @@
When building an application with Android Studio, the source codes of applications are compiled to Java bytecode, which is then converted to Dalvik bytecode.
Dalvik bytecode is then put in a zip archive with other resources such as the application manifest, and the zip archive is then signed.
All this process is handled by Android Studio, behind the scene.
At runtime, the Dalvik bytecode is either interpreted by the Dalvik virtual machine or compiled by ART in order to execute native code and it is up to these components to handle the loading of the classes.
At runtime, the Dalvik bytecode is either interpreted by the Dalvik virtual machine or compiled by #ART in order to execute native code and it is up to these components to handle the loading of the classes.
Both behaviors are possible at the same time for a single application, and it is up to Android to choose which part of an application is compiled in native code.
*/
@ -20,17 +20,17 @@ If this first phase is not accurately driven, for example if they fail to access
Additionally, as stated by Li #etal@Li2017 in their conclusions, such a task is complexified by dynamic code loading, reflective calls, native code, and multi-threading which cannot be easily handled statically.
Nevertheless, even if we do not consider these aspects, determining statically how the regular class loading system of Android is working is a difficult task.
Class loading occurs at runtime and is handled by the components of Android Runtime (ART), even when the application is partially or fully compiled ahead of time.
Class loading occurs at runtime and is handled by the components of #ART, even when the application is partially or fully compiled ahead of time.
Nevertheless, at the development stage, Android Studio handles the resolution of the different classes that can be internal to the application.
When building, the code is linked to the standard library i.e. the code contained in `android.jar`.
In this article, we call these classes "Development SDK classes".
In this article, we call these classes "Development #SDK classes".
`android.jar` is not added to the application because its classes will be available at runtime in others `.jar` files.
To distinguish those classes found at runtime from Dev SDK classes, we call them #Asdkc.
When releasing the application, the building process of Android Studio can manage different versions of the #Asdk, reported in the Manifest as the "SDK versions".
Indeed, some parts of the core #Asdkc can be embedded in the application, for retro compatibility purposes: by comparing the specified minimum SDK version and the target SDK version, the code of extra #Asdkc is stored in the APK file.
To distinguish those classes found at runtime from Dev #SDK classes, we call them #Asdkc.
When releasing the application, the building process of Android Studio can manage different versions of the #Asdk, reported in the Manifest as the "#SDK versions".
Indeed, some parts of the core #Asdkc can be embedded in the application, for retro compatibility purposes: by comparing the specified minimum #SDK version and the target #SDK version, the code of extra #Asdkc is stored in the APK file.
As a consequence, it is frequent to find inside applications some classes that come from the `com.android` packages.
At runtime each smartphone runs a unique version of Android, but, as the application is deployed on multiple versions of Android, it is difficult to predict which classes will be loaded from the #Asdkc or from the APK file itself.
This complexity increases with the multi-DEX format of recent APK files that can contain several bytecode files.
This complexity increases with the multi-#DEX format of recent #APK files that can contain several bytecode files.
Going back to the problem of a reverser studying a suspicious application statically, the reverser uses tools to disassemble the application@mauthe_large-scale_2021 and track the flows of data in the bytecode.
As an example, for a spyware potentially leaking personal information, the reverser can unpack the application with Apktool and, after manually locating a method that they suspect to read sensitive data (by reading the unpacked bytecode), they can compute with FlowDroid@Arzt2014a if there is a flow from this method to methods performing HTTP requests.
@ -47,7 +47,7 @@ The goal of such an attack is to confuse them during the reversing process: at r
This attack can be applied to regular classes of the #Asdk or to hidden classes of Android@he_systematic_2023 @li_accessing_2016.
We show how these attacks can confuse the tools of the reverser when he performs a static analysis.
In order to evaluate if such attacks are already used in the wild, we analyzed #nbapk applications from 2023 that we extracted randomly from AndroZoo@allixAndroZooCollectingMillions2016.
Our main result is that #shadowsdk of these applications contain shadow collisions against the SDK and #shadowhidden against hidden classes.
Our main result is that #shadowsdk of these applications contain shadow collisions against the #SDK and #shadowhidden against hidden classes.
Our investigations conclude that most of these collisions are not voluntary attacks, but we highlight one specific malware sample performing strong obfuscation revealed by our detection of one shadow attack.
The paper is structured as follows.

View file

@ -40,15 +40,3 @@ They found that hidden APIs are added and removed in every release of Android, a
More recently, He #etal @he_systematic_2023 did a systematic study of hidden service API related to security.
They studied how the hidden API can be used to bypass Android security restrictions and found that although Google countermeasures are effective, they need to be implemented inside the system services and not the hidden API due to the lack of in-app privilege isolation: the framework code is in the same process as the user code, meaning any restriction in the framework can be bypassed by the user.
]
#paragraph([Static analysis tools])[
Static analysis tools are used to perform operations on an APK file, for example extracting its bytecode or information from the Manifest file.
Because of the complexity of Android, few tools have followed all the evolutions of the file format and are robust enough to analyze all applications without crashing@mineau_evaluating_2024.
The tools can share the backend used to manipulate the code.
For example, Apktool is often called in a subprocess to extracte the bytecode.
Another example is Soot@Arzt2013, a Java framework that allows to manipulate the bytecode from an object representation of instructions.
This framework enables advanced features such as inserting or removing bytecode instructions but can require a lot of memory and time to perform its operations.
The most known tool built on top of Soot is FlowDroid@Arzt2014a, which enables to compute information flows statically into the code.
Because these tools are used by reversers, we will evaluate the accuracy of the provided results in the case of an application developer exploits the possible confusions that brings the class loading mechanisms of Android.
]