This commit is contained in:
parent
471a176683
commit
c34eb1b838
7 changed files with 43 additions and 61 deletions
|
@ -4,41 +4,26 @@
|
|||
|
||||
=== Reusability of Static Analysis Tools <sec:bg-soa-rasta>
|
||||
|
||||
//== Android Reverse Engineering Techniques <sec:bg-techniques>
|
||||
|
||||
//#todo[swap with tool section ?]
|
||||
//
|
||||
|
||||
#todo[Refactor]
|
||||
|
||||
==== Static Analysis <sec:bg-soa-static>
|
||||
#pb1-text
|
||||
|
||||
In the past fifteen years, the research community released many tools to detect or analyse malicious behaviors in applications.
|
||||
Two main approaches can be distinguished: static and dynamic analysis~@Li2017.
|
||||
Dynamic analysis requires to run the application in a controlled environment to observe runtime values and/or interactions with the operating system.
|
||||
For example, an Android emulator with a patched kernel can capture these interactions but the modifications to apply are not a trivial task.
|
||||
Such approach is limited by the required time to execute a limited part of the application with no guarantee on the obtained code coverage.
|
||||
Dynamic analysis is also limited by evading techniques that may prevent the execution of malicious parts of the code.
|
||||
As a consequence, a lot of efforts have been put in static approaches. //, which is the focus of this paper.
|
||||
|
||||
Data-flow analysis is the subject of many contribution~@weiAmandroidPreciseGeneral2014 @titzeAppareciumRevealingData2015 @bosuCollusiveDataLeak2017 @klieberAndroidTaintFlow2014 @DBLPconfndssGordonKPGNR15 @octeauCompositeConstantPropagation2015 @liIccTADetectingInterComponent2015, the most notable tool being Flowdroid~@Arzt2014a.
|
||||
|
||||
#todo[Describe the different contributions in relations to the issues they tackle, be more critical]
|
||||
|
||||
A lot of those more advanced tools rely on common tools to interact with Android applications/#DEX bytecode@~@Li2017.
|
||||
Reccuring examples of such support tools are Appktool (#eg Amandroid~@weiAmandroidPreciseGeneral2014, Blueseal~@shenInformationFlowsPermission2014, SAAF~@hoffmannSlicingDroidsProgram2013), Androguard (#eg Adagio~@gasconStructuralDetectionAndroid2013, Appareciumn~@titzeAppareciumRevealingData2015, Mallodroid~@fahlWhyEveMallory2012) or Soot (#eg Blueseal~@shenInformationFlowsPermission2014, DroidSafe~@DBLPconfndssGordonKPGNR15, Flowdroid~@Arzt2014a).
|
||||
|
||||
The number of publication related to static analysis make can make it difficult to find the right tool for the right task.
|
||||
The first steps to anwser this question is to list those previously published tools.
|
||||
The number of publication related to static analysis can make it difficult to find the right tool for the right task.
|
||||
Li #etal~@Li2017 published a systematic literature review for Android static analysis before May 2015.
|
||||
They analysed 92 publications and classified them by goal, method used to solve the problem and underlying technical solution for handling the bytecode when performing the static analysis.
|
||||
In particular, they listed 27 approaches with an open-source implementation available.
|
||||
Nevertheless, experiments to evaluate the reusability of the pointed out software were not performed.
|
||||
|
||||
Interestingly, a lot of the tools listed rely on common tools to interact with Android applications/#DEX bytecode.
|
||||
Reccuring examples of such support tools are Appktool (#eg Amandroid~@weiAmandroidPreciseGeneral2014, Blueseal~@shenInformationFlowsPermission2014, SAAF~@hoffmannSlicingDroidsProgram2013), Androguard (#eg Adagio~@gasconStructuralDetectionAndroid2013, Appareciumn~@titzeAppareciumRevealingData2015, Mallodroid~@fahlWhyEveMallory2012) or Soot (#eg Blueseal~@shenInformationFlowsPermission2014, DroidSafe~@DBLPconfndssGordonKPGNR15, Flowdroid~@Arzt2014a).
|
||||
This strengthens our idea that behing able to reuse previous tools in important.
|
||||
Those tools are built incrementally, on top of each other.
|
||||
|
||||
Nevertheless, experiments to evaluate the reusability of the pointed out software were not performed by Li #etal
|
||||
#jfl-note[We believe that the effort of reviewing the literature for making a comprehensive overview of available approaches should be pushed further: an existing published approach with a software that cannot be used for technical reasons endanger both the reproducibility and reusability of research.][A mettre en avant?]
|
||||
In the next section, we will look at the work that has been done to evaluate different analysis tools.
|
||||
|
||||
//Data-flow analysis is the subject of many contribution~@weiAmandroidPreciseGeneral2014 @titzeAppareciumRevealingData2015 @bosuCollusiveDataLeak2017 @klieberAndroidTaintFlow2014 @DBLPconfndssGordonKPGNR15 @octeauCompositeConstantPropagation2015 @liIccTADetectingInterComponent2015, the most notable tool being Flowdroid~@Arzt2014a.
|
||||
|
||||
==== Evaluating Static Analysis Tools <sec:bg-eval-tools>
|
||||
|
||||
We will now explore this direction further by looking at the work that has been done to evaluate different analysis tools.
|
||||
Works that perform benchmaks of tools follow a similar method.
|
||||
They start by selecting a set of tools with similar goals.
|
||||
Usually, those contribusions are comparing existing tools to their own, but some contributions do not introduce a new tool and focus on surveying the state of the art for some technique.
|
||||
|
@ -49,9 +34,10 @@ Several factors can be considered to compare the results of the tools:
|
|||
the number of false positives, false negatives, or even the time it took to finish the analysis.
|
||||
Occasionally, the number of application a tool simply failled to analyse are also compared.
|
||||
|
||||
In @sec:bg-datasets we will look at the dataset used in the community to compare analysis tools, and in @sec:rasta-soa we will go through the contributions that benchmarked those tools #jm-note[to see if they can be used as an indication as to which tools can still be used today.] [Mettre en avant]
|
||||
In @sec:bg-datasets we will look at the dataset used in the community to compare analysis tools.
|
||||
Then in @sec:bg-bench> we will go through the contributions that benchmarked those tools #jm-note[to see if they can be used as an indication as to which tools can still be used today.][Mettre en avant]
|
||||
|
||||
===== Application Datasets <sec:bg-datasets>
|
||||
==== Application Datasets <sec:bg-datasets>
|
||||
|
||||
Research contributions often rely on existing datasets or provide new ones in order to evaluate the developed software.
|
||||
Raw datasets such as Drebin@Arp2014 contain few information about the provided applications.
|
||||
|
@ -74,7 +60,7 @@ Currently, Androzoo contains more than 25 millions applications, that can be dow
|
|||
Androzoo also provide additionnal information about the applications, like the date the application was detected for the first time by Androzoo or the number of antivirus from VirusTotal that flaged the application as malicious.
|
||||
In addition to providing researchers with an easy access to real world applications, Androzoo make it a lot easier to share datasets for reproducibility: instead of sharing hundreds of #APK files, the list of SHA256 is enough.
|
||||
|
||||
===== Benchmarking <sec:rasta-soa>
|
||||
==== Benchmarking <sec:bg-bench>
|
||||
|
||||
The few datasets composed of real-world application confirmed that some tools such as Amandroid~@weiAmandroidPreciseGeneral2014 and Flowdroid~@Arzt2014a are less efficient on real-world applications~@bosuCollusiveDataLeak2017 @luoTaintBenchAutomaticRealworld2022.
|
||||
Unfortunatly, those real-world applications datasets are rather small, and a larger number of applications would be more suitable for our goal, #ie evaluating the reusability of a variety of static analysis tools.
|
||||
|
@ -158,10 +144,12 @@ DroidBench@Arzt2014a
|
|||
|
||||
#v(2em)
|
||||
|
||||
Reaves #etal raised two major concern for the use of Android static analysis tools.
|
||||
To summariz, Li #etal made a systematic literature review of static analysis for Android that listed 27 opensourced tools.
|
||||
However, they did not tested those tools.
|
||||
Reaves #etal did so for some of them and analysed the difficulty of using them.
|
||||
They raised two major concern for the use of Android static analysis tools.
|
||||
First, they can be quite difficult to setup, and second, they appear to have difficulties analysing read-world applications.
|
||||
This is problematic for a reverser engineer, not only do they need to invest a significan amont of work to setup a tool properly, they do not have any guarantees that the tool will actually manage to analyse the application they are investigating.
|
||||
Hence our first problem statement #pb1:
|
||||
|
||||
#pb1-text
|
||||
This is problematic for a reverser engineer, not only do they need to invest a significant amont of work to setup a tool properly, they do not have any guarantees that the tool will actually manage to analyse the application they are investigating.
|
||||
|
||||
In @sec:rasta, we will try to setup the tools listed by Li #etal and test them on a large number of real-world applications to see wich can be used today.
|
||||
We will also aim at identify what caracteristic of real-world applications make them harder to analyse.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue