wip
This commit is contained in:
parent
243b9df134
commit
c060e88996
17 changed files with 264 additions and 96 deletions
|
@ -12,27 +12,27 @@
|
|||
We review in this section the past existing contributions related to static analysis tools reusability.
|
||||
|
||||
Several papers have reviewed Android analysis tools produced by researchers.
|
||||
Li #etal@Li2017 published a systematic literature review for Android static analysis before May 2015.
|
||||
Li #etal~@Li2017 published a systematic literature review for Android static analysis before May 2015.
|
||||
They analyzed 92 publications and classified them by goal, method used to solve the problem and underlying technical solution for handling the bytecode when performing the static analysis.
|
||||
In particular, they listed 27 approaches with an open-source implementation available.
|
||||
Nevertheless, experiments to evaluate the reusability of the pointed out software were not performed.
|
||||
We believe that the effort of reviewing the literature for making a comprehensive overview of available approaches should be pushed further: an existing published approach with a software that cannot be used for technical reasons endanger both the reproducibility and reusability of research.
|
||||
|
||||
As we saw in @sec:bg-datasets that the need for a ground truth to test analysis tools leads test datasets to often be handcrafted.
|
||||
The few datasets composed of real-world application confirmed that some tools such as Amandroid@weiAmandroidPreciseGeneral2014 and Flowdroid@Arzt2014a are less efficient on real-world applications@bosuCollusiveDataLeak2017 @luoTaintBenchAutomaticRealworld2022.
|
||||
The few datasets composed of real-world application confirmed that some tools such as Amandroid~@weiAmandroidPreciseGeneral2014 and Flowdroid~@Arzt2014a are less efficient on real-world applications~@bosuCollusiveDataLeak2017 @luoTaintBenchAutomaticRealworld2022.
|
||||
Unfortunatly, those real-world applications datasets are rather small, and a larger number of applications would be more suitable for our goal, #ie evaluating the reusability of a variety of static analysis tools.
|
||||
|
||||
Pauck #etal@pauckAndroidTaintAnalysis2018 used DroidBench@@Arzt2014a, ICC-Bench@weiAmandroidPreciseGeneral2014 and DIALDroid-Bench@@bosuCollusiveDataLeak2017 to compare Amandroid@weiAmandroidPreciseGeneral2014, DIAL-Droid@bosuCollusiveDataLeak2017, DidFail@klieberAndroidTaintFlow2014, DroidSafe@DBLPconfndssGordonKPGNR15, FlowDroid@Arzt2014a and IccTA@liIccTADetectingInterComponent2015 -- all these tools will be also compared in this chapter.
|
||||
Pauck #etal~@pauckAndroidTaintAnalysis2018 used DroidBench~@Arzt2014a, ICC-Bench~@weiAmandroidPreciseGeneral2014 and DIALDroid-Bench~@bosuCollusiveDataLeak2017 to compare Amandroid~@weiAmandroidPreciseGeneral2014, DIAL-Droid~@bosuCollusiveDataLeak2017, DidFail~@klieberAndroidTaintFlow2014, DroidSafe~@DBLPconfndssGordonKPGNR15, FlowDroid~@Arzt2014a and IccTA~@liIccTADetectingInterComponent2015 -- all these tools will be also compared in this chapter.
|
||||
To perform their comparison, they introduced the AQL (Android App Analysis Query Language) format.
|
||||
AQL can be used as a common language to describe the computed taint flow as well as the expected result for the datasets.
|
||||
It is interesting to notice that all the tested tools timed out at least once on real-world applications, and that Amandroid@weiAmandroidPreciseGeneral2014, DidFail@klieberAndroidTaintFlow2014, DroidSafe@DBLPconfndssGordonKPGNR15, IccTA@liIccTADetectingInterComponent2015 and ApkCombiner@liApkCombinerCombiningMultiple2015 (a tool used to combine applications) all failed to run on applications built for Android API 26.
|
||||
It is interesting to notice that all the tested tools timed out at least once on real-world applications, and that Amandroid~@weiAmandroidPreciseGeneral2014, DidFail~@klieberAndroidTaintFlow2014, DroidSafe~@DBLPconfndssGordonKPGNR15, IccTA~@liIccTADetectingInterComponent2015 and ApkCombiner~@liApkCombinerCombiningMultiple2015 (a tool used to combine applications) all failed to run on applications built for Android API 26.
|
||||
These results suggest that a more thorough study of the link between application characteristics (#eg date, size) should be conducted.
|
||||
Luo #etal@luoTaintBenchAutomaticRealworld2022 used the framework introduced by Pauck #etal to compare Amandroid@weiAmandroidPreciseGeneral2014 and Flowdroid@Arzt2014a on DroidBench and their own dataset TaintBench, composed of real-world android malware.
|
||||
Luo #etal~@luoTaintBenchAutomaticRealworld2022 used the framework introduced by Pauck #etal to compare Amandroid~@weiAmandroidPreciseGeneral2014 and Flowdroid~@Arzt2014a on DroidBench and their own dataset TaintBench, composed of real-world android malware.
|
||||
They found out that those tools have a low recall on real-world malware, and are thus over adapted to micro-datasets.
|
||||
Unfortunately, because AQL is only focused on taint flows, we cannot use it to evaluate tools performing more generic analysis.
|
||||
|
||||
A first work about quantifying the reusability of static analysis tools was proposed by Reaves #etal@reaves_droid_2016.
|
||||
Seven Android analysis tools (Amandroid@weiAmandroidPreciseGeneral2014, AppAudit@xiaEffectiveRealTimeAndroid2015, DroidSafe@DBLPconfndssGordonKPGNR15, Epicc@octeau2013effective, FlowDroid@Arzt2014a, MalloDroid@fahlWhyEveMallory2012 and TaintDroid@Enck2010) were selected to check if they were still readily usable.
|
||||
A first work about quantifying the reusability of static analysis tools was proposed by Reaves #etal~@reaves_droid_2016.
|
||||
Seven Android analysis tools (Amandroid~@weiAmandroidPreciseGeneral2014, AppAudit~@xiaEffectiveRealTimeAndroid2015, DroidSafe~@DBLPconfndssGordonKPGNR15, Epicc~@octeau2013effective, FlowDroid~@Arzt2014a, MalloDroid~@fahlWhyEveMallory2012 and TaintDroid~@Enck2010) were selected to check if they were still readily usable.
|
||||
For each tool, both the usability and results of the tool were evaluated by asking auditors to install and use it on DroidBench and 16 real world applications.
|
||||
The auditors reported that most of the tools require a significant amount of time to setup, often due to dependencies issues and operating system incompatibilities.
|
||||
Reaves #etal propose to solve these issues by distributing a Virtual Machine with a functional build of the tool in addition to the source code.
|
||||
|
@ -41,7 +41,7 @@ Reaves #etal also report that real world applications are more challenging to an
|
|||
We will confirm and expand this result in this chapter with a larger dataset than only 16 real-world applications.
|
||||
// Indeed, a more diverse dataset would assess the results and give more insight about the factors impacting the performances of the tools.
|
||||
|
||||
Finally, our approach is similar to the methodology employed by Mauthe #etal for decompilers@mauthe_large-scale_2021.
|
||||
Finally, our approach is similar to the methodology employed by Mauthe #etal for decompilers~@mauthe_large-scale_2021.
|
||||
To assess the robustness of android decompilers, Mauthe #etal used 4 decompilers on a dataset of 40 000 applications.
|
||||
The error messages of the decompilers were parsed to list the methods that failed to decompile, and this information was used to estimate the main causes of failure.
|
||||
It was found that the failure rate is correlated to the size of the method, and that a consequent amount of failure are from third parties library rather than the core code of the application.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue