This commit is contained in:
Jean-Marie 'Histausse' Mineau 2025-07-29 16:23:42 +02:00
parent 243b9df134
commit c060e88996
Signed by: histausse
GPG key ID: B66AEEDA9B645AD2
17 changed files with 264 additions and 96 deletions

View file

@ -331,13 +331,13 @@ Our attempts to upgrade those dependencies led to new errors appearing: we concl
=== State of the art comparison
Luo #etal released TaintBench@luoTaintBenchAutomaticRealworld2022 a real-world benchmark and the associated recommendations to build such a benchmark.
Luo #etal released TaintBench~@luoTaintBenchAutomaticRealworld2022 a real-world benchmark and the associated recommendations to build such a benchmark.
These benchmarks confirmed that some tools such as Amandroid and Flowdroid are less efficient on real-world applications.
// Pauck #etal@pauckAndroidTaintAnalysis2018
// Reaves #etal@reaves_droid_2016
We finally compare our results to the conclusions and discussions of previous papers@luoTaintBenchAutomaticRealworld2022 @pauckAndroidTaintAnalysis2018 @reaves_droid_2016.
First we confirm the hypothesis of Luo #etal that real-world applications lead to less efficient analysis than using hand crafted test applications or old datasets@luoTaintBenchAutomaticRealworld2022.
We finally compare our results to the conclusions and discussions of previous papers~@luoTaintBenchAutomaticRealworld2022 @pauckAndroidTaintAnalysis2018 @reaves_droid_2016.
First we confirm the hypothesis of Luo #etal that real-world applications lead to less efficient analysis than using hand crafted test applications or old datasets~@luoTaintBenchAutomaticRealworld2022.
Even if Drebin is not hand-crafted, it is quite old and we obtained really good results compared to the Rasta dataset.
When considering real-world applications, the size is rather different from hand crafted application, which impacts the success rate.
We believe that it is explained by the fact that the complexity of the code increases with its size.
@ -354,10 +354,10 @@ We believe that it is explained by the fact that the complexity of the code incr
=== State-of-the-art comparison
Our finding are consistent with the numerical results of Pauck #etal that showed that #mypercent(106, 180) of DIALDroid-Bench@bosuCollusiveDataLeak2017 real-world applications are analyzed successfully with the 6 evaluated tools@pauckAndroidTaintAnalysis2018.
Our finding are consistent with the numerical results of Pauck #etal that showed that #mypercent(106, 180) of DIALDroid-Bench~@bosuCollusiveDataLeak2017 real-world applications are analyzed successfully with the 6 evaluated tools~@pauckAndroidTaintAnalysis2018.
Six years after the release of DIALDroid-Bench, we obtain a lower ratio of #mypercent(40.05, 100) for the same set of 6 tools but using the Rasta dataset of #NBTOTALSTRING applications.
We extended this result to a set of #nbtoolsvariationsrun tools and obtained a global success rate of #resultratio.
We confirmed that most tools require a significant amount of work to get them running@reaves_droid_2016.
We confirmed that most tools require a significant amount of work to get them running~@reaves_droid_2016.
Our investigations of crashes also confirmed that dependencies to older versions of Apktool are impacting the performances of Anadroid, Saaf and Wognsen #etal in addition to DroidSafe and IccTa, already identified by Pauck #etal.
/*