in the end I ingored a lot of feedbacks, sory jfl

2025-09-30 23:40:43 +02:00 · 2025-09-30 23:40:43 +02:00 · a3fcff0c19
commit a3fcff0c19
parent 346151125e
3 changed files with 25 additions and 20 deletions
--- a/2_background/4_1_rasta.typ
+++ b/2_background/4_1_rasta.typ
@ -14,28 +14,26 @@ They analysed 92 publications and classified them by goal, method used to solve
 In particular, they listed 27 approaches with an open-source implementation available.

 Interestingly, a lot of the tools listed rely on common tools to interact with Android applications/#DEX bytecode.
-Reccuring examples of such support tools are Apktool (#eg Amandroid~@weiAmandroidPreciseGeneral2014, Blueseal~@shenInformationFlowsPermission2014, SAAF~@hoffmannSlicingDroidsProgram2013), Androguard (#eg Adagio~@gasconStructuralDetectionAndroid2013, Appareciumn~@titzeAppareciumRevealingData2015, Mallodroid~@fahlWhyEveMallory2012) or Soot (#eg Blueseal~@shenInformationFlowsPermission2014, DroidSafe~@DBLPconfndssGordonKPGNR15, Flowdroid~@Arzt2014a).
+Reccuring examples of such support tools are Apktool (#eg Amandroid~@weiAmandroidPreciseGeneral2014, Blueseal~@shenInformationFlowsPermission2014, SAAF~@hoffmannSlicingDroidsProgram2013), Androguard (#eg Adagio~@gasconStructuralDetectionAndroid2013, Appareciumn~@titzeAppareciumRevealingData2015, Mallodroid~@fahlWhyEveMallory2012) or Soot (#eg Blueseal~@shenInformationFlowsPermission2014, DroidSafe~@DBLPconfndssGordonKPGNR15, Flowdroid~@Arzt2014a): those tools are built incrementally, on top of each other.
 This strengthens our idea that being able to reuse previous tools is important.
-Those tools are built incrementally, on top of each other.

-Nevertheless, experiments to evaluate the reusability of the pointed out software were not performed by Li #etal
+Nevertheless, Li #etal focus more on the techniques and features described in the reviewed publications, and experiments to evaluate whether the pointed out software are still usable were not performed.
 #jfl-note[We believe that the effort of reviewing the literature for making a comprehensive overview of available approaches should be pushed further: an existing published approach with a software that cannot be used for technical reasons endangers both the reproducibility and reusability of research.][A mettre en avant?]

 //Data-flow analysis is the subject of many contribution~@weiAmandroidPreciseGeneral2014 @titzeAppareciumRevealingData2015 @bosuCollusiveDataLeak2017 @klieberAndroidTaintFlow2014 @DBLPconfndssGordonKPGNR15 @octeauCompositeConstantPropagation2015 @liIccTADetectingInterComponent2015, the most notable tool being Flowdroid~@Arzt2014a.

-We will now explore this direction further by looking at the work that has been done to evaluate different analysis tools.
-Works that perform benchmarks of tools follow a similar method.
-They start by selecting a set of tools with similar goals.
+We will now explore this direction further by looking at other works that have been done to evaluate different analysis tools.
+Those evaluations often take the form of benchmarks and follow a similar method (we will look at the different contributions in more detail in @sec:bg-bench).
+They start by selecting a set of tools with similar goals to compare.
 Usually, those contributions are comparing existing tools to their own, but some contributions do not introduce a new tool and focus on surveying the state of the art for some technique.
 They then selected a dataset of applications to analyse.
 We will see in @sec:bg-datasets that those datasets are often hand-crafted, except for some studies that select a few real-world applications that they manually reverse-engineered to get a ground truth to compare to the tool's result.
-Once the tools and test dataset are selected, the tools are run on the application dataset, and the results of the tools are compared to the ground truth to determine the accuracy of each tool.
-Several factors can be considered to compare the results of the tools:
-the number of false positives, false negatives, or even the time it took to finish the analysis.
+Once the tools and test dataset are selected, the tools are run on the application dataset, and the results of the tools are compared to the expected results (ground truth) to determine the accuracy of each tool.
+Additional factors are sometimes compared as well: the number of false positives, false negatives, or even the time it took to finish the analysis.
 Occasionally, the number of applications a tool simply failed to analyse is also compared.

 In @sec:bg-datasets, we will look at the dataset used in the community to compare analysis tools.
-Then in @sec:bg-bench> we will go through the contributions that benchmarked those tools #jm-note[to see if they can be used as an indication as to which tools can still be used today.][Mettre en avant]
+Then, in @sec:bg-bench, we will go through the contributions that benchmarked those tools #jm-note[to see if they can be used as an indication as to which tools can still be used today.][Mettre en avant]

 ==== Application Datasets <sec:bg-datasets>

@ -57,12 +55,16 @@ These datasets are useful for carefully spotting missing taint flows, but contai

 In addition to those datasets, AndroZoo~@allixAndroZooCollectingMillions2016 collect applications from several application marketplaces, including the Google Play store (the official Google application store), Anzhi and AppChina (two Chinese stores), or FDroid (a store dedicated to free and open source applications).
 Currently, Androzoo contains more than 25 million applications that can be downloaded by researchers from the SHA256 hash of the application.
-Androzoo also provide additional information about the applications, like the date the application was detected for the first time by Androzoo or the number of antiviruses from VirusTotal that flagged the application as malicious.
+Androzoo also provides additional information about the applications, like the date the application was detected for the first time by Androzoo or the number of antiviruses from VirusTotal that flagged the application as malicious.
+This will allow us to sample a dataset of applications evenly distributed over the years.
 In addition to providing researchers with easy access to real-world applications, Androzoo make it a lot easier to share datasets for reproducibility: instead of sharing hundreds of #APK files, the list of SHA256 is enough.

+
 ==== Benchmarking <sec:bg-bench>

-The few datasets composed of real-world applications confirmed that some tools, such as Amandroid~@weiAmandroidPreciseGeneral2014 and Flowdroid~@Arzt2014a, are less efficient on real-world applications~@bosuCollusiveDataLeak2017 @luoTaintBenchAutomaticRealworld2022.
+We will now go through the different contributions that evaluated different static analysis tools to see if they can give us some insights into the current usability of the tools.
+
+The few experiments with datasets composed of real-world applications confirmed that some tools, such as Amandroid~@weiAmandroidPreciseGeneral2014 and Flowdroid~@Arzt2014a, are less efficient on real-world applications~@bosuCollusiveDataLeak2017 @luoTaintBenchAutomaticRealworld2022.
 Unfortunately, those real-world applications datasets are rather small, and a larger number of applications would be more suitable for our goal, #ie evaluating the reusability of a variety of static analysis tools.

 Pauck #etal~@pauckAndroidTaintAnalysis2018 used DroidBench~@Arzt2014a, ICC-Bench~@weiAmandroidPreciseGeneral2014 and DIALDroid-Bench~@bosuCollusiveDataLeak2017 to compare Amandroid~@weiAmandroidPreciseGeneral2014, DIAL-Droid~@bosuCollusiveDataLeak2017, DidFail~@klieberAndroidTaintFlow2014, DroidSafe~@DBLPconfndssGordonKPGNR15, FlowDroid~@Arzt2014a and IccTA~@liIccTADetectingInterComponent2015. //-- all these tools will also be compared in this chapter.