wip

2025-07-29 16:23:42 +02:00 · 2025-07-29 16:23:42 +02:00 · c060e88996
commit c060e88996
parent 243b9df134
17 changed files with 264 additions and 96 deletions
--- a/3_rasta/2_methodology.typ
+++ b/3_rasta/2_methodology.typ
@ -66,18 +66,18 @@
    *documentation*: #okk: excellent, MWE, #ok: few inconsistencies, #bad: bad quality, #ko: not available\
    *decision*: #ok: considered; #bad: considered but not built; #ko: out of scope of the study
  ]},
-  caption: [Considered tools@Li2017: availability and usage reliability],
+  caption: [Considered tools~@Li2017: availability and usage reliability],
 ) <tab:rasta-tools>

-We collected the static analysis tools from@Li2017, plus one additional paper encountered during our review of the state-of-the-art (DidFail@klieberAndroidTaintFlow2014). 
+We collected the static analysis tools from~@Li2017, plus one additional paper encountered during our review of the state-of-the-art (DidFail~@klieberAndroidTaintFlow2014). 
 They are listed in @tab:rasta-tools, with the original release date and associated paper. 
-We intentionally limited the collected tools to the ones selected by Li #etal@Li2017 for several reasons.
+We intentionally limited the collected tools to the ones selected by Li #etal~@Li2017 for several reasons.
 First, not using recent tools enables to have a gap of at least 5 years between the publication and the more recent APK files, which enables to measure the reusability of previous contribution with a reasonable gap of time. 
-Second, collecting new tools would require to describe these tools in depth, similarly to what have been performed by Li #etal@Li2017, which is not the primary goal of this paper. 
+Second, collecting new tools would require to describe these tools in depth, similarly to what have been performed by Li #etal~@Li2017, which is not the primary goal of this paper. 
 Additionally, selection criteria such as the publication venue or number of citations would be necessary to select a subset of tools, which would require an additional methodology. 
 These possible contributions are left for future work.

-Some tools use hybrid analysis (both static and dynamic): A3E@DBLPconfoopslaAzimN13, A5@vidasA5AutomatedAnalysis2014, Android-app-analysis@geneiatakisPermissionVerificationApproach2015, StaDynA@zhauniarovichStaDynAAddressingProblem2015. 
+Some tools use hybrid analysis (both static and dynamic): A3E~@DBLPconfoopslaAzimN13, A5~@vidasA5AutomatedAnalysis2014, Android-app-analysis~@geneiatakisPermissionVerificationApproach2015, StaDynA~@zhauniarovichStaDynAAddressingProblem2015. 
 They have been excluded from this paper. 
 We manually searched the tool repository when the website mentioned in the paper is no longer available (#eg when the repository have been migrated from Google code to GitHub) and for each tool we searched for:

@ -89,7 +89,7 @@ In @tab:rasta-tools we rated the quality of these artifacts with "#ok" when avai
 Results show that documentation is often missing or very poor (#eg Lotrack), which makes the rebuild process very complex and the first analysis of a MWE.  


-We finally excluded Choi #etal@CHOI2014620 as their tool works on the sources of Android applications, and Poeplau #etal@DBLPconfndssPoeplauFBKV14 that focus on Android hardening. 
+We finally excluded Choi #etal~@CHOI2014620 as their tool works on the sources of Android applications, and Poeplau #etal~@DBLPconfndssPoeplauFBKV14 that focus on Android hardening. 
 As a summary, in the end we have #nbtoolsselected tools to compare. 
 Some specificities should be noted. 
 The IC3 tool will be duplicated in our experiments because two versions are available: the original version of the authors and a fork used by other tools like IccTa.   
@ -255,11 +255,11 @@ Probleme 2: pour sampler, on utilise les deciles de taille d'apk, mais pour nos
 */

 Two datasets are used in the experiments of this section. 
-The first one is *Drebin*@Arp2014, from which we extracted the malware part (5479 samples that we could retrieved) for comparison purpose only. 
-It is a well known and very old dataset that should not be used anymore because it contains temporal and spatial biases@Pendlebury2018. 
+The first one is *Drebin*~@Arp2014, from which we extracted the malware part (5479 samples that we could retrieved) for comparison purpose only. 
+It is a well known and very old dataset that should not be used anymore because it contains temporal and spatial biases~@Pendlebury2018. 
 We intend to compare the rate of success on this old dataset with a more recent one. 
 The second one, *Rasta*, we built to cover all dates between 2010 to 2023. 
-This dataset is a random extract of Androzoo@allixAndroZooCollectingMillions2016, for which we balanced applications between years and size. 
+This dataset is a random extract of Androzoo~@allixAndroZooCollectingMillions2016, for which we balanced applications between years and size. 
 For each year and inter-decile range of size in Androzoo, 500 applications have been extracted with an arbitrary proportion of 7% of malware. 
 This ratio has been chosen because it is the ratio of goodware/malware that we observed when performing a raw extract of Androzoo.
 For checking the maliciousness of an Android application we rely on the VirusTotal detection indicators.