developp rasta methodology section

2025-08-13 17:52:52 +02:00 · 2025-08-13 17:52:52 +02:00 · af1187f041
commit af1187f041
parent 01ce20ffda
4 changed files with 82 additions and 9 deletions
--- a/2_background/3_static_analysis.typ
+++ b/2_background/3_static_analysis.typ
@ -1,6 +1,6 @@
 #import "../lib.typ": APK, etal, ART, SDK, DEX, eg, 
 #import "../lib.typ": todo, jm-note, jfl-note
-#import "@preview/diagraph:0.3.3": raw-render
+#import "@preview/diagraph:0.3.5": raw-render

 //== Android Reverse Engineering Techniques <sec:bg-techniques>

@ -29,6 +29,8 @@ A more advance control-flow analysis consist in building the control-flow graph.
 This time, instead of methods, the nodes represent instructions, and the edges indicate which instruction can follow which instruction.
@fig:bg-fizzbuzz-cg-cfg c) represents the control-flow graph of @fig:bg-fizzbuzz-cg-cfg a), with code statement instead of bytecode instructions.

+#todo[Add alt text for @fig:bg-fizzbuzz-cg and @fig:bg-fizzbuzz-cfg]
+
 #figure({
  set align(center)
  stack(dir: ttb,[
@ -63,7 +65,8 @@ This time, instead of methods, the nodes represent instructions, and the edges i
        "fizzBuzz(int)" -> "Log.e(String, String)"
      }
      ```,
-      width: 40%
+      width: 40%,
+      alt: "",
    ),
    supplement: none,
    kind: "bg-fizzbuzz-cg-cfg subfig",
@ -104,7 +107,8 @@ This time, instead of methods, the nodes represent instructions, and the edges i
        "l7": `Buzzer.buzz();`,
        "l9": `Log.e("fizzbuzz", String.valueOf(i));`,
      ),
-      width: 50%
+      width: 50%,
+      alt: "",
    ),
    supplement: none,
    kind: "bg-fizzbuzz-cg-cfg subfig",
@ -114,7 +118,6 @@ This time, instead of methods, the nodes represent instructions, and the edges i
  supplement: [Figure],
  caption: [Source code for a simple Java method and its Call and Control Flow Graphs],
 )<fig:bg-fizzbuzz-cg-cfg>
-
 Once the control-flow graph is computed, it can be used to compute data-flows.
 Data-flow analysis, also called taint-tracking, allows to follow the flow of information in the application.
 Be defining a list of methods and fields that can generate critical information (taint sources) and a list of methods that can consume information (taint sink), taint-tracking allows to detect potential data leaks (if a data flow link a taint source and a taint sink).
--- a/2_background/X_dynamic_analysis.typ
+++ b/2_background/X_dynamic_analysis.typ
@ -1,5 +1,4 @@
 #import "../lib.typ": todo, APK, etal, ART, SDK, eg, jm-note, jfl-note
-#import "@preview/diagraph:0.3.3": raw-render

 === Dynamic Analysis <sec:bg-dynamic>

--- a/3_rasta/1_intro.typ
+++ b/3_rasta/1_intro.typ
@ -12,7 +12,7 @@ Thus, our contributions are the following.
 We carefully retrieved static analysis tools for Android applications that were selected by Li #etal~@Li2017 between 2011 and 2017. 
 #jm-note[Many of those tools where presented in @sec:bg-static.][Yes but not really, @sec:bg-static do not present the contributions in detail \ FIX: develop @sec:bg-static]
 We contacted the authors, whenever possible, for selecting the best candidate versions and to confirm the good usage of the tools.
-We rebuild the tools in their original environment and #jm-note[share our Docker images.][ref]
+We rebuild the tools in their original environment and share our Docker images.#footnote[on Docker Hub as `histausse/rasta-<toolname>:icsr2024`]
 We evaluated the reusability of the tools by measuring the number of successful analysis of applications taken in the Drebin dataset~@Arp2014 and in a custom dataset that contains more recent applications (#NBTOTALSTRING in total). 
 The observation of the success or failure of these analysis enables us to answer the following research questions: 

--- a/3_rasta/2_methodology.typ
+++ b/3_rasta/2_methodology.typ
@ -1,3 +1,4 @@
+#import "@preview/diagraph:0.3.5": raw-render
 #import "../lib.typ": etal, eg, MWE, HPC, SDK, SDKs, APKs, DEX
 #import "../lib.typ": todo, jfl-note
 #import "X_var.typ": *
@ -5,8 +6,11 @@

 == Methodology <sec:rasta-methodology>

-#todo[small intro: resumé approche + schema?]
-#jfl-note[Add diagram: Li etal -> [tool selection] -> drop/ - selected -> [select source version] -> [packaging] -> docker / -> singularity -> [exp]]
+In this section, we describe our methodology to evaluate the reusability of Android static analysis tools.
+@fig:rasta-methodo-collection and @fig:rasta-overview summarize our approach. 
+We collected tools listed as open source by Li #etal, checked if that the tools where only using static analysis technique, and selected the most rescent version of the tool.
+Whe then packaged the tools inside containers and check our choices with the authors.
+We then run those tools on a large dataset that we sampled, and collected the exit status of the run (wether the tool completed the analysis or not).

 === Collecting Tools

@ -203,6 +207,73 @@ To guarantee reproducibility we published the results, datasets, Dockerfiles and
 - on Docker Hub as `histausse/rasta-<toolname>:icsr2024`.
 ]

+#todo[alt text for @fig:rasta-methodo-collection]
+
+#figure(
+  raw-render(```
+    digraph {
+      rankdir=TB
+      node [shape=none]
+
+      {
+        rank=same
+
+        Li
+        ST
+        TS
+        SV
+        Pack
+        Dock
+      }
+      {
+        rank=same
+
+        Drop0
+        Drop1
+        Drop2
+      }
+
+      Li -> ST
+      ST -> TS
+      TS -> SV
+      SV -> Pack
+      Pack -> Dock
+      ST -> Drop0
+      TS -> Drop1
+      Pack -> Drop2
+    }
+    ```,
+    labels: (
+      "Li": align(center)[Tools from\ Li #etal],
+      "ST": block(stroke: black, inset: 1em)[Search Tools],
+      "TS": block(stroke: black, inset: 1em)[Select Tools],
+      "Drop0": "Drop",
+      "Drop1": "Drop",
+      "Drop2": [Not Reusable],
+      "SV": block(stroke: black, inset: 1em)[Select Source Version],
+      "Pack": block(stroke: black, inset: 1em)[Package],
+      "Dock": [Docker\ Images],
+    ),
+    edges: (
+      //"ST": ("Drop0": align(center, block(inset: 1em)[Tool no longer\ available])),
+      "TS": ("Drop1": align(center, block(inset: 1em)[Uses Dynamic\ Analysis])),
+      "Pack": ("Drop2": align(center, block(inset: 1em)[Could Not Setup\ in 4 days])),
+    ),
+    width: 100%,
+    alt: "",
+  ),
+  caption: [Tool selection methodology overview],
+) <fig:rasta-methodo-collection>
+
+@fig:rasta-methodo-collection summarizes our tool selection process.
+We first looked for the tools listed as open source by Li #etal.
+For the tools still available, we checked if they used dynamic analysis and removed them.
+We then checked if they where more rescent updates of the tools and select the most relevent version.
+Finally, we marked as non-reusable the tools that we could not setup within a period of 4 days, even with the help of the authors.
+
+
+
+
 === Runtime Conditions

 #figure(
@ -211,7 +282,7 @@ To guarantee reproducibility we published the results, datasets, Dockerfiles and
    width: 100%,
    alt: "A diagram representing the methodology. The word 'Tool' is linked to a box labeled 'Docker image' by an arrow labeled 'building'. The box 'Docker image' is linked to a box labeled 'Singularity image' by an arrow labeled 'conversion'. The box 'Singularity image' is linked to a box labeled 'Execution monitoring' by a dotted arrow labeled 'Manuel tests' and to an image of a server labeled 'Singularity cluster' by an arrow labeled deployment. An image of three android logo labeled 'apks' is also linked to the 'Singularity cluster' by an arrow labeled 'running the tool analysis'. The 'Singularity cluster' image is linked to the 'Execution monitoring' box by an arrow labeled 'log capture'. The 'Execution monitoring' box linked to the words 'Exit status' by an unlabeled arrow.",
  ),
-  caption: [Methodology overview],
+  caption: [Experiment methodology overview],
 ) <fig:rasta-overview>

 As shown in @fig:rasta-overview, before benchmarking the tools, we built and installed them in a Docker containers for facilitating any reuse of other researchers.