I declare this manuscript finished

2025-10-07 17:16:32 +02:00 · 2025-10-07 17:16:32 +02:00 · 5c3a6955bd
commit 5c3a6955bd
parent 9f39ded209
14 changed files with 162 additions and 131 deletions
--- a/2_background/2_3_static_analysis.typ
+++ b/2_background/2_3_static_analysis.typ
@ -4,23 +4,6 @@

 === Static Analysis <sec:bg-static>

-A static analysis program examines an #APK file without executing it to extract information from it.
-Basic static analysis can include extracting information from the `AndroidManifest.xml` file or decompiling bytecode to Java code with tools like Apktool or Jadx.
-Unfortunately, simply reading the bytecode does not scale.
-To do so, a human analyst is needed, making it complicated to analyse a large number of applications, and even for single applications, the size and complexity of some applications can quickly overwhelm the reverse engineer.
-
-Control flow analysis is often used to mitigate this issue.
-The idea is to extract the behaviour, the flow, of the application from the bytecode, and to represent it as a graph.
-A graph representation is easier to work with than a list of instructions and can be used for further analysis.
-Depending on the level of precision required, different types of graphs can be computed.
-The most basic of those graphs is the call graph.
-A call graph is a graph where the nodes represent the methods in the application, and the edges represent calls from one method to another.
-@fig:bg-fizzbuzz-cg-cfg b) show the call graph of the code in @fig:bg-fizzbuzz-cg-cfg a).
-A more advanced control-flow analysis consists of building the control-flow graph.
-This time, instead of methods, the nodes represent instructions, and the edges indicate which instruction can follow which instruction.
-@fig:bg-fizzbuzz-cg-cfg c) represents the control-flow graph of @fig:bg-fizzbuzz-cg-cfg a), with code statements instead of bytecode instructions.
-
-
 #figure({
  set align(center)
  stack(dir: ttb,[
@ -119,6 +102,22 @@ This time, instead of methods, the nodes represent instructions, and the edges i
  caption: [Source code for a simple Java method and its Call and Control Flow Graphs],
 )<fig:bg-fizzbuzz-cg-cfg>

+A static analysis program examines an #APK file without executing it to extract information from it.
+Basic static analysis can include extracting information from the `AndroidManifest.xml` file or decompiling bytecode to Java code with tools like Apktool or Jadx.
+Unfortunately, simply reading the bytecode does not scale.
+To do so, a human analyst is needed, making it complicated to analyse a large number of applications, and even for single applications, the size and complexity of some applications can quickly overwhelm the reverse engineer.
+
+Control flow analysis is often used to mitigate this issue.
+The idea is to extract the behaviour, the flow, of the application from the bytecode, and to represent it as a graph.
+A graph representation is easier to work with than a list of instructions and can be used for further analysis.
+Depending on the level of precision required, different types of graphs can be computed.
+The most basic of those graphs is the call graph.
+A call graph is a graph where the nodes represent the methods in the application, and the edges represent calls from one method to another.
+@fig:bg-fizzbuzz-cg-cfg b) show the call graph of the code in @fig:bg-fizzbuzz-cg-cfg a).
+A more advanced control-flow analysis consists of building the control-flow graph.
+This time, instead of methods, the nodes represent instructions, and the edges indicate which instruction can follow which instruction.
+@fig:bg-fizzbuzz-cg-cfg c) represents the control-flow graph of @fig:bg-fizzbuzz-cg-cfg a), with code statements instead of bytecode instructions.
+
 Once the control-flow graph is computed, it can be used to compute data-flows.
 Data-flow analysis, also called taint-tracking, is used to follow the flow of information in the application.
 By defining a list of methods and fields that can generate critical information (taint sources) and a list of methods that can consume information (taint sinks), taint-tracking detects potential data leaks (if a data flow links a taint source and a taint sink).