diff --git a/2_background/1_intro.typ b/2_background/1_intro.typ index f67433d..95ae29c 100644 --- a/2_background/1_intro.typ +++ b/2_background/1_intro.typ @@ -3,9 +3,10 @@ == Introduction In order to understand the challenges of reverse engineering Android applications, we first need to understand some key concepts and specificities of Android. -In particular, the format in which applications are distributed, as well as the runtime environment that runs those applications, is very specific to Android. +In particular, the format in which applications are distributed, as well as the runtime environment that runs those applications, are very specific to Android. To handle those specificities, a reverse engineer must have appropriate tools. Some of those tools are used recurrently, either by the reverse engineer themself, or as a basis for other more complex tools that implement more advanced analysis techniques. +// NOTE: "reverse engineer themself": both themself and themselves are correct, I prefere themself here because it's one specific non gendered engineer. Among those techniques, the ones that do not require running the application are called static analysis. Over time, many of those tools have been released. diff --git a/2_background/2_1_android.typ b/2_background/2_1_android.typ index 86540f1..3c917a9 100644 --- a/2_background/2_1_android.typ +++ b/2_background/2_1_android.typ @@ -11,7 +11,7 @@ Those changes make Android a unique operating system. ==== Android Applications Applications in the Android ecosystem are distributed in the #APK format. -#APK files are #JAR files with additional features, which are themself #ZIP files with additional features. +#APK files are #JAR files with additional features, which are themselves #ZIP files with additional features. A minimal #APK file contains a file `AndroidManifest.xml`, the `META-INF/` folder containing the #JAR manifest and signature files, and an #APK Signing Block at the end of the #ZIP file. The code of the application is then stored in a custom format, the Dalvik bytecode, or in the binary ELF format, called native code in the Android ecosystem, or both. diff --git a/2_background/2_3_static_analysis.typ b/2_background/2_3_static_analysis.typ index 545ae98..aca75fa 100644 --- a/2_background/2_3_static_analysis.typ +++ b/2_background/2_3_static_analysis.typ @@ -119,7 +119,7 @@ This time, instead of methods, the nodes represent instructions, and the edges i @fig:bg-fizzbuzz-cg-cfg c) represents the control-flow graph of @fig:bg-fizzbuzz-cg-cfg a), with code statements instead of bytecode instructions. Once the control-flow graph is computed, it can be used to compute data-flows. -Data-flow analysis, also called taint-tracking, is used to follow the flow of information in the application. +Data-flow analysis/*, also called taint-tracking, reviewer note: not really, taint-tracking \in data flow analysis*/is used to follow the flow of information in the application. By defining a list of methods and fields that can generate critical information (taint sources) and a list of methods that can consume information (taint sinks), taint-tracking detects potential data leaks (if a data flow links a taint source and a taint sink). For example, `TelephonyManager.getImei()` returns a unique, persistent, device identifier. This can be used to identify the user, and it cannot be changed if compromised. diff --git a/2_background/4_1_rasta.typ b/2_background/4_1_rasta.typ index 5bf989f..7807036 100644 --- a/2_background/4_1_rasta.typ +++ b/2_background/4_1_rasta.typ @@ -14,7 +14,7 @@ They analysed 92 publications and classified them by goal, method used to solve In particular, they listed 27 approaches with an open-source implementation available. Interestingly, a lot of the tools listed rely on common tools to interact with Android applications/#DEX bytecode. -Reccuring examples of such support tools are Apktool (#eg Amandroid~@weiAmandroidPreciseGeneral2014, Blueseal~@shenInformationFlowsPermission2014, SAAF~@hoffmannSlicingDroidsProgram2013), Androguard (#eg Adagio~@gasconStructuralDetectionAndroid2013, Appareciumn~@titzeAppareciumRevealingData2015, Mallodroid~@fahlWhyEveMallory2012) or Soot (#eg Blueseal~@shenInformationFlowsPermission2014, DroidSafe~@DBLPconfndssGordonKPGNR15, Flowdroid~@Arzt2014a): those tools are built incrementally, on top of each other. +Recuring examples of such support tools are Apktool (#eg Amandroid~@weiAmandroidPreciseGeneral2014, Blueseal~@shenInformationFlowsPermission2014, SAAF~@hoffmannSlicingDroidsProgram2013), Androguard (#eg Adagio~@gasconStructuralDetectionAndroid2013, Appareciumn~@titzeAppareciumRevealingData2015, Mallodroid~@fahlWhyEveMallory2012) or Soot (#eg Blueseal~@shenInformationFlowsPermission2014, DroidSafe~@DBLPconfndssGordonKPGNR15, Flowdroid~@Arzt2014a): those tools are built incrementally, on top of each other. This strengthens our idea that being able to reuse previous tools is important. Nevertheless, Li #etal focus more on the techniques and features described in the reviewed publications, and experiments to evaluate whether the pointed out software are still usable were not performed. @@ -23,7 +23,7 @@ Nevertheless, Li #etal focus more on the techniques and features described in th //Data-flow analysis is the subject of many contribution~@weiAmandroidPreciseGeneral2014 @titzeAppareciumRevealingData2015 @bosuCollusiveDataLeak2017 @klieberAndroidTaintFlow2014 @DBLPconfndssGordonKPGNR15 @octeauCompositeConstantPropagation2015 @liIccTADetectingInterComponent2015, the most notable tool being Flowdroid~@Arzt2014a. We will now explore this direction further by looking at other works that have been done to evaluate different analysis tools. -Those evaluations often take the form of benchmarks and follow a similar method (we will look at the different contributions in more detail in @sec:bg-bench). +Those evaluations often take the form of benchmarks and follow a similar method (we will look at the different contributions in more details in @sec:bg-bench). They start by selecting a set of tools with similar goals to compare. Usually, those contributions are comparing existing tools to their own, but some contributions do not introduce a new tool and focus on surveying the state of the art for some technique. They then selected a dataset of applications to analyse. @@ -57,7 +57,7 @@ In addition to those datasets, AndroZoo~@allixAndroZooCollectingMillions2016 col Currently, Androzoo contains more than 25 million applications that can be downloaded by researchers from the SHA256 hash of the application. Androzoo also provides additional information about the applications, like the date the application was detected for the first time by Androzoo or the number of antiviruses from VirusTotal that flagged the application as malicious. This will allow us to sample a dataset of applications evenly distributed over the years. -In addition to providing researchers with easy access to real-world applications, Androzoo make it a lot easier to share datasets for reproducibility: instead of sharing hundreds of #APK files, the list of SHA256 is enough. +In addition to providing researchers with easy access to real-world applications, Androzoo makes it a lot easier to share datasets for reproducibility: instead of sharing hundreds of #APK files, the list of SHA256 is enough. ==== Benchmarking diff --git a/2_background/4_2_classloader.typ b/2_background/4_2_classloader.typ index 1b2aecc..6c2353a 100644 --- a/2_background/4_2_classloader.typ +++ b/2_background/4_2_classloader.typ @@ -29,7 +29,7 @@ Additionally, our problem statement does not focus on spoofing classes at runtim Contributions about Android class loading focus on using the capabilities of class loading to extend Android features or to prevent reverse engineering of Android applications. For instance, Zhou #etal~@zhou_dynamic_2022 extend the class loading mechanism of Android to support regular Java bytecode, and Kritz and Maly~@kriz_provisioning_2015 propose a new class loader to automatically load modules of an application without user interactions. -Regarding reverse engineering, class loading mechanisms are frequently used by packers for hiding all or parts of the code of an application~@Duan2018. +Regarding reverse engineering, class loading mechanisms are frequently used by packers, applications that load their actual code at runtime, for hiding all or parts of the code of an application~@Duan2018. For example, packers exploits the class loading capability of Android to load new code. They also combine the loading with code generation from ciphered assets or code modification from native code calls~@liao2016automated to increase the difficulty of recovery of the code. Because parts of the original code will be only available at runtime, deobfuscation approaches propose techniques that track #DEX structures when manipulated by the application~@zhang2015dexhunter @xue2017adaptive @wong2018tackling. diff --git a/2_background/4_3_theseus.typ b/2_background/4_3_theseus.typ index e5f5fb8..b967755 100644 --- a/2_background/4_3_theseus.typ +++ b/2_background/4_3_theseus.typ @@ -46,7 +46,7 @@ It resulted that dynamic code loading was mostly related to mobile advertisement Similarly, StaDynA~@zhauniarovichStaDynAAddressingProblem2015 is a framework that generates a call graph statically, then uses dynamic analysis to analyse dynamic code loading and reflection calls to complete this call graph. The issue with those approaches is that they are only compatible with their own subsequent analysis. -For instance, StaDynA only provide the call graph, and cannot be used as is to improve the capacity of Flowdroid. +For instance, StaDynA only provides the call graph, and cannot be used as is to improve the capacity of Flowdroid. This is unfortunate: the reverse engineer's next step will depend on the context. Not being able to reuse the result of a previous analysis with any ad hoc tools greatly limits their options. AppSpear has an interesting solution to this issue: the code it intercepts is repackaged inside a new #APK file that Android analysis tools should be able to analyse. @@ -74,7 +74,7 @@ Samhi #etal~@samhi_jucify_2022 followed this direction to unify the analysis of Their tool, JuCify, uses Angr~@angrPeople to generate the call graph of the native code, and uses heuristics to encode this call graph into Jimple that can then be added to the Jimple generated by Soot from the bytecode of the application. Like IccTa, they use Flowdroid to analyse this new augmented representation of the application, but it should be usable by any analysis tools relying on Soot. -Finally, DroidRA~@li_droidra_2016 use the COAL~@octeauCompositeConstantPropagation2015 solver to statically compute the reflection information. +Finally, DroidRA~@li_droidra_2016 uses the COAL~@octeauCompositeConstantPropagation2015 solver to statically compute the reflection information. The reflection calls are transformed into direct calls inside the application using Soot. Using COAL makes DroidRA quite good at solving the simpler cases, where the names of classes and methods targeted by reflection are already present in the application. Those cases are quite common; being able to solve those without resorting to dynamic analysis is quite useful. diff --git a/3_rasta/1_intro.typ b/3_rasta/1_intro.typ index dd97163..59842b2 100644 --- a/3_rasta/1_intro.typ +++ b/3_rasta/1_intro.typ @@ -33,4 +33,4 @@ The chapter is structured as follows. @sec:rasta-failure-analysis investigates the reasons behind the observed failures of some of the tools. We then compare in @sec:rasta-soa-comp our results with the contributions presented in @sec:bg. In @sec:rasta-reco, we give recommendations for tool development that we drew from our experience running our experiment. -Finally, @sec:rasta-limit lists the limit of our approach, @sec:rasta-futur presents further avenues that did not have time to pursue and @sec:rasta-conclusion concludes the chapter. +Finally, @sec:rasta-limit lists the limit of our approach, @sec:rasta-future presents further avenues that did not have time to pursue and @sec:rasta-conclusion concludes the chapter. diff --git a/3_rasta/2_methodology.typ b/3_rasta/2_methodology.typ index 5921c60..b0bd950 100644 --- a/3_rasta/2_methodology.typ +++ b/3_rasta/2_methodology.typ @@ -204,7 +204,7 @@ To guarantee reproducibility, we published the results, datasets, Dockerfiles an - https://zenodo.org/records/10144014 . - https://zenodo.org/records/10980349 . - on Docker Hub as `histausse/rasta-:icsr2024`. -] +]. #figure( raw-render(``` diff --git a/3_rasta/4_failures_analysis.typ b/3_rasta/4_failures_analysis.typ index 89cb7ce..347880b 100644 --- a/3_rasta/4_failures_analysis.typ +++ b/3_rasta/4_failures_analysis.typ @@ -263,7 +263,7 @@ dad: SError #paragraph([Mallodroid and Apparecium])[ Mallodroid and Apparecium stand out as the tools that raised the most errors in one run. -They can raise more than #num(10000) error by analysis. +They can raise more than #num(10000) errors by analysis. However, it happened only for a few dozen #APKs, and conspicuously, the same #APKs raised the same high number of errors for both tools. The recurring error is a `KeyError` raised by Androguard when trying to find a string by its identifier. Although this error is logged, it seems successfully handled, and during a manual analysis of the execution, both tools seemingly perform their analysis without issue. @@ -286,7 +286,7 @@ Instruction10x% */ #paragraph([Blueseal])[ -Because Blueseal rarely log more than one error when crashing, it is easy to identify the relevant error. +Because Blueseal rarely logs more than one error when crashing, it is easy to identify the relevant error. The majority of crashes come from unsupported Android versions (due to the magic number of the DEX files not being supported by the version of back smali used by Blueseal) and methods whose implementation is not found (like native methods). ] diff --git a/3_rasta/6_recommendations.typ b/3_rasta/6_recommendations.typ index 192c5e8..0a390fb 100644 --- a/3_rasta/6_recommendations.typ +++ b/3_rasta/6_recommendations.typ @@ -40,7 +40,7 @@ Good error reporting can allow future users to solve issues encountered using th This issue could easily be fixed by changing the filenames used to store the results. In contrast, the errors generated by Flowdroid are so opaque that we have no idea how we could solve them. -And at last, an important remark concerns the libraries used by a tool. +At last, an important remark concerns the libraries used by a tool. We have seen two types of libraries: - internal libraries manipulating internal data of the tool. - external libraries that are used to manipulate the input data (APKs, bytecode, resources). diff --git a/3_rasta/7_limitations.typ b/3_rasta/7_limitations.typ index 1b1da77..bec8e43 100644 --- a/3_rasta/7_limitations.typ +++ b/3_rasta/7_limitations.typ @@ -2,7 +2,7 @@ Some limitations of our approach should be kept in mind. -Our application dataset is biased in favour of Androguard, because Androzoo have already used Androguard internally when collecting applications and discarded any application that cannot be processed with this tool. +Our application dataset is biased in favour of Androguard, because Androzoo has already used Androguard internally when collecting applications and discarded any application that cannot be processed with this tool. Despite our best efforts, it is possible that we made mistakes when building or using the tools. It is also possible that we wrongly classified a result as a failure. diff --git a/3_rasta/8_futur_works.typ b/3_rasta/8_future_works.typ similarity index 96% rename from 3_rasta/8_futur_works.typ rename to 3_rasta/8_future_works.typ index 0b1477b..d039691 100644 --- a/3_rasta/8_futur_works.typ +++ b/3_rasta/8_future_works.typ @@ -2,7 +2,7 @@ #import "X_var.typ": tool_info #import "X_lib.typ": ok -== Futur Works +== Future Works A first extension to this work would obviously be to study more tools. We restricted ourselves to the tools listed by Li #etal, but it would be interesting to compare our result to the finishing rate of recently released tools. @@ -17,7 +17,7 @@ Such datasets would need to be updated regularly: we saw that there is a trend f In addition to the finishing rate, it would be both interesting and useful to have reference values. @tab:rasta-rec-deps list common Android-related dependencies we encountered when packaging the tools. -We can see that each tools use at least one of those dependencies. +We can see that each tool uses at least one of those dependencies. It would be reasonable to consider the best finishing ratio a tool can have to be the finishing ratio of a tool that would perform an "empty analysis" using the same dependencies. Considering the prevalence of those dependencies, having those theoretical minimums could also guide future tool developers when choosing their dependencies. diff --git a/3_rasta/9_conclusion.typ b/3_rasta/9_conclusion.typ index c87ef0a..6f7e5b6 100644 --- a/3_rasta/9_conclusion.typ +++ b/3_rasta/9_conclusion.typ @@ -29,5 +29,5 @@ This will allow the research community to use the tools directly without the bui In some cases, it was due to our inability to set up the tool correctly. Mostly, it was due to the high failure rate when analysing real-world applications. Results show that large applications cause more crashes, as do applications with a higher min #SDK target. - Goodware also appear to generate more analysis failures than malware. + Goodware also appears to generate more analysis failures than malware. ]))) diff --git a/3_rasta/figs/running.svg b/3_rasta/figs/running.svg index bb977fc..694d6ad 100644 --- a/3_rasta/figs/running.svg +++ b/3_rasta/figs/running.svg @@ -1,225 +1,1607 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/3_rasta/main.typ b/3_rasta/main.typ index 7c5f63d..95e8eb9 100644 --- a/3_rasta/main.typ +++ b/3_rasta/main.typ @@ -21,7 +21,7 @@ This chapter intends to explore the robustness of past software dedicated to static analysis of Android applications. We pursue the community effort that identified software supporting publications that perform static analysis of mobile applications, and we propose a method for evaluating the reliability of this software. We extensively evaluate static analysis tools on a recent dataset of Android applications, including goodware and malware, that we designed to measure the influence of parameters such as the date and size of applications. - Our results show that #resultunusable of the evaluated tools are no longer usable and that the size of the bytecode and the min #SDK version have the greatest influence on the reliability of tested the tools. + Our results show that #resultunusable of the evaluated tools are no longer usable and that the size of the bytecode and the min #SDK version have the greatest influence on the reliability of the tested tools. ]))) @@ -32,5 +32,5 @@ #include("5_soa_comp.typ") #include("6_recommendations.typ") #include("7_limitations.typ") -#include("8_futur_works.typ") +#include("8_future_works.typ") #include("9_conclusion.typ") diff --git a/jury.typ b/jury.typ index 0f4fe7f..6fc3823 100644 --- a/jury.typ +++ b/jury.typ @@ -8,12 +8,10 @@ column-gutter: 2em, stroke: 0pt, inset: (x: 0pt, y: .5em), - //"Présidente :", "", "", "", + "Président :", "Olivier Barais", "Professeur des Universités", "Université de Rennes", "Rapporteurs :", "Vincent Nicomette", "Professeur des Universités", "INSA de Toulouse", "", "Julien Signoles", "Directeur de Recherche", "CEA LIST", - "Examinateurs :", "Olivier Barais", "Professeur des Universités", "Université de Rennes", - //"", "Guillaume Doyen", "Professeur", "IMT Atlantique", - "", "Simone Aonzo", /*"Assistant Professor"*/ "Maître de Conférences", "Eurecom", + "Examinateurs :", "Simone Aonzo", /*"Assistant Professor"*/ "Maître de Conférences", "Eurecom", "Dir. de thèse :", "Jean-François Lalande", "Professeur des Universités", "CentraleSupélec", "", "Valérie Viet Triem Tong", "Professeure", "CentraleSupélec", )