parent
25c79da4f9
commit
021ac36e73
15 changed files with 110 additions and 75 deletions
|
@ -1,4 +1,5 @@
|
|||
#import "../lib.typ": todo, epigraph, eg, APK, API, highlight-block, jm-note, pb1-text, pb2-text, pb3-text
|
||||
#import "../lib.typ": epigraph, eg, APK, API, highlight-block, pb1-text, pb2-text, pb3-text
|
||||
#import "../lib.typ": todo, jfl-note, jm-note
|
||||
|
||||
= Introduction <sec:intro>
|
||||
|
||||
|
@ -8,88 +9,99 @@
|
|||
|
||||
// De tout temps les hommes on fait des apps android ...
|
||||
Android is the most used mobile operating system since 2014, and since 2017, it even surpasses Windows all platforms combined#footnote[https://gs.statcounter.com/os-market-share#monthly-200901-202304].
|
||||
The public adoption of Android is confirmed by application developers, with 1.3 millions apps available in the Google Play Store in 2014, and 3.5 millions apps available in 2017#footnote[https://www.statista.com/statistics/266210].
|
||||
The public adoption of Android is confirmed by application developers, with 1.3 million apps available in the Google Play Store in 2014, and 3.5 million apps available in 2017#footnote[https://www.statista.com/statistics/266210].
|
||||
Its popularity makes Android a prime target for malware developers.
|
||||
Various applications have been shown to behave maliciously, from stealing personal informations~@shanSelfhidingBehaviorAndroid2018 to hijacking the phone computing ressources to mine cryptocurrency~@adjibi_devil_2022.
|
||||
Indeed, various applications have been shown to behave maliciously, from stealing personal informations~@shanSelfhidingBehaviorAndroid2018 to hijacking the smartphone's computing resources to mine cryptocurrency~@adjibi_devil_2022.
|
||||
|
||||
Considering the importance of Android in the everyday live of so many people, Google, the company that develops Android, defined a very strong security model that addresses an extensive threat model~@mayrhofer_android_2021.
|
||||
This threat model goes as far as to consider that an adversarie can have physical access to an unlocked device (#eg an abusive partner, or a border control). // Americaaaaa
|
||||
On the device, this security model imply the sandboxing of each applications, with a system of permissions to allow the applications to perform potentially unwanted actions.
|
||||
For example, an applications cannot access the contact list without requesting the permission to the user first.
|
||||
Android keep improving its security version from version, be it by improving the sandboxing (#eg starting with Android 10, application can no longer access the clipboard if they are not focused) or safer default (#eg since Android 9, by default, all network connection must use TLS).
|
||||
Considering the importance of Android in the everyday life of so many people, Google, the company that develops Android, defined a very strong security model that addresses an extensive threat model~@mayrhofer_android_2021.
|
||||
This threat model goes as far as to consider that an adversary can have physical access to an unlocked device (#eg an abusive partner, or a border control). // Americaaaaa
|
||||
On the device, this security model includes the sandboxing of each application, controlled using a system of permissions to allow the applications to perform potentially unwanted actions.
|
||||
For example, an application cannot access the contact list without requesting permission from the user first.
|
||||
Android keeps improving its security from version to version by improving the sandboxing (#eg starting with Android 10, applications can no longer access the clipboard if they are not focused) or by using safer defaults (#eg since Android 9, by default, all network connections must use TLS).
|
||||
// Android Bouncer, ca marche pas tres bien quand même ect ect (stralker ware?)
|
||||
|
||||
In the spirit of _defence in depth_, Google develloped a _Bouncer_ service that scan applications in the store for malicious software#footnote[https://googlemobile.blogspot.com/2012/02/android-and-security.html].
|
||||
Although its operating is kept secret, it seems that the Bouncer is both comparing the applications with known malware code and running the applications in Google's cloud infrastructure to detect hidden behavior.
|
||||
In the spirit of _defence in depth_, Google developed a _Bouncer_ service that scans applications in the store for malicious software#footnote[https://googlemobile.blogspot.com/2012/02/android-and-security.html].
|
||||
Although its #jm-note[operation][I would have said "operating" but grammarly disagrees] is kept secret, it seems that the Bouncer is both comparing the applications with known malware code and running the applications in Google's cloud infrastructure to detect hidden behavior.
|
||||
Despite Google's efforts, malicious applications are still found in the Play Store~@adjibi_devil_2022.
|
||||
Also, it is not uncommmon for people in abusive situation to have their abuser install on their phone a stalkerware (spying application) found outside of the Play Store~@stateofstalkerware.
|
||||
Also, it is not uncommon for people in abusive situations #jfl-note[to have their abuser install][jfl says "install#strong[ing]", jm says no, grammarly is on the side of jm] on their phone a stalkerware (spying application) found outside of the Play Store~@stateofstalkerware.
|
||||
|
||||
For this reasons, it is important to be able to analyse an application and understand was it does.
|
||||
For these reasons, it is important to be able to analyse an application and understand what it does.
|
||||
This process is called reverse engineering.
|
||||
A lot of work has been done to reverse engineering computer software, but Android applications come with specific challenges that need to be address.
|
||||
For instance, Android application have a distributed in a specific file format, the #APK format, and the code of the application is mainly compile into an Android specific bytecode: Dalvik.
|
||||
An Android reverse engineer will need tools that can read those Android specific formats.
|
||||
A lot of work has been done to reverse engineer computer software, but Android applications come with specific challenges that need to be addressed.
|
||||
For instance, Android applications are distributed in a specific file format, the #APK format, and the code of the application is mainly compiled into an Android-specific bytecode: Dalvik.
|
||||
An Android reverse engineer will need tools that can read those Android-specific formats.
|
||||
A first test in the process of reverse engineering an application would be to simply read the content of the application and the code in it.
|
||||
Tools like apktool can be used to convert the binary files of an application in a human readable format.
|
||||
Other tools like Jadx can go farther and try to generate Java code from the bytecode in the application.
|
||||
Because Android applications tend to be massive, it can be quite tedious to understand what it doest juste from reading its bytecode.
|
||||
To help, many tools/approches have been developed~@Li2017 @sutter_dynamic_2024.
|
||||
For example, Flowdroid~@Arzt2014a aims to detect information leak: given a set of methods that can generate private information, and a set of methods that send information to the outside, Flowdroid will detect if private information is send to the outside.
|
||||
Once again, those kind of tools need to target Android specifically.
|
||||
Android run its applications code differently very than a computer would run software.
|
||||
One example would be entry points: computer software usually have one starting point, when Android applications have many, that Android will chose depending on the situation.
|
||||
Unfortunately, those tools are hard to use, and even when the work on small example application, it is not uncommon for them to fail to run on real-live applications~@reaves_droid_2016.
|
||||
Tools like Apktool can be used to convert the binary files of an application into a human-readable format.
|
||||
Other tools like Jadx can go further and try to generate Java code from the bytecode in the application.
|
||||
Because Android applications tend to be quite large, it can be quite tedious to understand what it does just from reading its bytecode.
|
||||
To address this issue, many tools/approaches have been developed~@Li2017 @sutter_dynamic_2024 to extract higher-level information about the behavior of the application without having to manually analyse the application.
|
||||
For example, Flowdroid~@Arzt2014a aims to detect information leaks: given a set of methods that can generate private information, and a set of methods that send information to the outside, Flowdroid will detect if private information is sent to the outside.
|
||||
Once again, those kinds of tools need to target Android specifically.
|
||||
Android runs its applications code differently than a computer would run software.
|
||||
One example would be the handling of entry points: computer software usually has one entry point, whereas Android applications have many, and Android will choose depending on context.
|
||||
Unfortunately, those tools are hard to use, and even when they work on small example applications, it is not uncommon for them to fail to run on real-life applications~@reaves_droid_2016.
|
||||
This is worrying.
|
||||
Android applications are becoming more complexe every years and tools that cannot handle this complexity only fail more often.
|
||||
Android applications are becoming more complex every year, and tools that cannot handle this complexity will fail more often.
|
||||
This leads us to our first problem statement:
|
||||
// Chiffrer les contrib avec des xp qui ignore les app qui font crasher les outils?
|
||||
|
||||
#highlight-block(breakable: false)[
|
||||
*Pb1*: #pb1-text
|
||||
|
||||
Many tools have been published to analyse Android applications, but the Android ecosystem is fast evolving.
|
||||
Many tools have been published to analyse Android applications, but the Android ecosystem is evolving rapidly.
|
||||
Tools developed 5 years ago might not be usable anymore.
|
||||
We will endeavor to identify which tools are still usable today, and for the others, what causes them to no longer be an option.
|
||||
] <pb-1>
|
||||
|
||||
Another issue is that Android application developpers sometime use various techniques to slow down reverse engineering.
|
||||
This process called obfuscation.
|
||||
Malware developpers do that to hide malicious behavior and avoid detection, but the use of obfuscation is not a proof that and application is malicious.
|
||||
Indeed, legitimate applications developpers can also use obfuscation to protect their intellectual property. // burrkkk
|
||||
Thus, developpers and reverse engineers are playing a game of cats and mouse, constantly inventing new technique to hide or reveal the behavior of an application.
|
||||
Another issue is that Android application developers sometimes use various techniques to slow down reverse engineering.
|
||||
This process is called obfuscation.
|
||||
Malware developers do that to hide malicious behavior and avoid detection, but the use of obfuscation is not proof that an application is malicious.
|
||||
Indeed, legitimate application developers can also use obfuscation to protect their intellectual property. // burrkkk
|
||||
Thus, developers and reverse engineers are playing a game of cat and mouse, constantly inventing new techniques to hide or reveal the behavior of an application.
|
||||
|
||||
They are two types of reverse engineering: static and dynamic.
|
||||
Static analysis consists of examining the application without running it, while dynamic analysis studdy the action of the application durring its run.
|
||||
There are two types of reverse engineering techniques: static and dynamic.
|
||||
Static analysis consists #jfl-note[of][jfl asks "in"?\ grammarly says "of"] examining the application without running it, while dynamic analysis studies the action of the application while it is running.
|
||||
Both methods have their drawbacks, and techniques will often capitalyse on the drawbacks of one of those methods.
|
||||
For instance, an application can try to detect if it is running in a sandbox environment and not act maliciously if it is the case.
|
||||
Similarly, an application can dynamicaly load bytecode at runtime, and this bytecode will not be available during a static analysis.
|
||||
Dynamic code loading rely on Java classes called `ClassLoader` that are central components of the Android runtime environment.
|
||||
Because dynamic code loading is such a difficult probleme for static analysis, dynamic class loading is often ignore when doing static analysis.
|
||||
Similarly, an application can dynamically load bytecode at runtime, and this bytecode will not be available during a static analysis.
|
||||
Dynamic code loading relies on Java classes called `ClassLoader` that are central components of the Android runtime environment.
|
||||
Because dynamic code loading is such a difficult problem for static analysis, dynamic class loading is often ignored when doing static analysis.
|
||||
However, class loading is not limited to dynamic code loading.
|
||||
In fact, the Android Runtime is constantly performing class loading to load classes from the application of from the Android platform itself.
|
||||
As a matter of fact, the Android Runtime is constantly performing class loading to load classes from the application or from the Android platform itself.
|
||||
This blind spot in static analysis tools raises our second problem statement:
|
||||
|
||||
#highlight-block(breakable: false)[
|
||||
*Pb2*: #pb2-text
|
||||
|
||||
Class loading is an operation often ignored in static analysis.
|
||||
Class loading is an operation often ignored by static analysis tools.
|
||||
The exact algorithm used is not well known and might not be accurately modeled by static analysis tools.
|
||||
If it is the case, discrepancies between the model of the tools and the one used by Android could be used as a base for new obfuscation techniques.
|
||||
] <pb-2>
|
||||
|
||||
#jfl-note[
|
||||
Reflection is another common obfuscation technique against static analysis.
|
||||
Instead of directly invoking methods, the generic `Method.invoke()` #API is used, and the method is retrieved from its name in the form of a character string.
|
||||
Finding the value of this string can be quite difficult to determine statically, so it is once again an issue more suitable for dynamic analysis.
|
||||
A reverse engineer can obtain the relevant information with dynamic analysing, but there is no standard way to make static analysis tools aware of it.
|
||||
This lead us to our last problem statement:
|
||||
When encountering a complex case of reflection (#ie using ciphered strings) or code loading, a reverse engineer will switch to dynamic analysis to collect the relevant data (the name of the methods called or the code that was loaded), then switch back to static analysis.
|
||||
This is doable for a manual analysis; unfortunately, the more automated tools that would require that runtime information to perform an accurate analysis may not have a way to access this new data.
|
||||
This led us to our last problem statement:
|
||||
][
|
||||
|
||||
Peu developpé.
|
||||
Expliquer qu'un reverser, s'il trouve de la reflection ou du dyn load peut eventuellement capturer les données en analyse dynamique.
|
||||
Mais ensuite ces données devienent inutiles s'il retourne a de l'analyse static.
|
||||
En effet, il fait souvant les deux en alternances.
|
||||
Il avait besoin que les data issues de l'analyse dyn soient prisent en compte par l'analyse statique, par example...
|
||||
|
||||
TODO: trouver un example simple a formuler
|
||||
]
|
||||
#highlight-block(breakable: false)[
|
||||
*Pb3*: #pb3-text
|
||||
|
||||
Dynamic code loading and reflection are problems most suited for dynamic analysis.
|
||||
However, static analysis tools do not have access to collected data.
|
||||
Encoding this information inside valid applications could be a way to make it universally available to any static analysis tool.
|
||||
#todo[say something about the impact that can have on tools?]
|
||||
Ideally, this encoding should not degrade the quality of the static analysis compared to the original application.
|
||||
] <pb-3>
|
||||
|
||||
#[
|
||||
|
@ -100,29 +112,29 @@ This lead us to our last problem statement:
|
|||
The contributions of this thesis are the following:
|
||||
|
||||
+ We evaluate the reusability of Android static analysis tools published by the community:
|
||||
We rebuild the tools in their original environment as container images.
|
||||
With those containers, those tools are now readilly available capable of running either Docker of Singularity.
|
||||
We also tested those tools on a dataset of real-life applications balanced in order to have a significant number of applications with different caracteristics to assess which caracteristic impact the success of a tools.
|
||||
we rebuild the tools in their original environment as container images.
|
||||
With those containers, those tools are now readily available on any environment capable of running either Docker or Singularity.
|
||||
We tested those tools on a dataset of real-life applications balanced in order to have a significant number of applications with different characteristics to assess which characteristics impact the success of a tool.
|
||||
This work was presented at the ICSR 2024 conference~@rasta.
|
||||
+ We model the default class loading behavior of Android.
|
||||
Based on this model, we defined a class of obfuscation technique that we called _shadow attacks_ where an class definition in an #APK shadows the actual class definition.
|
||||
We show that common state of the arts tools like Jadx or Flowdroid do not implement this model correctly and thus can fall for those shadow attacks.
|
||||
We surveilled a large number of rescent Android applications and found that applications with classes shadowing the actual definition do exists, those are the result of quirks in the #APK compilation process and not deliberate obfuscation attempts.
|
||||
This work was publish in the Digital Threats journal~@classloaderinthemiddle. #todo[update ref when not 'just published' anymore]
|
||||
+ We propose an approach to allow static analysis tools to analyse application that perform dynamic code loading:
|
||||
We collect at runtime the bytecode dynamically loaded and the reflection calls informations, an patch the #APK file to perform those operation statically.
|
||||
Finally, we evaluate the impact this transformation has on the #jm-note[resiliance][wrong word?] of the tools we containerized previously.
|
||||
Based on this model, we define a class of obfuscation techniques that we call _shadow attacks_ where a class definition in an #APK shadows the actual class definition.
|
||||
We show that common state-of-the-art tools like Jadx or Flowdroid do not implement this model correctly and thus can fall for those shadow attacks.
|
||||
We analysed a large number of recent Android applications and found that applications with class shadowing do exist, though they are the result of quirks in the #APK compilation process and not deliberate obfuscation attempts.
|
||||
This work was published in the Digital Threats journal~@classloaderinthemiddle. #todo[update ref when not 'just published' anymore]
|
||||
+ We propose an approach to allow static analysis tools to analyse applications that perform dynamic code loading:
|
||||
We collect at runtime the bytecode dynamically loaded and the reflection calls information, and patch the #APK file to perform those operations statically.
|
||||
Finally, we evaluate the impact this transformation has on the tools we containerized previously.
|
||||
|
||||
== Outline
|
||||
|
||||
This dissertation is composed of 6 chapters.
|
||||
This introduction is the first chapter.
|
||||
It is followed by @sec:bg that gives background information about Android and the different analysis techniques targetting Android applications.
|
||||
It is followed by @sec:bg which gives background information about Android and the different analysis techniques targeting Android applications.
|
||||
|
||||
The next 3 chapters are dedicated to the contributions of this thesis.
|
||||
First @sec:rasta studdies the reusability of static analysis tools.
|
||||
Next in @sec:cl, we model the default class loading algorithm used by Android and the show the consequences for reverse engineering tools that implement a wrong model.
|
||||
First @sec:rasta studies the reusability of static analysis tools.
|
||||
Next in @sec:cl, we model the default class loading algorithm used by Android and show the consequences for reverse engineering tools that implement a wrong model.
|
||||
Then @sec:th presents an approach that allows for static analysis tools to analyse applications that load bytecode at runtime.
|
||||
|
||||
Finally, @sec:conclusion summarizes the contributions of this thesis and opens perspectives for futur work.
|
||||
Finally, @sec:conclusion summarizes the contributions of this thesis and opens perspectives for future work.
|
||||
]
|
||||
|
|
|
@ -9,7 +9,7 @@
|
|||
|
||||
== Static Analysis <sec:bg-static>
|
||||
|
||||
In the past fifteen years, the research community released many tools to detect or analyze malicious behaviors in applications.
|
||||
In the past fifteen years, the research community released many tools to detect or analyse malicious behaviors in applications.
|
||||
Two main approaches can be distinguished: static and dynamic analysis~@Li2017.
|
||||
Dynamic analysis requires to run the application in a controlled environment to observe runtime values and/or interactions with the operating system.
|
||||
For example, an Android emulator with a patched kernel can capture these interactions but the modifications to apply are not a trivial task.
|
||||
|
@ -147,7 +147,7 @@ Reccuring examples of such support tools are Appktool (#eg Amandroid~@weiAmandro
|
|||
|
||||
The number of publication related to static analysis make can make it difficult to find the right tool for the right task.
|
||||
Li #etal~@Li2017 published a systematic literature review for Android static analysis before May 2015.
|
||||
They analyzed 92 publications and classified them by goal, method used to solve the problem and underlying technical solution for handling the bytecode when performing the static analysis.
|
||||
They analysed 92 publications and classified them by goal, method used to solve the problem and underlying technical solution for handling the bytecode when performing the static analysis.
|
||||
In particular, they listed 27 approaches with an open-source implementation available.
|
||||
Nevertheless, experiments to evaluate the reusability of the pointed out software were not performed.
|
||||
#jfl-note[We believe that the effort of reviewing the literature for making a comprehensive overview of available approaches should be pushed further: an existing published approach with a software that cannot be used for technical reasons endanger both the reproducibility and reusability of research.][A mettre en avant?]
|
||||
|
|
|
@ -59,7 +59,7 @@ For each tool, both the usability and results of the tool were evaluated by aski
|
|||
The auditors reported that most of the tools require a significant amount of time to setup, often due to dependencies issues and operating system incompatibilities.
|
||||
Reaves #etal propose to solve these issues by distributing a Virtual Machine with a functional build of the tool in addition to the source code.
|
||||
Regrettably, these Virtual Machines were not made available, preventing future researchers to take advantage of the work done by the auditors.
|
||||
Reaves #etal also report that real world applications are more challenging to analyze, with tools having lower results, taking more time and memory to run, sometimes to the point of not being able to run the analysis.
|
||||
Reaves #etal also report that real world applications are more challenging to analyse, with tools having lower results, taking more time and memory to run, sometimes to the point of not being able to run the analysis.
|
||||
This result is worrying considering it was noticed on a dataset of only 16 real-world application.
|
||||
A more diverse dataset would be needed to better assess the extend of the issue and give more insight about the factor impacting the perfomances of the tools.
|
||||
//We will confirm and expand this result in @sec:rasta with a larger dataset than only 16 real-world applications.
|
||||
|
|
|
@ -23,7 +23,7 @@ The observation of the success or failure of these analysis enables us to answer
|
|||
/*
|
||||
As a summary, the contributions of this paper are the following:
|
||||
|
||||
- We provide containers with a compiled version of all studied analysis tools, which ensures the reproducibility of our experiments and an easy way to analyze applications for other researchers. Additionally receipts for rebuilding such containers are provided.
|
||||
- We provide containers with a compiled version of all studied analysis tools, which ensures the reproducibility of our experiments and an easy way to analyse applications for other researchers. Additionally receipts for rebuilding such containers are provided.
|
||||
- We provide a recent dataset of #NBTOTALSTRING applications balanced over the time interval 2010-2023.
|
||||
- We point out which static analysis tools of Li #etal SLR paper@Li2017 can safely be used and we show that #resultunusable of evaluated tools are unusable (considering that a tool that fails more than 50% of time is unusable). In total, the success rate of the tools we could run is #resultratio on our dataset.
|
||||
- We discuss the effect of applications features (date, size, SDK version, goodware/malware) on static analysis tools and the nature of the issues we found by studying statistics on the errors captured during our experiments.
|
||||
|
|
|
@ -176,7 +176,7 @@ We refer to this variant of usage as androguard_dad.
|
|||
In a second step, we explored the best sources to be selected among the possible forks of a tool.
|
||||
We reported some indicators about the explored forks and our decision about the selected one in @tab:rasta-sources.
|
||||
For each source code repository called "Origin", we reported in @tab:rasta-sources the number of GitHub stars attributed by users and we mentioned if the project is still alive (#ok in column Alive when a commit exist in the last two years).
|
||||
Then, we analyzed the fork tree of the project.
|
||||
Then, we analysed the fork tree of the project.
|
||||
We searched recursively if any forked repository contains a more recent commit than the last one of the branch mentioned in the documentation of the original repository.
|
||||
If such a commit is found (number of such commits are reported in column Alive Forks Nb), we manually looked at the reasons behind this commit and considered if we should prefer this more up-to-date repository instead of the original one (column "Alive Forks Usable").
|
||||
As reported in @tab:rasta-sources, we excluded all forks, except IC3 for which we selected the fork JordanSamhi/ic3, because they always contain experimental code with no guarantee of stability.
|
||||
|
@ -185,7 +185,7 @@ For IC3, the fork seems promising: it has been updated to be usable on a recent
|
|||
We decided to keep these two versions of the tool (IC3 and IC3_fork) to compare their results.
|
||||
|
||||
Then, we self-allocated a maximum of four days for each tool to successfully read and follow the documentation, compile the tool and obtain the expected result when executing an analysis of a #MWE.
|
||||
We sent an email to the authors of each tool to confirm that we used the more suitable version of the code, that the command line we used to analyze an application is the most suitable one and, in some cases, requested some help to solve issues in the building process.
|
||||
We sent an email to the authors of each tool to confirm that we used the more suitable version of the code, that the command line we used to analyse an application is the most suitable one and, in some cases, requested some help to solve issues in the building process.
|
||||
We reported in @tab:rasta-sources the authors that answered our request and confirmed our decisions.
|
||||
|
||||
From this building phase, several observations can be made.
|
||||
|
|
|
@ -153,7 +153,7 @@ Regarding errors linked to the disk space, we observe few ratios for the excepti
|
|||
Manual inspections revealed that those errors are often a consequence of a failed apktool execution.
|
||||
|
||||
Second, the black squares indicate frequent errors that need to be investigated separately.
|
||||
In the next subsection, we manually analyzed, when possible, the code that generates this high ratio of errors and we give feedback about the possible causes and difficulties to write a bug fix.
|
||||
In the next subsection, we manually analysed, when possible, the code that generates this high ratio of errors and we give feedback about the possible causes and difficulties to write a bug fix.
|
||||
|
||||
=== Tool by Tool Investigation // <sec:rasta-tool-by-tool-inv>
|
||||
/*
|
||||
|
@ -211,7 +211,7 @@ Anadroid: DONE
|
|||
*/
|
||||
|
||||
#paragraph[Androguard and Androguard_dad][
|
||||
Surprisingly, while Androguard almost never fails to analyze an APK, the internal decompiler of Androguard (DAD) fails more than half of the time.
|
||||
Surprisingly, while Androguard almost never fails to analyse an APK, the internal decompiler of Androguard (DAD) fails more than half of the time.
|
||||
The analysis of the logs shows that the issue comes from the way the decompiled methods are stored: each method is stored in a file named after the method name and signature, and this file name can quickly exceed the size limit (255 characters on most file systems).
|
||||
It should be noticed that Androguard_dad rarely fails on the Drebin dataset.
|
||||
This illustrates the importance to test tools on real and up-to-date APKs: even a bad handling of filenames can influence an analysis.
|
||||
|
|
|
@ -12,7 +12,7 @@ These benchmarks confirmed that some tools such as Amandroid and Flowdroid are l
|
|||
We confirm the hypothesis of Luo #etal that real-world applications lead to less efficient analysis than using hand crafted test applications or old datasets~@luoTaintBenchAutomaticRealworld2022.
|
||||
In addition, even if Drebin is not hand-crafted, it is quite old seams to present similar issue as hand-crafted dataset when used to evaluate a tool: we obtained really good results compared to the Rasta dataset -- which is more representative of realworld applications.
|
||||
|
||||
Our finding are also consistent with the numerical results of Pauck #etal that showed that #mypercent(106, 180) of DIALDroid-Bench~@bosuCollusiveDataLeak2017 real-world applications are analyzed successfully with the 6 evaluated tools~@pauckAndroidTaintAnalysis2018.
|
||||
Our finding are also consistent with the numerical results of Pauck #etal that showed that #mypercent(106, 180) of DIALDroid-Bench~@bosuCollusiveDataLeak2017 real-world applications are analysed successfully with the 6 evaluated tools~@pauckAndroidTaintAnalysis2018.
|
||||
Six years after the release of DIALDroid-Bench, we obtain a lower ratio of #mypercent(40.05, 100) for the same set of 6 tools but using the Rasta dataset of #NBTOTALSTRING applications.
|
||||
We extended this result to a set of #nbtoolsvariationsrun tools and obtained a global success rate of #resultratio.
|
||||
We confirmed that most tools require a significant amount of work to get them running~@reaves_droid_2016.
|
||||
|
|
|
@ -10,7 +10,7 @@ To mitigate this possible problem we contacted the authors of the tools to confi
|
|||
Before running the final experiment, we also ran the tools on a subset of our dataset and looked manually the most common errors to ensure that they are not trivial errors that can be solved.
|
||||
|
||||
The timeout value, amount of memory are arbitrarily fixed.
|
||||
To mitigate this issue, a small extract of our dataset has been analyzed with more memory/time and we check that they was no significant difference in the results.
|
||||
To mitigate this issue, a small extract of our dataset has been analysed with more memory/time and we check that they was no significant difference in the results.
|
||||
|
||||
Finally, the use of VirusTotal for determining if an application is a malware or not may be wrong.
|
||||
To limite the impact of errors, we used a threshold of at most 5 antiviruses (resp. no more than 0) reporting an application as being a malware (resp. goodware) for taking a decision about maliciousness (resp. benignness).
|
||||
|
|
|
@ -46,7 +46,7 @@ We present a new technique that "shadows" a class #ie embeds a class in the APK
|
|||
The goal of such an attack is to confuse them during the reversing process: at runtime the real class will be loaded from another location of the APK file or from the #Asdk, instead of the shadow version.
|
||||
This attack can be applied to regular classes of the #Asdk or to hidden classes of Android~@he_systematic_2023 @li_accessing_2016.
|
||||
We show how these attacks can confuse the tools of the reverser when he performs a static analysis.
|
||||
In order to evaluate if such attacks are already used in the wild, we analyzed #nbapk applications from 2023 that we extracted randomly from AndroZoo~@allixAndroZooCollectingMillions2016.
|
||||
In order to evaluate if such attacks are already used in the wild, we analysed #nbapk applications from 2023 that we extracted randomly from AndroZoo~@allixAndroZooCollectingMillions2016.
|
||||
Our main result is that #shadowsdk of these applications contain shadow collisions against the #SDK and #shadowhidden against hidden classes.
|
||||
Our investigations conclude that most of these collisions are not voluntary attacks, but we highlight one specific malware sample performing strong obfuscation revealed by our detection of one shadow attack.
|
||||
|
||||
|
|
|
@ -12,7 +12,7 @@ This behavior is now implemented in modern Java virtual machines.
|
|||
Later Tazawa and Hagiya~@tozawa_formalization_2002 proposed a formalization of the Java Virtual Machine supporting dynamic class loading in order to ensure type safety.
|
||||
Those works ensure strong safety for the Java Virtual Machine, in particular when linking new classes at runtime.
|
||||
Although Android has a similar mechanism, the implementation is not shared with the JVM of Oracle.
|
||||
Additionally, in this paper, we do not focus on spoofing classes at runtime, but on confusion that occurs when using a static analyzer used by a reverser that tries to understand the code loading process offline.
|
||||
Additionally, in this paper, we do not focus on spoofing classes at runtime, but on confusion that occurs when using a static analyser used by a reverser that tries to understand the code loading process offline.
|
||||
|
||||
Contributions about Android class loading focus on using the capabilities of class loading to extend Android features or to prevent reverse engineering of Android applications.
|
||||
For instance, Zhou #etal~@zhou_dynamic_2022 extend the class loading mechanism of Android to support regular Java bytecode and Kritz and Maly~@kriz_provisioning_2015 propose a new class loader to automatically load modules of an application without user interactions.
|
||||
|
|
|
@ -251,10 +251,10 @@ As discussed earlier in the paper, the documentation can lack some classes.
|
|||
Consequently, the most reliable source is the smartphone itself.
|
||||
It should be noted that none of these methods can be generalized for all possible versions of Android, as the exact list will depend on the exact targeted device, possibly modified by the manufacturer.
|
||||
Thus, to conter Shadow attaks, the static analysis tools that we evaluated need to embed multiple lists of platform classes, one for each Android version.
|
||||
Then, the best heuristic would be to use the list of platform classes that is closest to the target SDK of the analyzed application.
|
||||
Then, the best heuristic would be to use the list of platform classes that is closest to the target SDK of the analysed application.
|
||||
|
||||
Some tools like Flowdroid would require additional countermeasures: to compute the exact flow of data, Flowdroid also needs to analyze the code of platform classes.
|
||||
For the SDK classes, Flowdroid has already analyzed them, but the hidden classes have not.
|
||||
Some tools like Flowdroid would require additional countermeasures: to compute the exact flow of data, Flowdroid also needs to analyse the code of platform classes.
|
||||
For the SDK classes, Flowdroid has already analysed them, but the hidden classes have not.
|
||||
In addition to the data flow in hidden classes, Flowdroid needs a list of data sources and sinks from those classes.
|
||||
%Other analysis tools may require additional data from platform classes, which may be too difficult to obtain.
|
||||
|
||||
|
@ -287,7 +287,7 @@ Second, the attacker could use a packer to unpack code at runtime in a first pha
|
|||
The reverse engineer would have to perform a dynamic analysis, for example uising a tool such as Dexhunter~@zhang2015dexhunter, to recover new DEX files that are loaded by a custom class loader.
|
||||
Then, the reverse engineer would go back to a new static analysis and could have the problem of solving shadow attacks, for example, if a class is defined multiple times in the loaded DEX files.
|
||||
|
||||
Because the interaction between shadow attacks and other obfuscations techniques often rely on a loading mechanism implemented by the developer, investigating these cases require to analyze the Java bytecode that is handling the loading.
|
||||
Because the interaction between shadow attacks and other obfuscations techniques often rely on a loading mechanism implemented by the developer, investigating these cases require to analyse the Java bytecode that is handling the loading.
|
||||
This problem is left as future work.
|
||||
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
== Threat to Validity <sec:cl-ttv>
|
||||
|
||||
During the analysis of the ART internals, we made the hypothesis that its different operating modes are equivalent: we analyzed the loading process for classes stored as non-optimized `.dex` format, and not for the pre-compiled `.oat`.
|
||||
During the analysis of the ART internals, we made the hypothesis that its different operating modes are equivalent: we analysed the loading process for classes stored as non-optimized `.dex` format, and not for the pre-compiled `.oat`.
|
||||
It is a reasonable hypothesis to suppose that the two implementations have been produced from the same algorithm using two compilation workflows.
|
||||
Similarly, we assumed that the platform classes stored in `boot.art` are the same as the ones in `BOOTCLASSPATH`.
|
||||
We confirm empirically our hypothesis on an Android Emulator, but we may have missed some edge cases.
|
||||
|
|
|
@ -1,9 +1,20 @@
|
|||
#import "../lib.typ": todo, epigraph
|
||||
#import "../lib.typ": epigraph, ART, APK, ie, highlight-block
|
||||
#import "X_var.typ": nbapk
|
||||
|
||||
= Class Loaders in the Middle: Confusing Android Static Analyzers <sec:cl>
|
||||
= Class Loaders in the Middle: Confusing Android Static Analysers <sec:cl>
|
||||
|
||||
#epigraph("Esmerelda Weatherwax, Wyrd Sisters, Terry Pratchett")[Things that try to look like things often do look more like things than things.]
|
||||
|
||||
#align(center, highlight-block(inset: 15pt, width: 75%, block(align(left)[
|
||||
The dynamic linking and loading of the different classes by the #ART is a complex task that can eventually be exploited by an attacker.
|
||||
In particular, if the developer adds a class whose name collides with the name of a class of the Android operating system or another class in the application, they may confuse a reverse engineer in charge of studying such an application.
|
||||
In this chapter, we explore the consequences of those collisions.
|
||||
We highlight three attacks that we call shadow attacks because the class implementation that a reverser would find shadows a second implementation with a higher priority.
|
||||
In particular, we show that a static analysis tools used by a reverser choose the shadow implementation for most of the evaluated tools, and outputs a wrong result.
|
||||
In a dataset of #nbapk applications, we also investigate whether shadow attacks are used in the wild and show that, most of the time, there is no malicious behavior behind them.
|
||||
])))
|
||||
|
||||
|
||||
#include("0_intro.typ")
|
||||
#include("1_related_work.typ")
|
||||
#include("2_classloading.typ")
|
||||
|
|
|
@ -16,7 +16,7 @@ After running the dynamic analysis on our dataset the first time we realised our
|
|||
We found that #mypercent(dyn_res.all.nb_failed_first_run, dyn_res.all.nb) of the execution failed with various errors.
|
||||
The majority of those errors were related to faillures to connect to the Frida agent or start the activity from Frida.
|
||||
Some of those errors seamed to come from Frida, while other seamed related to the emulator failing to start the application.
|
||||
We found the simply relauching the analysis for the applications that failled was the most simple way to fix those issues, and after 6 passes we went from #num(dyn_res.all.nb_failed_first_run) to #num(dyn_res.all.nb_failed) application that could not be analyzed.
|
||||
We found the simply relauching the analysis for the applications that failled was the most simple way to fix those issues, and after 6 passes we went from #num(dyn_res.all.nb_failed_first_run) to #num(dyn_res.all.nb_failed) application that could not be analysed.
|
||||
The remaining errors look more related to the application itself or Android, with #num(96) errors being a failure to install the application, and #num(110) other beeing a null pointer exception from Frida.
|
||||
|
||||
Infortunatly, although we managed to start the applications, we can see from the list of activity visited by GroddDroid that a majority (#mypercent(dyn_res.all.z_act_visited, dyn_res.all.nb - dyn_res.all.nb_failed)) of the application stopped before even starting one activity.
|
||||
|
|
12
papers/4_class_loader/abstract.typ
Normal file
12
papers/4_class_loader/abstract.typ
Normal file
|
@ -0,0 +1,12 @@
|
|||
#import "X_var.typ": nbapk
|
||||
|
||||
Android is the most deployed operating system for smartphones.
|
||||
Android applications are designed by external developers that embed their classes in the APK file installed later in the smartphone.
|
||||
At runtime, Android executes these classes in addition to classes provided by the operating system itself.
|
||||
The dynamic linking and loading of the different classes is a complex task that can eventually be exploited by an attacker.
|
||||
In particular, if the developer adds a class whose name collides with the name of a class of the Android operating system, he may confuse a reverser in charge of studying such an application.
|
||||
In this paper, we explore the possible collisions that can occur between classes defined multiple times at different locations i.e. multiple times in the APK file or, at the same time, in the APK and the operating system.
|
||||
We highlight three attacks that we call shadow attacks because the class implementation that a reverser would find shadows a second implementation with a higher priority.
|
||||
In particular, we show that a static analysis tools used by a reverser choose the shadow implementation for most of the evaluated tools, and output a wrong result.
|
||||
In a dataset of #nbapk applications, we also explored if shadow attacks are used in the wild and show that most of the time, there is no malicious behavior behind them.
|
||||
|
Loading…
Add table
Add a link
Reference in a new issue