This commit is contained in:
parent
d9650d0775
commit
d1dba30426
11 changed files with 159 additions and 98 deletions
|
@ -25,12 +25,10 @@ Regrettably, analysis tools mostly return results in an ad hoc format, making it
|
|||
Some tools however encode their result in the form of a new augmented Android application.
|
||||
The idea beeing that any Android analysis tools must be able to handle an Android application in the first place, so it will have access to those new information.
|
||||
|
||||
In this section, explore in more details those different aspects of Android reverse engineering.
|
||||
We will begin this chapter by a presentation of the bases of the Android ecosystem.
|
||||
The reader already familliar with Android reverse engineering might want to skip to @sec:bg-probl where we put our problem statements in perspective.
|
||||
We will then examine the state of the art related to those problem statements @sec:bg-soa, and conclude this chapter in @sec:bg-conclusion.
|
||||
|
||||
#todo[Plan d'annonce]
|
||||
#todo[Petit intro back platform classes, séparé de soa]
|
||||
#todo[Petit intro class loading séparé de soa]
|
||||
#todo[Bien séparer background et st-o-a]
|
||||
#todo[bien dédier des sections/sous section aux 3 problemes]
|
||||
#todo[synthese a la fin de chaque section soa des problemes]
|
||||
#todo[Problematique avant soa]
|
|
@ -1,14 +1,13 @@
|
|||
#import "../lib.typ": todo, num, APK, JAR, AXML, ART, SDK, JNI, NDK, DEX, XML, API, ZIP, jfl-note
|
||||
|
||||
== Android <sec:bg-android>
|
||||
=== Android <sec:bg-android>
|
||||
|
||||
Android is the smartphone operating system developed by Google.
|
||||
It is based on a Long Term Support Linux Kernel, to which are added patches develloped by the Android community.
|
||||
On top of the kernel, Android redeveloped many of the usual components used by linux-based operating systems, and added new ones.
|
||||
On top of the kernel, Android redeveloped many of the usual components used by linux-based operating systems, like the init system or the standart C library, and added new ones, like the #ART that execute the applications.
|
||||
Those change make Android a verry unique operating system.
|
||||
#jfl-note[][Chiffres pour illustrer?]
|
||||
|
||||
=== Android Applications <sec:bg-android>
|
||||
==== Android Applications <sec:bg-android>
|
||||
|
||||
Application in the Android ecosystem are distributed in the #APK format.
|
||||
#APK files are #JAR files with additionnal features, which are themself #ZIP files with additionnal features.
|
||||
|
@ -21,10 +20,10 @@ When ressources are present in `res/`, the file `resources.arsc` is also present
|
|||
The `assets/` folder contains the files that are used directly by the code application.
|
||||
Depending on the application and compilation process, any kind of other files and folders can be added to the application.
|
||||
|
||||
==== Signature
|
||||
===== Signature
|
||||
|
||||
Android applications are cryptographically signed to prove the autorship.
|
||||
Applicatations signed with same key are considered developed by the same entity.
|
||||
Applicatations signed with the same key are considered developed by the same entity.
|
||||
This allow to securely update applications, and applications can declare security permission to restrict access to some feature to only application with the same author.
|
||||
|
||||
Android has several signature schemes coexisting:
|
||||
|
@ -36,7 +35,7 @@ Android has several signature schemes coexisting:
|
|||
- The v4 signature scheme is complementary to the v2/v3 signature scheme.
|
||||
Signature data are stored in an external, `.apk.idsig` file.
|
||||
|
||||
==== Android Manifest
|
||||
===== Android Manifest
|
||||
|
||||
The Android Manifest is stored in the `AndroidManifest.xml`, encoded in the binary #AXML format.
|
||||
The manifest declare important informations about the application:
|
||||
|
@ -46,7 +45,7 @@ The manifest declare important informations about the application:
|
|||
- Intent filters to list the intents that can start or be sent to the application componants.
|
||||
- Security permissions required by the application.
|
||||
|
||||
==== Code <sec:bg-android-code-format>
|
||||
===== Code <sec:bg-android-code-format>
|
||||
|
||||
An application usually contains at least a `classes.dex` file containing Dalvik bytecode.
|
||||
This is the format executed by the Android #ART.
|
||||
|
@ -60,14 +59,14 @@ Because native code is compiled for a specific architecture, `.so` files are pre
|
|||
For example `lib/arm64-v8a/libexample.so` is the version of the `example` library compiled for an ARM 64 architecture.
|
||||
Because smartphones mostly use ARM processors, it is not rare to see applications that only have the ARM version of their native code.
|
||||
|
||||
==== Ressources
|
||||
===== Ressources
|
||||
|
||||
Application user interface require many kind of specific assets, which are stored in `lib/`.
|
||||
Developing graphical interfaces for applications require many kind of specific assets, which are stored in `lib/`.
|
||||
Those ressources include bitmap images, text, layout, etc.
|
||||
Data like layout, color or text are stored in binary #AXML.
|
||||
An additionnal file, `resources.arsc`, in a custom binary format, contains a list of the ressources names, ids, and their properties.
|
||||
|
||||
==== Compilation Process
|
||||
===== Compilation Process
|
||||
|
||||
For the developer, the compilation process is handled by Android Studio and is mostly transparent.
|
||||
Behind the scene, Android Studio rely on Gradle to orchestrate the different compilation steps:
|
||||
|
@ -97,16 +96,14 @@ Since 2021, Google requires that new applications in the Google Play app store t
|
|||
The main difference is that Google will perform the last packaging steps and generate (and sign) the application itself.
|
||||
This allow Google to generate different applications for different target, and avoid including unnecessary files in the application like native code targetting the wrong architecture.
|
||||
|
||||
=== Android Runtime <sec:bg-art>
|
||||
==== Android Runtime <sec:bg-art>
|
||||
|
||||
Android runtime environement has many specificities that sets it appart from other platforms.
|
||||
An heavy emphasis is put on isolating the applications from one another as well from the systems critical capabilities.
|
||||
The code execution itself can be confusing at first.
|
||||
Instead of the usual linear model with a single entry point, applications have many entrypoints that are called by the Android framework in accordance to external events.
|
||||
|
||||
==== Application Architecture
|
||||
|
||||
#todo[Subsection name?]
|
||||
===== Application Architecture
|
||||
|
||||
Android application expose their componants to the Android Runtime (#ART) via classes inheriting specific classes from the Android #SDK.
|
||||
Four classes represent application components that can be used as entry points:
|
||||
|
@ -125,19 +122,50 @@ In addition to the componants declared in the manifest that act as entry points,
|
|||
The most obvious cases are for the user interface, for example a button will call a callback method defined by the application when clicked.
|
||||
Other part of the #API also rely on non-linear execution, for example when an application sends an intent (see @sec:bg-sandbox), the intent sent in responce is transmitted back to the application by calling another method.
|
||||
|
||||
==== Application Isolation and Interprocess Communication <sec:bg-sandbox>
|
||||
===== Application Isolation and Interprocess Communication <sec:bg-sandbox>
|
||||
|
||||
On Android, each application has its own storage folders and the application process are isolated from other applications and the hardware interfaces.
|
||||
On Android, each application has its own storage folders and the application processes are isolated from each other and from the hardware interfaces.
|
||||
This sandboxing is done using Linux security features like group and user permissions, SELinux, and seccomp.
|
||||
The sandboxing is adjusted according to the permissions requested in the `AndroidManifest.xml` file of the applications.
|
||||
In addition, most feature of the Android system can only be accessed through Binder, Android main interprocess communication channel.
|
||||
|
||||
Binder is a componant of tha Android framework, external to the application, that all applications can communicate with.
|
||||
Applicatians can send messages to Binder, called *intent*, that will check if the application is allowed to send it then foward it to the appropriate componant that can then responce with another intent.
|
||||
Applications can also receive intent must declare intent filters to indicate which intent can be send to the application, and which classes receive the intents.
|
||||
Intent are central to Android applications and are not used just to access Android capabilities.
|
||||
For instance, the activities and services are started by receiving intent, and it is not uncommon for application to send intents to itself to switch activities.
|
||||
Applications can send messages to Binder, called *intents*.
|
||||
Binder will check if the application is allowed to send it, and then foward it to the appropriate componant.
|
||||
This component can then respond with another intent.
|
||||
Applications must declare intent filters to indicate which intent can be send to the application, and which classes receive the intents.
|
||||
Intents are central to Android applications and are not just used to access Android capabilities.
|
||||
For instance, the activities and services are started by receiving and intent, and it is not uncommon for application to self-send intents to switch between activities.
|
||||
Intent can also be sent directly from Android to the application: when a user starts an application by tapping the app icons, Android will send an intent to the class of the application that defined the intent filter for the `android.intent.action.MAIN` intent.
|
||||
One interesting feature of the Binder is that intent do not need to explicitly name the targetted application and class: intent can be implicit and request an action without knowing the exact application that will performed it.
|
||||
An example of this behavior is when an application whant open a file: an `android.intent.action.VIEW` intent is sent with the file location, and Binder will find an application capable of viewing the file.
|
||||
An example of this behaviour is when an application want to open a file: an `android.intent.action.VIEW` intent is sent with the file location and type, and Binder will find and start an application capable of viewing this file.
|
||||
|
||||
===== Platform Classes <sec:bg-platform>
|
||||
|
||||
In addition to the classes they include, Android applications have access to classes provided by Android, stored on the phone.
|
||||
Those classes are called _platform classes_.
|
||||
They are devided between #SDK classes, and hidden #API.
|
||||
The #SDK classes can be seen as the Android standard library.
|
||||
They are documented by Google, and have a certain stability from version to version.
|
||||
In case of breaking changes, the changed are listed by Google as well.
|
||||
The list of #SDK classes is available at compile time in the form of a `android.jar` file to link against.
|
||||
|
||||
On the opposite, hidden #API are undocumented methods used internally by the #ART.
|
||||
Still, they are loaded by the application and can be used by it.
|
||||
|
||||
===== Class Loading and Reflection
|
||||
|
||||
Class loading is the mechanism used by Android to find and select the classes implementation when encontering a reference to a class.
|
||||
Android developers mainly use it to load bytecode dynamically from a source other than the application itself (#eg a file downloaded at runtime), using `ClassLoader` objects.
|
||||
`Class` objects are the retrieved from those class loaders using their name in the form of strings to identify them.
|
||||
Those `Class` can then be instanciated into object, and `Methods` objects can be used to call the mehtods of the instanciated object.
|
||||
The process of manipulating `Class` and `Methods` object instead of using bytecode instructions is called reflection.
|
||||
Reflection is not limited to bytecode that has been dynamically loaded: it can be used for any class or method available to the application.
|
||||
|
||||
Because the `ClassLoader` object are only used when loading bytecode dynamically or when using reflection, it is often forgotten that the #ART uses class loaders constantly behind the scene, allowing classes from the application and platform classes to cohabit seamlessly.
|
||||
|
||||
|
||||
#v(2em)
|
||||
|
||||
In this subsection, we presented the most notable specificities of the Android ecosystem.
|
||||
In the next section, we will continue with the various tools available for an Android reverse engineer.
|
|
@ -1,13 +1,19 @@
|
|||
#import "../lib.typ": todo, APK, IDE, SDK, DEX, ADB, ART, eg, XML, AXML, API, jfl-note
|
||||
|
||||
== Reverse Engineering Tools <sec:bg-tools>
|
||||
=== Reverse Engineering Tools <sec:bg-tools>
|
||||
|
||||
Due to the specificities of Android, reverse engineers need tools adapted to Android.
|
||||
The developement tools provided by Google can be used for basic operations.
|
||||
Apktool and Jadx are common tools used to read the content of an application, meanwhile Androguard and Soot can be used as librairy to automate analysis.
|
||||
For a more dynamic approach, Frida is a toolkit that can be use to intercept method call and execute custom while an application is running.
|
||||
The developement tools provided by Google can be used for basic operations, but a reverse engineer will quickly need more specialized tool.
|
||||
Usually, the first steep while while analysing an application is to look at its content.
|
||||
Apktool and Jadx are common tools used to convert the content of an application into a readable format.
|
||||
Analysing an application this way, without running it, is called static analysis.
|
||||
For more advanced form of static analysis, Androguard and Soot can be used as librairy to automate analyses.
|
||||
When static analysis became too complicated (#eg if the application uses obfuscation techniques), a reverse engineer might switch to dynamic analysis.
|
||||
This time, the application is executed and the analyst will scrutinise the behaviour of the application.
|
||||
Frida is a good option to help this dynamic analysis,
|
||||
It is a toolkit that can be use to intercept method call and execute custom while an application is running.
|
||||
|
||||
=== Android Studio <sec:bg-android-studio>
|
||||
==== Android Studio <sec:bg-android-studio>
|
||||
|
||||
The whole Android developement ecosystem is packaged by Google in the #IDE Android Studio#footnote[https://developer.android.com/studio].
|
||||
In practice, Android Studio is a source-code editor that wrap arround the different tools of the android #SDK.
|
||||
|
@ -31,44 +37,44 @@ Among the notable tools in the #SDK, they are:
|
|||
It can also be used to perform different level of optimization of the bytecode generated.
|
||||
- `aapt`/`aapt2` (Android Asset Packaging Tool): This tools is used to build the #APK file.
|
||||
It is commonly used by other tools that repackage applications like Apktool.
|
||||
Behind the scene, it we convert #XML to binary #AXML and ensure the right files have the right compression and alignment. (#eg some ressource files are mapped in memory by the #ART, and thus need to be aligned and not compressed).
|
||||
Behind the scene, it converts #XML to binary #AXML and ensure that each files have the right compression and alignment. (#eg some ressource files are mapped in memory by the #ART, and thus need to be aligned and not compressed).
|
||||
- `apksigner`: the tool used to sign an #APK file.
|
||||
When repackaging an application, for example with Apktool, the new application need to be signed.
|
||||
|
||||
=== Apktool <sec:bg-apktool>
|
||||
==== Apktool <sec:bg-apktool>
|
||||
|
||||
Apktool#footnote[https://apktool.org/] is a _reengineering tool_ for Android #APK files.
|
||||
It can be used to disassemble an application: it will extract the files from the #APK file, convert the binary #AXML to text #XML, and use smali/backsmali#footnote[https://github.com/JesusFreke/smali] to convert the #DEX files to smali, an assembler-like langage that match the Dalvik bytecode instructions.
|
||||
The main strenght of Apktool is that after having disassemble an application, the content of the application can be edited and reassemble into a new #APK. #jfl-note[limites? ca marche toujours?]
|
||||
|
||||
=== Androguard <sec:bg-androguard>
|
||||
==== Androguard <sec:bg-androguard>
|
||||
|
||||
Androguard#footnote[https://github.com/androguard/androguard]~@desnos:adnroguard:2011 is a python library for parsing and analysing #APK files.
|
||||
#jfl-note[Its main feature is disassembling #APK files.][backend #sym.eq.not apktool?]
|
||||
Androguard#footnote[https://github.com/androguard/androguard]~@desnos:adnroguard:2011 is a python library for parsing and disassembling #APK files.
|
||||
It can be used to automatically read Android manifests, ressources, and bytecode.
|
||||
Contrary to Apktool, it can be used programatically, whithout parsing text files, to analyse the application, but it cannot repackage a modified application.
|
||||
Contrary to Apktool wich generate text files, it can be used as a library to programatically to analyse the application.
|
||||
However, contrary to Apktool, it cannot repackage a modified application.
|
||||
|
||||
In addition, it can perform additionnal analysis, like computing a call graph or control flow graph of the application.
|
||||
We will explain what are those graphs later in @sec:bg-static.
|
||||
|
||||
In addition, it can perform additionnal analysis, like computing a call graph or control flow graph.
|
||||
|
||||
=== Jadx <sec:bg-jadx>
|
||||
==== Jadx <sec:bg-jadx>
|
||||
|
||||
Jadx#footnote[https://github.com/skylot/jadx] is an application decompiler.
|
||||
It convert #DEX files to Java source code.
|
||||
It is not always capable of decompiling all classes of an application, so it cannot be used to recompile a new application, but the code generated can be very helpful to reverse an application.
|
||||
In addition to decompilling #DEX files, Jadx can also decode Android manifests and application ressources.
|
||||
|
||||
=== Soot <sec:bg-soot>
|
||||
==== Soot <sec:bg-soot>
|
||||
|
||||
Soot#footnote[https://github.com/soot-oss/soot]~@Arzt2013 is a Java optimization framework.
|
||||
It can leaft java bytecode to other intermediate representations that can be used to perform optimization then converted back to bytecode.
|
||||
Because Dalvik bytecode and Java bytecode are equivalent, support for Android was added to Soot, and Soot features are now leveraged to analyse Android applications.
|
||||
One of the best known example of Soot usage for Android analysis is Flowdroid~@Arzt2014a, a tool that compute data flow in an application.
|
||||
Soot#footnote[https://github.com/soot-oss/soot]~@Arzt2013 was originaly a Java optimization framework.
|
||||
It could leaft java bytecode to other intermediate representations that can could be optimized, then converted back to bytecode.
|
||||
Because Dalvik bytecode and Java bytecode are equivalent, support for Android was added to Soot, and Soot features are now leveraged to analyse and modify Android applications.
|
||||
One of the best known example of Soot usage for Android analysis is Flowdroid~@Arzt2014a, a tool that computes data flow in an application.
|
||||
|
||||
A new version of Soot, SootUp#footnote[https://github.com/soot-oss/SootUp], is currently beeing worked on.
|
||||
Compared to Soot, it has a modernize interface and architecture, but it is not yet feature complete and some tools like Flowdroid are still using Soot.
|
||||
|
||||
=== Frida <sec:bg-frida>
|
||||
==== Frida <sec:bg-frida>
|
||||
|
||||
Frida#footnote[https://frida.re/] is a dynamic intrumentation toolkit.
|
||||
It allows the reverse engineer to inject and run javascript code inside a running application.
|
||||
|
@ -85,5 +91,5 @@ Malware might implement countermeasures that avoid running malicious payload in
|
|||
|
||||
Those tools are quite useful for manual operations.
|
||||
However, considering the complexity of modern Android applications, it might take a lot of work for a reverse engineer to analyse one application.
|
||||
In the next section, we will see more advance techniques that have been developped to analyse Android applications.
|
||||
|
||||
Different techniques have been developped to streamline the analysis.
|
||||
Next, we will see the most common of those techniques for static analysis.
|
|
@ -2,27 +2,18 @@
|
|||
#import "../lib.typ": todo, jm-note, jfl-note
|
||||
#import "@preview/diagraph:0.3.5": raw-render
|
||||
|
||||
//== Android Reverse Engineering Techniques <sec:bg-techniques>
|
||||
|
||||
//#todo[swap with tool section ?]
|
||||
|
||||
|
||||
== Static Analysis <sec:bg-static>
|
||||
|
||||
In the past fifteen years, the research community released many tools to detect or analyse malicious behaviors in applications.
|
||||
Two main approaches can be distinguished: static and dynamic analysis~@Li2017.
|
||||
Dynamic analysis requires to run the application in a controlled environment to observe runtime values and/or interactions with the operating system.
|
||||
For example, an Android emulator with a patched kernel can capture these interactions but the modifications to apply are not a trivial task.
|
||||
Such approach is limited by the required time to execute a limited part of the application with no guarantee on the obtained code coverage.
|
||||
Dynamic analysis is also limited by evading techniques that may prevent the execution of malicious parts of the code.
|
||||
As a consequence, a lot of efforts have been put in static approaches. //, which is the focus of this paper.
|
||||
=== Static Analysis <sec:bg-static>
|
||||
|
||||
Static analysis program examine an #APK file without executing it to extract information from it.
|
||||
Basic static analysis can include extracting information from the `AndroidManifest.xml` file or decompiling bytecode to Java code.
|
||||
Basic static analysis can include extracting information from the `AndroidManifest.xml` file or decompiling bytecode to Java code with tools like Apktool or Jadx.
|
||||
Unfortunately, simply reading the bytecode does not scale.
|
||||
To do so, a human analyst is needed, making it complicated to analyse a large number of applications, and even for single applications, the size and complexity of some applications can quickly overwhelm the reverse engineer.
|
||||
|
||||
More advance analysis consist in the computing the control-flow of an application and computing its data-flow~@Li2017.
|
||||
|
||||
The most basic form of control-flow analysis is to build a call graph.
|
||||
Control flow analysis is often used to mitigate this issue.
|
||||
The idea is to extract the behaviour, the flow, of the application from the bytecode, and to represent it as a graph.
|
||||
A graph representation is easier to work with than a list of instructions, and can be used for further analysis.
|
||||
Depending on the level of precision required, different types of graphs can be computed.
|
||||
The most basic of those graph is the call graph.
|
||||
A call graph is a graph where the nodes represent the methods in the application, and the edges reprensent calls from one method to another.
|
||||
@fig:bg-fizzbuzz-cg-cfg b) show the call graph of the code in @fig:bg-fizzbuzz-cg-cfg a).
|
||||
A more advance control-flow analysis consist in building the control-flow graph.
|
||||
|
@ -118,20 +109,19 @@ This time, instead of methods, the nodes represent instructions, and the edges i
|
|||
supplement: [Figure],
|
||||
caption: [Source code for a simple Java method and its Call and Control Flow Graphs],
|
||||
)<fig:bg-fizzbuzz-cg-cfg>
|
||||
|
||||
Once the control-flow graph is computed, it can be used to compute data-flows.
|
||||
Data-flow analysis, also called taint-tracking, allows to follow the flow of information in the application.
|
||||
Be defining a list of methods and fields that can generate critical information (taint sources) and a list of methods that can consume information (taint sink), taint-tracking allows to detect potential data leaks (if a data flow link a taint source and a taint sink).
|
||||
For example, `TelephonyManager.getImei()` returns an unique, persistent, device identifier.
|
||||
This can be used to identify the user, and it cannot be changed if #jfl-note[compromised][replace by: this imei is dislaxd (illisible) \ jm: ???].
|
||||
This can be used to identify the user, and it cannot be changed if compromised.
|
||||
This make `TelephonyManager.getImei()` a good candidate as a taint source.
|
||||
On the other hand, `UrlRequest.start()` send a request to an external server, making it a taint sink.
|
||||
If a data-flow is found linking `TelephonyManager.getImei()` to `UrlRequest.start()`, this means the application is potentially leaking a critical information to an external entity, a behavior that is probably not wanted by the user.
|
||||
Data-flow analysis is the subject of many contribution~@weiAmandroidPreciseGeneral2014 @titzeAppareciumRevealingData2015 @bosuCollusiveDataLeak2017 @klieberAndroidTaintFlow2014 @DBLPconfndssGordonKPGNR15 @octeauCompositeConstantPropagation2015 @liIccTADetectingInterComponent2015, the most notable tool being Flowdroid~@Arzt2014a.
|
||||
|
||||
#todo[Describe the different contributions in relations to the issues they tackle, be more critical]
|
||||
|
||||
Static analysis is powerful as it allows to detects unwanted behavior in an application even is the behavior does not manifest itself when running the application.
|
||||
Hovewer, static analysis tools must overcom many challenges when analysing Android applications:
|
||||
Hovewer, static analysis tools must overcom many challenges when analysing Android applications.
|
||||
/ the Java object-oriented paradigm: A call to a method can in fact correspond to a call to any method overriding the original method in subclasses.
|
||||
/ the multiplicity of entry points: Each component of an application can be an entry point for the application.
|
||||
/ the event driven architecture: Methods of in the applications can be called when event occur, in unknown order.
|
||||
|
@ -142,13 +132,4 @@ Hovewer, static analysis tools must overcom many challenges when analysing Andro
|
|||
For instance, the multi-dex feature presented in @sec:bg-android-code-format was introduced in Android #SDK 21.
|
||||
Tools unaware of this feature only analyse the `classes.dex` file an will ignore all other `classes<n>.dex` files.
|
||||
|
||||
A lot of those more advanced tools rely on common tools to interact with Android applications/#DEX bytecode@~@Li2017.
|
||||
Reccuring examples of such support tools are Appktool (#eg Amandroid~@weiAmandroidPreciseGeneral2014, Blueseal~@shenInformationFlowsPermission2014, SAAF~@hoffmannSlicingDroidsProgram2013), Androguard (#eg Adagio~@gasconStructuralDetectionAndroid2013, Appareciumn~@titzeAppareciumRevealingData2015, Mallodroid~@fahlWhyEveMallory2012) or Soot (#eg Blueseal~@shenInformationFlowsPermission2014, DroidSafe~@DBLPconfndssGordonKPGNR15, Flowdroid~@Arzt2014a).
|
||||
|
||||
The number of publication related to static analysis make can make it difficult to find the right tool for the right task.
|
||||
Li #etal~@Li2017 published a systematic literature review for Android static analysis before May 2015.
|
||||
They analysed 92 publications and classified them by goal, method used to solve the problem and underlying technical solution for handling the bytecode when performing the static analysis.
|
||||
In particular, they listed 27 approaches with an open-source implementation available.
|
||||
Nevertheless, experiments to evaluate the reusability of the pointed out software were not performed.
|
||||
#jfl-note[We believe that the effort of reviewing the literature for making a comprehensive overview of available approaches should be pushed further: an existing published approach with a software that cannot be used for technical reasons endanger both the reproducibility and reusability of research.][A mettre en avant?]
|
||||
In the next section, we will look at the work that has been done to evaluate different analysis tools.
|
||||
#todo[Ca serait bien de souligner Dyn Code Load et Reflection]
|
9
2_background/2_android_bg.typ
Normal file
9
2_background/2_android_bg.typ
Normal file
|
@ -0,0 +1,9 @@
|
|||
#import "../lib.typ": todo
|
||||
|
||||
== Android Background <sec:bg-android-bg>
|
||||
|
||||
#todo[Intro]
|
||||
|
||||
#import("2_1_android.typ")
|
||||
#import("2_2_tools.typ")
|
||||
#import("2_3_static_analysis.typ")
|
11
2_background/3_problem_statements.typ
Normal file
11
2_background/3_problem_statements.typ
Normal file
|
@ -0,0 +1,11 @@
|
|||
#import "../lib.typ": todo
|
||||
|
||||
== PB <sec:bg-probl>
|
||||
|
||||
#todo[title for @sec:bg-probl]
|
||||
#todo[
|
||||
Problématiques du RE (reprendre l'intro avec ce qui a été dit dans 2.2)
|
||||
apktool et androguard sont réutilisé, ca fait supposé qu'il y a peut être un peu de réutilisation
|
||||
on peut charger des classes, et dans le code d'android, on vois qu'en fait le classes loading est beaucoup plus important que ca
|
||||
c'est connus que cl + statique + ref = nono, tout les outils présentes leurs solutions d'une certaine facons
|
||||
]
|
33
2_background/4_1_static_analysis.typ
Normal file
33
2_background/4_1_static_analysis.typ
Normal file
|
@ -0,0 +1,33 @@
|
|||
#import "../lib.typ": APK, etal, ART, SDK, DEX, eg,
|
||||
#import "../lib.typ": todo, jm-note, jfl-note
|
||||
#import "@preview/diagraph:0.3.5": raw-render
|
||||
|
||||
//== Android Reverse Engineering Techniques <sec:bg-techniques>
|
||||
|
||||
//#todo[swap with tool section ?]
|
||||
|
||||
|
||||
== Static Analysis <sec:bg-soa-static>
|
||||
|
||||
In the past fifteen years, the research community released many tools to detect or analyse malicious behaviors in applications.
|
||||
Two main approaches can be distinguished: static and dynamic analysis~@Li2017.
|
||||
Dynamic analysis requires to run the application in a controlled environment to observe runtime values and/or interactions with the operating system.
|
||||
For example, an Android emulator with a patched kernel can capture these interactions but the modifications to apply are not a trivial task.
|
||||
Such approach is limited by the required time to execute a limited part of the application with no guarantee on the obtained code coverage.
|
||||
Dynamic analysis is also limited by evading techniques that may prevent the execution of malicious parts of the code.
|
||||
As a consequence, a lot of efforts have been put in static approaches. //, which is the focus of this paper.
|
||||
|
||||
Data-flow analysis is the subject of many contribution~@weiAmandroidPreciseGeneral2014 @titzeAppareciumRevealingData2015 @bosuCollusiveDataLeak2017 @klieberAndroidTaintFlow2014 @DBLPconfndssGordonKPGNR15 @octeauCompositeConstantPropagation2015 @liIccTADetectingInterComponent2015, the most notable tool being Flowdroid~@Arzt2014a.
|
||||
|
||||
#todo[Describe the different contributions in relations to the issues they tackle, be more critical]
|
||||
|
||||
A lot of those more advanced tools rely on common tools to interact with Android applications/#DEX bytecode@~@Li2017.
|
||||
Reccuring examples of such support tools are Appktool (#eg Amandroid~@weiAmandroidPreciseGeneral2014, Blueseal~@shenInformationFlowsPermission2014, SAAF~@hoffmannSlicingDroidsProgram2013), Androguard (#eg Adagio~@gasconStructuralDetectionAndroid2013, Appareciumn~@titzeAppareciumRevealingData2015, Mallodroid~@fahlWhyEveMallory2012) or Soot (#eg Blueseal~@shenInformationFlowsPermission2014, DroidSafe~@DBLPconfndssGordonKPGNR15, Flowdroid~@Arzt2014a).
|
||||
|
||||
The number of publication related to static analysis make can make it difficult to find the right tool for the right task.
|
||||
Li #etal~@Li2017 published a systematic literature review for Android static analysis before May 2015.
|
||||
They analysed 92 publications and classified them by goal, method used to solve the problem and underlying technical solution for handling the bytecode when performing the static analysis.
|
||||
In particular, they listed 27 approaches with an open-source implementation available.
|
||||
Nevertheless, experiments to evaluate the reusability of the pointed out software were not performed.
|
||||
#jfl-note[We believe that the effort of reviewing the literature for making a comprehensive overview of available approaches should be pushed further: an existing published approach with a software that cannot be used for technical reasons endanger both the reproducibility and reusability of research.][A mettre en avant?]
|
||||
In the next section, we will look at the work that has been done to evaluate different analysis tools.
|
4
2_background/4_soa.typ
Normal file
4
2_background/4_soa.typ
Normal file
|
@ -0,0 +1,4 @@
|
|||
|
||||
== State of the Art <sec:bg-soa>
|
||||
|
||||
#import("4_1_static_analysis.typ")
|
|
@ -1,18 +1,8 @@
|
|||
#import "../lib.typ": SDK, API, API, etal
|
||||
|
||||
== Platform Classes <sec:bg-platform>
|
||||
== Platform Classes <sec:bg-soa-platform>
|
||||
|
||||
In addition to the classes they include, Android applications have access to classes provided by Android.
|
||||
Those classes are called _platform classes_.
|
||||
They are devided between #SDK classes, and hidden #API.
|
||||
The #SDK classes can be seen as the Android standard library.
|
||||
They are documented by Google, and have a certain stability from version to version.
|
||||
In case of breaking changes, the changed are listed by Google as well.
|
||||
The list of #SDK classes is available at complite time in the form of a `android.jar` file to link against.
|
||||
|
||||
On the opposite, hidden #API are undocumented methods used internally by Android.
|
||||
Still, they are loaded by the application and can be used by it.
|
||||
Thus, they are a potential blind spot when analysing an application.
|
||||
As we said earlier, hidden #API are undocumented methods that can be used by an application, thus making them a potential blind spot when analysing an application.
|
||||
However, not a lot a research has been done on the subject.
|
||||
Li #etal did an empirical study of the usage and evolution of hidden #API~@li_accessing_2016.
|
||||
They found that hidden #API are added and removed in every release of Android, and that they are used both by benign and malicious applications.
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
#import "../lib.typ": APK, pb1, pb2, pb3, pb1-text, pb2-text, pb3-text
|
||||
|
||||
== Conclusion
|
||||
== Conclusion <sec:bg-conclusion>
|
||||
|
||||
In this chapter, looked at the specificities of Android and the usual tools used as a basis for reverse engeenering applications.
|
||||
Many contributions have been done to static analysis, and benchmarks have been proposed to compare the different tools that resulted from those contributions.
|
||||
|
|
|
@ -4,10 +4,11 @@
|
|||
|
||||
#epigraph("Alexis \"Lex\" Murphy, Jurassic Park")[This is a Unix system. I know this.]
|
||||
|
||||
#include("0_intro.typ")
|
||||
#include("1_android.typ")
|
||||
#include("2_tools.typ")
|
||||
#include("3_static_analysis.typ")
|
||||
#include("1_intro.typ")
|
||||
#include("2_android_bg.typ")
|
||||
#include("3_problem_statements.typ")
|
||||
#include("4_soa.typ")
|
||||
|
||||
#include("4_datasets_and_benchmarking.typ")
|
||||
#include("5_platform_classes.typ")
|
||||
#include("6_classloading.typ")
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue