thesis/2_background/2_tools.typ
Jean-Marie Mineau 2e52599a7c
Some checks failed
/ test_checkout (push) Failing after 22s
correction background
2025-08-06 00:25:42 +02:00

89 lines
6.6 KiB
Typst

#import "../lib.typ": todo, APK, IDE, SDK, DEX, ADB, ART, eg, XML, AXML, API, jfl-note
== Reverse Engineering Tools <sec:bg-tools>
Due to the specificities of Android, reverse engineers need tools adapted to Android.
The developement tools provided by Google can be used for basic operations.
Apktool and Jadx are common tools used to read the content of an application, meanwhile Androguard and Soot can be used as librairy to automate analysis.
For a more dynamic approach, Frida is a toolkit that can be use to intercept method call and execute custom while an application is running.
=== Android Studio <sec:bg-android-studio>
The whole Android developement ecosystem is packaged by Google in the #IDE Android Studio#footnote[https://developer.android.com/studio].
In practice, Android Studio is a source-code editor that wrap arround the different tools of the android #SDK.
The #SDK tools and packages can be installed manually with the `sdkmanager` tool.
Among the notable tools in the #SDK, they are:
- `emulator`: an Android emulator.
This tools allow to run an emulated Android phone on a computer.
Although very usefull, Android emulator has several limitation.
For once, it cannot emulate another achitecture.
An x86_64 computer cannot emulate an ARM smartphone.
This can be an issue because a majority of smartphone run on ARM processor.
Also, for certain version of Android, the proprietary GooglePlay libraries are not available on rooted emulators.
Lastly, emulators are not designed to be stealthy and can easily be detected by an application.
Malware will avoid detection by not running their payload on emulators.
- #ADB: a tool to send commands to Android smartphone or emulator.
It can be used to install applications, send instructions, events, and generally perform debuging operations.
- Platform Packages: Those packages contains data associated to a version of android needed to compile an application.
Especially, they contains the so call `android.jar` files, that contains the list of #API for a version of Android.
- `d8`: The main use of `d8` is to convert java bytecode files (`.class`) to Android #DEX format.
It can also be used to perform different level of optimization of the bytecode generated.
- `aapt`/`aapt2` (Android Asset Packaging Tool): This tools is used to build the #APK file.
It is commonly used by other tools that repackage applications like Apktool.
Behind the scene, it we convert #XML to binary #AXML and ensure the right files have the right compression and alignment. (#eg some ressource files are mapped in memory by the #ART, and thus need to be aligned and not compressed).
- `apksigner`: the tool used to sign an #APK file.
When repackaging an application, for example with Apktool, the new application need to be signed.
=== Apktool <sec:bg-apktool>
Apktool#footnote[https://apktool.org/] is a _reengineering tool_ for Android #APK files.
It can be used to disassemble an application: it will extract the files from the #APK file, convert the binary #AXML to text #XML, and use smali/backsmali#footnote[https://github.com/JesusFreke/smali] to convert the #DEX files to smali, an assembler-like langage that match the Dalvik bytecode instructions.
The main strenght of Apktool is that after having disassemble an application, the content of the application can be edited and reassemble into a new #APK. #jfl-note[limites? ca marche toujours?]
=== Androguard <sec:bg-androguard>
Androguard#footnote[https://github.com/androguard/androguard]~@desnos:adnroguard:2011 is a python library for parsing and analysing #APK files.
#jfl-note[Its main feature is disassembling #APK files.][backend #sym.eq.not apktool?]
It can be used to automatically read Android manifests, ressources, and bytecode.
Contrary to Apktool, it can be used programatically, whithout parsing text files, to analyse the application, but it cannot repackage a modified application.
In addition, it can perform additionnal analysis, like computing a call graph or control flow graph.
=== Jadx <sec:bg-jadx>
Jadx#footnote[https://github.com/skylot/jadx] is an application decompiler.
It convert #DEX files to Java source code.
It is not always capable of decompiling all classes of an application, so it cannot be used to recompile a new application, but the code generated can be verry helpfull to reverse an application.
In addition to decompilling #DEX files, Jadx can also decode Android manifests and application ressources.
=== Soot <sec:bg-soot>
Soot#footnote[https://github.com/soot-oss/soot]~@Arzt2013 is a Java optimization framework.
It can leaft java bytecode to other intermediate representations that can be used to perform optimization then converted back to bytecode.
Because Dalvik bytecode and Java bytecode are equivalent, support for Android was added to Soot, and Soot features are now leveraged to analyse Android applications.
One of the best known example of Soot usage for Android analysis is Flowdroid~@Arzt2014a, a tool that compute data flow in an application.
A new version of Soot, SootUp#footnote[https://github.com/soot-oss/SootUp], is currently beeing worked on.
Compared to Soot, it has a modernize interface and architecture, but it is not yet feature complete and some tools like Flowdroid are still using Soot.
=== Frida <sec:bg-frida>
Fidra#footnote[https://frida.re/] is a dynamic intrumentation toolkit.
It allows the reverse engineer to inject and run javascript code inside a running application.
To instrument an application, the frida server must be running as root on the phone, or the frida librairy must be injected inside the #APK file before installing it.
Frida defines a javascript wrapper arround the Java Native Interface (JNI) used by native code to interact with Java classes and the Android #API.
In addition to allowing interaction with Java objects from the application and the Android API, this wrapper provides the option to replace a method implementation by a javascript function (that itself can call the original method implementation if needed).
This make Frida a powerful tool capable of collecting runtime informations or modifying the behavior of an application as needed.
The main drawback of using Frida is that it is a known tools easily detected by applications.
Malware might implement countermeasures that avoid running malicious payload in presence of Frida.
#v(2em)
Those tools are quite usefull for manual operations.
However, considering the complexity of modern Android applications, it might take a lot of work for a reverse engineer to analyse one application.
In the next section, we will see more advance techniques that have been developped to analyse Android applications.