thesis/2_background/2_2_tools.typ

#import "../lib.typ": APK, IDE, SDK, DEX, ADB, ART, eg, XML, AXML, API, paragraph
#import "../lib.typ": jfl-note, todo

=== Reverse Engineering Tools <sec:bg-tools>

Due to the specificities of Android, reverse engineers need tools adapted to Android.
The development tools provided by Google can be used for basic operations, but a reverse engineer will quickly need more specialised tools.
Usually, the first step while analysing an application is to look at its content.
Apktool and Jadx are common tools used to convert the content of an application into a readable format.
Analysing an application this way, without running it, is called static analysis.
For more advanced forms of static analysis, Androguard and Soot can be used as libraries to automate analyses.
When static analysis becomes too complicated (#eg if the application uses obfuscation techniques), a reverse engineer might switch to dynamic analysis.
This time, the application is executed, and the analyst will scrutinise the behaviour of the application.
Frida is a good option to help with this dynamic analysis.
It is a toolkit that can be used to intercept method calls and execute custom scripts while an application is running.

#paragraph[*Android Studio*][
The whole Android development ecosystem is packaged by Google in the #IDE Android Studio#footnote[https://developer.android.com/studio].
In practice, Android Studio is a source-code editor that wraps around the different tools of the Android #SDK.
The #SDK tools and packages can be installed manually with the `sdkmanager` tool.
Among the notable tools in the #SDK are:

- `emulator`: this tool allows running an emulated Android phone on a computer.
  Although very useful, Android emulator has several limitations.
  For once, it cannot emulate another architecture.
  An x86_64 computer cannot emulate an ARM smartphone.
  This can be an issue because a majority of smartphones run on ARM processors.
  Also, for certain versions of Android, the proprietary GooglePlay libraries are not available on rooted emulators.
  Lastly, emulators are not designed to be stealthy and can easily be detected by an application.
  Malware will avoid detection by not running its payload on emulators.
- #ADB: a tool to send commands to an Android smartphone or emulator.
  It can be used to install applications, send instructions, events, and generally perform debugging operations.
- Platform Packages: Those packages contain data associated with a version of Android needed to compile an application.
  Especially, they contain the so-called `android.jar` files, which contain the list of #API for a version of Android.
- `d8`: The main use of `d8` is to convert Java bytecode files (`.class`) to Android #DEX format.
  It can also be used to perform different levels of optimisation of the bytecode generated.
- `aapt`/`aapt2` (Android Asset Packaging Tool): This tool is used to build the #APK file.
  It is commonly used by other tools that repackage applications like Apktool.
  Behind the scenes, it converts #XML to binary #AXML and ensures that each file has the right compression and alignment. (#eg some resource files are mapped in memory by the #ART, and thus need to be aligned and not compressed).
- `apksigner`: the tool used to sign an #APK file.
  When repackaging an application, for example, with Apktool, the new application needs to be signed.
]

#paragraph[*Apktool*][
Apktool#footnote[https://apktool.org/] is a _reengineering tool_ for Android #APK files.
It can be used to disassemble an application: it will extract the files from the #APK file, convert the binary #AXML to text #XML, and use smali/backsmali#footnote[https://github.com/JesusFreke/smali] to convert the #DEX files to smali, an assembler-like language that matches the Dalvik bytecode instructions.
The main strength of Apktool is that after disassembling an application, its content can be edited and reassembled into a new #APK.
]

#paragraph[*Androguard*][
Androguard#footnote[https://github.com/androguard/androguard]~@desnos:adnroguard:2011 is a Python library for parsing and disassembling #APK files.
It can be used to automatically read Android manifests, resources, and bytecode.
Contrary to Apktool, which generates text files, it can be used as a library to programmatically analyse the application.
It can also perform additional analysis, like computing a call graph or control flow graph of the application (we will explain what those graphs are later in @sec:bg-static).
However, contrary to Apktool, it cannot repackage a modified application.
]

#paragraph[*Jadx*][
Jadx#footnote[https://github.com/skylot/jadx] is an application decompiler.
It converts #DEX files to Java source code.
It is not always capable of decompiling all classes of an application, so it cannot be used to recompile a new application, but the code generated can be very helpful to reverse an application.
In addition to decompiling #DEX files, Jadx can also decode Android manifests and application resources.
]

#paragraph[*Soot*][
Soot#footnote[https://github.com/soot-oss/soot]~@Arzt2013 was originally a Java optimisation framework.
It could lift Java bytecode to other intermediate representations that can be optimised, then converted back to bytecode.
Because Dalvik bytecode and Java bytecode are equivalent, support for Android was added to Soot, and Soot features are now leveraged to analyse and modify Android applications.
One of the best-known examples of Soot usage for Android analysis is Flowdroid~@Arzt2014a, a tool that computes data flow in an application.

A new version of Soot, SootUp#footnote[https://github.com/soot-oss/SootUp], is currently being worked on.
Compared to Soot, it has a modernised interface and architecture, but it is not yet feature-complete, and some tools like Flowdroid are still using Soot.
]

#paragraph[*Frida*][
Frida#footnote[https://frida.re/] is a dynamic instrumentation toolkit.
It allows the reverse engineer to inject and run JavaScript code inside a running application.

To instrument an application, the Frida server must be running as root on the phone, or the Frida library must be injected inside the #APK file before installing it.
Frida defines a JavaScript wrapper around the Java Native Interface (JNI) used by native code to interact with Java classes and the Android #API.
In addition to allowing interaction with Java objects from the application and the Android API, this wrapper provides the option to replace a method implementation with a JavaScript function (that itself can call the original method implementation if needed).
This makes Frida a powerful tool capable of collecting runtime information or modifying the behaviour of an application as needed.

The main drawback of using Frida is that it is a known tool, easily detected by applications.
Malware might implement countermeasures that avoid running malicious payloads if Frida is detected.
]

#v(2em)

Those tools are quite useful for manual operations.
However, considering the complexity of modern Android applications, it might take a lot of work for a reverse engineer to analyse one application.
Different techniques have been developed to streamline the analysis.
Next, we will see the most common of those techniques for static analysis.