#import "../lib.typ": todo, APK, JAR, AXML, ART, SDK, JNI, NDK, DEX, XML == Android Android is the smartphone operating system develloped by Google. It is based on a Long Term Support Linux Kernel, to which patches are added patches develloped by the Android community. On top of the kernel, Android redeveloped many of the usual components used by linux-based operating systems, and added new ones. Those change make Android a verry unique operating system. === Android Applications Application in the Android ecosystem are distributed in the #APK format. #APK files are #JAR files with additionnal features, which are themself ZIP files with additionnal features. A minimal #APK file is a ZIP archive containing a file `AndroidManifest.xml`, the `META-INF/` folder containing the #JAR manifest and signature files, and an #APK Signing Block at the end of the ZIP file. Other files are then added. Dalvik bytecode is stored in the `classes.dex`, `classes2.dex`, `classes3.dex`, ... and native code is stored in `lib//*.so`. The `res/` folder contains the ressources required for the user interface. When ressources are present in `res/`, the file `resources.arsc` is also present at the root of the archive. The `assets/` folder contains the files that are used directly by the code application. Depending on the application and compilation process, any kind of other files and folders can be added to the application. ==== Signature Android applications are cryptographically signed to prove the autorship. Applicatations signed with same key are considered develloped by the same entity. This allow to securelly update applications, and application can declare security permission to restrict access to some feature to only application with the same author. Android has several signature schemes coexisting: - The v1 signature scheme is the #JAR signing scheme, where the signature data is stored in the `META-INF/` folder. - The v2, v3 and v3.1 signature scheme are store in the '#APK Signing Block' of the #APK. The v2 signature scheme was introduce in Android 7.0, and to keep retrocompatibility with older version, the v1 scheme is still used in addition to the #APK Signing Block. The Signing block is an unindexed binary section added to the ZIP file, between the ZIP entries and the Central Directory. The signature was added in an unindexed section of the ZIP to avoid interferring with the v1 signature scheme that sign the files inside the archive, and not the archive itself. - The v4 signature scheme is complementary to the v2/v3 signature scheme. Signature data are stored in an external, `.apk.idsig` file. ==== Android Manifest The Android Manifest is stored in the `AndroidManifest.xml`, encoded in the binary #AXML format. The manifest declare important informations about the application: - generic informations like the application name, id, icon - The Android compatibility of the applications, in the form of 3 values: the Android `min-sdk`, `target-sdk` and `max-sdk`. Those are the minimum, targeted and maximum version of the Android SDK supported by the application - The application componants (Activity, Service, Receiver and Provider) of the application and the classes they are associated to - Intent filters to list the itents that can start or be sent to the application componants - Security permissions required by the application ==== Code An application usually contains at least a `classes.dex` file containing Dalvik bytecode. This is the format executed by the Android #ART. It is common for an application to have more thant one #DEX file, when application need to reference more methods than the format allows in one file. Support for multiple #DEX files was added in the #SDK 21 version of Android, and applications that have multiple #DEX file are sometimes refered to as 'multi-dex'. In addition to #DEX files, and sometimes instead of #DEX files, applications can contain `.so` ELF (Executable and Linkable Format) files in the `lib/` folder. In the Android echosystem, binary code is called native code. Because native code is compile for a specific architecture, `.so` files are present in different versions, stored in different subfolders, depending on the targetted architecture. For example `lib/arm64-v8a/libexample.so` is the version of the `example` library compiled for an ARM 64 architecture. Because smartphones mostly use ARM processors, it is not rare to see applications that only have the ARM version of their native code. ==== Ressources Application user interface require many kind of specific assets, which are stored in `lib/`. Those ressources include bitmap images, text, layout, etc. Data like layout, color or text are stored in binary #AXML. An additionnal file, `resources.arsc`, in a custom binary format, contains a list of the ressources names, ids, and their properties. ==== Compilation Process For the developer, the compilation process is handled by Android Studio and is mostly transparent. Behind the scene, Android Studio rely on Gradle to orchestrate the different compilation steps: The sources #XML files like `AndroidManifest.xml` and the one in `res/` are compile to binary #AXML by `aapt`, which also generate the ressource table `resources.arsc` and a `R.java` file that define for each ressources variable named after the variable, set to the id of the ressouce. The `R.java` file allows the developer to refere to ressources with readable names and avoid using the often automatically generated ressources ids, that can change from a version of the application to another. The source code is then compile. The most common programming langages used for Android application are Java and Kotlin. Both are first compiled to java bytecode in `.class` files using the langage compiler. To allow access to the Android API, the `.class` are linked during the compilation to an `android.jar` file that contains classes with the same signatures as the one in the Android API for the targeted SDK. The `.class` files are the converted to #DEX files using `d8`. During those steeps, both the original langage compiler and `d8` can perform optimizations on the classes. If the application contains native code, the original C or C++ code is compile using tools Android #NDK to target the different architecture target. `aapt` is then used once again to package all the generated #AXML, #DEX, `.so` files, as well as the other ressources files, assets, `resources.arsc`, and any additionnal files deemed necessary in ZIP file. `aapt` ensures that the generated ZIP is compatible with the requirement from Android. For instance, the `resources.arsc` will be mapped directly in memory at runtime, so it must not be compressed inside the ZIP file. If necessary, the ZIP file is then aligned using `zipalign`. Again, this is to ensure compatibility with android optimizations: files like `resources.arsc` need to be 4 bits alligned to be mapped in memory. The last step is to sign the application using the `apksigner` utility. Since 2021, Google require that new applications in the Google Play app store to be uploaded in a new format called Android App Bundles. The main difference is that Google will perform the last packaging steps and generate (and sign) the application itself. This allow Google to generate different applications for different target, and avoid including unnecessary files in the application like native code targetting the wrong architecture. === Android Runtime #todo[NDK / JDK, java runtime, intent]