typos ch 5
All checks were successful
/ test_checkout (push) Successful in 49s

This commit is contained in:
Jean-Marie 'Histausse' Mineau 2025-12-21 14:39:17 +01:00
parent ca4e7703e1
commit 5497988199
Signed by: histausse
GPG key ID: B66AEEDA9B645AD2
3 changed files with 15 additions and 15 deletions

View file

@ -27,7 +27,7 @@ The remaining errors look more related to the application itself or Android, wit
Unfortunately, although we managed to start the applications, we can see from the list of activities visited by GroddDroid that a majority (#mypercent(dyn_res.all.z_act_visited, dyn_res.all.nb - dyn_res.all.nb_failed)) of the applications stopped before even starting one activity.
Some applications do not have any activities and are not intended to interact with a user, but those are clearly a minority and do not explain such a high number.
We expected some issues related to the use of an emulator, like the lack of x86_64 library in the applications, or contermesures aborting the application if the emulator is detected.
We expected some issues related to the use of an emulator, like the lack of x86_64 library in the applications, or countermesures aborting the application if an emulator is detected.
We manually looked at some applications, but did not find a notable pattern.
In some cases, the application was just broken -- for instance, an application was trying to load a native library that simply does not exist in the application.
In other cases, Frida is to blame: we found some cases where calling a method from Frida can confuse the #ART.
@ -92,7 +92,7 @@ Once we compared the files, we found that we only collected #num(bytecode_hashes
Once we looked more in detail, we found that most of those files are advertisement libraries.
In total, we collected #num(nb_google) files containing Google ads libraries and #num(nb_facebook) files containing Facebook ads libraries.
In addition, we found #num(nb_appsflyer) files containing code that we believe to be AppsFlyer, a company that provides "measurement, analytics, engagement, and fraud protection technologies".
The remaining #num(nb_bytecode_collected - nb_google - nb_appsflyer - nb_facebook) files were custom code from high security applications (#ie banking, social security)
The remaining #num(nb_bytecode_collected - nb_google - nb_appsflyer - nb_facebook) files were custom code from high security applications (#ie banking, social security).
@tab:th-bytecode-hashes summarises the information we collected about the most common bytecode files.
#figure(
@ -124,11 +124,11 @@ We did not find visible #DEX files or #APK files inside the applications, meanin
To estimate the scope of the code we made available, we use Androguard to generate the call graph of the applications, before and after the instrumentation.
@tab:th-compare-cg shows the number of edges of those call graphs.
The columns before and after show the total number of edges of the graphs, and the diff column indicates the number of new edges detected (#ie the number of edges after instrumentation minus the number of edges before).
This number include edges from the bytecode loaded dynamically, as well as the call added to reflect reflection calls, and calls to "glue" methods (method like `Integer.intValue()` used to convert objects to scalar values, or calls to `T.check_is_Xxx_xxx(Method)` used to check if a `Method` object represent a known method).
This number include edges from the bytecode loaded dynamically, as well as the call added to reflect reflection calls, and calls to "glue" methods (method like `Integer.intValue()` used to convert objects to scalar values, or calls to `T.check_is_Xxx_xxx(Method)` used to check if a `Method` object represents a known method).
The last column, "Added Reflection", is the list of non-glue method calls found in the call graph of the instrumented application but neither in the call graph of the original #APK, nor in the call graphs of the added bytecode files that we computed separately.
This corresponds to the calls we added to represent reflection calls.
The first application, #lower(compared_callgraph.at(0).sha256), is noticable.
The first application, #lower(compared_callgraph.at(0).sha256.slice(0,10) + "..."), is noticable.
The instrumented #APK has ten times more edges to its call graph than the original, and only one reflection call.
This is consistent with the behaviour of a packer: the application loads the main part of its code at runtime and switches from the bootstrap code to the loaded code with a single reflection call.
@ -159,14 +159,14 @@ This is consistent with the behaviour of a packer: the application loads the mai
caption: [Edges added to the call graphs computed by Androguard by instrumenting the applications]
) <tab:th-compare-cg>
Unfortunately, our implementation of the transformation is imperfect and sometimes fails, as illustrated by #lower("5D2CD1D10ABE9B1E8D93C4C339A6B4E3D75895DE1FC49E248248B5F0B05EF1CE") in @tab:th-compare-cg.
Unfortunately, our implementation of the transformation is imperfect and sometimes fails, as illustrated by #lower("5D2CD1D10ABE9B1E8D93C4C339A6B4E3D75895DE1FC49E248248B5F0B05EF1CE".slice(0,10))... in @tab:th-compare-cg.
However, over the #num(dyn_res.all.nb - dyn_res.all.nb_failed) applications whose dynamic analysis finished in our experiment, #num(nb_patched) were patched.
The remaining #mypercent(dyn_res.all.nb - dyn_res.all.nb_failed - nb_patched, dyn_res.all.nb - dyn_res.all.nb_failed) failed either due to some quirk in the zip format of the #APK file, because of a bug in our implementation when exceeding the method reference limit in a single #DEX file, or in the case of #lower("5D2CD1D10ABE9B1E8D93C4C339A6B4E3D75895DE1FC49E248248B5F0B05EF1CE"), because the application reused the original application classloader to load new code instead of instanciated a new classes loader (a behavior we did not expected as not possible using only the #SDK, but enabled by hidden #APIs).
The remaining #mypercent(dyn_res.all.nb - dyn_res.all.nb_failed - nb_patched, dyn_res.all.nb - dyn_res.all.nb_failed) failed either due to some quirk in the zip format of the #APK file, because of a bug in our implementation when exceeding the method reference limit in a single #DEX file, or in the case of #lower("5D2CD1D10ABE9B1E8D93C4C339A6B4E3D75895DE1FC49E248248B5F0B05EF1CE".slice(0,10))..., because the application reused the original application classloader to load new code instead of instanciated a new class loader (a behavior we did not expect as possible using only the #SDK, but enabled by hidden #APIs).
Taking into account the failure from both dynamic analysis and the instrumentation process, we have a #mypercent(dyn_res.all.nb - nb_patched, dyn_res.all.nb) failure rate.
This is a reasonable failure rate, but we should keep in mind that it adds up to the failure rate of the other tools we want to use on the patched application.
To check the impact on the finishing rate of our instrumentation, we then run the same experiment we ran in @sec:rasta.
We run the tools on the #APK before and after instrumentation, and compared the finishing rates in @fig:th-status-npatched-vs-patched (without taking into account #APKs we failed to patch#footnote[Due to a handling error during the experiment, the figure shows the results for #nb_patched_rasta #APKs instead of #nb_patched. \ We also ignored the tool from Wognsen #etal due to the high number of timeouts]).
We ran the tools on the #APK before and after instrumentation, and compared the finishing rates in @fig:th-status-npatched-vs-patched (without taking into account #APKs we failed to patch#footnote[Due to a handling error during the experiment, the figure shows the results for #nb_patched_rasta #APKs instead of #nb_patched. \ We also ignored the tool from Wognsen #etal~@wognsenFormalisationAnalysisDalvik2014 due to the high number of timeouts]).
#figure({
image(
@ -237,7 +237,7 @@ On the other hand, Saaf do not detect the issue with Apktool and pursues the ana
In this subsection, we use our approach on a unique #APK to look in more detail into the analysis of the transformed application.
We handcrafted this application for the purpose of demonstrating how this can help a reverse engineer in their work.
Accordingly, this application is quite small and contains both dynamic code loading and reflection.
We defined a method `Utils.source()` and `Utils.sink()` to model a method that collects sensitive data and a method that exfiltrates data.
We defined a method `Utils.source()` and `Utils.sink()` to model a method that collects sensitive data and a method that exfiltrates data respectively.
Those methods are the ones we will use with Flowdroid to track data flows.
#figure(