#import "../lib.typ": todo, SDK, num, mypercent, ART, ie #import "X_var.typ": * == Result #todo[better section name for @sec:th-res] To studdy the impact of our transformation on analysis tools, we took reused application from the dataset we sampled in @sec:rasta-dataset. Because we are running the application on a rescent version of Android (#SDK 34), we only took the most recent applications: the one collected in 2023. This represent #num(5000) applications. Among them, we could not retrieve 43 from Androzoo, leaving us with #num(dyn_res.all.nb) applications to test. === The Limits of Our Dynamic Analysis After running the dynamic analysis on our dataset the first time we realised our dynamic setup was quite fragile. We found that #mypercent(dyn_res.all.nb_failed_first_run, dyn_res.all.nb) of the execution failed with various errors. The majority of those errors were related to faillures to connect to the Frida agent or start the activity from Frida. Some of those errors seamed to come from Frida, while other seamed related to the emulator failing to start the application. We found the simply relauching the analysis for the applications that failled was the most simple way to fix those issues, and after 6 passes we went from #num(dyn_res.all.nb_failed_first_run) to #num(dyn_res.all.nb_failed) application that could not be analyzed. The remaining errors look more related to the application itself or Android, with #num(96) errors being a failure to install the application, and #num(110) other beeing a null pointer exception from Frida. Infortunatly, although we managed to start the applications, we can see from the list of activity visited by GroddDroid that a majority (#mypercent(dyn_res.all.z_act_visited, dyn_res.all.nb - dyn_res.all.nb_failed)) of the application stopped before even starting one activity. Some applications do not have an activity, and are not intended to interact with a user, but those are clearly a minority and do not explain such a high number. We expected some issue related to the use of an emulator, like the lack of x86_64 library in the applications, or contermesures aborting the application if the emulator is detected. We manually looked at some applications, but did not found a notable pattern. In some cases, the application was just broken, for instance the application might be trying to load a native library that simply do not exists in the application. In other case, Frida is to blame: we found some cases where calling a method from Frida can confuse the #ART. `protected` methods needs to be called from the class that defined the method or one of its children calsses, but Frida might be considered by the #ART as an other class, leading to the #ART aborting the application. #todo[jfl was suppose to test a few other app #emoji.eyes] @tab:th-dyn-visited shows the number of applications that we analysed, if we managed to start at least one activity and if we intercepted code loading or reflection. As shown in the table, even if the application fails to start an activity, some times it will still load external code or use reflection. #figure( table( columns: 6, stroke: none, inset: 7pt, align: center+horizon, table.header( table.hline(), table.cell(colspan: 6, inset: 2pt)[], table.cell(rowspan: 2)[], table.cell(rowspan: 2)[nb apk], table.vline(end: 3), table.vline(start: 4), table.cell(colspan: 2, inset: (bottom: 0pt))[nb failled], table.vline(end: 3), table.vline(start: 4), table.cell(colspan: 2, inset: (bottom: 0pt))[activities visited], [1#super[st] pass], [6#super[th] pass], [0], [$>= 1$], ), table.cell(colspan: 6, inset: 2pt)[], table.hline(), table.cell(colspan: 6, inset: 2pt)[], [All], num(dyn_res.all.nb), num(dyn_res.all.nb_failed_first_run), num(dyn_res.all.nb_failed), num(dyn_res.all.z_act_visited), num(dyn_res.all.nz_act_visited), [With Reflection], num(dyn_res.reflection.nb), [], [], num(dyn_res.reflection.z_act_visited), num(dyn_res.reflection.nz_act_visited), [With Code Loading], num(dyn_res.code_loading.nb), [], [], num(dyn_res.code_loading.z_act_visited), num(dyn_res.code_loading.nz_act_visited), table.cell(colspan: 3, inset: 2pt)[], table.hline(), ), caption: [Summary of the dynamic exploration of the applications from the RASTA dataset collected by Androzoo in 2023] ) The high number of application that did not start an activity means that our result will be highly biaised. The code that might be loaded or method that might be called by reflection from inside activities is filtered out by the limit of or dynamic execution. This biaised must be kept in mind when reading the next subsection that studdy the bytecode that we intercepted. === The Bytecode Loaded by Application We collected a total of #nb_bytecode_collected files for #dyn_res.code_loading.nb application that we detected loading bytecode dynamicatlly. #num(92) of them were loaded by a `DexClassLoader`, #num(547) were loaded by a `InMemoryDexClassLoader` and #num(1) was loaded by a `PathClassLoader`. Interressingly, once we compared the files, we found that we only collected #num(bytecode_hashes.len()) distinct files, and that #num(bytecode_hashes.at(0).at(0)) where identicals. Once we looked more in details, we found that most of those files are advertisement libraries. In total, we collected #num(nb_google) files containing Google ads librairies and #num(nb_facebook) files containing Facebook ads librairies. In addition, we found #num(nb_appsflyer) files containing code that we believe to be AppsFlyer, and company that provides "measurement, analytics, engagement, and fraud protection technologies". The remaining #num(nb_bytecode_collected - nb_google - nb_appsflyer - nb_facebook) files were custom code from high security applications (#ie banking, social security) @tab:th-bytecode-hashes sumarize the information we collected about the most common bytecode files. #figure( table( columns: 4, stroke: none, align: center+horizon, table.header( [Nb Occurences], [SHA 256], [Content], [Format] ), table.hline(), ..bytecode_hashes.slice(0, 10) .map( (e) => (num(e.at(0)), [#e.at(1).slice(0, 10)...], ..e.slice(2)) ).flatten(), table.cell(colspan: 4)[...], table.hline(), ), caption: [Most common dynamically loaded files] ) === Impact on Analysis Tools #todo[Check if flowdroid improve, compare sucess rate of RASTA, show result for demo app]