This commit is contained in:
parent
5e512b585a
commit
01ce20ffda
7 changed files with 81 additions and 59 deletions
|
@ -1,15 +1,20 @@
|
|||
#let ADB = link(<acr-adb>)[ADB]
|
||||
#let API = link(<acr-api>)[API]
|
||||
#let APK = link(<acr-apk>)[APK]
|
||||
#let APKs = link(<acr-apk>)[APKs]
|
||||
#let ART = link(<acr-art>)[ART]
|
||||
#let AXML = link(<acr-axml>)[AXML]
|
||||
#let DEX = link(<acr-dex>)[DEX]
|
||||
#let FR = link(<acr-fr>)[FR]
|
||||
#let HPC = link(<acr-hpc>)[HPC]
|
||||
#let MWE = link(<acr-mwe>)[MWE]
|
||||
#let OAT = link(<acr-oat>)[OAT]
|
||||
#let JAR = link(<acr-jar>)[JAR]
|
||||
#let JNI = link(<acr-jni>)[JNI]
|
||||
#let IDE = link(<acr-ide>)[IDE]
|
||||
#let NDK = link(<acr-ndk>)[NDK]
|
||||
#let SDK = link(<acr-sdk>)[SDK]
|
||||
#let SDKs = link(<acr-sdk>)[SDKs]
|
||||
#let XML = link(<acr-xml>)[XML]
|
||||
#let ZIP = link(<acr-zip>)[ZIP]
|
||||
|
||||
|
@ -25,6 +30,9 @@
|
|||
ART, [Android RunTime, the runtime environement that execute an Android application. The ART is the successor of the older Dalvik Virtual Machine <acr-art>],
|
||||
AXML, [Android #XML. The specific flavor of #XML used by Android. The main specificity of AXML is that it can be compile in a binary version inside an APK <acr-axml>],
|
||||
DEX, [Dalvik Executable, the file format for the bytecode used for applicatiobs by Android <acr-dex>],
|
||||
FR, [Finishing Rate, the number of runs that finished over the number of total runs of an analysis <acr-fr>],
|
||||
HPC, [High-Performance Computing, the use of supercomputers and computer clusers <acr-hpc>],
|
||||
MWE, [Minimum Working Example, in this context, a small example that can be used to check if a tool is working <acr-mwe>],
|
||||
IDE, [Integrated Development Environment, a software providing tools for software development <acr-ide>],
|
||||
JAR, [Java ARchive file, the file format used to store several java class files. Sometimes used by Android to store #DEX files instead of java classes <acr-jar>],
|
||||
JNI, [Java Native Interface, the native library used to interact with Java classes of the application and Android API <acr-jni>],
|
||||
|
|
|
@ -35,7 +35,7 @@ Bosu #etal~@bosuCollusiveDataLeak2017 use DIALDroid to perform a threat analysis
|
|||
Similarly, Luo #etal released TaintBench~@luoTaintBenchAutomaticRealworld2022 a real-world dataset and the associated recommendations to build such a dataset.
|
||||
These datasets are useful for carefully spotting missing taint flows, but contain only a few dozen of applications.
|
||||
|
||||
In addition to those datasets, Androzoo~@allixAndroZooCollectingMillions2016 collect applications from several application market places, including the Google Play store (the official Google application store), Anzhi and AppChina (two chinese stores), or FDroid (a store dedicated to free and open source applications).
|
||||
In addition to those datasets, AndroZoo~@allixAndroZooCollectingMillions2016 collect applications from several application market places, including the Google Play store (the official Google application store), Anzhi and AppChina (two chinese stores), or FDroid (a store dedicated to free and open source applications).
|
||||
Currently, Androzoo contains more than 25 millions applications, that can be downloaded by researchers from the SHA256 hash of the application.
|
||||
Androzoo also provide additionnal information about the applications, like the date the application was detected for the first time by Androzoo or the number of antivirus from VirusTotal that flaged the application as malicious.
|
||||
In addition to providing researchers with an easy access to real world applications, Androzoo make it a lot easier to share datasets for reproducibility: instead of sharing hundreds of #APK files, the list of SHA256 is enough.
|
||||
|
|
|
@ -31,8 +31,7 @@ As a summary, the contributions of this paper are the following:
|
|||
|
||||
The chapter is structured as follows.
|
||||
@sec:rasta-methodology presents the methodology employed to build our evaluation process and @sec:rasta-xp gives the associated experimental results.
|
||||
@sec:rasta-discussion investigates the reasons behind the observed failures of some of the tools.
|
||||
@sec:rasta-discussion discusses the limitations of this work and gives some takeaways for future contributions.
|
||||
@sec:rasta-discussion investigates the reasons behind the observed failures of some of the tools and discusses the limitations of this work and gives some takeaways for future contributions.
|
||||
@sec:rasta-conclusion concludes the chapter.
|
||||
|
||||
|
||||
|
|
|
@ -1,9 +1,13 @@
|
|||
#import "../lib.typ": todo, etal, eg
|
||||
#import "../lib.typ": etal, eg, MWE, HPC, SDK, SDKs, APKs, DEX
|
||||
#import "../lib.typ": todo, jfl-note
|
||||
#import "X_var.typ": *
|
||||
#import "X_lib.typ": *
|
||||
|
||||
== Methodology <sec:rasta-methodology>
|
||||
|
||||
#todo[small intro: resumé approche + schema?]
|
||||
#jfl-note[Add diagram: Li etal -> [tool selection] -> drop/ - selected -> [select source version] -> [packaging] -> docker / -> singularity -> [exp]]
|
||||
|
||||
=== Collecting Tools
|
||||
|
||||
#figure({
|
||||
|
@ -63,30 +67,29 @@
|
|||
)
|
||||
[
|
||||
*binaries, sources*: #nr: not relevant, #ok: available, #bad: partially available, #ko: not provided\
|
||||
*documentation*: #okk: excellent, MWE, #ok: few inconsistencies, #bad: bad quality, #ko: not available\
|
||||
*documentation*: #okk: excellent, #MWE, #ok: few inconsistencies, #bad: bad quality, #ko: not available\
|
||||
*decision*: #ok: considered; #bad: considered but not built; #ko: out of scope of the study
|
||||
]},
|
||||
caption: [Considered tools~@Li2017: availability and usage reliability],
|
||||
) <tab:rasta-tools>
|
||||
|
||||
We collected the static analysis tools from~@Li2017, plus one additional paper encountered during our review of the state-of-the-art (DidFail~@klieberAndroidTaintFlow2014).
|
||||
They are listed in @tab:rasta-tools, with the original release date and associated paper.
|
||||
They are listed in @tab:rasta-tools, with the original release date and associated publication.
|
||||
We intentionally limited the collected tools to the ones selected by Li #etal~@Li2017 for several reasons.
|
||||
First, not using recent tools enables to have a gap of at least 5 years between the publication and the more recent APK files, which enables to measure the reusability of previous contribution with a reasonable gap of time.
|
||||
Second, collecting new tools would require to describe these tools in depth, similarly to what have been performed by Li #etal~@Li2017, which is not the primary goal of this paper.
|
||||
Second, collecting new tools would require to inspect these tools in depth, similarly to what have been performed by Li #etal~@Li2017, which is not the primary goal of this chapter.
|
||||
Additionally, selection criteria such as the publication venue or number of citations would be necessary to select a subset of tools, which would require an additional methodology.
|
||||
These possible contributions are left for future work.
|
||||
|
||||
Some tools use hybrid analysis (both static and dynamic): A3E~@DBLPconfoopslaAzimN13, A5~@vidasA5AutomatedAnalysis2014, Android-app-analysis~@geneiatakisPermissionVerificationApproach2015, StaDynA~@zhauniarovichStaDynAAddressingProblem2015.
|
||||
They have been excluded from this paper.
|
||||
They have been excluded from this study.
|
||||
We manually searched the tool repository when the website mentioned in the paper is no longer available (#eg when the repository have been migrated from Google code to GitHub) and for each tool we searched for:
|
||||
|
||||
- an optional binary version of the tool that would be usable as a fall back (if the sources cannot be compiled for any reason);
|
||||
- the source code of the tool;
|
||||
- the documentation for building and using the tool with a MWE (Minimum Working Example).
|
||||
- an optional binary version of the tool that would be usable as a fall back (if the sources cannot be compiled for any reason).
|
||||
- the source code of the tool.
|
||||
- the documentation for building and using the tool with a #MWE.
|
||||
|
||||
In @tab:rasta-tools we rated the quality of these artifacts with "#ok" when available but may have inconsistencies, a "#bad" when too much inconsistencies (inaccurate remarks about the sources, dead links or missing parts) have been found, a "#ko" when no documentation have been found, and a double "#okk" for the documentation when it covers all our expectations (building process, usage, MWE).
|
||||
Results show that documentation is often missing or very poor (#eg Lotrack), which makes the rebuild process very complex and the first analysis of a MWE.
|
||||
In @tab:rasta-tools we rated the quality of these artifacts with "#ok" when available but may have inconsistencies, a "#bad" when too much inconsistencies (inaccurate remarks about the sources, dead links or missing parts) have been found, a "#ko" when no documentation have been found, and a double "#okk" for the documentation when it covers all our expectations (building process, usage, #MWE).
|
||||
Results show that documentation is often missing or very poor (#eg Lotrack), which makes the rebuild process very complex and the first analysis of a #MWE.
|
||||
|
||||
|
||||
We finally excluded Choi #etal~@CHOI2014620 as their tool works on the sources of Android applications, and Poeplau #etal~@DBLPconfndssPoeplauFBKV14 that focus on Android hardening.
|
||||
|
@ -177,7 +180,7 @@ For example, a fork of Aparecium contains a port for Windows 7 which does not su
|
|||
For IC3, the fork seems promising: it has been updated to be usable on a recent operating system (Ubuntu 22.04 instead of Ubuntu 12.04 for the original version) and is used as a dependency by IccTa.
|
||||
We decided to keep these two versions of the tool (IC3 and IC3_fork) to compare their results.
|
||||
|
||||
Then, we self-allocated a maximum of four days for each tool to successfully read and follow the documentation, compile the tool and obtain the expected result when executing an analysis of a MWE.
|
||||
Then, we self-allocated a maximum of four days for each tool to successfully read and follow the documentation, compile the tool and obtain the expected result when executing an analysis of a #MWE.
|
||||
We sent an email to the authors of each tool to confirm that we used the more suitable version of the code, that the command line we used to analyze an application is the most suitable one and, in some cases, requested some help to solve issues in the building process.
|
||||
We reported in @tab:rasta-sources the authors that answered our request and confirmed our decisions.
|
||||
|
||||
|
@ -189,32 +192,39 @@ Thus, if the documentation mentions a specific operating system, we use a Docker
|
|||
// Those libraries are only available on Ubuntu 12 or previous versions.
|
||||
//
|
||||
Most of the time, tools require additional external components to be fully functional.
|
||||
It could be resources such as the android.jar file for each version of the SDK, a database, additional libraries or tools.
|
||||
It could be resources such as the `android.jar` file for each version of the #SDK, a database, additional libraries or tools.
|
||||
Depending of the quality of the documentation, setting up those components can take hours to days.
|
||||
This is why we automatized in a Dockerfile the setup of the environment in which the tool is built and run#footnote[To guarantee reproducibility we published the results, datasets, Dockerfiles and containers: https://github.com/histausse/rasta, https://zenodo.org/records/10144014, https://zenodo.org/records/10980349 and on Docker Hub as `histausse/rasta-<toolname>:icsr2024`]
|
||||
This is why we automatized in a Dockerfile the setup of the environment in which the tool is built and run#footnote[
|
||||
#set list(indent: 1em) // avoid having the bullet align with the footnot numbering
|
||||
To guarantee reproducibility we published the results, datasets, Dockerfiles and containers:
|
||||
- https://github.com/histausse/rasta .
|
||||
- https://zenodo.org/records/10144014 .
|
||||
- https://zenodo.org/records/10980349 .
|
||||
- on Docker Hub as `histausse/rasta-<toolname>:icsr2024`.
|
||||
]
|
||||
|
||||
=== Runtime Conditions
|
||||
|
||||
#figure(
|
||||
image(
|
||||
"figs/running.svg",
|
||||
width: 80%,
|
||||
width: 100%,
|
||||
alt: "A diagram representing the methodology. The word 'Tool' is linked to a box labeled 'Docker image' by an arrow labeled 'building'. The box 'Docker image' is linked to a box labeled 'Singularity image' by an arrow labeled 'conversion'. The box 'Singularity image' is linked to a box labeled 'Execution monitoring' by a dotted arrow labeled 'Manuel tests' and to an image of a server labeled 'Singularity cluster' by an arrow labeled deployment. An image of three android logo labeled 'apks' is also linked to the 'Singularity cluster' by an arrow labeled 'running the tool analysis'. The 'Singularity cluster' image is linked to the 'Execution monitoring' box by an arrow labeled 'log capture'. The 'Execution monitoring' box linked to the words 'Exit status' by an unlabeled arrow.",
|
||||
),
|
||||
caption: [Methodology overview],
|
||||
) <fig:rasta-overview>
|
||||
|
||||
As shown in @fig:rasta-overview, before benchmarking the tools, we built and installed them in a Docker containers for facilitating any reuse of other researchers.
|
||||
We converted them into Singularity containers because we had access to such a cluster and because this technology is often used by the HPC community for ensuring the reproducibility of experiments.
|
||||
We converted them into Singularity containers because we had access to such a cluster and because this technology is often used by the #HPC community for ensuring the reproducibility of experiments.
|
||||
//The Docker container allows a user to interact more freely with the bundled tools.
|
||||
//Then, we converted this image to a Singularity image.
|
||||
We performed manual tests using these Singularity images to check:
|
||||
|
||||
- the location where the tool is writing on the disk. For the best performances, we expect the tools to write on a mount point backed by an SSD. Some tools may write data at unexpected locations which required small patches from us.
|
||||
- the amount of memory allocated to the tool. We checked that the tool could run a MWE with a #ramlimit limit of RAM.
|
||||
- the network connection opened by the tool, if any. We expect the tool not to perform any network operation such as the download of Android SDKs. Thus, we prepared the required files and cached them in the images during the building phase. In a few cases, we patched the tool to disable the download of resources.
|
||||
- the amount of memory allocated to the tool. We checked that the tool could run a #MWE with a #ramlimit limit of RAM.
|
||||
- the network connection opened by the tool, if any. We expect the tool not to perform any network operation such as the download of Android #SDKs. Thus, we prepared the required files and cached them in the images during the building phase. In a few cases, we patched the tool to disable the download of resources.
|
||||
|
||||
A campaign of tests consists in executing the #nbtoolsvariationsrun selected tools on all APKs of a dataset.
|
||||
A campaign of tests consists in executing the #nbtoolsvariationsrun selected tools on all #APKs of a dataset.
|
||||
The constraints applied on the clusters are:
|
||||
|
||||
- No network connection is authorized in order to limit any execution of malicious software.
|
||||
|
@ -268,6 +278,6 @@ If no antivirus has reported the application as malicious, we consider it as a g
|
|||
Applications in between are dropped.
|
||||
|
||||
For computing the release date of an application, we contacted the authors of Androzoo to compute the minimum date between the submission to Androzoo and the first upload to VirusTotal.
|
||||
Such a computation is more reliable than using the DEX date that is often obfuscated when packaging the application.
|
||||
Such a computation is more reliable than using the #DEX date that is often obfuscated when packaging the application.
|
||||
|
||||
// #todo[Transition] // plus de place :-(
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
#import "../lib.typ": todo, highlight, num, paragraph, SDK
|
||||
#import "../lib.typ": todo, highlight, num, paragraph, SDK, APK, DEX, FR
|
||||
#import "X_var.typ": *
|
||||
#import "X_lib.typ": *
|
||||
|
||||
|
@ -10,12 +10,12 @@
|
|||
|
||||
#todo[alt text for figure rasta-exit / rasta-exit-drebin]
|
||||
#figure(
|
||||
image("figs/exit-status-for-the-drebin-dataset.svg", width: 80%),
|
||||
image("figs/exit-status-for-the-drebin-dataset.svg", width: 100%),
|
||||
caption: [Exit status for the Drebin dataset],
|
||||
) <fig:rasta-exit-drebin>
|
||||
|
||||
#figure(
|
||||
image("figs/exit-status-for-the-rasta-dataset.svg", width: 80%),
|
||||
image("figs/exit-status-for-the-rasta-dataset.svg", width: 100%),
|
||||
caption: [Exit status for the Rasta dataset],
|
||||
) <fig:rasta-exit>
|
||||
|
||||
|
@ -24,7 +24,7 @@
|
|||
They represent the success/failure rate (green/orange) of the tools.
|
||||
We distinguished failure to compute a result from timeout (blue) and crashes of our evaluation framework (in grey, probably due to out of memory kills of the container itself).
|
||||
Because it may be caused by a bug in our own analysis stack, exit status represented in grey (Other) are considered as unknown errors and not as failure of the tool.
|
||||
#todo[We discuss further errors for which we have information in the logs in @sec:rasta-failure-analysis.]
|
||||
We discuss further errors for which we have information in the logs in @sec:rasta-failure-analysis.
|
||||
|
||||
Results on the Drebin datasets shows that 11 tools have a high success rate (greater than 85%).
|
||||
The other tools have poor results.
|
||||
|
@ -46,7 +46,8 @@ Concerning Flowdroid, our results show a very low timeout rate (#mypercent(37, N
|
|||
|
||||
As a summary, the final ratio of successful analysis for the tools that we could run
|
||||
// and applications of Rasta dataset
|
||||
is #mypercent(54.9, 100). When including the two defective tools, this ratio drops to #mypercent(49.9, 100).
|
||||
is #mypercent(54.9, 100).
|
||||
When including the two defective tools, this ratio drops to #mypercent(49.9, 100).
|
||||
|
||||
#highlight()[
|
||||
*RQ1 answer:*
|
||||
|
@ -63,7 +64,7 @@ For the tools that we could run, #resultratio of analysis are finishing successf
|
|||
[#figure(
|
||||
image(
|
||||
"figs/finishing-rate-by-year-of-java-based-tools.svg",
|
||||
width: 48%,
|
||||
width: 50%,
|
||||
alt: ""
|
||||
),
|
||||
caption: [Java based tools],
|
||||
|
@ -72,7 +73,7 @@ For the tools that we could run, #resultratio of analysis are finishing successf
|
|||
[#figure(
|
||||
image(
|
||||
"figs/finishing-rate-by-year-of-non-java-based-tools.svg",
|
||||
width: 48%,
|
||||
width: 50%,
|
||||
alt: "",
|
||||
),
|
||||
caption: [Non Java based tools],
|
||||
|
@ -81,7 +82,7 @@ For the tools that we could run, #resultratio of analysis are finishing successf
|
|||
), caption: [Exit status evolution for the Rasta dataset]
|
||||
)
|
||||
|
||||
For investigating the effect of application dates on the tools, we computed the date of each APK based on the minimum date between the first upload in AndroZoo and the first analysis in VirusTotal.
|
||||
For investigating the effect of application dates on the tools, we computed the date of each #APK based on the minimum date between the first upload in AndroZoo and the first analysis in VirusTotal.
|
||||
Such a computation is more reliable than using the dex date that is often obfuscated when packaging the application.
|
||||
Then, for the sake of clarity of our results, we separated the tools that have mainly Java source code from those that use other languages.
|
||||
Among the ones that are Java based programs, most of them use the Soot framework which may correlate the obtained results.
|
||||
|
@ -126,7 +127,7 @@ To compare the influence of the date, #SDK version and size of applications, we
|
|||
[#figure(
|
||||
image(
|
||||
"figs/decorelation/finishing-rate-of-java-based-tool-by-bytecode-size-of-apks-detected-in-2022.svg",
|
||||
width: 48%,
|
||||
width: 50%,
|
||||
alt: ""
|
||||
),
|
||||
caption: [Java based tools],
|
||||
|
@ -135,7 +136,7 @@ To compare the influence of the date, #SDK version and size of applications, we
|
|||
[#figure(
|
||||
image(
|
||||
"figs/decorelation/finishing-rate-of-non-java-based-tool-by-bytecode-size-of-apks-detected-in-2022.svg",
|
||||
width: 48%,
|
||||
width: 50%,
|
||||
alt: "",
|
||||
),
|
||||
caption: [Non Java based tools],
|
||||
|
@ -144,10 +145,11 @@ To compare the influence of the date, #SDK version and size of applications, we
|
|||
), caption: [Finishing rate by bytecode size for APK detected in 2022]
|
||||
) <fig:rasta-decorelation-size>
|
||||
|
||||
#paragraph([Fixed application year. (#num(5000) APKs)])[
|
||||
#paragraph[Fixed application year. (#num(5000) APKs)][
|
||||
We selected the year 2022 which has a good amount of representatives for each decile of size in our application dataset.
|
||||
@fig:rasta-rate-evolution-java-2022} (resp. @fig:rasta-rate-evolution-non-java-2022) shows the finishing rate of the tools in function of the size of the bytecode for Java based tools (resp. non Java based tools) analyzing applications of 2022.
|
||||
We can observe that all Java based tools have a finishing rate decreasing over years. 50% of non Java based tools have the same behavior.
|
||||
@fig:rasta-rate-evolution-java-2022 (resp. @fig:rasta-rate-evolution-non-java-2022) shows the finishing rate of the tools in function of the size of the bytecode for Java based tools (resp. non Java based tools) analyzing applications of 2022.
|
||||
We can observe that all Java based tools have a finishing rate decreasing over years.
|
||||
50% of non Java based tools have the same behavior.
|
||||
]
|
||||
|
||||
#todo[Alt text for fig rasta-decorelation-size]
|
||||
|
@ -155,7 +157,7 @@ We can observe that all Java based tools have a finishing rate decreasing over y
|
|||
[#figure(
|
||||
image(
|
||||
"figs/decorelation/finishing-rate-of-java-based-tool-by-discovery-year-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg",
|
||||
width: 48%,
|
||||
width: 50%,
|
||||
alt: ""
|
||||
),
|
||||
caption: [Java based tools],
|
||||
|
@ -164,7 +166,7 @@ We can observe that all Java based tools have a finishing rate decreasing over y
|
|||
[#figure(
|
||||
image(
|
||||
"figs/decorelation/finishing-rate-of-non-java-based-tool-by-discovery-year-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg",
|
||||
width: 48%,
|
||||
width: 50%,
|
||||
alt: "",
|
||||
),
|
||||
caption: [Non Java based tools],
|
||||
|
@ -173,7 +175,7 @@ We can observe that all Java based tools have a finishing rate decreasing over y
|
|||
), caption: [Finishing rate by discovery year with a bytecode size $in$ [4.08, 5.2] MB]
|
||||
) <fig:rasta-decorelation-size>
|
||||
|
||||
#paragraph([Fixed application bytecode size. (#num(6252) APKs)])[We selected the sixth decile (between 4.08 and 5.20 MB), which is well represented in a wide number of years.
|
||||
#paragraph[Fixed application bytecode size. (#num(6252) APKs)][We selected the sixth decile (between 4.08 and 5.20 MB), which is well represented in a wide number of years.
|
||||
@fig:rasta-rate-evolution-java-decile-year (resp. @fig:rasta-rate-evolution-non-java-decile-year) represents the finishing rate depending of the year at a fixed bytecode size.
|
||||
We observe that 9 tools over 12 have a finishing rate dropping below 20% for Java based tools, which is not the case for non Java based tools.
|
||||
]
|
||||
|
@ -183,7 +185,7 @@ We observe that 9 tools over 12 have a finishing rate dropping below 20% for Jav
|
|||
[#figure(
|
||||
image(
|
||||
"figs/decorelation/finishing-rate-of-java-based-tool-by-min-sdk-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg",
|
||||
width: 48%,
|
||||
width: 50%,
|
||||
alt: ""
|
||||
),
|
||||
caption: [Java based tools],
|
||||
|
@ -192,7 +194,7 @@ We observe that 9 tools over 12 have a finishing rate dropping below 20% for Jav
|
|||
[#figure(
|
||||
image(
|
||||
"figs/decorelation/finishing-rate-of-non-java-based-tool-by-min-sdk-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg",
|
||||
width: 48%,
|
||||
width: 50%,
|
||||
alt: "",
|
||||
),
|
||||
caption: [Non Java based tools],
|
||||
|
@ -205,7 +207,7 @@ We performed similar experiments by variating the min #SDK and target #SDK versi
|
|||
We found that contrary to the target #SDK, the min #SDK version has an impact on the finishing rate of Java based tools: 8 tools over 12 are below 50% after #SDK 16.
|
||||
It is not surprising, as the min #SDK is highly correlated to the year.
|
||||
|
||||
#highlight()[
|
||||
#highlight(breakable: false)[
|
||||
*RQ2 answer:*
|
||||
For the #nbtoolsselected tools that can be used partially, a global decrease of the success rate of tools' analysis is observed over time.
|
||||
Starting at 78% of success rate, after five years, tools have 61% of success; after ten years, 45% of success.
|
||||
|
@ -253,7 +255,7 @@ sqlite> SELECT vt_detection == 0, COUNT(DISTINCT sha256) FROM apk WHERE dex_size
|
|||
#figure(
|
||||
image(
|
||||
"figs/exit-status-for-the-rasta-dataset-goodware-malware.svg",
|
||||
width: 80%,
|
||||
width: 100%,
|
||||
alt: "",
|
||||
),
|
||||
caption: [Exit status comparing goodware and malware for the Rasta dataset],
|
||||
|
@ -288,32 +290,32 @@ sqlite> SELECT AVG(apk_size) FROM apk WHERE vt_detection != 0;
|
|||
#figure({
|
||||
show table: set text(size: 0.80em)
|
||||
table(
|
||||
columns: 4,
|
||||
columns: 3, //4,
|
||||
inset: (x: 0% + 5pt, y: 0% + 2pt),
|
||||
stroke: none,
|
||||
align: center+horizon,
|
||||
table.hline(),
|
||||
table.header(
|
||||
table.cell(colspan: 4, inset: 3pt)[],
|
||||
table.cell(colspan: 3/*4*/, inset: 3pt)[],
|
||||
table.cell(rowspan:2)[*Rasta part*],
|
||||
table.vline(end: 3),
|
||||
table.vline(start: 4),
|
||||
table.cell(colspan:2)[*Average size*],
|
||||
table.vline(end: 3),
|
||||
table.vline(start: 4),
|
||||
table.cell(rowspan:2)[*Average date*],
|
||||
table.cell(colspan:2)[*Average size* (MB)],
|
||||
//table.vline(end: 3),
|
||||
//table.vline(start: 4),
|
||||
//table.cell(rowspan:2)[*Average date*],
|
||||
[*APK*],
|
||||
[*DEX*],
|
||||
),
|
||||
table.cell(colspan: 4, inset: 3pt)[],
|
||||
table.cell(colspan: 3/*4*/, inset: 3pt)[],
|
||||
table.hline(),
|
||||
table.cell(colspan: 4, inset: 3pt)[],
|
||||
table.cell(colspan: 3/*4*/, inset: 3pt)[],
|
||||
|
||||
[*goodware*], num(16897989), num(6598464), [2017],
|
||||
[*malware*], num(17236860), num(4337376), [2017],
|
||||
[*total*], num(16918107), num(6464228), [2017],
|
||||
[*goodware*], num(calc.round(16.897989, digits: 1)), num(calc.round(6.598464, digits: 1)),// [2017],
|
||||
[*malware*], num(calc.round(17.236860, digits: 1)), num(calc.round(4.337376, digits: 1)),// [2017],
|
||||
[*total*], num(calc.round(16.918107, digits: 1)), num(calc.round(6.464228, digits: 1)),// [2017],
|
||||
|
||||
table.cell(colspan: 4, inset: 3pt)[],
|
||||
table.cell(colspan: 3/*4*/, inset: 3pt)[],
|
||||
table.hline(),
|
||||
)},
|
||||
caption: [Average size and date of goodware/malware parts of the Rasta dataset],
|
||||
|
@ -336,13 +338,13 @@ sqlite> SELECT AVG(apk_size) FROM apk WHERE vt_detection != 0;
|
|||
table.cell(colspan:2)[*Average DEX size (MB)*],
|
||||
table.vline(end: 3),
|
||||
table.vline(start: 4),
|
||||
table.cell(colspan:2)[* Finishing Rate: FR*],
|
||||
table.cell(colspan:2)[* Finishing Rate: #FR*],
|
||||
table.vline(end: 3),
|
||||
table.vline(start: 4),
|
||||
[*Ratio Size*],
|
||||
table.vline(end: 3),
|
||||
table.vline(start: 4),
|
||||
[*Ratio FR*],
|
||||
[*Ratio #FR*],
|
||||
[Good], [Mal],
|
||||
[Good], [Mal],
|
||||
[Good/Mal], [Good/Mal],
|
||||
|
@ -365,7 +367,7 @@ sqlite> SELECT AVG(apk_size) FROM apk WHERE vt_detection != 0;
|
|||
table.cell(colspan: 7, inset: 3pt)[],
|
||||
table.hline(),
|
||||
)},
|
||||
caption: [DEX size and Finishing Rate (FR) per decile],
|
||||
caption: [#DEX size and Finishing Rate (#FR) per decile],
|
||||
) <tab:rasta-sizes-decile>
|
||||
|
||||
We compared the finishing rate of malware and goodware applications for evaluated tools.
|
||||
|
|
|
@ -133,7 +133,7 @@ Therefore, we investigated the nature of errors globally, without distinction be
|
|||
#figure(
|
||||
image(
|
||||
"figs/repartition-of-error-types-among-tools.svg",
|
||||
width: 80%,
|
||||
width: 100%,
|
||||
alt: "",
|
||||
),
|
||||
caption: [Heatmap of the ratio of errors reasons for all tools for the Rasta dataset],
|
||||
|
|
|
@ -1,6 +1,9 @@
|
|||
#import "@local/template-thesis-matisse:0.0.1": etal
|
||||
#import "../lib.typ": todo
|
||||
#import "X_var.typ": *
|
||||
|
||||
#todo[Futur work: new systematic literature review, maybe check https://ieeexplore.ieee.org/abstract/document/9118907 ?]
|
||||
|
||||
== Conclusion <sec:rasta-conclusion>
|
||||
|
||||
This paper has assessed the suggested results of the literature~@luoTaintBenchAutomaticRealworld2022 @pauckAndroidTaintAnalysis2018 @reaves_droid_2016 about the reliability of static analysis tools for Android applications.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue