rasta
This commit is contained in:
parent
fe6dbb1d22
commit
e794c037e8
10 changed files with 377 additions and 52 deletions
|
@ -1,4 +1,4 @@
|
|||
#import "../lib.typ": todo, epigraph, eg, APK, API, highlight, jm-note,
|
||||
#import "../lib.typ": todo, epigraph, eg, APK, API, highlight-block, jm-note,
|
||||
|
||||
= Introduction <sec:intro>
|
||||
|
||||
|
@ -44,13 +44,13 @@ Android applications are becoming more complexe every years and tools that canno
|
|||
This leads us to our first problem statement:
|
||||
// Chiffrer les contrib avec des xp qui ignore les app qui font crasher les outils?
|
||||
|
||||
#highlight(breakable: false)[
|
||||
*Pb 1*: _To what extent are previously published Android analysis tools still usable today, and what factors impact their reusability?_
|
||||
#highlight-block(breakable: false)[
|
||||
*Pb1*: _To what extent are previously published Android analysis tools still usable today, and what factors impact their reusability?_
|
||||
|
||||
Many tools have been published to analyse Android applications, but the Android ecosystem is fast evolving.
|
||||
Tools developed 5 years ago might not be usable anymore.
|
||||
We will endeavor to identify which tools are still usable today, and for the others, what causes them to no longer be an option.
|
||||
]
|
||||
] <pb-1>
|
||||
|
||||
Another issue is that Android application developpers sometime use various techniques to slow down reverse engineering.
|
||||
This process called obfuscation.
|
||||
|
@ -69,13 +69,13 @@ However, class loading is not limited to dynamic code loading.
|
|||
In fact, the Android Runtime is constantly performing class loading to load classes from the application of from the Android platform itself.
|
||||
This blind spot in static analysis tools raises our second problem statement:
|
||||
|
||||
#highlight(breakable: false)[
|
||||
*Pb 2*: _What is the default Android class loading algorithm, and does it impact static analysis?_
|
||||
#highlight-block(breakable: false)[
|
||||
*Pb2*: _What is the default Android class loading algorithm, and does it impact static analysis?_
|
||||
|
||||
Class loading is an operation often ignored in static analysis.
|
||||
The exact algorithm used is not well known and might not be accurately modeled by static analysis tools.
|
||||
If it is the case, discrepancies between the model of the tools and the one used by Android could be used as a base for new obfuscation techniques.
|
||||
]
|
||||
] <pb-2>
|
||||
|
||||
Reflection is another common obfuscation technique against static analysis.
|
||||
Instead of directly invoking methods, the generic `Method.invoke()` #API is used, and the method is retrieved from its name in the form of a character string.
|
||||
|
@ -83,14 +83,14 @@ Finding the value of this string can be quite difficult to determine statically,
|
|||
A reverse engineer can obtain the relevant information with dynamic analysing, but there is no standard way to make static analysis tools aware of it.
|
||||
This lead us to our last problem statement:
|
||||
|
||||
#highlight(breakable: false)[
|
||||
*Pb 3*: _Can we provide dynamic code loading and reflection data collected dynamically to any static analysis tools to improve their results?_
|
||||
#highlight-block(breakable: false)[
|
||||
*Pb3*: _Can we provide dynamic code loading and reflection data collected dynamically to any static analysis tools to improve their results?_
|
||||
|
||||
Dynamic code loading and reflection are problems most suited for dynamic analysis.
|
||||
However, static analysis tools do not have access to collected data.
|
||||
Encoding this information inside valid applications could be a way to make it universally available to any static analysis tool.
|
||||
#todo[say something about the impact that can have on tools?]
|
||||
]
|
||||
] <pb-3>
|
||||
|
||||
#[
|
||||
#set heading(numbering: none, outlined: false, bookmarked: false)
|
||||
|
|
|
@ -16,9 +16,9 @@ We rebuild the tools in their original environment and share our Docker images.#
|
|||
We evaluated the reusability of the tools by measuring the number of successful analysis of applications taken in the Drebin dataset~@Arp2014 and in a custom dataset that contains more recent applications (#NBTOTALSTRING in total).
|
||||
The observation of the success or failure of these analysis enables us to answer the following research questions:
|
||||
|
||||
/ RQ1: What Android static analysis tools that are more than 5 years old are still available and can be reused without crashing with a reasonable effort?
|
||||
/ RQ2: How the reusability of tools evolved over time, especially when analyzing applications that are more than 5 years far from the publication of the tool?
|
||||
/ RQ3: Does the reusability of tools change when analyzing goodware compared to malware?
|
||||
/ RQ1: What Android static analysis tools that are more than 5 years old are still available and can be reused without crashing with a reasonable effort? <rq-1>
|
||||
/ RQ2: How the reusability of tools evolved over time, especially when analyzing applications that are more than 5 years far from the publication of the tool? <rq-2>
|
||||
/ RQ3: Does the reusability of tools change when analyzing goodware compared to malware? <rq-3>
|
||||
|
||||
/*
|
||||
As a summary, the contributions of this paper are the following:
|
||||
|
@ -34,4 +34,4 @@ The chapter is structured as follows.
|
|||
@sec:rasta-failure-analysis investigates the reasons behind the observed failures of some of the tools.
|
||||
We then compare in @sec:rasta-soa-comp our results with the contributions presented in @sec:bg-eval-tools.
|
||||
In @sec:rasta-reco, we give recommendations for tool development we drawn from our experience running our experiment.
|
||||
Finally, @sec:rasta-limit list the limit of our approach, an @sec:rasta-conclusion concludes the chapter.
|
||||
Finally, @sec:rasta-limit list the limit of our approach, @sec:rasta-futur present further avenues that did not had time to pursue and @sec:rasta-conclusion concludes the chapter.
|
||||
|
|
|
@ -1,11 +1,11 @@
|
|||
#import "../lib.typ": todo, highlight, num, paragraph, SDK, APK, DEX, FR, APKs
|
||||
#import "../lib.typ": todo, highlight-block, num, paragraph, SDK, APK, DEX, FR, APKs
|
||||
#import "X_var.typ": *
|
||||
#import "X_lib.typ": *
|
||||
|
||||
== Experiments <sec:rasta-xp>
|
||||
|
||||
|
||||
=== RQ1: Re-Usability Evaluation
|
||||
=== #rq1: Re-Usability Evaluation
|
||||
|
||||
|
||||
#figure(
|
||||
|
@ -104,14 +104,14 @@ As a summary, the final ratio of successful analysis for the tools that we could
|
|||
is #mypercent(54.9, 100).
|
||||
When including the two defective tools, this ratio drops to #mypercent(49.9, 100).
|
||||
|
||||
#highlight()[
|
||||
*RQ1 answer:*
|
||||
#highlight-block()[
|
||||
*#rq1 answer:*
|
||||
On a recent dataset we consider that #resultunusable of the tools are unusable.
|
||||
For the tools that we could run, #resultratio of analysis are finishing successfully.
|
||||
//(those with less than 50% of successful execution and including the two tools that we were unable to build).
|
||||
]
|
||||
|
||||
=== RQ2: Size, #SDK and Date Influence
|
||||
=== #rq2: Size, #SDK and Date Influence
|
||||
|
||||
#todo[alt text for fig rasta-exit-evolution-java and rasta-exit-evolution-not-java]
|
||||
|
||||
|
@ -262,8 +262,8 @@ We performed similar experiments by variating the min #SDK and target #SDK versi
|
|||
We found that contrary to the target #SDK, the min #SDK version has an impact on the finishing rate of Java based tools: 8 tools over 12 are below 50% after #SDK 16.
|
||||
It is not surprising, as the min #SDK is highly correlated to the year.
|
||||
|
||||
#highlight(breakable: false)[
|
||||
*RQ2 answer:*
|
||||
#highlight-block(breakable: false)[
|
||||
*#rq2 answer:*
|
||||
For the #nbtoolsselected tools that can be used partially, a global decrease of the success rate of tools' analysis is observed over time.
|
||||
Starting at 78% of success rate, after five years, tools have 61% of success; after ten years, 45% of success.
|
||||
The success rate varies based on the size of bytecode and #SDK version.
|
||||
|
@ -271,7 +271,7 @@ The date is also correlated with the success rate for Java based tools only.
|
|||
]
|
||||
|
||||
|
||||
=== RQ3: Malware vs Goodware <sec:rasta-mal-vs-good>
|
||||
=== #rq3: Malware vs Goodware <sec:rasta-mal-vs-good>
|
||||
|
||||
#figure({
|
||||
show table: set text(size: 0.80em)
|
||||
|
@ -446,7 +446,7 @@ It goes from 1.03 for the 2#super[nd] decile to 0.67 in the 9#super[th] decile.
|
|||
We conclude from this table that, at equal size, analyzing malware still triggers less errors than for goodware, and that the difference of errors generated between when analyzing a goodware and analyzing a malware increase with the bytecode size.
|
||||
|
||||
|
||||
#highlight()[
|
||||
*RQ3 answer:*
|
||||
#highlight-block()[
|
||||
*#rq3 answer:*
|
||||
Analyzing malware applications triggers less errors for static analysis tools than analyzing goodware for comparable bytecode size.
|
||||
]
|
||||
|
|
|
@ -1,19 +0,0 @@
|
|||
#import "@local/template-thesis-matisse:0.0.1": etal
|
||||
#import "../lib.typ": todo, jfl-note
|
||||
#import "X_var.typ": *
|
||||
|
||||
#todo[Futur work: new systematic literature review, maybe check https://ieeexplore.ieee.org/abstract/document/9118907 ?]
|
||||
|
||||
== Conclusion <sec:rasta-conclusion>
|
||||
|
||||
#todo[Anwser pb1]
|
||||
|
||||
This paper has assessed the suggested results of the literature~@luoTaintBenchAutomaticRealworld2022 @pauckAndroidTaintAnalysis2018 @reaves_droid_2016 about the reliability of static analysis tools for Android applications.
|
||||
With a dataset of #NBTOTALSTRING applications we established that #resultunusable of #nbtoolsselectedvariations tools are not reusable, when considering that a tool that has more than 50% of time a failure is unusable.
|
||||
In total, the analysis success rate of the tools that we could run for the entire dataset is #resultratio.
|
||||
The characteristics that have the most influence on the success rate is the bytecode size and min SDK version. Finally, we showed that malware APKs have a better finishing rate than goodware.
|
||||
|
||||
#jfl-note[In future works, we plan to investigate deeper the reported errors of the tools in order to analyze the most common types of errors, in particular for Java based tools.
|
||||
We also plan to extend this work with a selection of more recent tools performing static analysis.
|
||||
|
||||
Following Reaves #etal recommendations~@reaves_droid_2016, we publish the Docker and Singularity images we built to run our experiments alongside the Docker files. This will allow the research community to use directly the tools without the build and installation penalty.][*Developper*]
|
59
3_rasta/8_futur_works.typ
Normal file
59
3_rasta/8_futur_works.typ
Normal file
|
@ -0,0 +1,59 @@
|
|||
#import "../lib.typ": jfl-note, etal, APKs
|
||||
#import "X_var.typ": tool_info
|
||||
#import "X_lib.typ": ok
|
||||
|
||||
== Futur Works <sec:rasta-futur>
|
||||
|
||||
A first extention to this work would obviously be to studdy more tools.
|
||||
We restricted ourself to the tools listed by Li #etal, but it would interesting to compare our result to the finishing rate of recently released tools.
|
||||
It would be interesting to see if they are better at handling large #APKs, but also to see if older applications are more challenging for them due to discontinued features.
|
||||
|
||||
Another avenue would be to define a benchmark to check the ability of tools to handle real-world applications.
|
||||
Our dataset is much to large for a simple benchmark, and is sampled to have a variety of application size and year of publication.
|
||||
Hence, the first step would be to sample a dataset for this benchmark.
|
||||
Current benchmark datasets focus on accuracy of the tested tools, with difficult to analyse applications.
|
||||
It could be instesting to extract from our result some of applications that the most tools failed to analyse, and either use them directly or studdy them to craft simpler applications reproducing the same challenged as those applications.
|
||||
Such dataset would need to be updated regularly: we saw that there is a trend for newer applications to be harder to analyse, a frozen dataset would ignore this factor.
|
||||
|
||||
In addition to the finishing rate, it would be both interesting and usefull to have reference value.
|
||||
@tab:rasta-rec-deps list common Android related dependencies we encontered when packaging the tools.
|
||||
We can see that each tools use at least one of those dependencies.
|
||||
It would be resonnable to consider the best finishing ratio a tool can have to be the finishing ratio of a tool that would perfom an "empty analysis" using the same dependencies.
|
||||
Considering the prevalence of those dependencies, having those theoritical minimum could also guide future tool developers when choosing their dependencies.
|
||||
|
||||
#figure({
|
||||
//show table: set text(size: 0.80em)
|
||||
let ko = []
|
||||
table(
|
||||
columns: 4,
|
||||
inset: (x: 0% + 5pt, y: 0% + 2pt),
|
||||
stroke: none,
|
||||
align: center+horizon,
|
||||
table.hline(),
|
||||
table.header(
|
||||
table.cell(colspan: 4, inset: 3pt)[],
|
||||
[Tool],
|
||||
table.vline(end: 3),
|
||||
table.vline(start: 4),
|
||||
[Soot],
|
||||
[Androguard],
|
||||
[Apktool],
|
||||
),
|
||||
table.cell(colspan: 4, inset: 3pt)[],
|
||||
table.hline(),
|
||||
table.cell(colspan: 4, inset: 3pt)[],
|
||||
..tool_info
|
||||
.map(entry => (
|
||||
[#entry.tool_name],
|
||||
if entry.use_soot { ok } else { ko },
|
||||
if entry.use_androguard { ok } else { ko },
|
||||
if entry.use_apktool { ok } else { ko },
|
||||
)).flatten(),
|
||||
table.cell(colspan: 4, inset: 3pt)[],
|
||||
table.hline(),
|
||||
)
|
||||
},
|
||||
placement: none, // small section: floating figure makes this table go in another section
|
||||
caption: [Commonly found dependencies],
|
||||
) <tab:rasta-rec-deps>
|
||||
|
33
3_rasta/9_conclusion.typ
Normal file
33
3_rasta/9_conclusion.typ
Normal file
|
@ -0,0 +1,33 @@
|
|||
#import "@local/template-thesis-matisse:0.0.1": etal
|
||||
#import "../lib.typ": todo, jfl-note, pb1, APKs, SDK, highlight-block
|
||||
#import "X_var.typ": *
|
||||
|
||||
== Conclusion <sec:rasta-conclusion>
|
||||
|
||||
Since the release of Android, many tools have been published in order to analyse Android application.
|
||||
In @sec:bg, we went through contributions benchmarking and comparing some of those tools.
|
||||
Those contributions suggested that analysing real-world applications might be more of a challenged than expected.
|
||||
This led us to question the reusability of those tools (#pb1).
|
||||
|
||||
This chapter has assessed the suggested results of the literature~@luoTaintBenchAutomaticRealworld2022 @pauckAndroidTaintAnalysis2018 @reaves_droid_2016 about the reliability of static analysis tools for Android applications.
|
||||
With a dataset of #NBTOTALSTRING applications we established that #resultunusable of #nbtoolsselectedvariations tools are not reusable.
|
||||
2 of those where due to the fact that whe did not managed to use the tools, even with the help of the author.
|
||||
We consider the 10 other tools the be unusable due to the fact that they fail to finish their analysis more than 50% of the time..
|
||||
In total, the analysis success rate of the tools that we could run for the entire dataset is #resultratio.
|
||||
The characteristics that have the most influence on the success rate is the bytecode size and min #SDK version.
|
||||
Finally, we showed that malware #APKs generate less fatal errors than goodware when analysed.
|
||||
|
||||
Following Reaves #etal recommendations~@reaves_droid_2016, we publish the Docker and Singularity images we built to run our experiments alongside the Docker files.
|
||||
This will allow the research community to use directly the tools without the build and installation penalty.
|
||||
|
||||
#v(1.5em)
|
||||
|
||||
#align(center, highlight-block(inset: 15pt, width: 75%, breakable: false, block(align(left)[
|
||||
#pb1: _To what extent are previously published Android analysis tools still usable today, and what factors impact their reusability?_
|
||||
#v(0.75em)
|
||||
More than half the tools we selected were not usable.
|
||||
In some cases, it was due to our inability to setup the tool correctly.
|
||||
Mostly, it was due to the high failure rate when analysing real-world applications.
|
||||
Results show that large applications cause more crashes, as does applications with higher min #SDK target.
|
||||
Goodware also appear to generate more analysis failure than malware.
|
||||
])))
|
|
@ -1,5 +1,9 @@
|
|||
#import "../lib.typ": num, mypercent
|
||||
|
||||
#let rq1 = link(<rq-1>)[*RQ1*]
|
||||
#let rq2 = link(<rq-2>)[*RQ2*]
|
||||
#let rq3 = link(<rq-3>)[*RQ3*]
|
||||
|
||||
#let NBTOTAL = 62525
|
||||
#let NBTOTALSTRING = num(NBTOTAL)
|
||||
|
||||
|
@ -35,3 +39,246 @@
|
|||
delimiter: ",",
|
||||
row-type: dictionary,
|
||||
)
|
||||
|
||||
#let tool_info = (
|
||||
(
|
||||
"tool_name": "adagio",
|
||||
"use_python": true,
|
||||
"use_java": false,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": false,
|
||||
"use_androguard": true,
|
||||
"use_apktool": false,
|
||||
),
|
||||
(
|
||||
"tool_name": "amandroid",
|
||||
"use_python": false,
|
||||
"use_java": false,
|
||||
"use_scala": true,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": false,
|
||||
"use_androguard": false,
|
||||
"use_apktool": true,
|
||||
),
|
||||
(
|
||||
"tool_name": "anadroid",
|
||||
"use_python": true,
|
||||
"use_java": true,
|
||||
"use_scala": true,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": false,
|
||||
"use_androguard": false,
|
||||
"use_apktool": true,
|
||||
),
|
||||
(
|
||||
"tool_name": "androguard",
|
||||
"use_python": true,
|
||||
"use_java": false,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": false,
|
||||
"use_androguard": true,
|
||||
"use_apktool": false,
|
||||
),
|
||||
(
|
||||
"tool_name": "androguard_dad",
|
||||
"use_python": true,
|
||||
"use_java": false,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": false,
|
||||
"use_androguard": true,
|
||||
"use_apktool": false,
|
||||
),
|
||||
(
|
||||
"tool_name": "apparecium",
|
||||
"use_python": true,
|
||||
"use_java": false,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": false,
|
||||
"use_androguard": true,
|
||||
"use_apktool": false,
|
||||
),
|
||||
(
|
||||
"tool_name": "blueseal",
|
||||
"use_python": false,
|
||||
"use_java": true,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": true,
|
||||
"use_androguard": false,
|
||||
"use_apktool": true,
|
||||
),
|
||||
(
|
||||
"tool_name": "dialdroid",
|
||||
"use_python": false,
|
||||
"use_java": true,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": true,
|
||||
"use_androguard": false,
|
||||
"use_apktool": false,
|
||||
),
|
||||
(
|
||||
"tool_name": "didfail",
|
||||
"use_python": true,
|
||||
"use_java": true,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": true,
|
||||
"use_androguard": false,
|
||||
"use_apktool": false,
|
||||
),
|
||||
(
|
||||
"tool_name": "droidsafe",
|
||||
"use_python": true,
|
||||
"use_java": true,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": true,
|
||||
"use_androguard": false,
|
||||
"use_apktool": true,
|
||||
),
|
||||
(
|
||||
"tool_name": "flowdroid",
|
||||
"use_python": false,
|
||||
"use_java": true,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": true,
|
||||
"use_androguard": false,
|
||||
"use_apktool": false,
|
||||
),
|
||||
(
|
||||
"tool_name": "gator",
|
||||
"use_python": true,
|
||||
"use_java": true,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": true,
|
||||
"use_androguard": false,
|
||||
"use_apktool": true,
|
||||
),
|
||||
(
|
||||
"tool_name": "ic3",
|
||||
"use_python": false,
|
||||
"use_java": true,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": true,
|
||||
"use_androguard": false,
|
||||
"use_apktool": false,
|
||||
),
|
||||
(
|
||||
"tool_name": "ic3_fork",
|
||||
"use_python": false,
|
||||
"use_java": true,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": true,
|
||||
"use_androguard": false,
|
||||
"use_apktool": false,
|
||||
),
|
||||
(
|
||||
"tool_name": "iccta",
|
||||
"use_python": false,
|
||||
"use_java": true,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": true,
|
||||
"use_androguard": false,
|
||||
"use_apktool": true,
|
||||
),
|
||||
(
|
||||
"tool_name": "mallodroid",
|
||||
"use_python": true,
|
||||
"use_java": false,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": false,
|
||||
"use_androguard": true,
|
||||
"use_apktool": false,
|
||||
),
|
||||
(
|
||||
"tool_name": "perfchecker",
|
||||
"use_python": false,
|
||||
"use_java": true,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": true,
|
||||
"use_androguard": false,
|
||||
"use_apktool": false,
|
||||
),
|
||||
(
|
||||
"tool_name": "redexer",
|
||||
"use_python": false,
|
||||
"use_java": false,
|
||||
"use_scala": false,
|
||||
"use_ocaml": true,
|
||||
"use_ruby": true,
|
||||
"use_prolog": false,
|
||||
"use_soot": false,
|
||||
"use_androguard": false,
|
||||
"use_apktool": true,
|
||||
),
|
||||
(
|
||||
"tool_name": "saaf",
|
||||
"use_python": false,
|
||||
"use_java": true,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": false,
|
||||
"use_soot": false,
|
||||
"use_androguard": false,
|
||||
"use_apktool": true,
|
||||
),
|
||||
(
|
||||
"tool_name": "wognsen_et_al",
|
||||
"use_python": true,
|
||||
"use_java": false,
|
||||
"use_scala": false,
|
||||
"use_ocaml": false,
|
||||
"use_ruby": false,
|
||||
"use_prolog": true,
|
||||
"use_soot": false,
|
||||
"use_androguard": false,
|
||||
"use_apktool": true,
|
||||
),
|
||||
)
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
#import "../lib.typ": todo, epigraph, highlight
|
||||
#import "../lib.typ": todo, epigraph, highlight-block, SDK
|
||||
#import "X_var.typ": resultunusable
|
||||
|
||||
= Evaluating the Reusability of Android Static Analysis Tools <sec:rasta>
|
||||
|
@ -17,11 +17,11 @@
|
|||
// This one is fun, but wont happen XD:
|
||||
// #epigraph("T-Bug Cyberpunk 2077")[You Want Nice, Supportive? Call A Damn Helpline.]
|
||||
|
||||
#align(center, highlight(inset: 15pt, width: 75%, block(align(left)[
|
||||
#align(center, highlight-block(inset: 15pt, width: 75%, block(align(left)[
|
||||
This chapter intends to explore the robustness of past software dedicated to static analysis of Android applications.
|
||||
We pursue the community effort that identified software supporting publications that perform static analysis of mobile applications and we propose a method for evaluating the reliability of these software.
|
||||
We extensively evaluate static analysis tools on a recent dataset of Android applications including goodware and malware, that we designed to measure the influence of parameters such as the date and size of applications.
|
||||
Our results show that #resultunusable of the evaluated tools are no longer usable and that the size of the bytecode and the min SDK version have the greatest influence on the reliability of tested tools.
|
||||
Our results show that #resultunusable of the evaluated tools are no longer usable and that the size of the bytecode and the min #SDK version have the greatest influence on the reliability of tested tools.
|
||||
])))
|
||||
|
||||
|
||||
|
@ -32,4 +32,5 @@
|
|||
#include("5_soa_comp.typ")
|
||||
#include("6_recommendations.typ")
|
||||
#include("7_limitations.typ")
|
||||
#include("8_conclusion.typ")
|
||||
#include("8_futur_works.typ")
|
||||
#include("9_conclusion.typ")
|
||||
|
|
4
lib.typ
4
lib.typ
|
@ -38,3 +38,7 @@
|
|||
|
||||
#let jm-note = note.with(stroke: purple + 1pt)
|
||||
#let jfl-note = note.with(stroke: green + 1pt)
|
||||
|
||||
#let pb1 = link(<pb-1>)[*Pb1*]
|
||||
#let pb2 = link(<pb-2>)[*Pb2*]
|
||||
#let pb3 = link(<pb-3>)[*Pb3*]
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
#import "../lib.typ": todo, highlight, num, paragraph
|
||||
#import "../lib.typ": todo, highlight-block, num, paragraph
|
||||
#import "X_var.typ": *
|
||||
#import "X_lib.typ": *
|
||||
|
||||
|
@ -48,7 +48,7 @@ As a summary, the final ratio of successful analysis for the tools that we could
|
|||
// and applications of Rasta dataset
|
||||
is #mypercent(54.9, 100). When including the two defective tools, this ratio drops to #mypercent(49.9, 100).
|
||||
|
||||
#highlight()[
|
||||
#highlight-block()[
|
||||
*RQ1 answer:*
|
||||
On a recent dataset we consider that #resultunusable of the tools are unusable.
|
||||
For the tools that we could run, #resultratio of analysis are finishing successfully.
|
||||
|
@ -120,7 +120,7 @@ sqlite> SELECT apk1.first_seen_year, (COUNT(*) * 100) / (SELECT 20 * COUNT(*)
|
|||
```
|
||||
*/
|
||||
|
||||
#highlight()[
|
||||
#highlight-block()[
|
||||
*RQ2 answer:* For the #nbtoolsselected tools that can be used partially, a global decrease of the success rate of tools' analysis is observed over time.
|
||||
Starting at 78% of success rate, after five years, tools have 61% of success; after ten years, 45% of success.
|
||||
]
|
||||
|
@ -216,7 +216,7 @@ We performed similar experiments by variating the min SDK and target SDK version
|
|||
We found that contrary to the target SDK, the min SDK version has an impact on the finishing rate of Java based tools: 8 tools over 12 are below 50% after SDK 16.
|
||||
It is not surprising, as the min SDK is highly correlated to the year.
|
||||
|
||||
#highlight()[
|
||||
#highlight-block()[
|
||||
*RQ2 answer:*
|
||||
The success rate varies based on the size of bytecode and SDK version.
|
||||
The date is also correlated with the success rate for Java based tools only.
|
||||
|
@ -388,7 +388,7 @@ We observe that the ratio for the finishing rate decreases from 1.04 to 0.73, wh
|
|||
We conclude from this table that analyzing malware triggers less errors than for goodware.
|
||||
|
||||
|
||||
#highlight()[
|
||||
#highlight-block()[
|
||||
*RQ3 answer:*
|
||||
Analyzing malware applications triggers less errors for static analysis tools than analyzing goodware for comparable bytecode size.
|
||||
]
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue