start modifying contrib section
All checks were successful
/ test_checkout (push) Successful in 1m0s

This commit is contained in:
Jean-Marie Mineau 2025-07-19 23:01:15 +02:00
parent ad66b1293d
commit fd4d6fa239
Signed by: histausse
GPG key ID: B66AEEDA9B645AD2
49 changed files with 22629 additions and 88 deletions

View file

@ -254,12 +254,11 @@ Probleme 2: pour sampler, on utilise les deciles de taille d'apk, mais pour nos
*/
// Two datasets are used in the experiments of this section.
// The first one is *Drebin*@Arp2014, from which we extracted the malware part (5479 samples that we could retrieved) for comparison purpose only.
// It is a well known and very old dataset that should not be used anymore because it contains temporal and spatial biases@Pendlebury2018.
// We intend to compare the rate of success on this old dataset with a more recent one.
// The second one,
We built a dataset named *Rasta* to cover all dates between 2010 to 2023.
Two datasets are used in the experiments of this section.
The first one is *Drebin*@Arp2014, from which we extracted the malware part (5479 samples that we could retrieved) for comparison purpose only.
It is a well known and very old dataset that should not be used anymore because it contains temporal and spatial biases@Pendlebury2018.
We intend to compare the rate of success on this old dataset with a more recent one.
The second one, *Rasta*, we built to cover all dates between 2010 to 2023.
This dataset is a random extract of Androzoo@allixAndroZooCollectingMillions2016, for which we balanced applications between years and size.
For each year and inter-decile range of size in Androzoo, 500 applications have been extracted with an arbitrary proportion of 7% of malware.
This ratio has been chosen because it is the ratio of goodware/malware that we observed when performing a raw extract of Androzoo.