From d02129a53117742a0935df86931d3107a0abf029 Mon Sep 17 00:00:00 2001 From: Jean-Marie Mineau Date: Fri, 4 Jul 2025 13:38:17 +0200 Subject: [PATCH 1/7] wip --- 6_theseus/1_static_transformation.typ | 66 +++++++++++++++++++++++++-- abstract.typ | 2 +- 2 files changed, 64 insertions(+), 4 deletions(-) diff --git a/6_theseus/1_static_transformation.typ b/6_theseus/1_static_transformation.typ index b42fa00..911fffe 100644 --- a/6_theseus/1_static_transformation.typ +++ b/6_theseus/1_static_transformation.typ @@ -41,7 +41,8 @@ When instanciating an object with `Object obj = cst.newInstance("Hello Void")`, #figure( ```java Method mth = clz.getMethod("myMethod", String.class); - String retData = (String) mth.invoke(obj, "an argument"); + Object[] args = {(Object)"an argument"} + String retData = (String) mth.invoke(obj, args); ```, caption: [Calling a method using reflection] ) @@ -55,8 +56,7 @@ Similarly, some analysis tools might have trouble analysis application calling n A notable issue is that a specific reflection call can call different methods. @lst:th-worst-case-ref illustrate a worst case scenario where any method can be call at the same reflection call. -In those situation, we cannot garanty that we know all the methodes that can be called (#eg the name of the method called could be retrieved from a remote server). - +In those situation, we cannot garanty that we know all the methods that can be called (#eg the name of the method called could be retrieved from a remote server). #figure( ```java @@ -67,6 +67,66 @@ In those situation, we cannot garanty that we know all the methodes that can be caption: [A reflection call that can call any method] ) +To handle those situation, instead of entirely removing the reflection call, we can modify the application code to test if the `Method` (or `Constructor`) object match any expected method, and if yes, directly call the method. +If the object does not match any expected method, the code can fallback to the original reflection call. +@lst:-th-expl-cl-call-trans demonstrate this transformation on @lst:-th-expl-cl-call. +It should be noted that we do the transformation at the bytecode level, the code in the listing correspond to the output of JADX #todo[Ref to list of common tools?] reformated for readability. +The method check is done in a separate method injected inside the application to avoid clutering the application too much. +Because Java (and thus Android) uses polymorphic methods, we cannot just check the method name and its class, but also the whole method signature. +We chose to limit the transformation to the specific instruction that call `Method.invoke(..)`. +This drastically reduce the risks of breaking the application, but leads to a lot of type casting. +Indeed, the reflection call uses the generic `Object` class, but actual methods usually use specific classes (#eg `String`, `Context`, `Reflectee`) or scalar types (#eg `int`, `long`, `boolean`). +This means that the method parameters and object on which the method is called must be downcast to their actual type before calling the method, then the returned value must be upcasted back to an `Object`. +Scalar types especially require special attention. +Java (and Android) distinguish between scalar type and classes, and they cannot be mixed: a scalar cannot be cast into an `Object`. +However, each scalar type has an associated class that can be use when doing reflection. +For example, the scalar type `int` is associated with the class `Integer`, the method `Integer.valueOf()` can convert an `int` scalar to an `Integer` object, and the method `Integer.intValue()` convert back an `Integer` object to an `int` scalar. +Each time the method called by reflection used scalars, the scalar-object convertion must be made before calling it. +And finally, because the instruction following the reflection call expect an `Object`, the return value of the method must be cast into an `Object`. + +This back and forth between types might confuse some analysis tools. +This could be improved in futur works by analysing the code around the reflection call. +For example, if the result of the reflection call is imediatly cast into the expected type (#eg in @lst:-th-expl-cl-call, the result is cast to a `String`), they should not be any need to cast it to Object in between. +Similarly, it is common to have the method parameter arrays generated just before the reflection call never be used again (This is due to `Method.invoke(..)` beeing a varargs method: the array can be generated by the compiler at compile time). +In those cases, the parameters could be used directly whithout the detour inside an array. + +#figure( + ```java + class T { + static boolean check_is_reflectee_mymethod_e398(Method mth) { + Class[] paramTys = mth.getParameterTypes(); + return ( + meth.getName().equals("myMethod") && + paramTys.length == 1 && + paramTys[0].descriptorString().equals( + String.class.descriptorString() + ) && + mth.getReturnType().descriptorString().equals( + String.class.descriptorString() + ) && + mth.getDeclaringClass().descriptorString().equals( + Reflectee.class.descriptorString() + ) + ) + } + } + + ... + + Method mth = clz.getMethod("myMethod", String.class); + Object[] args = {(Object)"an argument"} + Object objRet; + if (T.check_is_reflectee_mymethod_e398abf7d3ce6ede(mth)) { + objRet = (Object) ((Reflectee) obj).myMethod((String)args[0]); + } else { + objRet = mth.invoke(obj, args); + } + String retData = (String) objRet; + ```, + caption: [@lst:-th-expl-cl-call after the de-reflection transformation] +) + + === Code loading #todo[custom class loaders] diff --git a/abstract.typ b/abstract.typ index 822bb36..7b4ca40 100644 --- a/abstract.typ +++ b/abstract.typ @@ -1,6 +1,6 @@ #import "@local/template-thesis-matisse:0.0.1": todo -#let keywords-en = ("Android", "malware analysis", "static analysis", "class loading", "code obfuscation", todo[More Keywords]) +#let keywords-en = ("Android", "malware analysis", "static analysis", "class loading", "code obfuscation", todo[Keywords]) #let keywords-fr = ("Android", "analyse de maliciels", "analyse statique", "chargement de classe", "brouillage de code") From 660946119a5f752b54f8a166eeb220801eabc657 Mon Sep 17 00:00:00 2001 From: Jean-Marie Mineau Date: Fri, 4 Jul 2025 14:24:24 +0200 Subject: [PATCH 2/7] bg+rl will be merged --- 2_background/main.typ | 2 +- {4_rasta => 3_rasta}/0_intro.typ | 0 {4_rasta => 3_rasta}/1_related_work.typ | 0 {4_rasta => 3_rasta}/2_methodology.typ | 0 {4_rasta => 3_rasta}/3_experiments.typ | 0 {4_rasta => 3_rasta}/4_discussion.typ | 0 {4_rasta => 3_rasta}/5_conclusion.typ | 0 {4_rasta => 3_rasta}/X_lib.typ | 0 {4_rasta => 3_rasta}/X_var.typ | 0 {4_rasta => 3_rasta}/data/average_mem-final.csv | 0 .../data/average_number_of_error_by_exec.csv | 0 {4_rasta => 3_rasta}/data/average_time-final.csv | 0 {4_rasta => 3_rasta}/data/data-final.csv | 0 ...ased-tool-by-bytecode-size-of-apks-detected-in-2022.svg | 0 ...pks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg | 0 ...pks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg | 0 ...ased-tool-by-bytecode-size-of-apks-detected-in-2022.svg | 0 ...pks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg | 0 ...pks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg | 0 .../figs/exit-status-for-the-drebin-dataset.svg | 0 .../exit-status-for-the-rasta-dataset-goodware-malware.svg | 0 .../figs/exit-status-for-the-rasta-dataset.svg | 0 .../figs/finishing-rate-by-year-of-java-based-tools.svg | 0 .../finishing-rate-by-year-of-non-java-based-tools.svg | 0 .../figs/repartition-of-error-types-among-tools.svg | 0 {4_rasta => 3_rasta}/figs/running.svg | 0 {4_rasta => 3_rasta}/main.typ | 0 3_related_work/main.typ | 7 ------- {5_class_loader => 4_class_loader}/0_intro.typ | 0 {5_class_loader => 4_class_loader}/1_related_work.typ | 0 {5_class_loader => 4_class_loader}/2_classloading.typ | 0 {5_class_loader => 4_class_loader}/3_obfuscation.typ | 0 {5_class_loader => 4_class_loader}/4_in_the_wild.typ | 0 {5_class_loader => 4_class_loader}/5_ttv.typ | 0 {5_class_loader => 4_class_loader}/6_conclusion.typ | 0 {5_class_loader => 4_class_loader}/X_var.typ | 0 {5_class_loader => 4_class_loader}/data/redef_sdk_16.csv | 0 .../data/redef_sdk_7minus.csv | 0 {5_class_loader => 4_class_loader}/data/redef_sdk_8.csv | 0 {5_class_loader => 4_class_loader}/data/results_50k.csv | 0 {5_class_loader => 4_class_loader}/data/results_only.csv | 0 .../figs/architecture_SDK-crop.svg | 0 .../figs/call_graph_expected.svg | 0 {5_class_loader => 4_class_loader}/figs/call_graph_obf.svg | 0 .../figs/classloaders-crop.svg | 0 .../figs/redef_sdk_relative_min_sdk.svg | 0 {5_class_loader => 4_class_loader}/main.typ | 0 {6_theseus => 5_theseus}/1_static_transformation.typ | 0 {6_theseus => 5_theseus}/3_results.typ | 0 {6_theseus => 5_theseus}/4_ttv.typ | 0 {6_theseus => 5_theseus}/main.typ | 0 main.typ | 7 +++---- 52 files changed, 4 insertions(+), 12 deletions(-) rename {4_rasta => 3_rasta}/0_intro.typ (100%) rename {4_rasta => 3_rasta}/1_related_work.typ (100%) rename {4_rasta => 3_rasta}/2_methodology.typ (100%) rename {4_rasta => 3_rasta}/3_experiments.typ (100%) rename {4_rasta => 3_rasta}/4_discussion.typ (100%) rename {4_rasta => 3_rasta}/5_conclusion.typ (100%) rename {4_rasta => 3_rasta}/X_lib.typ (100%) rename {4_rasta => 3_rasta}/X_var.typ (100%) rename {4_rasta => 3_rasta}/data/average_mem-final.csv (100%) rename {4_rasta => 3_rasta}/data/average_number_of_error_by_exec.csv (100%) rename {4_rasta => 3_rasta}/data/average_time-final.csv (100%) rename {4_rasta => 3_rasta}/data/data-final.csv (100%) rename {4_rasta => 3_rasta}/figs/decorelation/finishing-rate-of-java-based-tool-by-bytecode-size-of-apks-detected-in-2022.svg (100%) rename {4_rasta => 3_rasta}/figs/decorelation/finishing-rate-of-java-based-tool-by-discovery-year-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg (100%) rename {4_rasta => 3_rasta}/figs/decorelation/finishing-rate-of-java-based-tool-by-min-sdk-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg (100%) rename {4_rasta => 3_rasta}/figs/decorelation/finishing-rate-of-non-java-based-tool-by-bytecode-size-of-apks-detected-in-2022.svg (100%) rename {4_rasta => 3_rasta}/figs/decorelation/finishing-rate-of-non-java-based-tool-by-discovery-year-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg (100%) rename {4_rasta => 3_rasta}/figs/decorelation/finishing-rate-of-non-java-based-tool-by-min-sdk-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg (100%) rename {4_rasta => 3_rasta}/figs/exit-status-for-the-drebin-dataset.svg (100%) rename {4_rasta => 3_rasta}/figs/exit-status-for-the-rasta-dataset-goodware-malware.svg (100%) rename {4_rasta => 3_rasta}/figs/exit-status-for-the-rasta-dataset.svg (100%) rename {4_rasta => 3_rasta}/figs/finishing-rate-by-year-of-java-based-tools.svg (100%) rename {4_rasta => 3_rasta}/figs/finishing-rate-by-year-of-non-java-based-tools.svg (100%) rename {4_rasta => 3_rasta}/figs/repartition-of-error-types-among-tools.svg (100%) rename {4_rasta => 3_rasta}/figs/running.svg (100%) rename {4_rasta => 3_rasta}/main.typ (100%) delete mode 100644 3_related_work/main.typ rename {5_class_loader => 4_class_loader}/0_intro.typ (100%) rename {5_class_loader => 4_class_loader}/1_related_work.typ (100%) rename {5_class_loader => 4_class_loader}/2_classloading.typ (100%) rename {5_class_loader => 4_class_loader}/3_obfuscation.typ (100%) rename {5_class_loader => 4_class_loader}/4_in_the_wild.typ (100%) rename {5_class_loader => 4_class_loader}/5_ttv.typ (100%) rename {5_class_loader => 4_class_loader}/6_conclusion.typ (100%) rename {5_class_loader => 4_class_loader}/X_var.typ (100%) rename {5_class_loader => 4_class_loader}/data/redef_sdk_16.csv (100%) rename {5_class_loader => 4_class_loader}/data/redef_sdk_7minus.csv (100%) rename {5_class_loader => 4_class_loader}/data/redef_sdk_8.csv (100%) rename {5_class_loader => 4_class_loader}/data/results_50k.csv (100%) rename {5_class_loader => 4_class_loader}/data/results_only.csv (100%) rename {5_class_loader => 4_class_loader}/figs/architecture_SDK-crop.svg (100%) rename {5_class_loader => 4_class_loader}/figs/call_graph_expected.svg (100%) rename {5_class_loader => 4_class_loader}/figs/call_graph_obf.svg (100%) rename {5_class_loader => 4_class_loader}/figs/classloaders-crop.svg (100%) rename {5_class_loader => 4_class_loader}/figs/redef_sdk_relative_min_sdk.svg (100%) rename {5_class_loader => 4_class_loader}/main.typ (100%) rename {6_theseus => 5_theseus}/1_static_transformation.typ (100%) rename {6_theseus => 5_theseus}/3_results.typ (100%) rename {6_theseus => 5_theseus}/4_ttv.typ (100%) rename {6_theseus => 5_theseus}/main.typ (100%) diff --git a/2_background/main.typ b/2_background/main.typ index 063241d..00076e1 100644 --- a/2_background/main.typ +++ b/2_background/main.typ @@ -2,6 +2,6 @@ = Background -#todo[Present your field background] +#todo[Present field background and related work] #lorem(200) diff --git a/4_rasta/0_intro.typ b/3_rasta/0_intro.typ similarity index 100% rename from 4_rasta/0_intro.typ rename to 3_rasta/0_intro.typ diff --git a/4_rasta/1_related_work.typ b/3_rasta/1_related_work.typ similarity index 100% rename from 4_rasta/1_related_work.typ rename to 3_rasta/1_related_work.typ diff --git a/4_rasta/2_methodology.typ b/3_rasta/2_methodology.typ similarity index 100% rename from 4_rasta/2_methodology.typ rename to 3_rasta/2_methodology.typ diff --git a/4_rasta/3_experiments.typ b/3_rasta/3_experiments.typ similarity index 100% rename from 4_rasta/3_experiments.typ rename to 3_rasta/3_experiments.typ diff --git a/4_rasta/4_discussion.typ b/3_rasta/4_discussion.typ similarity index 100% rename from 4_rasta/4_discussion.typ rename to 3_rasta/4_discussion.typ diff --git a/4_rasta/5_conclusion.typ b/3_rasta/5_conclusion.typ similarity index 100% rename from 4_rasta/5_conclusion.typ rename to 3_rasta/5_conclusion.typ diff --git a/4_rasta/X_lib.typ b/3_rasta/X_lib.typ similarity index 100% rename from 4_rasta/X_lib.typ rename to 3_rasta/X_lib.typ diff --git a/4_rasta/X_var.typ b/3_rasta/X_var.typ similarity index 100% rename from 4_rasta/X_var.typ rename to 3_rasta/X_var.typ diff --git a/4_rasta/data/average_mem-final.csv b/3_rasta/data/average_mem-final.csv similarity index 100% rename from 4_rasta/data/average_mem-final.csv rename to 3_rasta/data/average_mem-final.csv diff --git a/4_rasta/data/average_number_of_error_by_exec.csv b/3_rasta/data/average_number_of_error_by_exec.csv similarity index 100% rename from 4_rasta/data/average_number_of_error_by_exec.csv rename to 3_rasta/data/average_number_of_error_by_exec.csv diff --git a/4_rasta/data/average_time-final.csv b/3_rasta/data/average_time-final.csv similarity index 100% rename from 4_rasta/data/average_time-final.csv rename to 3_rasta/data/average_time-final.csv diff --git a/4_rasta/data/data-final.csv b/3_rasta/data/data-final.csv similarity index 100% rename from 4_rasta/data/data-final.csv rename to 3_rasta/data/data-final.csv diff --git a/4_rasta/figs/decorelation/finishing-rate-of-java-based-tool-by-bytecode-size-of-apks-detected-in-2022.svg b/3_rasta/figs/decorelation/finishing-rate-of-java-based-tool-by-bytecode-size-of-apks-detected-in-2022.svg similarity index 100% rename from 4_rasta/figs/decorelation/finishing-rate-of-java-based-tool-by-bytecode-size-of-apks-detected-in-2022.svg rename to 3_rasta/figs/decorelation/finishing-rate-of-java-based-tool-by-bytecode-size-of-apks-detected-in-2022.svg diff --git a/4_rasta/figs/decorelation/finishing-rate-of-java-based-tool-by-discovery-year-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg b/3_rasta/figs/decorelation/finishing-rate-of-java-based-tool-by-discovery-year-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg similarity index 100% rename from 4_rasta/figs/decorelation/finishing-rate-of-java-based-tool-by-discovery-year-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg rename to 3_rasta/figs/decorelation/finishing-rate-of-java-based-tool-by-discovery-year-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg diff --git a/4_rasta/figs/decorelation/finishing-rate-of-java-based-tool-by-min-sdk-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg b/3_rasta/figs/decorelation/finishing-rate-of-java-based-tool-by-min-sdk-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg similarity index 100% rename from 4_rasta/figs/decorelation/finishing-rate-of-java-based-tool-by-min-sdk-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg rename to 3_rasta/figs/decorelation/finishing-rate-of-java-based-tool-by-min-sdk-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg diff --git a/4_rasta/figs/decorelation/finishing-rate-of-non-java-based-tool-by-bytecode-size-of-apks-detected-in-2022.svg b/3_rasta/figs/decorelation/finishing-rate-of-non-java-based-tool-by-bytecode-size-of-apks-detected-in-2022.svg similarity index 100% rename from 4_rasta/figs/decorelation/finishing-rate-of-non-java-based-tool-by-bytecode-size-of-apks-detected-in-2022.svg rename to 3_rasta/figs/decorelation/finishing-rate-of-non-java-based-tool-by-bytecode-size-of-apks-detected-in-2022.svg diff --git a/4_rasta/figs/decorelation/finishing-rate-of-non-java-based-tool-by-discovery-year-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg b/3_rasta/figs/decorelation/finishing-rate-of-non-java-based-tool-by-discovery-year-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg similarity index 100% rename from 4_rasta/figs/decorelation/finishing-rate-of-non-java-based-tool-by-discovery-year-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg rename to 3_rasta/figs/decorelation/finishing-rate-of-non-java-based-tool-by-discovery-year-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg diff --git a/4_rasta/figs/decorelation/finishing-rate-of-non-java-based-tool-by-min-sdk-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg b/3_rasta/figs/decorelation/finishing-rate-of-non-java-based-tool-by-min-sdk-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg similarity index 100% rename from 4_rasta/figs/decorelation/finishing-rate-of-non-java-based-tool-by-min-sdk-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg rename to 3_rasta/figs/decorelation/finishing-rate-of-non-java-based-tool-by-min-sdk-of-apks-with-a-bytecode-size-between-4-08-mb-and-5-2-mb.svg diff --git a/4_rasta/figs/exit-status-for-the-drebin-dataset.svg b/3_rasta/figs/exit-status-for-the-drebin-dataset.svg similarity index 100% rename from 4_rasta/figs/exit-status-for-the-drebin-dataset.svg rename to 3_rasta/figs/exit-status-for-the-drebin-dataset.svg diff --git a/4_rasta/figs/exit-status-for-the-rasta-dataset-goodware-malware.svg b/3_rasta/figs/exit-status-for-the-rasta-dataset-goodware-malware.svg similarity index 100% rename from 4_rasta/figs/exit-status-for-the-rasta-dataset-goodware-malware.svg rename to 3_rasta/figs/exit-status-for-the-rasta-dataset-goodware-malware.svg diff --git a/4_rasta/figs/exit-status-for-the-rasta-dataset.svg b/3_rasta/figs/exit-status-for-the-rasta-dataset.svg similarity index 100% rename from 4_rasta/figs/exit-status-for-the-rasta-dataset.svg rename to 3_rasta/figs/exit-status-for-the-rasta-dataset.svg diff --git a/4_rasta/figs/finishing-rate-by-year-of-java-based-tools.svg b/3_rasta/figs/finishing-rate-by-year-of-java-based-tools.svg similarity index 100% rename from 4_rasta/figs/finishing-rate-by-year-of-java-based-tools.svg rename to 3_rasta/figs/finishing-rate-by-year-of-java-based-tools.svg diff --git a/4_rasta/figs/finishing-rate-by-year-of-non-java-based-tools.svg b/3_rasta/figs/finishing-rate-by-year-of-non-java-based-tools.svg similarity index 100% rename from 4_rasta/figs/finishing-rate-by-year-of-non-java-based-tools.svg rename to 3_rasta/figs/finishing-rate-by-year-of-non-java-based-tools.svg diff --git a/4_rasta/figs/repartition-of-error-types-among-tools.svg b/3_rasta/figs/repartition-of-error-types-among-tools.svg similarity index 100% rename from 4_rasta/figs/repartition-of-error-types-among-tools.svg rename to 3_rasta/figs/repartition-of-error-types-among-tools.svg diff --git a/4_rasta/figs/running.svg b/3_rasta/figs/running.svg similarity index 100% rename from 4_rasta/figs/running.svg rename to 3_rasta/figs/running.svg diff --git a/4_rasta/main.typ b/3_rasta/main.typ similarity index 100% rename from 4_rasta/main.typ rename to 3_rasta/main.typ diff --git a/3_related_work/main.typ b/3_related_work/main.typ deleted file mode 100644 index 1ddb4c3..0000000 --- a/3_related_work/main.typ +++ /dev/null @@ -1,7 +0,0 @@ -#import "../lib.typ": todo - -= Related Work - -#todo[Do the State of the Art] - -#lorem(200) diff --git a/5_class_loader/0_intro.typ b/4_class_loader/0_intro.typ similarity index 100% rename from 5_class_loader/0_intro.typ rename to 4_class_loader/0_intro.typ diff --git a/5_class_loader/1_related_work.typ b/4_class_loader/1_related_work.typ similarity index 100% rename from 5_class_loader/1_related_work.typ rename to 4_class_loader/1_related_work.typ diff --git a/5_class_loader/2_classloading.typ b/4_class_loader/2_classloading.typ similarity index 100% rename from 5_class_loader/2_classloading.typ rename to 4_class_loader/2_classloading.typ diff --git a/5_class_loader/3_obfuscation.typ b/4_class_loader/3_obfuscation.typ similarity index 100% rename from 5_class_loader/3_obfuscation.typ rename to 4_class_loader/3_obfuscation.typ diff --git a/5_class_loader/4_in_the_wild.typ b/4_class_loader/4_in_the_wild.typ similarity index 100% rename from 5_class_loader/4_in_the_wild.typ rename to 4_class_loader/4_in_the_wild.typ diff --git a/5_class_loader/5_ttv.typ b/4_class_loader/5_ttv.typ similarity index 100% rename from 5_class_loader/5_ttv.typ rename to 4_class_loader/5_ttv.typ diff --git a/5_class_loader/6_conclusion.typ b/4_class_loader/6_conclusion.typ similarity index 100% rename from 5_class_loader/6_conclusion.typ rename to 4_class_loader/6_conclusion.typ diff --git a/5_class_loader/X_var.typ b/4_class_loader/X_var.typ similarity index 100% rename from 5_class_loader/X_var.typ rename to 4_class_loader/X_var.typ diff --git a/5_class_loader/data/redef_sdk_16.csv b/4_class_loader/data/redef_sdk_16.csv similarity index 100% rename from 5_class_loader/data/redef_sdk_16.csv rename to 4_class_loader/data/redef_sdk_16.csv diff --git a/5_class_loader/data/redef_sdk_7minus.csv b/4_class_loader/data/redef_sdk_7minus.csv similarity index 100% rename from 5_class_loader/data/redef_sdk_7minus.csv rename to 4_class_loader/data/redef_sdk_7minus.csv diff --git a/5_class_loader/data/redef_sdk_8.csv b/4_class_loader/data/redef_sdk_8.csv similarity index 100% rename from 5_class_loader/data/redef_sdk_8.csv rename to 4_class_loader/data/redef_sdk_8.csv diff --git a/5_class_loader/data/results_50k.csv b/4_class_loader/data/results_50k.csv similarity index 100% rename from 5_class_loader/data/results_50k.csv rename to 4_class_loader/data/results_50k.csv diff --git a/5_class_loader/data/results_only.csv b/4_class_loader/data/results_only.csv similarity index 100% rename from 5_class_loader/data/results_only.csv rename to 4_class_loader/data/results_only.csv diff --git a/5_class_loader/figs/architecture_SDK-crop.svg b/4_class_loader/figs/architecture_SDK-crop.svg similarity index 100% rename from 5_class_loader/figs/architecture_SDK-crop.svg rename to 4_class_loader/figs/architecture_SDK-crop.svg diff --git a/5_class_loader/figs/call_graph_expected.svg b/4_class_loader/figs/call_graph_expected.svg similarity index 100% rename from 5_class_loader/figs/call_graph_expected.svg rename to 4_class_loader/figs/call_graph_expected.svg diff --git a/5_class_loader/figs/call_graph_obf.svg b/4_class_loader/figs/call_graph_obf.svg similarity index 100% rename from 5_class_loader/figs/call_graph_obf.svg rename to 4_class_loader/figs/call_graph_obf.svg diff --git a/5_class_loader/figs/classloaders-crop.svg b/4_class_loader/figs/classloaders-crop.svg similarity index 100% rename from 5_class_loader/figs/classloaders-crop.svg rename to 4_class_loader/figs/classloaders-crop.svg diff --git a/5_class_loader/figs/redef_sdk_relative_min_sdk.svg b/4_class_loader/figs/redef_sdk_relative_min_sdk.svg similarity index 100% rename from 5_class_loader/figs/redef_sdk_relative_min_sdk.svg rename to 4_class_loader/figs/redef_sdk_relative_min_sdk.svg diff --git a/5_class_loader/main.typ b/4_class_loader/main.typ similarity index 100% rename from 5_class_loader/main.typ rename to 4_class_loader/main.typ diff --git a/6_theseus/1_static_transformation.typ b/5_theseus/1_static_transformation.typ similarity index 100% rename from 6_theseus/1_static_transformation.typ rename to 5_theseus/1_static_transformation.typ diff --git a/6_theseus/3_results.typ b/5_theseus/3_results.typ similarity index 100% rename from 6_theseus/3_results.typ rename to 5_theseus/3_results.typ diff --git a/6_theseus/4_ttv.typ b/5_theseus/4_ttv.typ similarity index 100% rename from 6_theseus/4_ttv.typ rename to 5_theseus/4_ttv.typ diff --git a/6_theseus/main.typ b/5_theseus/main.typ similarity index 100% rename from 6_theseus/main.typ rename to 5_theseus/main.typ diff --git a/main.typ b/main.typ index 15641e0..5dbf852 100644 --- a/main.typ +++ b/main.typ @@ -71,10 +71,9 @@ #include("1_introduction/main.typ") #include("2_background/main.typ") -#include("3_related_work/main.typ") -#include("4_rasta/main.typ") -#include("5_class_loader/main.typ") -#include("6_theseus/main.typ") +#include("3_rasta/main.typ") +#include("4_class_loader/main.typ") +#include("5_theseus/main.typ") = Conclusion From 37492d223dfb332acba1f3b924224804e9594f4a Mon Sep 17 00:00:00 2001 From: Jean-Marie Mineau Date: Fri, 4 Jul 2025 14:45:03 +0200 Subject: [PATCH 3/7] grey out lorem ipsum and add option to increase interline --- 0_preamble/acknowledgements.typ | 2 +- 0_preamble/french_summary.typ | 2 +- 2_background/main.typ | 2 +- abstract.typ | 4 ++-- main.typ | 22 +++++++++++++++++++++- 5 files changed, 26 insertions(+), 6 deletions(-) diff --git a/0_preamble/acknowledgements.typ b/0_preamble/acknowledgements.typ index 9beaac2..c1c650d 100644 --- a/0_preamble/acknowledgements.typ +++ b/0_preamble/acknowledgements.typ @@ -4,4 +4,4 @@ #todo[Acknowledge people] -#lorem(400) +#text(fill: luma(75%), lorem(400)) diff --git a/0_preamble/french_summary.typ b/0_preamble/french_summary.typ index 4231b44..4f9ada1 100644 --- a/0_preamble/french_summary.typ +++ b/0_preamble/french_summary.typ @@ -9,7 +9,7 @@ Write a "Substantial Summary" in french, at least 4 pages: https://ed-matisse.doctorat-bretagne.fr/fr/soutenance-de-these#p-151 ] -#lorem(200) +#text(fill: luma(75%), lorem(200)) /* * Vocabulaire: diff --git a/2_background/main.typ b/2_background/main.typ index 00076e1..671224c 100644 --- a/2_background/main.typ +++ b/2_background/main.typ @@ -4,4 +4,4 @@ #todo[Present field background and related work] -#lorem(200) +#text(fill: luma(75%), lorem(200)) diff --git a/abstract.typ b/abstract.typ index 7b4ca40..ed4ae95 100644 --- a/abstract.typ +++ b/abstract.typ @@ -4,6 +4,6 @@ #let keywords-fr = ("Android", "analyse de maliciels", "analyse statique", "chargement de classe", "brouillage de code") -#let abstract-en = lorem(175) +#let abstract-en = text(fill: luma(75%), lorem(175)) -#let abstract-fr = lorem(175) +#let abstract-fr = text(fill: luma(75%), lorem(175)) diff --git a/main.typ b/main.typ index 5dbf852..7fab6ce 100644 --- a/main.typ +++ b/main.typ @@ -13,6 +13,19 @@ } else { true } +#let paper_draft = if "paper" in sys.inputs { + assert( + sys.inputs.paper == "true" or sys.inputs.paper == "false", + message: "If --input paper= is set, must be 'true', or 'false'" + ) + assert( + draft, + message: "paper can only be set if --input draft=true is set" + ) + sys.inputs.draft == "true" +} else { + false +} #show: matisse-thesis.with( title-fr: todo[Find a title], @@ -69,6 +82,13 @@ #counter(page).update(1) +// Augment interline when compiling to paper draft +#show par: set par(leading: 1.5em) if paper_draft +#show par: set par(spacing: 1.5em) if paper_draft +// Keep interline in table +#show table: set par(leading: 0.65em) if paper_draft + + #include("1_introduction/main.typ") #include("2_background/main.typ") #include("3_rasta/main.typ") @@ -79,6 +99,6 @@ #todo[Conclude] -#lorem(500) +#text(fill: luma(75%), lorem(500)) #bibliography("bibliography.bib") From caa1e005e47cc08216e3df4a81b94d991a6d8584 Mon Sep 17 00:00:00 2001 From: Jean-Marie Mineau Date: Fri, 4 Jul 2025 17:58:57 +0200 Subject: [PATCH 4/7] add collision resolution --- 1_introduction/main.typ | 13 +++++++++++++ 2_background/main.typ | 15 +++++++++++++++ 5_theseus/1_static_transformation.typ | 26 ++++++++++++++++++++++++-- 3 files changed, 52 insertions(+), 2 deletions(-) diff --git a/1_introduction/main.typ b/1_introduction/main.typ index bbb0783..0b45ff9 100644 --- a/1_introduction/main.typ +++ b/1_introduction/main.typ @@ -4,3 +4,16 @@ #todo[Write an introduction] +/* +* +* De tout temps les hommes on fait des apps android ... +* +* Introduire la notion de reverseur qui veux analyser une app +* +* Les outils d'analyses android sont problématique: +* - résulats trop bons sur des datasets faciles +* - facile a pieger: shadow attacks +* - savent pas gerer le chargement dyn et reflection +* +* Problématique: todo +*/ diff --git a/2_background/main.typ b/2_background/main.typ index 671224c..1442111 100644 --- a/2_background/main.typ +++ b/2_background/main.typ @@ -5,3 +5,18 @@ #todo[Present field background and related work] #text(fill: luma(75%), lorem(200)) + +/* +* Cours generique sur android +* présenter apk tool, jadx, androguard et flowdroid +* analyse statique +* outils avec des datasets un peu trop gentils +* +* analyse dynamique +* +* process du reverseur +* +* Garder les détails du class loading et de la reflection pour les chapitres associés? +* +* Analyse dynamique +*/ diff --git a/5_theseus/1_static_transformation.typ b/5_theseus/1_static_transformation.typ index 911fffe..1e3e001 100644 --- a/5_theseus/1_static_transformation.typ +++ b/5_theseus/1_static_transformation.typ @@ -1,5 +1,11 @@ #import "../lib.typ": todo, APK, DEX, JAR, OAT, eg +/* +* Parler de dex lego et du papier qui encode les resultats d'anger en jimple +* +* +*/ + == Code Transformation #todo[Define code loading and reflection somewhere] @@ -129,8 +135,6 @@ In those cases, the parameters could be used directly whithout the detour inside === Code loading -#todo[custom class loaders] - An application can dynamically import code from several format like #DEX, #APK, #JAR or #OAT, either stored in memory or in a file. Because it is an internal, platform dependant format, we elected to ignore the #OAT format. Practically, #JAR and #APK files are zip files containing #DEX files. @@ -148,6 +152,24 @@ Specifically, to call dynamically loaded code, an application needs to use refle === Class Collisions +We saw in @sec:cl-obfuscation that having several classes with the same name in the same application can be problematic. +In @sec:th-trans-cl, we are adding code from another source. +By doing so, we augment the probability of having class collisions. +When loaded dynamically, the classes are in a different classloader, and the class resolution is resolved at runtime like we saw in @sec:cl-loading. +We decided to restrain our scope to the use of class loader from the Android SDK. +In the abscence of class collision, those class loader behave seamlessly and adding the classes to application maintains the behavior. + +When we detect a collision, we rename one of the classes colliding before injecting it to the application. +To avoid breaking the application, we then need to rename all references to this specific class, an be carefull not to modify references to the other class. +To do so, we regroup each classes by the classloaders defining them, then, for each colliding class name and each classloader, we check the actual class used by the classloader. +If the class has been renamed, we rename all reference to this class in the classes defined by this classloader. +To find the class used by a classloader, we reproduce the behavior of the different classloaders of the Android SDK. +This is an important step: remember that the delegation process can lead to situation where the class defined by a classloader is not the class that will be loaded when querying the classloader. + +#todo[renamin algo] + === Pitfalls #todo[interupting try blocks: catch block might expect temporary registers to still stored the saved value] +#todo[diferenciating the classloaders] +#todo[changing classloader with class collision] From d369cfb187e609d4ab8cf10ea1290cdaa9f5c985 Mon Sep 17 00:00:00 2001 From: Jean-Marie Mineau Date: Mon, 7 Jul 2025 10:33:05 +0200 Subject: [PATCH 5/7] try to avoid patological todo cases --- 5_theseus/3_results.typ | 3 ++- 5_theseus/main.typ | 4 +++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/5_theseus/3_results.typ b/5_theseus/3_results.typ index c43b6fe..d312335 100644 --- a/5_theseus/3_results.typ +++ b/5_theseus/3_results.typ @@ -1,6 +1,7 @@ #import "../lib.typ": todo -== Results #todo[better section name] +== Result +#todo[better section name for @sec:th-res] === Bytecode Loaded by Application diff --git a/5_theseus/main.typ b/5_theseus/main.typ index c2d7bea..8125c9d 100644 --- a/5_theseus/main.typ +++ b/5_theseus/main.typ @@ -1,6 +1,8 @@ #import "../lib.typ": todo -= #todo[theseus chapter title] += Theseus + +#todo[theseus chapter title for @sec:th] #include("1_static_transformation.typ") #include("3_results.typ") From 65baae4d0dd1fa59f7996c43711d5af62d0f714d Mon Sep 17 00:00:00 2001 From: Jean-Marie Mineau Date: Mon, 7 Jul 2025 12:38:53 +0200 Subject: [PATCH 6/7] wip soa tools --- 0_preamble/notations.typ | 14 ++++++++++- 2_background/X_android.typ | 5 ++++ 2_background/X_tools.typ | 51 ++++++++++++++++++++++++++++++++++++++ 2_background/main.typ | 2 ++ 4 files changed, 71 insertions(+), 1 deletion(-) create mode 100644 2_background/X_android.typ create mode 100644 2_background/X_tools.typ diff --git a/0_preamble/notations.typ b/0_preamble/notations.typ index cbd2990..c51cfb9 100644 --- a/0_preamble/notations.typ +++ b/0_preamble/notations.typ @@ -1,7 +1,13 @@ +#let ADB = link()[ADB] #let APK = link()[APK] +#let ART = link()[ART] +#let AXML = link()[AXML] #let DEX = link()[DEX] #let OAT = link()[OAT] #let JAR = link()[JAR] +#let IDE = link()[IDE] +#let SDK = link()[SDK] +#let XML = link()[XML] #let notation_table = align(center, table( columns: 2, @@ -9,8 +15,14 @@ table.header( [Acronyms], [Meanings], ), + ADB, [Android Debug Bridge, a tool to connect to an Android emulator of smartphone to run commands, start applications, send events and perform other operations for testing and debuging purpose ], APK, [Android Package, the file format used to install application on Android. The APK format is an extention of the #JAR format ], + ART, [Android RunTime, the runtime environement that execute an Android application. The ART is the successor of the older Dalvik Virtual Machine ], + AXML, [Android #XML. The specific flavor of #XML used by Android. The main specificity of AXML is that it can be compile in a binary version inside an APK ], DEX, [Dalvik Executable, the file format for the bytecode used for applicatiobs by Android ], + IDE, [Integrated Development Environment, a software providing tools for software development ], JAR, [Java ARchive file, the file format used to store several java class files. Sometimes used by Android to store #DEX files instead of java classes ], - OAT, [Of Ahead Time, an ahead of time compiled format for #DEX files ] + OAT, [Of Ahead Time, an ahead of time compiled format for #DEX files ], + SDK, [Software Development Kit, a set of tools for developing software targeting a specific platform. In the context of Android, the version of the SDK can be associated to a version of Android, and application compatibility is defined in term of compatible SDK version ], + XML, [eXtensible Markup Language, a language to store data ], )) diff --git a/2_background/X_android.typ b/2_background/X_android.typ new file mode 100644 index 0000000..9c7f6b0 --- /dev/null +++ b/2_background/X_android.typ @@ -0,0 +1,5 @@ +#import "../lib.typ": todo + +== Android + +#todo[Present the android environnement] diff --git a/2_background/X_tools.typ b/2_background/X_tools.typ new file mode 100644 index 0000000..22334c9 --- /dev/null +++ b/2_background/X_tools.typ @@ -0,0 +1,51 @@ +#import "../lib.typ": todo, APK, IDE, SDK, DEX, ADB, ART, eg, XML, AXML + +== Android Reverse Engineering Tools + +Due to the specificities of Android, the usual tools for reverse engineering are not enough. +#todo[blabla intro in @sec:bg-tools] + +#todo[References in @sec:bg-tools] + +=== Android Studio + +The whole Android developement ecosystem is packaged by Google in the #IDE Android Studio. +In practice, Android Studio is a source-code editor that wrap arround the different tools of the android #SDK. +The #SDK tools and packages can be installed manually with the `sdkmanager` tool. +Among the notable tools in the #SDK, they are: + +- `emulator`: an Android emulator. + This tools allow to run an emulated Android phone on a computer. + Although very usefull, Android emulator has several limitation. + For once, it cannot emulate another achitecture. + An x86_64 computer cannot emulate an ARM smartphone. + This can be an issue because a majority of smartphone run on ARM processor. + Also, for certain version of Android, the proprietary GooglePlay libraries are not available on rooted emulators. + Lastly, emulators are not designed to be stealthy and can easily be detected by an application. + Malware will avoid detection by not running their payload on emulators. +- #ADB: a tool to send commands to Android smartphone or emulator. + It can be used to install applications, send instructions, events, and generally perform debuging operations. +- Platform Packages: Those packages contains data associated to a version of android needed to compile an application. + Especially, they contains the so call `android.jar` files. +- `d8`: The main use of `d8` is to convert java bytecode files (`.class`) to Android #DEX format. + It can also be used to perform different level of optimization of the bytecode generated. +- `aapt`/`aapt2` (Android Asset Packaging Tool): This tools is used to build the #APK file. + Behind the scene, it we convert #XML to binary #AXML and ensure the right files have the right compression and alignment. (#eg some ressource files are mapped in memory by the #ART, and thus need to be aligned and not compressed). +- `apksigner`: the tool used to sign an #APK file. + +=== Apktool + +Apktool is a *reengineering tool* for Android #APK files. +It can be used to disassemble an application: it will extract the files from the #APK file, convert the binary #AXML to text #XML, and use smali/backsmali to convert the #DEX files to smali, an assembler-like langage that match the Dalvik bytecode instructions. +The main strenght of Apktool is that after having disassemble an application, the content of the application can be edited and reassemble into a new #APK. + +=== Androguard + +Androguard is a python library for parsing and analysing #APK files. + +=== Jadx + +=== Soot + +=== Frida + diff --git a/2_background/main.typ b/2_background/main.typ index 1442111..49fecc7 100644 --- a/2_background/main.typ +++ b/2_background/main.typ @@ -4,6 +4,8 @@ #todo[Present field background and related work] +#include("X_android.typ") +#include("X_tools.typ") #text(fill: luma(75%), lorem(200)) /* From f4163d8c91e352ad61f5cf2d55ffa4d54bd1afd9 Mon Sep 17 00:00:00 2001 From: Jean-Marie Mineau Date: Tue, 8 Jul 2025 15:52:37 +0200 Subject: [PATCH 7/7] wip --- 2_background/X_tools.typ | 35 +++++++++++++++++++++++++++++------ 2_background/main.typ | 5 +++-- 2 files changed, 32 insertions(+), 8 deletions(-) diff --git a/2_background/X_tools.typ b/2_background/X_tools.typ index 22334c9..22853de 100644 --- a/2_background/X_tools.typ +++ b/2_background/X_tools.typ @@ -5,11 +5,9 @@ Due to the specificities of Android, the usual tools for reverse engineering are not enough. #todo[blabla intro in @sec:bg-tools] -#todo[References in @sec:bg-tools] - === Android Studio -The whole Android developement ecosystem is packaged by Google in the #IDE Android Studio. +The whole Android developement ecosystem is packaged by Google in the #IDE Android Studio#footnote[https://developer.android.com/studio]. In practice, Android Studio is a source-code editor that wrap arround the different tools of the android #SDK. The #SDK tools and packages can be installed manually with the `sdkmanager` tool. Among the notable tools in the #SDK, they are: @@ -35,17 +33,42 @@ Among the notable tools in the #SDK, they are: === Apktool -Apktool is a *reengineering tool* for Android #APK files. -It can be used to disassemble an application: it will extract the files from the #APK file, convert the binary #AXML to text #XML, and use smali/backsmali to convert the #DEX files to smali, an assembler-like langage that match the Dalvik bytecode instructions. +Apktool#footnote[https://apktool.org/] is a _reengineering tool_ for Android #APK files. +It can be used to disassemble an application: it will extract the files from the #APK file, convert the binary #AXML to text #XML, and use smali/backsmali#footnote[https://github.com/JesusFreke/smali] to convert the #DEX files to smali, an assembler-like langage that match the Dalvik bytecode instructions. The main strenght of Apktool is that after having disassemble an application, the content of the application can be edited and reassemble into a new #APK. === Androguard -Androguard is a python library for parsing and analysing #APK files. +#todo[ref to androguard paper] + +Androguard#footnote[https://github.com/androguard/androguard] is a python library for parsing and analysing #APK files. +Its main feature is disassembling #APK files. +It can be used to automatically read Android manifests, ressources, and bytecode. +Contrary to Apktool, it can be used programatically, whithout parsing text files, to analyse the application, but it cannot repackage a modified application. + +In addition, it can perform additionnal analysis, like computing a call graph or control flow graph. === Jadx +Jadx#footnote[https://github.com/skylot/jadx] is an application decompiler. +It convert #DEX files to Java source code. +It is not always capable of decompiling all classes of an application, so it cannot be used to recompile a new application, but the code generated can be verry helpfull to reverse an application. +In addition to decompilling #DEX files, Jadx can also decode Android manifests and application ressources. + === Soot +#todo[soot ref] + +Soot#footnote[https://github.com/soot-oss/soot] is a Java optimization framework. +It can leaft java bytecode to other intermediate representations that can be used to perform optimization then converted back to bytecode. +Because Dalvik bytecode and Java bytecode are equivalent, support for Android was added to Soot, and Soot features are now leveraged to analyse Android applications. +One of the best known example of Soot usage for Android analysis is Flowdroid #todo[ref], a tool that compute data flow in an application. + +A new version of Soot, SootUp#footnote[https://github.com/soot-oss/SootUp], is currently beeing worked on. +Compared to Soot, it has a modernize interface and architecture, but it is not yet feature complete and some tools like Flowdroid are still using Soot. + === Frida +Fidra#footnote[https://frida.re/] is a dynamic intrumentation toolki. + + diff --git a/2_background/main.typ b/2_background/main.typ index 49fecc7..2360b17 100644 --- a/2_background/main.typ +++ b/2_background/main.typ @@ -1,12 +1,13 @@ -#import "../lib.typ": todo +#import "../lib.typ": todo, epigraph = Background +#epigraph("Alexis \"Lex\" Murphy, Jurassic Park")[This is a Unix system. I know this.] + #todo[Present field background and related work] #include("X_android.typ") #include("X_tools.typ") -#text(fill: luma(75%), lorem(200)) /* * Cours generique sur android