parent
98cf4fbf6a
commit
ba7130160e
1 changed files with 37 additions and 27 deletions
|
@ -1,11 +1,5 @@
|
|||
#import "../lib.typ": todo, APK, DEX, JAR, OAT, eg, ART, paragraph, jm-note, jfl-note
|
||||
|
||||
/*
|
||||
* Parler de dex lego et du papier qui encode les resultats d'anger en jimple
|
||||
* argggg https://dl.acm.org/doi/10.1145/2931037.2931044 is verrryyyyy close
|
||||
*
|
||||
*/
|
||||
|
||||
== Code Transformation <sec:th-trans>
|
||||
|
||||
#todo[Define code loading and reflection somewhere]
|
||||
|
@ -16,10 +10,11 @@ In this section, we will see how we can transform the application code to make d
|
|||
|
||||
=== Transforming Reflection <sec:th-trans-ref>
|
||||
|
||||
In Android, reflection can be used to do two things: instanciate a class, or call a method.
|
||||
Either way, reflection starts by retrieving the `Class` object representing the class to use.
|
||||
This class is usually retrieved using a `ClassLoader` object, but can also be retrieved directly from the classloader of the class defining the calling method.
|
||||
// elaborate? const-class dalvik instruction / MyClass.class in java?
|
||||
|
||||
In Android, reflection allows to instanciate a class, or call a method, without having this class or method appear in the bytecode.
|
||||
Instead, the bytecode uses the generic classes `Class`, `Method` and `Constructor`, that represent any existing class, method or constructor.
|
||||
Reflection often starts by retrieving the `Class` object representing the class to use.
|
||||
This class is usually retrieved using a `ClassLoader` object (though they are other ways to get it).
|
||||
Once the class is retrieved, it can be instanciated using the deprecated method `Class.newInstance()`, as shown in @lst:-th-expl-cl-new-instance, or a specific method can be retrieved.
|
||||
The current approach to instanciate a class is to retrieve the specific `Constructor` object, then calling `Constructor.newInstance(..)` like in @lst:-th-expl-cl-cnstr.
|
||||
Similarly, to call a method, the `Method` object must be retrieved, then called using `Method.invoke(..)`, as shown in @lst:-th-expl-cl-call.
|
||||
|
@ -53,18 +48,17 @@ When instanciating an object with `Object obj = cst.newInstance("Hello Void")`,
|
|||
caption: [Calling a method using reflection]
|
||||
) <lst:-th-expl-cl-call>
|
||||
|
||||
To allow static analysis tools to analyse an application that use reflection, we want to replace the reflection call by the bytecode that actually call the method.
|
||||
One of the main reasons to use reflection is to access classes that are not present in the application bytecode, nor are platform classes.
|
||||
Indeed, the application will crash if the #ART encounter references to a class that is cannot be found by the current classloader.
|
||||
This is often the case when dealing with classes from bytecode loaded dynamically.
|
||||
|
||||
#jfl-note[
|
||||
One of the main reason to use reflection is to access classes not from the application.
|
||||
Although allows the use classes that do not exist in the application in bytecode, at runtime, if the classes are not found in the current classloader, the application will crash.
|
||||
Similarly, some analysis tools might have trouble analysis application calling non existing classes.
|
||||
@sec:th-trans-cl deals with the issue of adding dynamically loaded bytecode to the application.
|
||||
][#underline[pas clair]]
|
||||
To allow static analysis tools to analyse an application that use reflection, we want to replace the reflection call by the bytecode that actually call the method.
|
||||
In @sec:th-trans-cl, we deal with the issue of dynamic code loading so that the classes used are in fact present in the application.
|
||||
|
||||
A notable issue is that a specific reflection call can call different methods.
|
||||
@lst:th-worst-case-ref illustrates a worst case scenario where any method can be called at the same reflection call.
|
||||
In those situation, #jfl-note[we cannot garanty that we know all the methods][expliquer (on va collecter les noms en best efforts?) Expliquer ce qu'on veut dire "acceder a une classe qui n'est pas dans l'APK, si l'appli crash a quoi ca sert?] that can be called (#eg the name of the method called could be retrieved from a remote server).
|
||||
In those situation, we cannot garanty that we know all the methods that can be called (#eg the name of the method called could be retrieved from a remote server).
|
||||
In addition, the method we propose in @sec:th-dyn is a best effort approach to collect reflection data: like any dynamic analysis, it is limited by its code coverage.
|
||||
|
||||
#figure(
|
||||
```java
|
||||
|
@ -76,13 +70,22 @@ In those situation, #jfl-note[we cannot garanty that we know all the methods][ex
|
|||
) <lst:th-worst-case-ref>
|
||||
|
||||
To handle those situation, instead of entirely removing the reflection call, we can modify the application code to test if the `Method` (or `Constructor`) object match any expected method, and if yes, directly call the method.
|
||||
If the object does not match any expected method, #jfl-note[the code can fallback to the original reflection call.][comme DroidRA? \ hheuuu, a verifier]
|
||||
#jfl-note[@lst:-th-expl-cl-call-trans demonstrate this transformation on @lst:-th-expl-cl-call.][Expliquer @lst:-th-expl-cl-call-trans ligne importante]
|
||||
If the object does not match any expected method, the code can fallback to the original reflection call.
|
||||
DroidRA~@li_droidra_2016 has a similar solution, except that reflective calls are always evaluated, and the static equivalent follow just after, guarded behind an opaque predicate that is always false at runtime.
|
||||
@lst:-th-expl-cl-call-trans demonstrate this transformation on @lst:-th-expl-cl-call:
|
||||
at line 25, the `Method` objet `mth` is checked using a method we generated and injected in the application (defined at line 2 in the listing).
|
||||
This method check if the method name, (line 5), its parameters (lines 6-9), its return type (lines 10-11) and its declaring class (lines 13-14) match the expected method.
|
||||
If it is the case, the method is used directly (line 26) after casting the arguments and associated object into the types/classes we just checked.
|
||||
If the check line 25 does not pass, the original reflectif call is made (line 28).
|
||||
If we were to expect other possible methods to be called in addition to `myMethod`, we would add `else if` blocks between lines 26 and 27, with other check methods reflecting each potential method call.
|
||||
/*
|
||||
#jfl-note[It should be noted that we do the transformation at the bytecode level, the code in the listing correspond to the output of JADX][
|
||||
J'aurais bien fait une section a part sur "comment on fait ces transformation concretement;
|
||||
plus pedagique de décrire les transformation sans bytecode, ensuite, sous section qui discute
|
||||
les facon de modifier le bytecode, soot, apktool, ect et qui explique les limites, puis dire comment tu fait mes modifications
|
||||
] #todo[Ref to list of common tools?] reformated for readability.
|
||||
*/
|
||||
|
||||
The method check is done in a separate method injected inside the application to avoid clutering the application too much.
|
||||
Because Java (and thus Android) uses polymorphic methods, we cannot just check the method name and its class, but also the whole method signature.
|
||||
We chose to limit the transformation to the specific instruction that call `Method.invoke(..)`.
|
||||
|
@ -134,12 +137,12 @@ In those cases, the parameters could be used directly whithout the detour inside
|
|||
objRet = mth.invoke(obj, args);
|
||||
}
|
||||
String retData = (String) objRet;
|
||||
``` + todo[Ajouter lignes],
|
||||
```,
|
||||
caption: [@lst:-th-expl-cl-call after the de-reflection transformation]
|
||||
) <lst:-th-expl-cl-call-trans>
|
||||
|
||||
|
||||
=== Transforming Code Loading <sec:th-trans-cl>
|
||||
=== Transforming Code Loading (or Not) <sec:th-trans-cl>
|
||||
|
||||
#jfl-note[Ici je pensais lire comment on tranforme le code qui load du code, mais on me parle de multi dex]
|
||||
|
||||
|
@ -148,8 +151,13 @@ Because it is an internal, platform dependant format, we elected to ignore the #
|
|||
Practically, #JAR and #APK files are zip files containing #DEX files.
|
||||
This means that we only need to find a way to integrate #DEX files to the application.
|
||||
|
||||
We elected to simply add the dex files to the application, using the multi-dex feature introduced by the SDK 21 now used by all applications as shown in @fig:th-inserting-dex. #jfl-note[aleady discussed in @sec:cl]
|
||||
This gives access to the dynamically loaded code to static analysis tool.
|
||||
|
||||
We saw in @sec:cl the class loading model of Android.
|
||||
When doing dynamic code loading, an application define a new `ClassLoader` that handle the new bytecode, and start accessing its classes using reflection.
|
||||
We also saw in @sec:cl that Android now use the multi-dex format, allowing it to handle any number of #DEX files in one classloader.
|
||||
Therefore, the simpler way to give access to the dynamically loaded code to static analysis tool is add the dex files to the application.
|
||||
This should not impact the classloading model as long as there is no class collision (we will explore this in @sec:th-class-collision) and as long as the original application appliaction did not try to access unaccessible classes.
|
||||
#jm-note[explain? maybe ref to section limitation]
|
||||
|
||||
#figure(
|
||||
image(
|
||||
|
@ -160,16 +168,18 @@ This gives access to the dynamically loaded code to static analysis tool.
|
|||
caption: [Inserting #DEX files inside an #APK]
|
||||
) <fig:th-inserting-dex>
|
||||
|
||||
We decided to leave untouched the original code that load the bytecode.
|
||||
In the end, we decided to *not* modify the original code that load the bytecode.
|
||||
Statically, we already added the bytecode loaded dynamically, and most tools already ignore dynamic code loading.
|
||||
At runtime, although the bytecode is already present in the application, the application will still dynamically load the code.
|
||||
This ensure that the application keep working as intended even if the transformation we applied are incomplete.
|
||||
Specifically, to call dynamically loaded code, an application needs to use reflection, and we saw in @sec:th-trans-ref that we need to keep reflecton calls, and in order to keep reflection calls, we need the classloader created when loading bytecode.
|
||||
Specifically, to call dynamically loaded code, an application needs to use reflection, and we saw in @sec:th-trans-ref that we need to keep reflection calls, and in order to keep reflection calls, we need the classloader created when loading bytecode.
|
||||
|
||||
=== Class Collisions <sec:th-class-collision>
|
||||
|
||||
We saw in @sec:cl/*-obfuscation*/ that having several classes with the same name in the same application can be problematic.
|
||||
In @sec:th-trans-cl, we are adding new code.
|
||||
#jfl-note[By doing so, we increase the probability of having class collisions.][Un mini exemple de collision serait utilse: on a du mal a comprendre d'ou vient la collision car c'est nous qui ajoutons des classes]
|
||||
By doing so, we increase the probability of having class collisions:
|
||||
The developper may have reuse a helper class in both the dynamically loaded bytecode and the application, or an obfuscation process may have rename classes without checking for intersection between the two sources of bytecode.
|
||||
When loaded dynamically, the classes are in a different classloader, and the class resolution is resolved at runtime like we saw in @sec:cl-loading.
|
||||
We decided to restrain our scope to the use of class loader from the Android SDK.
|
||||
In the abscence of class collision, those class loader behave seamlessly and adding the classes to application maintains the behavior.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue