|| Date: 19-01-24 || Back to index ||
|| Tag: write-up ||

On Frida & Code Instrumentation

The Problem

Imagine there is an application we would like to understand. Take the case of PicoCTF’s quack-me binaries. Running it would yield this

[0] % ./main
You have now entered the Duck Web, and you're in for a honkin' good time.
Can you figure out my trick?
hello
That's all folks.

The binary is basically asking for a key. Simplest way to move forward would be to statically analyze the function that is responsible for displaying the flag:

│       │   ; CODE XREF from sym.do_magic (0x8048711)
│      ┌──> 0x080486bd      8b45e8         mov eax, dword [local_18h]       ; Jump is taken since size > 0
│      ╎│   0x080486c0      0558880408     add eax, obj.sekrutBuffer       ; 0x8048858 ; ")\x06\x16O+50\x1eQ\x1b[\x14K\b]+S\x10TQCM\T]"
│      ╎│   0x080486c5      0fb608         movzx ecx, byte [eax]
│      ╎│   0x080486c8      8b55e8         mov edx, dword [local_18h]
│      ╎│   0x080486cb      8b45ec         mov eax, dword [s]
│      ╎│   0x080486ce      01d0           add eax, edx
│      ╎│   0x080486d0      0fb600         movzx eax, byte [eax]
│      ╎│   0x080486d3      31c8           xor eax, ecx
│      ╎│   0x080486d5      8845e3         mov byte [local_1dh], al
│      ╎│   0x080486d8      8b1538a00408   mov edx, dword obj.greetingMessage       ; [0x804a038:4]=0x80487f0 str.You_have_now_entered_the_Duck_Web__a
│      ╎│   0x080486de      8b45e8         mov eax, dword [local_18h]
│      ╎│   0x080486e1      01d0           add eax, edx
│      ╎│   0x080486e3      0fb600         movzx eax, byte [eax]
│      ╎│   0x080486e6      3a45e3         cmp al, byte [local_1dh]
│     ┌───< 0x080486e9      7504           jne 0x80486ef               ;[3]   ; likely ; (if local_1ch == 0x19, we win)
│     │╎│   0x080486eb      8345e401       add dword [local_1ch], 1       ; local_1ch increments everytime local_1dh == al
│     │╎│   ; CODE XREF from sym.do_magic (0x80486e9)
│     └───> 0x080486ef      837de419       cmp dword [local_1ch], 0x19       ; if local_1ch == 0x19, we win
│     ┌───< 0x080486f3      7512           jne 0x8048707               ;[4]   ; likely
│     │╎│   0x080486f5      83ec0c         sub esp, 0xc
│     │╎│   0x080486f8      68ab880408     push str.You_are_winner       ; 0x80488ab ; "You are winner!" ; const char *s
│     │╎│   0x080486fd      e86efdffff     call sym.imp.puts           ;[5]   ; int puts(const char *s)
│     │╎│   │                                                            ; int puts(const char * s : (*0xffffffff)0x00177fec = .........................................................................................................
│     │╎│   0x08048702      83c410         add esp, 0x10
│    ┌────< 0x08048705      eb0c           jmp 0x8048713               ;[6]
│    ││╎│   ; CODE XREF from sym.do_magic (0x80486f3)
│    │└───> 0x08048707      8345e801       add dword [local_18h], 1
│    │ ╎│   ; CODE XREF from sym.do_magic (0x80486bb)
│    │ ╎└─> 0x0804870b      8b45e8         mov eax, dword [local_18h]
│    │ ╎    0x0804870e      3b45f0         cmp eax, dword [size]
│    │ └──< 0x08048711      7caa           jl 0x80486bd                ;[7]   ; unlikely ; (Jump is taken since size > 0)
│    └────> 0x08048713      c9             leave
└           0x08048714      c3             ret

The above output is from Radare2. There is quite a bit of jumps and loops happening above, but I can see the You are a winner! string in the comments above. It is being accessed and then puts is invoked. This makes me think I’m on the right track.

How to move forward from there? I would attach a debugger to basically figure out two things:

How this function works
How would the registers get manipulated

After a bit of debugging, we actually figure out that the input is XOR-ed with obj.sekrutBuffer constant variable and we build the solution based on that piece of finding. A piece of finding that would’ve been a lot more complicated had I traced the execution of the code statically.

Debuggers allow the analyst to enter God Mode and be able to see practically everything. The only limit would be one’s knowledge level, not the tooling used.

Let’s jump to the mobile world and take another example: you got an Android app that is using some funky encryption. You use apktool or whatever to disassemble the code and get through the static analysis phase and you find that the function you’re analyzing is calling a getKey() function that basically calculates the key based on some weird measurements. How would one proceed with tracing this code?

For Android, there are some options: on the Java level, we can debug the code with JDWP. On the native level, any ptrace-based debugger (LLDB, GDB, r2, etc.) would work just fine. It’s not a simple execution like ./gdb --args app.apk though. You’d need quite the hefty setup. For the Java level, you’d need to:

Disassemble the app
Set the debuggable flag to true
Rebuild the app
Decompile the app to get the Java sources (with CFR decompiler or jadx)
Setup an IDE, like Android Studio and port the decompiled java code to it
Setup the testing device to have that app in the “Wait for debugger” list of apps
Setup breakpoints on the getKey() function
Run the app

You’d basically have to do this for every single app. Modifying the returns and values of a function is possible through Android Studio (or even plain-old JDWP), but it takes too much time to bootstrap and whatever changes that you make cannot be saved. The native-level setup is even worse, since you’d have to hook gdbserver onto the running process and analyze how things look like.

Reversing and tracing decompiled sources is hard and complaining won’t help. However, if there is a better solution that can answer the two questions I used the debugger to answer in the first place, namely:

How this function works
How would the registers get manipulated

I would opt-in for an easier solution.

Frida

Frida is a dynamic instrumentation framework, capable of allocating a portion of the running process’s memory which would create a bi-directional communication channel. The Frida user would then be able to run JavaScript snippets which would be executed in the process’s memory as the process.

Think of it as a combination between strace and greasemonkey.

Furthermore, Frida is capable of inline hooking into the process’s functions in order to observe, modify, and even completely replace the implementation of a function with another.

As we will see, the use cases for Frida is quite amazing:

Memory dumps
fault injection
Tracing/manipulating methods in runtime

Quick Example

I’ll skip the installation step. It’s on the website.

Say you wanna know where does Spotify’s Linux application save its config file.

frida-trace -i "open*" -f /usr/share/spotify/spotify
...
...
...
   616 ms  open(pathname="/home/cheese/.config/spotify/prefs.tmp", flags=0x241)
           /* TID 0x6836 */
   620 ms  open(pathname="/home/cheese/.cache/spotify/Browser/Local Storage/leveldb/LOG", flags=0x241)
   621 ms  open(pathname="/home/cheese/.cache/spotify/Browser/Local Storage/leveldb/LOCK", flags=0x2)
   621 ms  open(pathname="/home/cheese/.cache/spotify/Browser/Local Storage/leveldb/CURRENT", flags=0x0)
   621 ms  open(pathname="/home/cheese/.cache/spotify/Browser/Local Storage/leveldb/MANIFEST-000001", flags=0x0
...

Found it. The frida-trace tool is just a quick wrapper that writes the JavaScript code based on the input and the man pages, and then spins up Frida.

Let’s take the example of an Android application and include the bootstrapping steps

Use Case #1: Tracing Android Java Functions

Bootstrap
Script
Utility traceMethod() function

In this example, we’ll trace an Android function. Refer to this link. I’ll just post the MainActivity() here:

1  public class MainActivity extends AppCompatActivity {
2  
3      @Override
4      protected void onCreate(Bundle savedInstanceState) {
5          super.onCreate(savedInstanceState);
6          setContentView(R.layout.activity_main);
7          Toolbar toolbar = (Toolbar) findViewById(R.id.toolbar);
8          setSupportActionBar(toolbar);
9  
10         // onClickListener for the only button in the app
11         FloatingActionButton fab = (FloatingActionButton) findViewById(R.id.fab);
12         fab.setOnClickListener(new View.OnClickListener() {
13             @Override
14             public void onClick(View view) {
15                 network_handler.sendPOSTRequest("{\"data\": \"aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d\", \"key\": \"" + getSecretKey() + "\"}";
16             }
17         });
18 
19         public String getSecretKey() {
20             return "My-Super-secret-Key";
21         }
22     }
23 }

The app just has one button that, when clicked, will invoke lines 11 -> 17. network_handler will send a POST request with a bunch of encrypted data and the secret key with the request. We’d like to extract that secret key.

Note: The process of bootstrapping Frida on an Android device is explained here. You’ll basically need to push frida_server executable to your rooted device/emulator and run it.

Here’s how the extraction script (I call it agent.js) looks like:

// agent.js
'use strict';

Java.perform(function () {
    let Activity = Java.use("com.adjust.androidjniexample.MainActivity");
    Activity.getSecretKey.implementation = function () {
        send("getSecretKey() got called! Let's call the original implementation");

        let retval = this.getSecretKey.apply(this, arguments);
        console.log("\nretval: " + retval);
        return retval;
    };
});

We’ll run it using the Frida CLI command while the app is running: frida --enable-jit -U com.my.app -l agent.js.

The secret key will be printed on the screen. Done. That was much faster than working with JDWP.

One can also trace Android native functions, even without worrying about ASLR since all the function address will be loaded from memory. Check out Module.findExportByName() in Frida’s JavaScript API.

Note: --enable-jit param is not necessary but it will run the V8 JS engine instead of DukTape. This allows me to use latest ECMAScript niceties

Use case #2: Fault Injection

Fault injection is any techniques that is used to verify the fault tolerance of hardware or software. Even coffee makers have fault injection tests. Faults are usually injected through:

Pins of integrated circuits
Bursts of EMI (Electromagnetic interference)
Altered voltage levels
Or Frida :)

Below is a snippet for assigning Linux’s errno to ECONNREFUSED when a port 80/443 connection occurs through the invocation of connect() syscall. More info here.

'use strict';

const AF_INET = 2;
const AF_INET6 = 30;
const ECONNREFUSED = 61;
const funcName = "connect";

funcion main() {
    const connect = new NativeFunction(Module.findExportByName(null, funcName), "int", ["int", "pointer", "int"]);
    Interceptor.replace(connect, new NativeCallback((socket, address, addressLen) => {
        const family = Memory.readU8(address.add(1));
        if (family != AF_INET) {
            return;
        }

        const port = (Memory.readU8(address.add(2)) << 8) | Memory.readU8(address.add(3));

        let ip = '';
        for (let offset = 4; offset != 8; offset++) {
            if (ip.length > 0)
                ip += '.';
            ip += Memory.ReadU8(address.add(offset));
        }

        console.log(`connect() family=${family} ip=${ip} port=${port}`);
        if (port === 80 | port === 443) {
            console.log("Blocking...");
            this.errno = ECONNREFUSED;
            return -1;
        } else {
            console.log("Accepting...");
            return connect(socket, address, addressLen);
        }
    }), "int", ["int", "pointer", "int"]);
}

main();

console.log("ready");

Credits for the soctaddr parsing logic goes to the creator of Frida, Mr. Ole Andre Ravnas during R2Con18.

Extra Links

Pin is a very good instrumentation framework by Intel - Link
Android’s NDK team has quite a bit of documentation about tracing native functions for the purposes of debugging and whatnots. It might not be very useful for a reverser but it’s definitely worth the time if you’re debugging your own Android native functions. It’s basically running perf under the hood, from what I understand - Link - Link 2
Speaking of Perf! - Link
It’s also worth mentioning that the best and most up-to-date reference on Frida is the JavaScript API page and Mr. Ole’s releases page which is actually very readable and not cluttered. I learned about the --enable-jit trick from there.