Featured image of post Using Koi for Symbolic Execution to Solve CrackMes and Find Vulnerabilities

Using Koi for Symbolic Execution to Solve CrackMes and Find Vulnerabilities

Koi is a C++ extension to the Triton dynamic binary analysis library that provides convenience features for symbolic execution. By hooking functions and instructions, the heap and stack can be dynamically tracked for a large range of cyber related tasks.

Backgroud

After a long lull without using symbolic execution, I decided to polish my skills by poking around with some binaries to see what I could do. My usual tool for this would be the Binary Analysis Platform (BAP), but it is hardly maintained these days. The suggested developer environment becomes more and more difficult to setup as its dependencies fall out of support.

This left me looking for a new “Goldilocks” binary symbolic executor; a tool that wraps boilerplate code but doesn’t over-automate everything, provides granular logging but only when it’s requested, and makes it easy to identify branch conditions. The tool most people would use is Angr, but it falls on the over-automation end. Another option I had explored in the past was Triton, but it suffers from the opposite problem, where only the most basic functionality is ready out of the box.

This led to my development of Koi, a C++ extension to the Triton that provides convenience features for symbolic execution.

Koi Capabilities

The biggest hurdle to getting started with Triton is that it does not provide a way to map an executable file into memory. Koi handles this by implementing a minimally viable mapper, which currently works on ELF files.

In addition to mapping the file to memory, the state of the heap and stack are dynamically tracked during symbolic execution.

Swimmer

Koi utilizes a Swimmer to perform symbolic exploration on a memory pool from a defined source to a defined target. All satisfiable paths of code are explored until a maximum depth has been hit, one instruction has been executed too many times, or the target is reached. The constraints to follow each branch point are tracked so that the swimmer can provide a model on how to reach the target.

To define a Swimmer for a file is as simple as creating an object.

#include <Koi/swimmer.h>

const size_t MAIN = 0x1010d0;
const size_t TARGET = 0x1012dd;

int main(int argc, char *argv[]) {
    Swimmer swimmer = Swimmer("/path/to/my/elfFile");
    swimmer.setPc(MAIN);
    swimmer.explore(TARGET);
}

Hooks

While exploring, instruction hooks can be used to execute your code after the instruction has been executed. This allows for inspection and modification of the Swimmer state during exploration.

Alternatively, function hooks will override calls in the binary code to execute your code instead. This is ideal for implementing library functions or skipping functions to stub known side effects.

#include <Koi/swimmer.h>
#include <Koi/bait.h>

const size_t MAIN = 0x1010d0;
const size_t TARGET = 0x1012dd;

void getReturnValue(Swimmer *s, triton::arch::Instruction insn) {
    auto rax = s->registers.x86_rax;
    triton::uint64 ptr = triton::uint64(s->getConcreteRegisterValue(rax));

    std::cout << "ptr addr  = 0x" << std::hex << key_ptr << std::dec << std::endl;
    std::cout << "ptr value = "   << s->readString(key_ptr) << std::endl;
}

int main(int argc, char *argv[]) {
    Swimmer swimmer = Swimmer("/path/to/my/elfFile");
    swimmer.setPc(MAIN);
    
    swimmer.hookFunction(0x10134a, koi_fgets);
    swimmer.hookInstruction(0x10134d, getReturnValue);
    
    swimmer.explore(TARGET);
}

Bait

To alleviate some workload, pre-developed function hooks–called bait–are provided. These are intended to perform the original purpose of the function while respecting any concrete values or symbolic constraints. It is most beneficial to call bait functions after performing your own hook. See the example in Hooks for how the fgets() bait was used.

Injection

To override an instruction, instruction injection can be used. This executes the new instruction and ignores the original instruction. This is a way to test patching without commiting it to a binary file.

A varient of instruction injection that goes beyond patching is jump condition injection. This allows for a more robust model to be created when there are branch points that are constrained by the return value of a function. This is a difficult to grapple functionality, but extrememly powerful. Several examples can be found in the examples directoruy of the Koi project, which are too long to include here.

Dead End

Sometimes a path exists that is satisfiable, but is of no interest. In this case, Swimmers that reach this known address can be killed to speed up exploration.

Creating a Key Generator

To make the most out of Koi, a CrackMe that requires a semi-complex key generator was selected. For this, I chose Switching CrackMe by cat_puzzler. The difficulty level is relatively low, but one comment makes mention of Z3, which is exactly what Koi uses for symbolic solving.

Static Analysis

To write the C++ code for Koi, static analysis must be used.

First, the start and target addresses must be chosen. The main function is an easy choice to begin symbolic exploration, allowing us to skip _start and __libc_start_main, which have parameters that are more complicated than main. As for the target, the printing of the string “The password was correct” can be found at 0x10141A, making it an obvious target. Some challenges include red herrings, but that is not the case here.

Image 1: Symbolic execution target that prints a positive message.

Next, several dead ends can immediately be found. The libc function exit is clearly a dead end, as well as one function whose only purpose is to print a failure message and call exit(). Likewise, there are two functions that check some condition: check_id_xor and check_id_sum. On failure, -1 is returned. Therefore, if the return of -1 is reached, exploration can be cut short, since it is assumed that we want these checks to pass.

Image 2: One dead end, where an error message is printed and execution is exited.

Since this challenge requires a parameter to be passed as a program argument in argv, this value needs to be set in the C++ Koi code as well. This is as simple as choosing a dummy address for argv and a dummy string that it points to. The following will seed the program such that argv = 0xBA5EBA11 and argv[1] = 0xDEADBEEF. Later, however, this memory region will be symbolized since it contains the key.

swimmer.setConcreteRegisterValue(swimmer.registers.x86_rsi, 0xBA5EBA11);
swimmer.setConcreteMemoryAreaValue(0xBA5EBA11 + 8, { 0xEF, 0xBE, 0xAD, 0xDE });

It is trivial to find that argv[1] is later copied to a global variable at 0x104040. The aforementioned checks are performed on this copied value, not argv[1]. As luck would have it, there is a bait function that can be used to stub strcpy, making it an extremely easy function hook to add.

There are also calls to strlen, but a simple function hook can be added to stub some desired length. Through static reverse engineering, the shortest valid key length is 5 characters, so let’s do 6 to show off.

Constraint Building

Through static reverse engineering, four constraints were discovered:

  • The return of check_id_xor should be non-negative.
  • The return of check_id_sum should be non-negative.
  • The key must contain at least five characters (we have chosen six).
  • Though not previously mentioned, the key must contain only numeric characters (excluding the last null byte).

Fortunately, the process of symbolic exploration should automatically handle the calls to check_id_xor and check_id_sum, since the dead ends have been defined for failing paths. This means that there is no need to understand the key validation at all, making the process to create a key generator signficantly easier.

Key Length

Starting with the key length, the location in the disassembled code performs a jump into an error if the length is less than five. So, after symbolizing the global variable that stores the key, the first five bytes of the key are constrained to 0x00 (null). Then, all constraints are chained using a logical OR. Lastly, the constraint chain is injected at the jump address. In other words, if byte[0] is null, or byte[1] is null, etc. up to byte[4], then the jump to an error is performed. We want to the jump to fail, so our logic sounds backwards, but is ultimately correct.

Image 3: Location of the desired failed jump condition.

With the minimum length handled, the actual length of the key must be handled as well. This one is more simple than the previous, since it is known that we only want to the last byte to be null. So, after creating that constraint, it is injected as a jump condition that acts as a way of enforcing the stubbed length of 6.

Image 4: Location of the desired successed jumped condition.

All Numeric ASCII Characters

All characters in a valid key are strictly 0-9, or [0x30, 0x39] in ASCII. At this phase, the input is already symbolized, so we only need to add a constraint to each byte to clamp it in that range. The constraints are the chained together using a logical AND since all bytes must meet that criteria.

Image 5: Location of the numeric ASCII check in decompiled code.

Unlike the key length requirements, there is no need to inject a jump condition here. Instead, the value of RAX and RBX can be made equal, which will successfully avoid the call to error.

Image 6: The corresponding disassembled code that checks for RAX and RBX equality.

Results

With all the preperation done, the final Koi test can be used to symbolically explore the challenge. This provides three keys for use in the challenge, which successfully print the desired positive message.

Image 7: Successful output of the positive message using three generated keys

The source code for this example is available on my GitHub.

Finding Memory Violations

While entertaining, CrackMes and CTFs are usually in a different realm than vulnerability discovery. A weakness is not always as immediately obvious as a message that says “Correct Key Entered!” or “Try Again.” Instead, a deeper understanding must be had to spot what is wrong and where. Symbolic exploration is a useful tool in automation, but still requires some up front static analysis to get everything started.

Input File

Take the following, trivial code as an example for a memory violation known as a “double free.”

#include <stdlib.h>
int main(int argc, char *argv[]) {
    void *buf = malloc(4);
    free(buf);
    free(buf);
    return 0;
}

This is an exceedingly simple program, and most static analyzers will pick up on the mistake. However, in practice, symbolic execution will be able to identify double free weaknesses spanning across a wider surface than usual static anaylsis tools.

Koi Test

By using the Koi bait for malloc (koi_malloc), it is simple to hook function calls to malloc to add new buffers to the dynamically tracked heap as symbolic memory.

Likewise, a koi_free bait exists, but does not check for a double free weakness. However, in only one function call, it can be determined if an address is currently tracked as the head of the buffer in the dynamically tracked heap. So, by creating a custom function hook for free which calls koi_free, we can have the best of both worlds: checking for a double free while also letting Koi do the heavy lifting for the actual memory management.

Image 8: Output of a successfully detected double free weakness.

The source code for this example is available on my Github along with more challenging memory violation weaknesses, including use after free, string overflow, and invalid pointers.

Built with Hugo
Theme Stack designed by Jimmy