CONFidence CTF 2015: Reversing 400 "Deobfuscate Me" writeup

A delegation of the H4x0rPsch0rr team spent a weekend in the awesome city of Kraków and participated in the even more awesome CONFidence CTF organized by the DragonSector team (thanks for the invitation!). Out of the numerous challenges that were opened up during the CTF, no team was able to solve a reversing task called “Deobfuscate Me” which was worth 400 points.

After the competition task authors presented their sample solutions which, in case of this task, consisted of incrementally measuring differences in the execution traces of the challenge depending on the user input. However, we wanted to show that it was also possible to leverage symbolic execution in combination with a SMT solver to solve the challenge. Our (somewhat tedious) approach is described in this writeup.

The challenge is a classical reversing task which simply asks the user for input, performs several checks on it and finally outputs whether the input was correct. Of course, the correct input would then be the flag.

julian@TPT440p ~/CTF/2015_05_CONFidence/rev400 % ./DeobfuscateMe.elf
Flag: DrgnS{I_have_no_idea_what_to_put_here}
Nope :(

As you can see from the dummy input the flag format was known a priori, a fact that decreased the time needed to solve several tasks substantially. The same applies to this challenge, as we will see later.

Firing up Hex-Rays yields the following straight-forward code for main:

int main(int argc, const char **argv, const char **envp)
{
    char uinput[64];

    printf("Flag: ");
    scanf("%63s", uinput);
    if (check(uinput) == 0)
        puts("Nice, that was the flag!");
    else
        puts("Nope :(");
    return 0;
}

Removing a first layer of obfuscation

The interesting part is the function check, where the binary verifies the user input. We quickly associate the name of the task with what we are presented with by check:

.text:080484E0 ; =============== S U B R O U T I N E ===============
.text:080484E0
.text:080484E0
.text:080484E0                 public _check
.text:080484E0 _check          proc far                ; CODE XREF: main+3C
.text:080484E0
.text:080484E0 arg_41B496AE    = dword ptr  41B496B6h
.text:080484E0
.text:080484E0                 xchg    edx, ebx
.text:080484E2                 xor     eax, 0FFFFFFFFh
.text:080484E5                 xchg    ebx, ecx
.text:080484E7                 jmp     loc_804EC31
<% --- snip --- %>
.text:0804865E loc_804865E:                            ; CODE XREF: _check+5D34
.text:0804865E                 or      edx, 0A869CBC9h
.text:08048664                 xchg    ebx, edx
.text:08048666                 jmp     loc_805ACEE
<% --- snip --- %>
.text:0804E211 loc_804E211:                            ; CODE XREF: _check+6754
.text:0804E211                 shl     edx, 1Bh
.text:0804E214                 jmp     loc_804865E
<% --- snip --- %>
.text:0804EC31 loc_804EC31:                            ; CODE XREF: _check+7
.text:0804EC31                 xchg    ecx, edx
.text:0804EC33                 xchg    eax, edx
.text:0804EC34                 jmp     loc_804E211
<% --- snip --- %>
.text:0805ACEE loc_805ACEE:                            ; CODE XREF: _check+186
.text:0805ACEE                 sub     ebx, ebp
.text:0805ACF0                 jmp     loc_805724C

The code is heavily scattered. Assuming that the chain of jmp instructions continues in this pattern and that all code snippets are position independent (i.e. do not reference constants or strings relative to their own position), this is nothing a small python script and Capstone cannot deal with:

#!/usr/bin/python2.7
from capstone import *
from capstone.x86 import *
md = Cs(CS_ARCH_X86, CS_MODE_32)
md.detail = True

c = open('DeobfuscateMe.elf', 'rb').read()

# virtual address base of the binary
vdiff = 0x08048000

# file offset of the target function
rip = 0x080484E0 - vdiff

# this buffer will store the function excluding the jumps
buf = b''

d = 0
op = [ None, None, None ]

while True:
    # disassemble one instruction
    i = md.disasm(c[rip:rip+16], rip + vdiff)
    i = next(i)

    # increase rip
    rip += i.size
    for x in range(len(i.operands)):
        op[x] = i.operands[x]

    # Follow jump, omit output to buf
    if i.id == X86_INS_JMP:
        rip = op[0].imm - vdiff
        continue
    # We're done, break
    elif i.id == X86_INS_RET:
        buf += i.bytes
        print("0x%x:\t%s\t%s" %(i.address, i.mnemonic, i.op_str))
        break

    # Any other instruction is appended to the output buf
    buf += i.bytes
    print("0x%x:\t%s\t%s" %(i.address, i.mnemonic, i.op_str))

# dump jmp-free version to file
open('deobfuscated.raw', 'wb').write(buf)

By quickly patching the deobfuscated blob into the original binary using dd and executing the patched binary we see that apparently we didn’t break it too badly:

julian@TPT440p ~/CTF/2015_05_CONFidence/rev400 % ./Deobfuscated.elf
Flag: DrgnS{I_still_have_no_clue}
Nope :(

40k instructions till complete failure

Okay, let’s reinvestigate the check function in IDA:

.text:080484E0                 public _check
.text:080484E0 _check          proc near               ; CODE XREF: main+3Cp
.text:080484E0
.text:080484E0 arg_4           = dword ptr  8
.text:080484E0
.text:080484E0                 xchg    edx, ebx
.text:080484E2                 xor     eax, 0FFFFFFFFh
.text:080484E5                 xchg    ebx, ecx
.text:080484E7                 xchg    ecx, edx
.text:080484E9                 xchg    eax, edx
.text:080484EA                 shl     edx, 1Bh
.text:080484ED                 or      edx, 0A869CBC9h
.text:080484F3                 xchg    ebx, edx
.text:080484F5                 sub     ebx, ebp
.text:080484F7                 add     ecx, eax
.text:080484F9                 sub     ebx, ebp
.text:080484FB                 xor     ebx, ebx
.text:080484FD                 add     ebx, eax
.text:080484FF                 xchg    ebx, edx
.text:08048501                 xchg    eax, ecx
.text:08048502                 xchg    edx, ebx
.text:08048504                 neg     ebx
.text:08048506                 add     eax, ebx
.text:08048508                 xchg    edx, ebx
.text:0804850A                 xchg    eax, ebx
.text:0804850B                 xor     edx, edx
.text:0804850D                 add     edx, 0FFBB4586h
.text:08048513                 sub     edx, 81622B0Fh
.text:08048519                 add     edx, 2276E2FAh
.text:0804851F                 xchg    eax, ecx
.text:08048520                 xor     edx, edx
.text:08048522                 xchg    ebx, ecx
.text:08048524                 xchg    eax, ebx
.text:08048525                 xchg    eax, ecx
.text:08048526                 xchg    ebx, ecx
.text:08048528                 sub     edx, 0D51EAFBFh
.text:0804852E                 xchg    edx, ecx
.text:08048530                 xchg    eax, edx
.text:08048531                 add     ecx, 0C6538058h
.text:08048537                 xchg    ebx, ecx
.text:08048539                 xchg    eax, ecx
.text:0804853A                 xchg    edx, ecx
.text:0804853C                 rol     ebx, 7
.text:0804853F                 xchg    eax, ecx
.text:08048540                 xchg    eax, ebx
.text:08048541                 xchg    edx, ecx
.text:08048543                 xchg    eax, edx
.text:08048544                 xchg    eax, ecx
.text:08048545                 push    ebp
.text:08048546                 xchg    eax, ebx
.text:08048547                 xchg    eax, ebx
.text:08048548                 xor     ebp, 0FFFFFFFFh
.text:0804854B                 xor     ebp, 0FFFFFFFFh
.text:0804854E                 add     ebp, 0A10FA35Fh
.text:08048554                 xchg    ebx, edx
.text:08048556                 add     ebp, 421A5649h
.text:0804855C                 sub     ebp, 6411B5D9h
.text:08048562                 add     ebp, 5F1EE010h
.text:08048568                 sub     ebp, 6CE7BCF1h
.text:0804856E                 add     ebp, 7BF9A346h
.text:08048574                 add     ebp, 1AE87B85h
<% --- snip --- %>
.text:08052103                 xchg    eax, ecx
.text:08052104                 xchg    ebx, edx
.text:08052106                 xchg    eax, ecx
.text:08052107                 mov     esi, edx
.text:08052109                 xchg    ebx, ecx
.text:0805210B                 neg     esi
.text:0805210D                 mov     ebp, esi
.text:0805210F                 neg     ebp
.text:08052111                 add     ebx, ebp
.text:08052113                 xchg    edx, ebx
.text:08052115                 xchg    ecx, ebx
.text:08052117                 xchg    eax, edx
.text:08052118                 xchg    eax, ebx
.text:08052119                 xchg    edx, ecx
.text:0805211B                 xchg    eax, edx
.text:0805211C                 xchg    eax, ecx
.text:0805211D                 xchg    eax, ebx
.text:0805211E                 xchg    edx, ebx
.text:08052120                 xchg    eax, edx
.text:08052121                 xchg    edx, ebx
.text:08052123                 mov     ebp, esi
.text:08052125                 and     ebp, 72EDEC3Ch
.text:0805212B                 xor     esi, 72EDEC3Ch
.text:08052131                 xor     esi, ebp
.text:08052133                 xor     edi, 0FFFFFFFFh
.text:08052136                 xchg    eax, ecx
.text:08052137                 xchg    edx, ebx
.text:08052139                 xchg    ebx, ecx
.text:0805213B                 xchg    ecx, ebx
.text:0805213D                 pop     ebp
.text:0805213E                 sub     edi, ecx
.text:08052140                 xchg    ecx, ebx
.text:08052142                 add     edi, ebx
.text:08052144                 retn
.text:08052144 _check          endp ; sp-analysis failed
.text:08052144
.text:08052144 ; -------------------------------------------------------

Aaaaargh. Even though we’ve seemingly successfully removed the jumps the code still is close to impossible to read. We see tons of xchg instructions, many of which are garbage (the actual user input resides on the stack, the function nevertheless starts to immediately read undefined registers), and lots of arithmetic operations using cryptic constants. The return value should in theory be written to eax in the function epilogue but, again, the xchg instructions make it impossible to track the final value back through the remaining 40k of machine code.

But hey, decompilers are quite advanced these days, so Hex-Rays’ liveliness analysis plus the built-in constant folding mechanism should remove most of the garbage automatically, right?

Well, to make things even worse, the small sp-analysis failed message at the bottom of the function tells us that there will be no decompiler to call for rescue as IDA hadn’t been able to reconstruct the stack frame of the function. Quickly inspecting the function yields tons of statements that assign arbitrary constants to both the ebp and the esp registers. Considering the 400 points of the challenge I’d allege that this is intentional to prevent an easy “decompiler-only” solution. At this point we considered rewriting all instructions that use either ebp or esp to use some other operand but as there is no spare register (we’re operating in 32bit mode and all e(abcd)x|e(sd)i registers are used by the calculation) a simple renaming of the offending instruction’s operands is also not feasible. Possibly replacing each instruction that uses ebp or esp by an opcode sequence that pushes one of the used registers and then uses fixed memory locations instead of ebp and esp would have worked but then simply relying on the hidden features of the Hex-Rays black box seemed like a pretty long shot. This was the point where we moved to a different challenge during the competition. Now we’re back to fulfill our duty.

Symbolic Execution and SMT solvers to the rescue

One property of the assembly code is extremely useful: The function is completely linear, i.e. after patching out (see above) the unconditional jmp instructions there are no instructions left that could alter the control flow of the check function in any way. This led to the following idea: Due to the linearity of the target function it can be seen as one giant basic block. As basic blocks are branch-free, the operations that the instructions contained by the block perform on the registers can be translated to one formula per register. Remember that the function needs to return 0 in eax for the main procedure to print the success message. Translating the whole block to a formula, it should be possible to solve the equation eax == 0 for the user-supplied input variables residing on the stack. Therefore, we need to take two steps to calculate the solution:

  • Translate the function to a formula using symbolic execution
  • Solve the resulting equation for eax to be zero using a SMT solver

That having said, we found OpenREIL on top of which Dmytro Oleksiuk wrote a simple symbolic execution engine which he used in combination with z3 to solve a crackme. Adopting Dmytro’s approach we came up with the following python script that uses OpenREIL and z3 to solve the challenge (continuation of the python script above):

from pyopenreil.REIL import *
from pyopenreil.utils import asm
from pyopenreil.symbolic import *
import pyopenreil.VM as VM

import struct
import z3

reader = ReaderRaw(ARCH_X86, buf)
tr = CodeStorageTranslator(reader)
bb = tr.get_bb(0)

cpu = Cpu(ARCH_X86)
abi = VM.Abi(cpu, tr, no_reset = True)

ret = 0x41414141
indata = 0x13370000

# IMPORTANT: Patch in the first 4 bytes of the (known) flag format to make the
# solution unique!
cpu.mem.store(indata, U32, Val(struct.unpack("<I", b'Drgn')[0], U32))
# Mark all other input dwords as symbolic
for i in range(1, 16):
    cpu.mem.store(indata + i * 4, U32, Val(exp = SymVal('FLAG_{}'.format(i), U32)))

# create stack with pointer to symbolic arguments for check()
stack = abi.pushargs([Val(indata)])

# dummy return address
stack.push(Val(ret))

# initialize emulator's registers (not needed but speeds up the execution)
cpu.reg('esp', Val(stack.top, U32))
cpu.reg('eax', Val(0, U32))
cpu.reg('ebp', Val(0, U32))
cpu.reg('ebx', Val(0, U32))
cpu.reg('ecx', Val(0, U32))
cpu.reg('edx', Val(0, U32))
cpu.reg('edi', Val(0, U32))
cpu.reg('esi', Val(0, U32))

# run until stop
try: cpu.run(tr, 0, stop_at = [0 + len(buf) - 1])
#except VM.CpuStop as e:
# solve in any case, even if an error occurs
except:

    print 'STOP at', hex(cpu.reg('eip').get_val())

    # get Z3 expressions list for current CPU state
    state = cpu.to_z3()

    # in theory we only need to solve for 'eax' but if execution ends
    # prematurely we at least obtain some parts of the flag
    # solving the equation isn't the most expensive operation in this script,
    # anyway
    for reg in ['eax', 'ebx', 'ecx' , 'edx', 'esi', 'edi', 'ebp']:
        print(reg)

        # create SMT solver
        solver = z3.Solver()

        # add constraint for return value
        solver.add(0 == cpu.reg(reg).val.to_z3(state, U32))
        check = solver.check()
        if check == z3.sat:
            model = solver.model()
            for s in sorted(list(model), key = lambda x: int(x.name()[len('FLAG_'):])):
                print(s, hex(model[s].as_long()))
            sys.stdout.write('Drgn')
            for s in sorted(list(model), key = lambda x: int(x.name()[len('FLAG_'):])):
                # prefix solution with the bytes we chose for FLAG_0
                try:
                    sys.stdout.write(struct.pack("<L", model[s].as_long()).decode('ascii'))
                except: pass

            print('')

And indeed, running the script finally yields the flag (notice the execution time, though):

STOP at 0xa1f0L
eax
(FLAG_1, '0x30797b53')
(FLAG_2, '0x68735f75')
(FLAG_3, '0x646c7530')
(FLAG_4, '0x7634685f')
(FLAG_5, '0x54695f33')
(FLAG_6, '0x3130735f')
(FLAG_7, '0x5f646576')
(FLAG_8, '0x685f7962')
(FLAG_9, '0x21646e34')
(FLAG_10, '0x7d2121')
DrgnS{y0u_sh0uld_h4v3_iT_s01ved_by_h4nd!!!}
ebx
ecx
edx

esi
edi
ebp
./solve.py  6748,65s user 3,27s system 99% cpu 1:52:35,37 total

Feed it into the original binary to confirm:

julian@TPT440p ~/CTF/2015_05_CONFidence/rev400 % ./DeobfuscateMe.elf
Flag: DrgnS{y0u_sh0uld_h4v3_iT_s01ved_by_h4nd!!!}
Nice, that was the flag!