Boston Key Party 2015: Reversing 300 "Community College" / "fredkin"

Examining the unknown

Community College
I can’t follow what’s going on here. Can you? : 300

As always with CTF challanges, let’s run file on what we’re given:

file fredkin fredkin: ELF 32-bit LSB executable, MIPS, MIPS-II version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=2204905041c5ddc8a13ab816a6421daf0870b6a0, stripped

MIPS? Ooooh, well … Firing up IDA and loading the binary we’re greeted by a relatively complex main function calling several other functions (who’d have guessed this?). The following is the prologue:

# int __cdecl main(int argc, const char **argv, const char **envp)
.globl main

addiu   $sp, -0x70
sw      $ra, 0x70+saved_ra($sp)
sw      $fp, 0x70+saved_fp($sp)
sw      $s0, 0x70+saved_s0($sp)
move    $fp, $sp
li      $gp, 0x4235C0
sw      $gp, 0x70+gp_base($sp)
sw      $a0, 0x70+argc($fp)
sw      $a1, 0x70+argv($fp)
lw      $v1, 0x70+argc($fp)
li      $v0, 2
beq     $v1, $v0, loc_404A14
or      $at, $zero

We immediately are able to spot several ABI concepts we’re familiar with from other machine architectures: The $sp register is a stack pointer which gets decremented at the beginning to make space for local variables. After saving some registers on the stack, the frame pointer gets initialized to point at the top of the current stack frame to be able to reference local variables relatively to a fixed address (rather than having to worry about the current value of $sp). Nothing fancy here. The sixth instruction loads the $gp register with an offset which is later used to point into the global offset table where all externally imported functions reside. The value 0x4235c0 thus will be important once we worry about determining what library functions are called by the program. The offset is then stored in a variable gp_base on the stack for later use. The next part shows part of the calling convention used by gcc on MIPS: Function arguments are passed in the $aX registers, which are immediately saved in the local stack frame by the callee to make them persist nested function calls. We can see how argc and argv (always the first and second the second argument of main) are written into the stack frame and how argc is checked being equal to 2. If equality holds, the processor jumps to loc_404a14 omming the assembly block that informs us about the proper usage of the program (./fredkin flag).

lw      $v0, 0x70+argv($fp)
addiu   $v0, 4
lw      $v0, 0($v0)
move    $a0, $v0
lw      $v0, -0x7D54($gp)
move    $t9, $v0
jalr    $t9              # strlen
or      $at, $zero
lw      $gp, 0x70+gp_base($fp)
sw      $v0, 0x70+len($fp)
sw      $zero, 0x70+i($fp)
j       loc_404A7C
or      $at, $zero

This block loads argv[1] into $a0 and passes it as an argument to strlen. How do I know about the latter? Looking up the symbol at virtual address 0x4235c0-0x7d54 (gp_base - offset used by the third lw instruction) gives us the name of the library function. We’ll use this technique from now on to resolve the names of all calls to libraries. After strlen returns, the result is stored on the stack and a counter is initialized to zero using the constant zero register $zero.

One slight eye-catching oddity: Notice the senseless or instructions behind each jump? These are delay slots, instructions that are executed without “seeing” the effects of the preceeding jump instructions and being skipped upon return. Weird to still encounter them in 2k15.

0x404A48        loc_404A48:
0x404A48        lw      $v0, 0x70+argv($fp)
0x404A4C        addiu   $v0, 4
0x404A50        lw      $v1, 0($v0)
0x404A54        lw      $v0, 0x70+i($fp)
0x404A58        addu    $v0, $v1, $v0
0x404A5C        lb      $v0, 0($v0)
0x404A60        move    $a0, $v0
0x404A64        jal     decompose_into_bits
0x404A68        or      $at, $zero
0x404A6C        lw      $gp, 0x70+gp_base($fp)
0x404A70        lw      $v0, 0x70+i($fp)
0x404A74        addiu   $v0, 1
0x404A78        sw      $v0, 0x70+i($fp)
0x404A7C        loc_404A7C:
0x404A7C        lw      $v1, 0x70+i($fp)
0x404A80        lw      $v0, 0x70+len($fp)
0x404A84        sltu    $v0, $v1, $v0
0x404A88        bnez    $v0, loc_404A48
0x404A8C        or      $at, $zero

What we can see here is the code calling decompose_into_bitlist with each character of argv[1] as argument until the loop counter is equal to the aforementioned value of strlen(argv[1]). Of course the name of the function was not present in the original binary, but the function works as you’d expect and decomposes the argument into a C++ list of bits (highest bit is MSB) represented as 1 and 0 integers. We’ll therefore omit the details of the decompose_into_bitlist function. The pointer to the first element of the resulting list is stored at a location we will refer to as uinput_bitlist (0x41b8d0). Afterwards execution continues at address 0x404A90.

0x404A90        addiu   $v0, $fp, 0x70+wire_vector
0x404A94        move    $a0, $v0
0x404A98        lw      $v0, -0x7FB4($gp)
0x404A9C        move    $t9, $v0
0x404AA0        jalr    $t9              # std::vector<Wires,std::allocator<Wires>>::vector(void)
0x404AA4        or      $at, $zero
0x404AA8        lw      $gp, 0x70+gp_base($fp)
0x404AAC        sw      $zero, 0x70+j($fp)
0x404AB0        j       loc_404C50
0x404AB4        or      $at, $zero

These lines initialize a second loop counter j and also allocate a C++ vector object which should contain elements of type Wire. Interesting. The following loop body seems huge but still not too complicated. We will break it down into smaller chunks in our analysis:

0x404AB8        loc_404AB8:
0x404AB8        lui     $v0, 0x42
0x404ABC        addiu   $a0, $v0, (uinput_bitlist - 0x420000)
0x404AC0        lw      $v0, -0x7F8C($gp)
0x404AC4        move    $t9, $v0
0x404AC8        jalr    $t9              # std::list<int,std::allocator<int>>::size(void)
0x404ACC        or      $at, $zero
0x404AD0        lw      $gp, 0x70+gp_base($fp)
0x404AD4        sltiu   $v0, 3
0x404AD8        andi    $v0, 0xFF
0x404ADC        beqz    $v0, loc_404B38
0x404AE0        or      $at, $zero

This checks if the list has at least 3 elements left. If this is true, it jumps to loc_404B38 which we’ll cover in a second. Let’s examine the case in which the size of the list is less or equal to 2 elements.

0x404AE4        addiu   $v0, $fp, 0x70+new_wire
0x404AE8        move    $a0, $v0
0x404AEC        move    $a1, $zero
0x404AF0        move    $a2, $zero
0x404AF4        move    $a3, $zero
0x404AF8        lw      $v0, -0x7FD0($gp)
0x404AFC        move    $t9, $v0
0x404B00        jalr    $t9              # Wires::Wires(int,int,int)
0x404B04        or      $at, $zero
0x404B08        lw      $gp, 0x70+gp_base($fp)
0x404B0C        addiu   $v1, $fp, 0x70+wire_vector
0x404B10        addiu   $v0, $fp, 0x70+new_wire
0x404B14        move    $a0, $v1
0x404B18        move    $a1, $v0
0x404B1C        lw      $v0, -0x7FA4($gp)
0x404B20        move    $t9, $v0
0x404B24        jalr    $t9              # std::vector<Wires,std::allocator<Wires>>::push_back(Wires const&)
0x404B28        or      $at, $zero
0x404B2C        lw      $gp, 0x70+gp_base($fp)
0x404B30        j       loc_404C44
0x404B34        or      $at, $zero

The code calls the constructor of the class Wires with three zero arguments and appends the resulting object to the vector we named wire_vector. Similarly, if the remaining number of elements in uinput_bits is sufficient (i.e. > 2), the code pops three elements from the front of the list and uses them as arguments to the constructor of the Wires class:

0x404B38        loc_404B38:
0x404B38        lui     $v0, 0x42
0x404B3C        addiu   $a0, $v0, (uinput_bitlist - 0x420000)
0x404B40        lw      $v0, -0x7FC4($gp)
0x404B44        move    $t9, $v0
0x404B48        jalr    $t9              # std::list<int,std::allocator<int>>::front(void)
0x404B4C        or      $at, $zero
0x404B50        lw      $gp, 0x70+gp_base($fp)
0x404B54        lw      $v0, 0($v0)
0x404B58        sw      $v0, 0x70+bit1($fp)
0x404B5C        lui     $v0, 0x42
0x404B60        addiu   $a0, $v0, (uinput_bitlist - 0x420000)
0x404B64        lw      $v0, -0x7FBC($gp)
0x404B68        move    $t9, $v0
0x404B6C        jalr    $t9              # std::list<int,std::allocator<int>>::pop_front(void)
0x404B70        or      $at, $zero
0x404B74        lw      $gp, 0x70+gp_base($fp)
0x404B78        lui     $v0, 0x42
0x404B7C        addiu   $a0, $v0, (uinput_bitlist - 0x420000)
0x404B80        lw      $v0, -0x7FC4($gp)
0x404B84        move    $t9, $v0
0x404B88        jalr    $t9              # std::list<int,std::allocator<int>>::front(void)
0x404B8C        or      $at, $zero
0x404B90        lw      $gp, 0x70+gp_base($fp)
0x404B94        lw      $v0, 0($v0)
0x404B98        sw      $v0, 0x70+bit2($fp)
0x404B9C        lui     $v0, 0x42
0x404BA0        addiu   $a0, $v0, (uinput_bitlist - 0x420000)
0x404BA4        lw      $v0, -0x7FBC($gp)
0x404BA8        move    $t9, $v0
0x404BAC        jalr    $t9              # std::list<int,std::allocator<int>>::pop_front(void)
0x404BB0        or      $at, $zero
0x404BB4        lw      $gp, 0x70+gp_base($fp)
0x404BB8        lui     $v0, 0x42
0x404BBC        addiu   $a0, $v0, (uinput_bitlist - 0x420000)
0x404BC0        lw      $v0, -0x7FC4($gp)  # std::list<int,std::allocator<int>>::front(void)
0x404BC4        move    $t9, $v0
0x404BC8        jalr    $t9
0x404BCC        or      $at, $zero
0x404BD0        lw      $gp, 0x70+gp_base($fp)
0x404BD4        lw      $v0, 0($v0)
0x404BD8        sw      $v0, 0x70+bit3($fp)
0x404BDC        lui     $v0, 0x42
0x404BE0        addiu   $a0, $v0, (uinput_bitlist - 0x420000)
0x404BE4        lw      $v0, -0x7FBC($gp)
0x404BE8        move    $t9, $v0
0x404BEC        jalr    $t9              # std::list<int,std::allocator<int>>::pop_front(void)
0x404BF0        or      $at, $zero
0x404BF4        lw      $gp, 0x70+gp_base($fp)
0x404BF8        addiu   $v0, $fp, 0x70+var_34
0x404BFC        move    $a0, $v0
0x404C00        lw      $a1, 0x70+bit1($fp)
0x404C04        lw      $a2, 0x70+bit2($fp)
0x404C08        lw      $a3, 0x70+bit3($fp)
0x404C0C        lw      $v0, -0x7FD0($gp)
0x404C10        move    $t9, $v0
0x404C14        jalr    $t9              # Wires::Wires(int,int,int)
0x404C18        or      $at, $zero
0x404C1C        lw      $gp, 0x70+gp_base($fp)
0x404C20        addiu   $v1, $fp, 0x70+wire_vector
0x404C24        addiu   $v0, $fp, 0x70+var_34
0x404C28        move    $a0, $v1
0x404C2C        move    $a1, $v0
0x404C30        lw      $v0, -0x7FA4($gp)
0x404C34        move    $t9, $v0
0x404C38        jalr    $t9              # std::vector<Wires,std::allocator<Wires>>::push_back(Wires const&)
0x404C3C        or      $at, $zero
0x404C40        lw      $gp, 0x70+gp_base($fp)
0x404C44 loc_404
0x404C44        lw      $v0, 0x70+j($fp)
0x404C48        addiu   $v0, 1
0x404C4C        sw      $v0, 0x70+j($fp)
0x404C50 loc_404
0x404C50        lw      $v0, 0x70+j($fp)
0x404C54        sltiu   $v0, 0x38
0x404C58        bnez    $v0, loc_404AB8
0x404C5C        or      $at, $zero

The aforementioned process of turning uinput_bits into a vector of Wires objects is repeated until there are 0x38 elements in the vector. This most likely hints us towards the length of the flag: For the vector to reach length 56 (assuming no padding), we need to feed the binary 168 bits (21 bytes) of data. Anything beyond that limit will simply be ignored.