In this task, we’re given a core dump of an x86-64
ELF file that has crashed due to a SIGSEGV
. Our job is to investigate on the reason of the SIGSEGV
and reconstruct the actual program flow calculating a flag.
The reason of the segfault is quite obvious, as gdb
tells us:
RAX 0x555555554e9a ◂— inc dword ptr [rax]
RBX 0x555555554e9d ◂— leave
RCX 0x7fffffffec70 ◂— 0x6c69665f73696874 ('this_fil')
RDX 0x7fffffffec94 ◂— 0xdc7ab200791dc20e
RDI 0x7fffffffec70 ◂— 0x6c69665f73696874 ('this_fil')
RSI 0x7fffffffec90 ◂— 0x791dc20eefa46567
R8 0x7ffff7fd7740 ◂— 0x7ffff7fd7740
R9 0x13
R10 0x7ffff7fd7740 ◂— 0x7ffff7fd7740
R11 0x246
R12 0x5555555549d0 ◂— xor ebp, ebp
R13 0x7fffffffed90 ◂— 0x5
R14 0x0
R15 0x0
RBP 0x7fffffffecb0 —▸ 0x5555555552d0 ◂— push r15
RSP 0x7fffffffec38 —▸ 0x5555555552aa ◂— mov eax, 0
RIP 0x555555554e9a ◂— inc dword ptr [rax]
[────────────────────────DISASM───────────────────────────]
► 0x555555554e9a inc dword ptr [rax]
Currently, rax
points to non-writable memory within the program’s executable segment causing the inc
instruction to fail. Clearly, the execution flow should not end up here.
Fortunately, IDA supports coredumps, and so we can start our analysis right away. After a bit of digging, we recognize the main
function of the program at address 0x555555554FF7
. Interestingly, Hex-Rays refuses to decompile the function, as the stack frame is not cleaned up properly at the (alleged) function end. Furthermore, the pointers of imported glibc
functions in the GOT section point to invalid memory. Nevertheless, common sense lets us reconstruct them by statically looking at the function call sequences, the parameters and the contents of the strtab
:
/tmp/askjaneaboutthewhale/ps-sec
/tmp/askjaneaboutthewhale/ps-sec
/tmp/askjaneaboutthewhale/ps-sec
/lib/x86_64-linux-gnu/libdl-2.26.so
/lib/x86_64-linux-gnu/libdl-2.26.so
/lib/x86_64-linux-gnu/libc-2.26.so
/lib/x86_64-linux-gnu/libc-2.26.so
/lib/x86_64-linux-gnu/libcrypto.so.1.0.0
/lib/x86_64-linux-gnu/libcrypto.so.1.0.0
/lib/x86_64-linux-gnu/ld-2.26.so
/lib/x86_64-linux-gnu/ld-2.26.so
/lib/x86_64-linux-gnu/ld-2.26.so
ELF
/lib64/ld-linux-x86-64.so.2
GNU
GNU
!V4
libcrypto.so.1.0.0
_ITM_deregisterTMCloneTable
__gmon_start__
_ITM_registerTMCloneTable
MD5
libc.so.6
strcpy
exit
sprintf
fopen
strncpy
__stack_chk_fail
fgets
strlen
fseek
fclose
malloc
fwrite
atoi
__cxa_finalize
strcmp
__libc_start_main
OPENSSL_1.0.0
GLIBC_2.4
GLIBC_2.2.5
Thusly, after cleaning up the binary and declaring function arguments to IDA and Hex-Rays, the PLT looks like the following
load:00005555555548C0 printf proc near ; CODE XREF: load_noise+8Fp
load:00005555555548C0 ; get_alignment+31p
load:00005555555548C0 ; get_alignment+6Bp
load:00005555555548C0 ; get_alignment+A3p
load:00005555555548C0 ; get_alignment+D7p
load:00005555555548C0 ; calculate_flag+68p
load:00005555555548C0 ; calculate_flag+B0p
load:00005555555548C0 ; main+2Bp
load:00005555555548C0 ; main+42p
load:00005555555548C0 ; main+76p
load:00005555555548C0 ; main+291p
load:00005555555548C0 000 jmp cs:qword_555555755F58
load:00005555555548C0 printf endp
load:00005555555548C0
load:00005555555548C6
load:00005555555548C6 ; =============== S U B R O U T I N E =======================================
load:00005555555548C6
load:00005555555548C6
load:00005555555548C6 sub_5555555548C6 proc near
load:00005555555548C6
load:00005555555548C6 ; FUNCTION CHUNK AT load:00005555555548B0 SIZE 0000000C BYTES
load:00005555555548C6
load:00005555555548C6 000 push 0
load:00005555555548CB 008 jmp loc_5555555548B0
load:00005555555548CB sub_5555555548C6 endp
load:00005555555548CB
load:00005555555548D0
load:00005555555548D0 ; =============== S U B R O U T I N E =======================================
load:00005555555548D0
load:00005555555548D0 ; Attributes: thunk
load:00005555555548D0
load:00005555555548D0 ; int __fastcall fseek(void *file, void *a2, int a3)
load:00005555555548D0 fseek proc near ; CODE XREF: load_noise+49p
load:00005555555548D0 000 jmp cs:qword_555555755F60
load:00005555555548D0 fseek endp
load:00005555555548D0
load:00005555555548D6
load:00005555555548D6 ; =============== S U B R O U T I N E =======================================
load:00005555555548D6
load:00005555555548D6
load:00005555555548D6 sub_5555555548D6 proc near
load:00005555555548D6 000 push 1
load:00005555555548DB 008 jmp loc_5555555548B0
load:00005555555548DB sub_5555555548D6 endp
load:00005555555548DB
load:00005555555548E0
load:00005555555548E0 ; =============== S U B R O U T I N E =======================================
load:00005555555548E0
load:00005555555548E0 ; Attributes: noreturn thunk
load:00005555555548E0
load:00005555555548E0 exit proc near ; CODE XREF: main+4Cp
load:00005555555548E0 ; main+80p
load:00005555555548E0 000 jmp cs:qword_555555755F68
load:00005555555548E0 exit endp
load:00005555555548E0
load:00005555555548E6
load:00005555555548E6 ; =============== S U B R O U T I N E =======================================
load:00005555555548E6
load:00005555555548E6
load:00005555555548E6 sub_5555555548E6 proc near
load:00005555555548E6 000 push 2
load:00005555555548EB 008 jmp loc_5555555548B0
load:00005555555548EB sub_5555555548E6 endp
load:00005555555548EB
load:00005555555548F0
load:00005555555548F0 ; =============== S U B R O U T I N E =======================================
load:00005555555548F0
load:00005555555548F0 ; Attributes: thunk
load:00005555555548F0
load:00005555555548F0 ; int __fastcall MD5(void *a1, void *a2, void *a3)
load:00005555555548F0 MD5 proc near ; CODE XREF: MD5_hexlify+4Cp
load:00005555555548F0 000 jmp cs:qword_555555755F70
load:00005555555548F0 MD5 endp
load:00005555555548F0
load:00005555555548F6
load:00005555555548F6 ; =============== S U B R O U T I N E =======================================
load:00005555555548F6
load:00005555555548F6
load:00005555555548F6 sub_5555555548F6 proc near
load:00005555555548F6 000 push 3
load:00005555555548FB 008 jmp loc_5555555548B0
load:00005555555548FB sub_5555555548F6 endp
load:00005555555548FB
load:0000555555554900
load:0000555555554900 ; =============== S U B R O U T I N E =======================================
load:0000555555554900
load:0000555555554900 ; Attributes: thunk
load:0000555555554900
load:0000555555554900 ; void *__fastcall malloc(_QWORD)
load:0000555555554900 malloc proc near ; CODE XREF: load_noise+53p
load:0000555555554900 000 jmp cs:qword_555555755F78
load:0000555555554900 malloc endp
load:0000555555554900
load:0000555555554906
load:0000555555554906 ; =============== S U B R O U T I N E =======================================
load:0000555555554906
load:0000555555554906
load:0000555555554906 sub_555555554906 proc near
load:0000555555554906 000 push 4
load:000055555555490B 008 jmp loc_5555555548B0
load:000055555555490B sub_555555554906 endp
load:000055555555490B
load:0000555555554910
load:0000555555554910 ; =============== S U B R O U T I N E =======================================
load:0000555555554910
load:0000555555554910 ; Attributes: thunk
load:0000555555554910
load:0000555555554910 ; FILE *__fastcall fopen(const char *name, const char *mode)
load:0000555555554910 fopen proc near ; CODE XREF: load_noise+22p
load:0000555555554910 ; calculate_flag+CBp
load:0000555555554910 000 jmp cs:qword_555555755F80
load:0000555555554910 fopen endp
load:0000555555554910
load:0000555555554916
load:0000555555554916 ; =============== S U B R O U T I N E =======================================
load:0000555555554916
load:0000555555554916
load:0000555555554916 sub_555555554916 proc near
load:0000555555554916 000 push 5
load:000055555555491B 008 jmp loc_5555555548B0
load:000055555555491B sub_555555554916 endp
load:000055555555491B
load:0000555555554920
load:0000555555554920 ; =============== S U B R O U T I N E =======================================
load:0000555555554920
load:0000555555554920 ; Attributes: thunk
load:0000555555554920
load:0000555555554920 ; char *__cdecl fgets(char *s, int size, FILE *stream)
load:0000555555554920 fgets proc near ; CODE XREF: load_noise+72p
load:0000555555554920 000 jmp cs:qword_555555755F88
load:0000555555554920 fgets endp
load:0000555555554920
load:0000555555554926
load:0000555555554926 ; =============== S U B R O U T I N E =======================================
load:0000555555554926
load:0000555555554926
load:0000555555554926 sub_555555554926 proc near
load:0000555555554926 000 push 6
load:000055555555492B 008 jmp loc_5555555548B0
load:000055555555492B sub_555555554926 endp
load:000055555555492B
load:0000555555554930
load:0000555555554930 ; =============== S U B R O U T I N E =======================================
load:0000555555554930
load:0000555555554930 ; Attributes: thunk
load:0000555555554930
load:0000555555554930 ; size_t __fastcall strlen(_QWORD)
load:0000555555554930 strlen proc near ; CODE XREF: MD5_hexlify+36p
load:0000555555554930 ; calculate_flag+F8p
load:0000555555554930 ; main+5Fp
load:0000555555554930 ; main+F6p
load:0000555555554930 ; main+137p
load:0000555555554930 000 jmp cs:qword_555555755F90
load:0000555555554930 strlen endp
load:0000555555554930
load:0000555555554936
load:0000555555554936 ; =============== S U B R O U T I N E =======================================
load:0000555555554936
load:0000555555554936
load:0000555555554936 sub_555555554936 proc near
load:0000555555554936 000 push 7
load:000055555555493B 008 jmp loc_5555555548B0
load:000055555555493B sub_555555554936 endp
load:000055555555493B
load:0000555555554940
load:0000555555554940 ; =============== S U B R O U T I N E =======================================
load:0000555555554940
load:0000555555554940 ; Attributes: thunk
load:0000555555554940
load:0000555555554940 ; int sprintf(char *str, const char *format, ...)
load:0000555555554940 sprintf proc near ; CODE XREF: MD5_hexlify+87p
load:0000555555554940 ; eat_cpu+B2p
load:0000555555554940 000 jmp cs:qword_555555755F98
load:0000555555554940 sprintf endp
load:0000555555554940
load:0000555555554946
load:0000555555554946 ; =============== S U B R O U T I N E =======================================
load:0000555555554946
load:0000555555554946
load:0000555555554946 sub_555555554946 proc near
load:0000555555554946 000 push 8
load:000055555555494B 008 jmp loc_5555555548B0
load:000055555555494B sub_555555554946 endp
load:000055555555494B
load:0000555555554950
load:0000555555554950 ; =============== S U B R O U T I N E =======================================
load:0000555555554950
load:0000555555554950 ; Attributes: thunk
load:0000555555554950
load:0000555555554950 ; unsigned __int8 *__cdecl atoi(void *a1)
load:0000555555554950 atoi proc near ; CODE XREF: main+A1p
load:0000555555554950 ; main+B7p
load:0000555555554950 000 jmp cs:qword_555555755FA0
load:0000555555554950 atoi endp
load:0000555555554950
load:0000555555554956
load:0000555555554956 ; =============== S U B R O U T I N E =======================================
load:0000555555554956
load:0000555555554956
load:0000555555554956 sub_555555554956 proc near
load:0000555555554956 000 push 9
load:000055555555495B 008 jmp loc_5555555548B0
load:000055555555495B sub_555555554956 endp
load:000055555555495B
load:0000555555554960
load:0000555555554960 ; =============== S U B R O U T I N E =======================================
load:0000555555554960
load:0000555555554960 ; Attributes: noreturn thunk
load:0000555555554960
load:0000555555554960 stack_chk_fail proc near ; CODE XREF: MD5_hexlify+A6p
load:0000555555554960 ; eat_cpu+ECp
load:0000555555554960 ; calculate_flag+151p
load:0000555555554960 ; main+2C7p
load:0000555555554960 000 jmp cs:qword_555555755FA8
load:0000555555554960 stack_chk_fail endp
load:0000555555554960
load:0000555555554966
load:0000555555554966 ; =============== S U B R O U T I N E =======================================
load:0000555555554966
load:0000555555554966
load:0000555555554966 sub_555555554966 proc near
load:0000555555554966 000 push 0Ah
load:000055555555496B 008 jmp loc_5555555548B0
load:000055555555496B sub_555555554966 endp
load:000055555555496B
load:0000555555554970
load:0000555555554970 ; =============== S U B R O U T I N E =======================================
load:0000555555554970
load:0000555555554970 ; Attributes: thunk
load:0000555555554970
load:0000555555554970 ; char *__fastcall strstr(const char *haystack, const char *needle)
load:0000555555554970 strstr proc near ; CODE XREF: calculate_flag+94p
load:0000555555554970 000 jmp cs:qword_555555755FB0
load:0000555555554970 strstr endp
load:0000555555554970
load:0000555555554976
load:0000555555554976 ; =============== S U B R O U T I N E =======================================
load:0000555555554976
load:0000555555554976
load:0000555555554976 sub_555555554976 proc near
load:0000555555554976 000 push 0Bh
load:000055555555497B 008 jmp loc_5555555548B0
load:000055555555497B sub_555555554976 endp
load:000055555555497B
load:0000555555554980
load:0000555555554980 ; =============== S U B R O U T I N E =======================================
load:0000555555554980
load:0000555555554980 ; Attributes: thunk
load:0000555555554980
load:0000555555554980 ; void *__cdecl strcpy(void *a1, char *a2)
load:0000555555554980 strcpy proc near ; CODE XREF: main+207p
load:0000555555554980 000 jmp cs:qword_555555755FB8
load:0000555555554980 strcpy endp
load:0000555555554980
load:0000555555554986
load:0000555555554986 ; =============== S U B R O U T I N E =======================================
load:0000555555554986
load:0000555555554986
load:0000555555554986 sub_555555554986 proc near
load:0000555555554986 000 push 0Ch
load:000055555555498B 008 jmp loc_5555555548B0
load:000055555555498B sub_555555554986 endp
load:000055555555498B
load:0000555555554990
load:0000555555554990 ; =============== S U B R O U T I N E =======================================
load:0000555555554990
load:0000555555554990 ; Attributes: thunk
load:0000555555554990
load:0000555555554990 ; void __fastcall fclose(void *a1)
load:0000555555554990 fclose proc near ; CODE XREF: calculate_flag+13Cp
load:0000555555554990 000 jmp cs:qword_555555755FC0
load:0000555555554990 fclose endp
load:0000555555554990
load:0000555555554996
load:0000555555554996 ; =============== S U B R O U T I N E =======================================
load:0000555555554996
load:0000555555554996
load:0000555555554996 sub_555555554996 proc near
load:0000555555554996 000 push 0Dh
load:000055555555499B 008 jmp loc_5555555548B0
load:000055555555499B sub_555555554996 endp
load:000055555555499B
load:00005555555549A0
load:00005555555549A0 ; =============== S U B R O U T I N E =======================================
load:00005555555549A0
load:00005555555549A0 ; Attributes: thunk
load:00005555555549A0
load:00005555555549A0 ; void __fastcall strncat(void *a1, void *a2, int a3)
load:00005555555549A0 strncat proc near ; CODE XREF: eat_cpu+D7p
load:00005555555549A0 000 jmp cs:qword_555555755FC8
load:00005555555549A0 strncat endp
load:00005555555549A0
load:00005555555549A6
load:00005555555549A6 ; =============== S U B R O U T I N E =======================================
load:00005555555549A6
load:00005555555549A6
load:00005555555549A6 sub_5555555549A6 proc near
load:00005555555549A6 000 push 0Eh
load:00005555555549AB 008 jmp loc_5555555548B0
load:00005555555549AB sub_5555555549A6 endp
load:00005555555549AB
load:00005555555549B0
load:00005555555549B0 ; =============== S U B R O U T I N E =======================================
load:00005555555549B0
load:00005555555549B0 ; Attributes: thunk
load:00005555555549B0
load:00005555555549B0 fwrite proc near ; CODE XREF: calculate_flag+ECp
load:00005555555549B0 ; calculate_flag+113p
load:00005555555549B0 ; calculate_flag+130p
load:00005555555549B0 000 jmp cs:qword_555555755FD0
load:00005555555549B0 fwrite endp
load:00005555555549B0
load:00005555555549B6
load:00005555555549B6 ; =============== S U B R O U T I N E =======================================
load:00005555555549B6
load:00005555555549B6
load:00005555555549B6 sub_5555555549B6 proc near
load:00005555555549B6 000 push 0Fh
load:00005555555549BB 008 jmp loc_5555555548B0
load:00005555555549BB sub_5555555549B6 endp
load:00005555555549BB
load:00005555555549C0
load:00005555555549C0 ; =============== S U B R O U T I N E =======================================
load:00005555555549C0
load:00005555555549C0 ; Attributes: thunk
load:00005555555549C0
load:00005555555549C0 sub_5555555549C0 proc near ; CODE XREF: sub_555555554A90+1Ep
load:00005555555549C0 000 jmp cs:qword_555555755FF8
load:00005555555549C0 sub_5555555549C0 endp
load:00005555555549C0
load:00005555555549C0 ; ---------------------------------------------------------------------------
Okay, great. Now we can start looking into the binary in more detail. All relevant functions apart from main
can be fed into Hex-Rays and are rather concise:
// virtual address 0x555555554ADA
unsigned int __fastcall lcg(unsigned int *a1)
{
*a1 = 214013 * *a1 + 2531011;
return (*a1 >> 16) & 0x7FFF;
}
// virtual address 0x555555554B0A
signed __int64 __fastcall jump_forward_lcg(unsigned int *lcgstate)
{
__int64 v1; // rdx@1
signed __int64 result; // rax@1
__int64 vars0; // [sp+8h] [bp+0h]@0
v1 = (unsigned __int8)lcg(lcgstate);
result = vars0 + 8;
*(_QWORD *)(vars0 + 8) += v1;
return result;
}
// virtual address 0x555555554B3B
void __fastcall load_noise(__int64 vault, unsigned int *state0, __int64 state1)
{
FILE *fp; // ST28_8@1
unsigned int v5; // eax@1
fp = fopen((const char *)vault, mode);
v5 = lcg(state0);
fseek(fp, (void *)(signed int)v5, 0);
magic_flag_contents = (char *)malloc(130LL);
fgets(magic_flag_contents, 128, fp);
jump_forward_lcg((unsigned int *)state1);
printf("The noise has been loaded\n", 128LL);
}
What we can see here is that load_noise
, the first function not from the standard library called by main reads in 128 bytes from a file vault.txt
specified on the command line and stores it into a malloc
ed buffer of size 130. The lcg
function is a Linear Congruential Generator with 32 bit internal state producing 15-bit pseudo random numbers (hence the challenge name pseudo secure) which is then used as a seek offset within the vault file. Fortunately, the core dump contains the heap and therefore the contents of the magic_flag_contents
buffer. We also found the FILE
struct corresponding to vault.txt
on the heap, including the first 4096 bytes of its content. From this we learned that the LCG must have returned 0, as the seek offset observable from the FILE
struct is currently 128, meaning it was a 0 prior to the fgets
.
The function jump_forward_lcg
is delicate: It changes the value of the saved return address on the stack by adding a 8-bit value obtained from the LCG earlier. This sounds already a bit awkward and might be related to the SIGSEGV
we saw earlier.
Even worse, the initial LCG state is obtained from argv[1]
and argv[2]
, which unfortunately have been censored to XXXXXXXXXX
and therefore removed from the coredump. We therefore are a bit stuck because we cannot follow the control flow when leaving the function load_noise
as an arbitrary, unknown offset has been added to the return pointer. Darn.
Following the control flow from the entry point of the binary seems to be a dead end for now. Hence we started looking at the current program state in the coredump. Here, we notice that the address contained on the top of the stack points to 0x0x5555555552aa
: An instruction directly behind an indirect call. The code sequence around the indirect call looks like follows:
load:0000555555555241 078 call lcg
load:0000555555555246 078 mov ecx, eax
load:0000555555555248 078 mov edx, 7E07E07Fh
load:000055555555524D 078 mov eax, ecx
load:000055555555524F 078 imul edx
load:0000555555555251 078 sar edx, 5
load:0000555555555254 078 mov eax, ecx
load:0000555555555256 078 sar eax, 1Fh
load:0000555555555259 078 sub edx, eax
load:000055555555525B 078 mov eax, edx
load:000055555555525D 078 mov edx, eax
load:000055555555525F 078 shl edx, 6
load:0000555555555262 078 add edx, eax
load:0000555555555264 078 mov eax, ecx
load:0000555555555266 078 sub eax, edx
load:0000555555555268 078 cdqe
load:000055555555526A 078 lea rdx, [rax+1Ch]
load:000055555555526E 078 lea rax, alignment_and_noise
load:0000555555555275 078 add rax, rdx
load:0000555555555278 078 mov [rbp+var_48], rax
load:000055555555527C 078 lea rdi, aHoldYourBreath ; "Hold your breath..\n"
load:0000555555555283 078 mov eax, 0
load:0000555555555288 078 call printf
load:000055555555528D 078 lea rax, [rbp+flagfile]
load:0000555555555291 078 lea rdx, [rax+24h]
load:0000555555555295 078 lea rax, [rbp+flagfile]
load:0000555555555299 078 lea rsi, [rax+20h]
load:000055555555529D 078 lea rcx, [rbp+flagfile]
load:00005555555552A1 078 mov rax, [rbp+var_48]
load:00005555555552A5 078 mov rdi, rcx
load:00005555555552A8 078 call rax
load:00005555555552AA
load:00005555555552AA loc_5555555552AA: ; DATA XREF: load:00007FFFFFFFEC38
load:00005555555552AA 078 mov eax, 0
This is very interesting. What we can see here is that the LCG output is added to a function pointer (which we named alignment_and_noise
). Even though Hex-Rays is not available, this is about the equivalent functionality of the assembly piece above:
addr = (char *)alignment_and_noise + (int)lcg(&state0) % 65 + 0x1C;
printf("Hold your breath..\n");
((void (*)(char *, unsigned int *, int *))addr)(flagfile, &state0, &state1);
Assuming that the LCG output was wrong, we might very well end up at a wrong address and crash the program. The alignment_and_noise
function is at address 0x555555554E51
, the program crashed at 0x555555554e9a
. Subtracting the two values yields 0x49
. Even more importantly, the internal state of the LCG (state0
) is passed as an argument to the function, which means that in the core dump, at the time of the crash, the second and third parameter (rsi
and rdx
) must hold the current internal states of the LCG!
A quick calculation confirms this hypothesis. From gdb
we obtain
pwndbg> x/1xw $rsi
0x7fffffffec90: 0xefa46567
pwndbg> x/1xw $rdx
0x7fffffffec94: 0x791dc20e
where the jump distance depends on the value returned by the LCG’s last invocation (0xefa46567 >> 16 & 0x7fff = 0x6fa4
) modulo 65 plus 0x1c
. And indeed 0x6fa % 65 + 0x1c = 0x49
!
This means we’ve successfully reconstructed the (wrong) jump target from the LCG state. Copying the LCG implementation into a python script and doing a 32 bit brute force search also tells us that the state preceding the current LCG state was 0x00745874
(as 214013 * 0x00745874 + 2531011 & 0xffffffff = 0xefa46567
, the current value of state0
). This value immediately looks suspicious for the trained eye of a reverse engineer: It’s printable ASCII with a NULL termination reading tXt\x00
. A quick view into the disassembly prior to the LCG invocation tells us that the code performs roughly the following steps:
strcpy(flagfile, argv[4])
char *tmp = &flagfile[strlen(flagfile)];
*(int *)tmp = 'tXt.';
tmp[4] = 0;
The value of argv[4]
is this_file_will_contain_the_flag
, according to gdb
, and, looking further up in the code to find the stack setup, we can see that the flagfile
buf is only 32 bytes in length, and is directly followed by state0
. So the program crash occurred because the poor user entered a file name not small enough to fit into the buffer. Then the program forcefully appended ".txt\x00"
to the output file name, effectively overwriting the value of state0
leading to the LCG output being wrong eventually causing the program to crash. Phew.
Well. Now we know better. But this is terrible, as we now don’t know the actual internal state of the LCG prior to being overwritten. Poking around in the dump in IDA we finally find the function which looked as if it should be the correct target of the indirect jump: At address 0x555555554E9F
we find a function that first zeroes out some array on the stack and then calls printf("This may take a while..\n");
. We quickly rewrote this function (and all the called subfunctions) in C:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <openssl/md5.h>
unsigned char flag[0x100] =
{
0x0B, 0x7A, 0x64, 0x33, 0x42, 0xE0, 0x9C, 0x1D, 0xCC, 0x2E,
0x68, 0xCB, 0x23, 0x46, 0x66, 0xB8, 0x15, 0xAE, 0x77, 0xC3,
0x63, 0xE5, 0x51, 0x70, 0xA5, 0x0D, 0xE1, 0xF6, 0xE5, 0xC1,
0x32, 0xF3, 0xF9, 0xB7, 0x7F, 0x35, 0xF3, 0xD0, 0x5A, 0x96,
0xFB, 0x34, 0x46, 0x14, 0x76, 0x63, 0x53, 0x5C, 0x4B, 0x7C,
0xA8, 0x7F, 0x24, 0xDA, 0xB5, 0xC4, 0x63, 0x25, 0xF3, 0xAA,
0x12, 0xCA, 0x43, 0x63, 0xE1, 0x2B, 0x8E, 0x1A, 0x3E, 0xDB,
0xC6, 0xDA, 0x49, 0x64, 0x76, 0x16, 0x66, 0xC6, 0x37, 0x91,
0xBE, 0x91, 0xF4, 0xA3, 0x2C, 0xC0, 0xD8, 0xCC, 0xF3, 0x2A,
0xFC, 0xB2, 0xDC, 0x17, 0x57, 0xA0, 0x38, 0xEF, 0x5A, 0x3F,
0x5E, 0x63, 0xB2, 0x59, 0xB8, 0x13, 0x9C, 0x2F, 0x23, 0xA3,
0x59, 0xF4, 0x13, 0x97, 0x39, 0xCF, 0xBC, 0xA9, 0x15, 0xC5,
0x07, 0x7E, 0x83, 0x43, 0x72, 0xDA, 0x0C, 0x7A, 0x60, 0xB5,
0xA9, 0x64, 0x2E, 0x2C, 0xF9, 0x9F, 0x6A, 0x56, 0xAD, 0xC8,
0x36, 0x82, 0x18, 0xA1, 0xCE, 0x2C, 0x86, 0x4A, 0x8F, 0xFB,
0xAA, 0x8B, 0xD8, 0x70, 0xAF, 0xED, 0x82, 0x42, 0x40, 0xF9,
0x42, 0x9D, 0x88, 0x81, 0x83, 0x30, 0x97, 0x7B, 0x47, 0xCD,
0xA8, 0x9B, 0xEF, 0x40, 0x6E, 0x53, 0x2B, 0xC7, 0x61, 0x64
};
unsigned int lcg(unsigned int *state)
{
*state = 214013 * *state + 2531011;
return (*state >> 16) & 0x7FFF;
}
void MD5_hexlify(unsigned char *in, char *out)
{
size_t len = 0;
unsigned int i = 0;
unsigned char md[0x10] = { 0 };
MD5(in, strlen(in), md);
for (i = 0; i <= 15; i++)
sprintf(&out[2 * i], "%02x", md[i]);
}
void eat_cpu(char *res, unsigned int *s0)
{
unsigned int i = 0;
unsigned char hash[0x30] = { 0 };
flag[127] = 0;
printf("%d\n", strlen(flag));
MD5_hexlify(flag, hash);
for (i = 0; i <= 99999999; ++i) {
MD5_hexlify(hash, hash);
sprintf(&hash[0x20], "-%.8x", lcg(s0));
}
strncat(res, hash, 0x63);
}
int main(int argc, char **argv)
{
char result[0x100] = { 0 };
if (argc != 2) {
puts("Missing argument.");
return -1;
}
unsigned int s0 = strtoul(argv[1], NULL, 0);
eat_cpu(result, &s0);
printf("Flag:{");
printf("%s", result);
printf("}\n");
fflush(stdout);
}
This seems to invoke the MD5
hash function and the LCG with state0
one houndred million times, and encloses the result with Flag:{
and }
. (btw, OOO, why this inconsistency? Wouldn’t this have been trivial to change to OOO{
}
?)
So our task now becomes clear: Reconstruct the overwritten state0
variable, and do it precisely (even on 2018 hardware the hashing takes about 2 minutes).
The LCG has $2^{32}$ internal states. Clearly, doing a bruteforce search is infeasible in the given time frame. So we have to look for constraints that we know the LCG state musst satisfy.
Coming back to the indirect jump, assuming we want to hit the flag calculation function, we know that the last output of the LCG must be 0x555555554E9F-0x555555554E51 = 0x4e
. This reduces the search space by a factor of $2^8$ there are still too many values to try.
state1
LCG state state1
governs the behavior of the jump_forward_lcg
function, and has not been overwritten. Therefore, we were able to reconstruct the LCG sequence for state1
: 0x00002a45 -> 0x8a2f73f4 -> 0x584112e7 -> 0x791dc20e
meaning that whenever jump_forward_lcg
was called, it skipped 0x2f -> 0x41 -> 0x1d
(in forward order) bytes in the outer function. To our surprise, this actually is consistent with the assembly contained in the binary, with the jumps being
0x55555555515E -> 0x55555555518D (0x2f)
skipping the bogus return from main
0x5555555551A8 -> 0x5555555551E9 (0x41)
skipping some bogus part saying “this is the flag”
0x555555554E80 -> 0x555555554E9D (0x1d)
skipping around a magic constant saying 0xff0011
( = fool?)
Nopping out these ranges, for the first time made Hex-Rays decompile the main
function! Back then, we knew we were on track! :)
int __cdecl main(int argc, const char **argv, const char **envp)
{
char *v3; // rax@13
char *v4; // ST28_8@13
int result; // eax@13
__int64 v6; // rbx@13
const char **argv_cpy; // [sp+0h] [bp-70h]@1
int i; // [sp+1Ch] [bp-54h]@7
int j; // [sp+1Ch] [bp-54h]@10
char flagfile[32]; // [sp+30h] [bp-40h]@13
unsigned int state0; // [sp+50h] [bp-20h]@7
int state1; // [sp+54h] [bp-1Ch]@7
__int64 v13; // [sp+58h] [bp-18h]@1
v13 = *MK_FP(__FS__, 40LL);
printf("Thanks for choosing Ps Security\n", argv, argv);
if ( argc <= 4 )
{
printf("Not enough parameters\n");
exit(1LL);
}
if ( strlen(argv_cpy[4]) > 0x1F )
{
printf("Filename too long\n");
exit(1LL);
}
state0 = 0;
state1 = 0;
state0 = (unsigned __int64)atoi((void *)argv_cpy[1]);
state1 = (unsigned __int64)atoi((void *)argv_cpy[2]);
for ( i = 0; i < strlen(argv_cpy[1]); ++i )
argv_cpy[1][i] = 'x';
for ( j = 0; j < strlen(argv_cpy[2]); ++j )
argv_cpy[2][j] = 'x';
load_noise((__int64)argv_cpy[3], &state0, (__int64)&state1);
alignment_and_noise((__int64)&state0, (__int64)&state1);
strcpy(flagfile, argv[4])
v3 = &flagfile[strlen(flagfile)];
*(_DWORD *)v3 = 'tXt.';
v3[4] = 0;
v4 = (char *)alignment_and_noise + (signed int)lcg(&state0) % 65 + 0x1C;// 0x00745874 -> 0xefa46567
printf("Hold your breath..\n");
((void (__fastcall *)(char *, unsigned int *, int *))v4)(flagfile, &state0, &state1);
result = 0;
v6 = *MK_FP(__FS__, 40LL) ^ v13;
return result;
}
state0
Back to our main problem of finding state0
we now have at least the precise original control flow the program took prior to crashing. The main
function calls the alignment_and_noise
function which in turn calls a get_alignment
function (0x555555554BD2
) which invokes the LCG many times until it returns zero:
void __cdecl get_alignment(__int64 a1, __int64 a2)
{
unsigned int *s1; // [sp+0h] [bp-20h]@1
unsigned int *s0; // [sp+8h] [bp-18h]@1
int v4; // [sp+14h] [bp-Ch]@1
int v5; // [sp+18h] [bp-8h]@1
v4 = 0;
v5 = 0;
printf("Waiting for the proper alignment:\n ", a2, a1);
while ( lcg(s0) )
{
if ( !(++v4 % 50) )
{
printf(".");
if ( !(++v5 % 50) )
printf("\n ");
}
}
jump_forward_lcg(s1); // 0x584112e7 -> 0x791dc20e
printf("\n");
}
Remember how the LCG output with state0
was used as a seek offset earlier? As explained earlier we know that this was a zero, and here the LCG is called on state0
until it returns 0 again. This means that we additionally know that there must be two zero outputs with a particular distance. If we only knew that distance …
Looking again at the get_alignment
function we realize that it actually gives away the magnitude of the number of LCG invocations! For every 50 LCG calls it prints one “.” to stdout. For every 50 * 50 LCG calls it writes out a newline character. If we only had stdout
output …
And it turns out we are lucky again! Reconstructing glibc
’s FILE
structs we find the following:
0000555555757260 48 6F 6C 64 20 79 6F 75 72 20 62 72 65 61 74 68 Hold your breath
0000555555757270 2E 2E 0A 2E 2E 2E 2E 2E 2E 2E 2E 2E 2E 2E 2E 2E ................
0000555555757280 2E 2E 0A 2E 2E 2E 2E 2E 2E 2E 2E 2E 2E 2E 2E 2E ................
0000555555757290 2E 2E 2E 2E 2E 2E 0A 00 00 00 00 00 00 00 00 00 ................
This output could only be produced if the LCG was called 50 * 50 times, producing four spaces, 50 dots and a newline (in total 55 characters), after being overwritten by four spaces, 30 dots and a newline, which is in turn overwritten by the Hold your breath..
announcement of the binary directly before the indirect jump. Unfortunately to us, we still have to many possible answers to this, as we cannot tell how many times the first (50 * 50) linewrap was triggered. What we derive from the observations above is that the distance separating two zero LCG return values has to follow the formula 50 * x + 30. If we only could precisely determine this x …
And it turns out we are lucky once more! After staring at the fragments of the function frames on the stack contained in the coredump
00:0000│ 0x7fffffffebb8 —▸ 0x5555557588b0 ◂— 0x1d9ce04233647a0b
01:0008│ 0x7fffffffebc0 —▸ 0x7fffffffec94 ◂— 0xdc7ab200791dc20e
02:0010│ 0x7fffffffebc8 —▸ 0x7fffffffebe0 —▸ 0x7fffffffec10 —▸ 0x7fffffffec30 —▸ 0x7fffffffecb0 —▸ 0x5555555552d0 ◂— ...
03:0018│ 0x7fffffffebd0 —▸ 0x555555554b22 ◂— movzx eax, al
04:0020│ 0x7fffffffebd8 —▸ 0x7fffffffec94 ◂— 0xdc7ab200791dc20e
05:0028│ 0x7fffffffebe0 —▸ 0x7fffffffec10 —▸ 0x7fffffffec30 —▸ 0x7fffffffecb0 —▸ 0x5555555552d0 ◂— ...
06:0030│ 0x7fffffffebe8 —▸ 0x555555554cae ◂— nop
07:0038│ 0x7fffffffebf0 —▸ 0x7fffffffec94 ◂— 0xdc7ab200791dc20e
08:0040│ 0x7fffffffebf8 —▸ 0x7fffffffec90 ◂— 0x791dc20eefa46567
09:0048│ 0x7fffffffec00 ◂— 0x55554b22 /* '"KUU' */
0a:0050│ 0x7fffffffec08 ◂— 0x1000000e6
0b:0058│ 0x7fffffffec10 —▸ 0x7fffffffec30 —▸ 0x7fffffffecb0 —▸ 0x5555555552d0 ◂— push r15
0c:0060│ 0x7fffffffec18 —▸ 0x555555554e9d ◂— leave
0d:0068│ 0x7fffffffec20 —▸ 0x7fffffffec94 ◂— 0xdc7ab200791dc20e
0e:0070│ 0x7fffffffec28 —▸ 0x7fffffffec90 ◂— 0x791dc20eefa46567
0f:0078│ 0x7fffffffec30 —▸ 0x7fffffffecb0 —▸ 0x5555555552d0 ◂— push r15
10:0080│ rsp 0x7fffffffec38 —▸ 0x5555555552aa ◂— mov eax, 0
11:0088│ 0x7fffffffec40 —▸ 0x7fffffffed98 —▸ 0x7fffffffef43 ◂— 0x6b73612f706d742f ('/tmp/ask')
12:0090│ 0x7fffffffec48 ◂— 0x500f0b5ff
13:0098│ 0x7fffffffec50 ◂— 0xc2
14:00a0│ 0x7fffffffec58 ◂— 0xaffffec86
15:00a8│ 0x7fffffffec60 ◂— 0x1
16:00b0│ 0x7fffffffec68 —▸ 0x555555554e9a ◂— inc dword ptr [rax]
17:00b8│ rcx rdi 0x7fffffffec70 ◂— 0x6c69665f73696874 ('this_fil')
18:00c0│ 0x7fffffffec78 ◂— 0x635f6c6c69775f65 ('e_will_c')
19:00c8│ 0x7fffffffec80 ◂— 0x745f6e6961746e6f ('ontain_t')
1a:00d0│ 0x7fffffffec88 ◂— 0x2e67616c665f6568 ('he_flag.')
1b:00d8│ rsi rdx-4 0x7fffffffec90 ◂— 0x791dc20eefa46567
1c:00e0│ 0x7fffffffec98 ◂— 0x853f88dcdc7ab200
1d:00e8│ 0x7fffffffeca0 —▸ 0x7fffffffed90 ◂— 0x5
1e:00f0│ 0x7fffffffeca8 ◂— 0x0
1f:00f8│ rbp 0x7fffffffecb0 —▸ 0x5555555552d0 ◂— push r15
20:0100│ 0x7fffffffecb8 ◂— 0x7ffff75d51c1
21:0108│ 0x7fffffffecc0 —▸ 0x7fffffffedc8 —▸ 0x7fffffffefa4 ◂— 'COLUMNS=136'
22:0110│ 0x7fffffffecc8 —▸ 0x7fffffffed98 —▸ 0x7fffffffef43 ◂— 0x6b73612f706d742f ('/tmp/ask')
23:0118│ 0x7fffffffecd0 ◂— 0x500000000
24:0120│ 0x7fffffffecd8 —▸ 0x555555554ff7 ◂— push rbp
25:0128│ 0x7fffffffece0 ◂— 0x0
26:0130│ 0x7fffffffece8 ◂— 0x7ef90cea2d1b8aca
27:0138│ 0x7fffffffecf0 —▸ 0x5555555549d0 ◂— xor ebp, ebp
we spotted the answer to the million dollar question clandestinely hiding at offset 0x7fffffffec08
: 0xe6 = 230 = 50 * 4 + 30
. This is the value v5
contained in the frame of the get_alignment
function. We therefore have a second, strong constraint on the state0
variable!
To summarize, we now look for two zero outputs of the LCG separated by at least 230 * 50 and at most 231 * 50 (otherwise there would have been another dot on stdout) steps. Additionally, the value after the second zero must satisfy (((((nxt >> 16) & 0x7fff) % 65) + 0x1c) == 0x4e)
. With this knowledge, our mathemagician wrote a Sage script
#!/usr/bin/env sage
R = Zmod(2**32)
a = R(214013)
b = R(2531011)
good = list(range(2**16)) + list(range(2**31, 2**31+2**16))
for n in range(230*50, 231*50+1):
A = a**n
B = sum(a**i*b for i in range(n))
f = lambda x: int(A*x+B)
t = lambda v: (int(v)>>16)&0x7fff
for x0 in good:
xn = f(x0)
nxt = int(a*xn+b)
if t(xn) == 0 and t(nxt) % 65 + 0x1c == 0x4e:
print '{:5d} {:08x} {:08x} {:08x}'.format(n, x0, xn, nxt)
producing the outputs 0x1c6158d4
, 0x374b2e90
, and 0x6ea5b54d
. Feeding the first value into the flag calculation function from above yields the flag Flag:{d589ec6f60145e9d6cdb750f74d24ccd-00001a5f}
.