0CTF Quals 2019: "babysponge" writeup

This challenge consisted of finding a collision for Keccak with a very small capacity parameter.

The only really relevant part of the task’s source code is:

        return CompactFIPS202.Keccak(1552, 48, bytearray(msg), 0x06, 32)

When given two distinct msgs such that this call returns the same hash value, the server will hand out the flag.

We immediately see that the capacity parameter is set to only 48 bits, which implies finding an inner collision for the sponge function takes only about 224 calls to Keccak-f.

The Keccak function processes a given message m0m1 consisting of two 194-byte blocks m0,m1 by initializing a 200-byte state s0 with all zeroes, then computing s1:=f(s0(m0006)) where 006 denotes a byte string of six null bytes. Here f is Keccak-f, an invertible transformation of 200-byte strings whose specifics don’t matter for this challenge. The next absorption step is similar; it sets s2:=f(s1(m1006)).

If one manages to find two different such messages m0m1 and m0m1 such that s2=s2, then the rest of the computations in the hash will also be identical and we found a collision. Here, m0m1 and m0m1 are called an inner collision. The evident strategy to search for m0m1 and m0m1 is to take random m0 until the last 6 bytes of s1 collide; let m0,m0 be two inputs that make this happen, hence s1[6:]=s1[6:]. Then m1 and m1 may be set to the first 194 bytes of s1 resp. s1, such that the byte strings thrown into f during the computation of s2 become equal: Letting cap:=s1[6:]=s1[6:], we get s2=f(s1(s1[:194]006))=f(00194s1[6:])=f(00194cap)s2=f(s1(s1[:194]006))=f(00194s1[6:])=f(00194cap)

We wrote a C++ program to search for an inner collision efficiently, using keccak-tiny.c for the keccakf() subroutine:

#include <iostream>
#include <iomanip>
#include <fstream>
#include <vector>
#include <unordered_map>
#include <thread>
#include <mutex>

const constexpr size_t c = 6, r = 194;
typedef std::array<char,200> state_t;

extern void keccakf(void *);   /* keccak-tiny.c */

static state_t randomize()
{
    state_t st {0};
    static std::ifstream urandom("/dev/urandom", std::ios::binary);
    if (!urandom.read(&st[0], r)) throw;
    return st;
}

static void dump(state_t const& st)
{
    std::cout << std::hex << std::setfill('0');
    for (auto c: st) std::cout << std::setw(2) << (c & 0xff);
    std::cout << std::endl;
}

std::mutex mtx;
void search()
{
    std::unordered_map<std::string,state_t> tab;
    state_t st = randomize();
    while (tab.size() < (1 << 22)) {    /* ~ 1 GB */
        std::fill(st.begin() + r, st.end(), 0);
        state_t in = st;
        keccakf(&st);
        std::string cap(st.begin() + r, st.end());
        if (tab.find(cap) != tab.end()) {
            std::lock_guard lck(mtx);
            dump(tab[cap]);
            dump(in);
            exit(0);
        }
        tab[cap] = in;
    }
}

int main()
{
    std::vector<std::thread> ts;
    while (true) {
        while (ts.size() < 4)
            ts.push_back(std::thread(search));
        ts[0].join();
        ts.erase(ts.begin());
    }
}

Running this program spits out an inner collision after a few minutes:

$ g++ -std=c++17 -O3 -march=native -lpthread collide.cpp keccak-tiny.c -o collide
$ ./collide | tee inner_collision.txt
cac84ac37940f1428012ee634adb7ed13bbeca22955da80f42ec4aa80afec33ff03bbea91a1295994a0b66a82e1921e996453157a9eaec4d3c0b87eebedace88e1a53bb2081e1ade65302ebad24683b2a5bc34a23e49b29380bbe189e6eaf52ebdcbe876b24a8e0e69f66171178861ddd44e52104c23b033e9ebbaecabfbb29f2e6322cec5e0e4f2478ca38f8b8d40f8592d7201b7bd0c66777ac8d36d0405d8bd7d6bb34d89734bbe8428f876dc91ca7acdf2c6d6ce314495fbdc7d08396f73d7bf000000000000
f3933c0281c60c2be9959b60f477c58c9cbfb8c4d00aaeabaef4c7b278bdba7eb420ff69393db34c522327810334e61941cb821243a76c34efd3258bcca0004bd41e7cc5d3c8f2607ab11c98ec50c84054054e2edf392ebd1fb4bac0e0915299727d54d2b267b51c4694d165d6004d06b4eb65bf02584d65a7660382983cd9033335ffbf617d570c67dfebcbc5a55e2170ca793a34ad53b57581de052a9ae8f5224aa04b87c10e7d0211cd445a540f13f54487e4bccac2188ae5af5f0cb41002f798000000000000

The two outputs are hex-encoded values of m0006 and m0006 such that the values of s1 and s1 collide on the last 6 bytes. Hence all that remains to do (after solving the proof of work of course…) is to compute the corresponding m1,m1 and send the resulting hash collision to the server:

#!/usr/bin/env python3
import socket, re, hashlib, string, itertools

sock = socket.socket()
sock.connect(('111.186.63.14', 10001))

if 1:
    s = b'';
    while b'Give me XXX' not in s:
        tmp = sock.recv(0x100); assert tmp; s += tmp
    m = re.search(r'XXXX.([^)]+)\) == ([0-9a-f]+)', s.decode())
    suffix, d = m.groups()
    alph = string.ascii_letters + string.digits
    for s in map(''.join, itertools.product(*(alph for _ in range(4)))):
        if hashlib.sha256((s + suffix).encode()).hexdigest() == d:
            sock.sendall(s.encode() + b'\n')
            assert b'first message' in sock.recv(0x100)
            break
    else: assert False

from CompactFIPS202 import KeccakF1600

m0, m1 = map(bytes.fromhex, open('inner_collision.txt').read().strip().split('\n'))
assert m0[-6:] == m1[-6:] == bytes(6)

h0, h1 = map(KeccakF1600, (m0, m1))

m0 = m0[:194] + h0[:194]
m1 = m1[:194] + h1[:194]

sock.sendall('{}\n'.format(m0.hex()).encode())
assert b'second message' in sock.recv(0x100)

sock.sendall('{}\n'.format(m1.hex()).encode())

import telnetlib
tel = telnetlib.Telnet()
tel.sock = sock
tel.interact()

Running this script yields the flag:

$ ./pwn.py
flag{I_wAs_th3_sh4d0w_Of_the_waXwing_sLAin__By_the_fAlse_@4zure9_in_the_window_pan3}
*** Connection closed by remote host ***
$