0CTF Quals 2019: "babysponge" writeup

This challenge consisted of finding a collision for Keccak with a very small capacity parameter.

The only really relevant part of the task’s source code is:

        return CompactFIPS202.Keccak(1552, 48, bytearray(msg), 0x06, 32)

When given two distinct msgs such that this call returns the same hash value, the server will hand out the flag.

We immediately see that the capacity parameter is set to only 48 bits, which implies finding an inner collision for the sponge function takes only about $2^{24}$ calls to Keccak-f.

The Keccak function processes a given message $m_0\Vert m_1$ consisting of two 194-byte blocks $m_0,m_1$ by initializing a 200-byte state $s_0$ with all zeroes, then computing \[ s_1 := f(s_0\oplus (m_0\Vert \mathtt{00}^6)) \] where $\mathtt{00}^6$ denotes a byte string of six null bytes. Here $f$ is Keccak-$f$, an invertible transformation of 200-byte strings whose specifics don’t matter for this challenge. The next absorption step is similar; it sets \[ s_2 := f(s_1\oplus (m_1\Vert \mathtt{00}^6)) \,\text. \]

If one manages to find two different such messages $m_0\Vert m_1$ and $m_0’\Vert m_1’$ such that $s_2=s_2’$, then the rest of the computations in the hash will also be identical and we found a collision. Here, $m_0\Vert m_1$ and $m_0’\Vert m_1’$ are called an inner collision. The evident strategy to search for $m_0\Vert m_1$ and $m_0’\Vert m_1’$ is to take random $m_0$ until the last 6 bytes of $s_1$ collide; let $m_0,m_0’$ be two inputs that make this happen, hence $s_1[-6{:}]=s_1’[-6{:}]$. Then $m_1$ and $m_1’$ may be set to the first 194 bytes of $s_1$ resp. $s_1’$, such that the byte strings thrown into $f$ during the computation of $s_2$ become equal: Letting $\mathit{cap}:=s_1[-6{:}]=s_1’[-6{:}]$, we get \[ s_2 = f(s_1 \oplus (s_1[{:}194] \Vert \mathtt{00}^6)) = f(\mathtt{00}^{194}\Vert s_1[-6{:}]) = f(\mathtt{00}^{194}\Vert\mathit{cap}) \]\[ s_2’ = f(s_1’ \oplus (s_1’[{:}194] \Vert \mathtt{00}^6)) = f(\mathtt{00}^{194}\Vert s_1’[-6{:}]) = f(\mathtt{00}^{194}\Vert\mathit{cap}) \]

We wrote a C++ program to search for an inner collision efficiently, using keccak-tiny.c for the keccakf() subroutine:

#include <iostream>
#include <iomanip>
#include <fstream>
#include <vector>
#include <unordered_map>
#include <thread>
#include <mutex>

const constexpr size_t c = 6, r = 194;
typedef std::array<char,200> state_t;

extern void keccakf(void *);   /* keccak-tiny.c */

static state_t randomize()
    state_t st {0};
    static std::ifstream urandom("/dev/urandom", std::ios::binary);
    if (!urandom.read(&st[0], r)) throw;
    return st;

static void dump(state_t const& st)
    std::cout << std::hex << std::setfill('0');
    for (auto c: st) std::cout << std::setw(2) << (c & 0xff);
    std::cout << std::endl;

std::mutex mtx;
void search()
    std::unordered_map<std::string,state_t> tab;
    state_t st = randomize();
    while (tab.size() < (1 << 22)) {    /* ~ 1 GB */
        std::fill(st.begin() + r, st.end(), 0);
        state_t in = st;
        std::string cap(st.begin() + r, st.end());
        if (tab.find(cap) != tab.end()) {
            std::lock_guard lck(mtx);
        tab[cap] = in;

int main()
    std::vector<std::thread> ts;
    while (true) {
        while (ts.size() < 4)

Running this program spits out an inner collision after a few minutes:

$ g++ -std=c++17 -O3 -march=native -lpthread collide.cpp keccak-tiny.c -o collide
$ ./collide | tee inner_collision.txt

The two outputs are hex-encoded values of $m_0\Vert\mathtt{00}^6$ and $m_0’\Vert\mathtt{00}^6$ such that the values of $s_1$ and $s_1’$ collide on the last 6 bytes. Hence all that remains to do (after solving the proof of work of course…) is to compute the corresponding $m_1,m_1’$ and send the resulting hash collision to the server:

#!/usr/bin/env python3
import socket, re, hashlib, string, itertools

sock = socket.socket()
sock.connect(('', 10001))

if 1:
    s = b'';
    while b'Give me XXX' not in s:
        tmp = sock.recv(0x100); assert tmp; s += tmp
    m = re.search(r'XXXX.([^)]+)\) == ([0-9a-f]+)', s.decode())
    suffix, d = m.groups()
    alph = string.ascii_letters + string.digits
    for s in map(''.join, itertools.product(*(alph for _ in range(4)))):
        if hashlib.sha256((s + suffix).encode()).hexdigest() == d:
            sock.sendall(s.encode() + b'\n')
            assert b'first message' in sock.recv(0x100)
    else: assert False

from CompactFIPS202 import KeccakF1600

m0, m1 = map(bytes.fromhex, open('inner_collision.txt').read().strip().split('\n'))
assert m0[-6:] == m1[-6:] == bytes(6)

h0, h1 = map(KeccakF1600, (m0, m1))

m0 = m0[:194] + h0[:194]
m1 = m1[:194] + h1[:194]

assert b'second message' in sock.recv(0x100)


import telnetlib
tel = telnetlib.Telnet()
tel.sock = sock

Running this script yields the flag:

$ ./pwn.py
*** Connection closed by remote host ***