hxp CTF 2022: sequoa writeup

The goal of the cryptography challenge sequoa from our recent CTF was to break the post‑quantum signature scheme SEQUOA introduced very recently in ePrint 2023/318; the paper only appeared online about a week prior to the CTF. For the challenge I wrote a simple C++ program, compiled with the reference implementation of the scheme, which outputs a public key, reads in a signature, and grants the flag if the (necessarily forged) signature can be verified correctly. There were 12 teams who solved the challenge during the CTF.

Overview

SEQUOA is the latest incarnation of a series of proposed post‑quantum DLP replacements whose common hope is to land in the sweet spot where just enough structure is available to easily build analogues of all the usual "nice" classical DLP-based constructions, while lacking some part of the structure required to run Shor's quantum algorithm. (It is not known whether any such construction exists.)

The scheme is described pretty clearly in the paper, but I will summarize the key aspects relevant for the attack here for your convenience:

  • Fixed parameters: Prime $n$ defining the polynomial quotient ring $R = \mathbb F_2[X]/(X^n-1)$, an element $g\in R$ of multiplicative order $q=2^{(n-1)/2}-1$.
  • Private key: Tuple $(a_1,a_2)$ of positive integers.
  • Public key: Ring element $y = g^{a_1} \cdot T(g^{a_2})$, where $T\colon\,R\to R$ is the involution $f(X)\mapsto f(1/X)$.
    (In the specification this involution is described as being given by transposing the associated circulant matrix, or as an explicit mapping of a polynomial in terms of its coefficients, but it is readily verified that this map is identical to evaluation at $1/X$.)

The rest of the construction doesn't matter too much as we will recover the private key from the public key, and one can simply run the legitimate signing algorithm afterwards. (The scheme is essentially Schnorr signatures with exponentiation $h^b$ replaced by the operation $h^{(b_1,b_2)} := h^{b_1}\cdot T(h^{b_2})$.)

Choices of $n$ used in the reference implementation range from $167$ to $863$, with claimed security levels from $150$ bits classically up to NISTPQC post‑quantum security level $5$. The challenge works with the $n=359$ instantiation (claimed $320$‑bit classical security), as it is small enough to run the complete attack within a couple of minutes on a laptop.

How to hack

So, let's figure out how to recover a private key that leads to a specific given public key. There are two core observations that go into this:

  • Applying $T$ commutes with exponentiation in $R$: Since $T$ is a ring automorphism (check this!), we have $T(h^b) = T(h)^b$ for any $h$ and $b$.
  • The ring $R$ has projection homomorphisms to finite fields $\mathbb F_{2^k}$, defined by reduction modulo degree‑$k$ irreducible divisors of $X^n-1$.

Note that the first observation alone is sufficient to show that the scheme reduces to a discrete‑logarithm problem, and hence definitely isn't post‑quantum secure: The public key $y$ equals $g^{a_1}\cdot T(g)^{a_2}$, where $g$ and $T(g)$ are publicly known. A basis of the period lattice $\Lambda$ of the map \[ (\mathbb Z^3,+) \longrightarrow R ,\quad (u,v,w) \longmapsto g^u\cdot T(g)^v \cdot y^w \] can be computed in quantum polynomial time using Shor (more precisely, Boneh–Lipton or Kitaev), and finding a vector in $\Lambda$ of the form $(a_1,a_2,-1)$ recovers a valid secret key $(a_1,a_2)$. Not having access to a quantum computer (yet), the second observation helps moving the DLP to finite fields, where efficient implementations of subexponential DLP algorithms are available in standard computer algebra systems.

For the particular choices of $n$ used in the reference implementation, the polynomial $X^n-1$ splits into a product of the form $(X-1) \cdot f \cdot \mathrm{rev}(f)$, where $\deg(f) = (n-1)/2 =: k$ and $\mathrm{rev}(f)$ is the "reverse polynomial" of $f$; concretely it is given by $\mathrm{rev}(f) = f(1/X)\cdot X^{\deg(f)}$. This factorization yields two projection maps \[ \pi_1\colon\; R \to \mathbb F_2[X]/f =: F_1 \cong \mathbb F_{2^k} \,\text, \\ \pi_2\colon\; R\to \mathbb F_2[X]/\mathrm{rev}(f) =: F_2 \cong \mathbb F_{2^k} \,\text. \] (Actually, $\pi_2$ is equal to $\iota\circ\pi_1\circ T$ where $\iota\colon\; F_1\to F_2$ is the isomorphism given by $X\mapsto \pi_2(1/X)$, but we don't really need this fact.)

With these projections, the key‑recovery attack is fairly straightforward:

  • Solve DLP in $F_1\cong\mathbb F_{2^k}$ to find $c_1$ such that $\pi_1(T(g)) = \pi_1(g)^{c_1}$.
  • Solve DLP in $F_2\cong\mathbb F_{2^k}$ to find $c_2$ such that $\pi_2(T(g)) = \pi_2(g)^{c_2}$. (Actually, this is redundant as $c_2$ equals $c_1^{-1}\bmod q$, but computing it again does not do much harm either.)
  • Solve DLP in $F_1\cong\mathbb F_{2^k}$ to find $u_1$ such that $\pi_1(y) = \pi_1(g)^{u_1}$.
  • Solve DLP in $F_2\cong\mathbb F_{2^k}$ to find $u_2$ such that $\pi_2(y) = \pi_2(g)^{u_2}$.
  • Piece it all together: We know that \[ \pi_1(y) = \pi_1\big(g^{a_1}\cdot T(g)^{a_2}) = \pi_1(g)^{a_1 + c_1 a_2} \,\text, \\ \pi_2(y) = \pi_2\big(g^{a_1}\cdot T(g)^{a_2}) = \pi_2(g)^{a_1 + c_2 a_2} \,\text. \] Therefore, we have to solve the system \[ u_1 = a_1 + c_1a_2 \\ u_2 = a_1 + c_2a_2 \] for $(a_1,a_2)$ modulo $q$. This is linear algebra.

Also note that all the DLPs can be computed in parallel.

Attack script (Python with SageMath library)

#!/usr/bin/env python3
import socket, sys
from Crypto.Hash import SHAKE256
from sage.all import *

gen = [(b>>i)&1 for b in (36,4) for i in range(8)]  # from bufPi in sign.cpp

sock = socket.socket()
sock.settimeout(5)
sock.connect((sys.argv[1], int(sys.argv[2])))

buf = b''
while buf.count(b'\n') < 4:
    tmp = sock.recv(0x100)
    assert tmp
    buf += tmp
pubbytes = bytes.fromhex(buf.decode().strip().split('\n')[-1])

ns = [167, 263, 359, 479, 647, 863]
n, = {n for n in ns if (n+7)//8 == len(pubbytes)}

pub = [(b>>i)&1 for b in pubbytes for i in range(8)][:n]

################################################################

P,V = GF(2)['V'].objgen()
R,v = P.quotient(V**n - 1).objgen()
aut = lambda el: el.lift()(~v)

q = 2**(n//2) - 1
qbyts = (q.bit_length() + 7) // 8

subexp = lambda x,y: (x[0]-y[0], x[1]-y[1])
mulexp = lambda x,y: (x[0]*y[0]+x[1]*y[1], x[0]*y[1]+x[1]*y[0])
exp2d = lambda g, x: g**(x[0]%q) * aut(g)**(x[1]%q)

def H(data):
    k = SHAKE256.new(data).read(2*qbyts)
    k1 = int.from_bytes(k[:qbyts], 'little')
    k2 = int.from_bytes(k[qbyts:], 'little')
    return k1 % q, k2 % q

def poly2byts(f):
    assert len(list(f)) <= n
    bs = [0]*((n+7)//8)
    for i in range(n):
        bs[i//8] |= int(f[i]) << i%8
    return bytes(bs)

def sign(priv, pub, msg):
    k = (42,42)  # this line of code is dedicated to Sony
    r = exp2d(g, k)
    e = H(poly2byts(r) + poly2byts(pub) + msg)
    s = subexp(k, mulexp(priv, e))
    return s, e

################################################################

def _ffmap(f):
    assert f.is_irreducible()
    F,w = GF(2).extension(modulus=f, name='w').objgen()
    assert f(w) == 0
    return lambda el: el.lift()(w)

ffmaps = []
for f,m in (V**n - 1).factor():
    assert m == 1
    ffmaps.append(_ffmap(f))

################################################################

def batchlog(xys):
    def aux(x,y):
        try:
            return x.log(y)
        except ValueError:
            return None
    parlog = parallel(os.cpu_count()+1)(aux)
    ret = [None] * len(xys)
    for (xy,_),r in parlog(xys):
        for i,xy2 in enumerate(xys):
            if xy == xy2:
                if r is None:
                    print(f'\x1b[33mcould not solve logarithm\x1b[0m #{i}')
                    exit(1)
                ret[i] = r
    assert None not in ret
    return ret

g = R(gen)
y = R(pub)
gT = aut(g)

assert gT.matrix() == g.matrix().transpose()

ffmaps = [ff for ff in ffmaps if ff(g).parent().cardinality() == q+1]
ff1, ff2 = ffmaps

assert g**q == 1
assert ff1(g).multiplicative_order() == q
assert ff2(g).multiplicative_order() == q

iso = ff1(g).parent().hom([ff2(1/v)])
r = R.random_element()
assert iso(ff1(aut(r))) == ff2(r)

c1,c2, x1,x2 = batchlog([
        (ff1(gT), ff1(g)),
        (ff2(gT), ff2(g)),
        (ff1(y), ff1(g)),
        (ff2(y), ff2(g)),
    ])
assert c1 * c2 % q == 1

assert ff1(exp2d(g, (x1, 0))) == ff1(y)
assert ff2(exp2d(g, (x2, 0))) == ff2(y)
assert ff1(exp2d(g, (0, x1*c2))) == ff1(y)
assert ff2(exp2d(g, (0, x2*c1))) == ff2(y)

assert ff1(exp2d(g, (777, (x1-777)*c2))) == ff1(y)
assert ff2(exp2d(g, (777, (x2-777)*c1))) == ff2(y)

S,A = Zmod(q)['A'].objgen()
eq = (x1-A)*c2 - (x2-A)*c1
a = ZZ(*eq.roots(multiplicities=False))

assert ff1(exp2d(g, (a, (x1-a)*c2))) == ff1(y)
assert ff2(exp2d(g, (a, (x2-a)*c1))) == ff2(y)

assert (x1-a)*c2 % q == (x2-a)*c1 % q
x = a, (x2-a)*c1
assert ff1(exp2d(g, x)) == ff1(y)
assert ff2(exp2d(g, x)) == ff2(y)

msg = b'hello world'
sig = sign(x, pub, msg)
sigbytes = b''.join(int(x%q).to_bytes(qbyts,'little') for x in sum(sig,()))
sock.sendall(f'{(msg + sigbytes).hex()}\n'.encode())

s = b''
while True:
    tmp = sock.recv(0x100)
    if not tmp:
        break
    s += tmp
print(s.decode(errors='replace').strip())

Output:

$ time ./pwn.py 94.130.27.91 7777
hxp{Th4nKx_4_th3_phUn_Ch4l1En9e}

real	3m3.043s
user	11m58.520s
sys	0m0.418s

I also tried running this against the next larger parameter set $n=479$: Everything works the same, but the DLPs get more expensive (≈ 1 hour each).