hxp | OMH CTF 2021: falsch

Sun, 16 May 2021 ~ yyyyyyy

OMH CTF 2021: falsch

misc (500 points, 1 solve)

In this (very cool!) challenge, the objective was to prove false in a small homebrew automated theorem prover written in the LISP dialect Racket.

The prover incorporated many important checks (such as only allowing tail recursion) that are required for soundness, but missed (at least) one crucial thing: variable scoping isn’t a thing.

In short, the technique used in the proof of false (or rather #f in the syntax of Racket) is a combination of two little problems:

One can define a function (u) satisfying (u) = y without ever specifying what y is.
Later, y can be used as a local variable in theorem statements, and the prover makes no distinction between the two variables called y.

So, basically, we can reason about an undefined thing called y, and the lack of variable scoping allows us to prove that y equals arbitrary other things:

(defun (u) y)

(defthm (oopsie y)
  (equal? (u) y)
)
; ...
(qed)

The definition and theorem above are the core of the proof of #f. We defined a function (u) and obtained the theorem

    ∀y. (u) = y

which is very obviously unsound nonsense. Getting from there to a proof of #f is not hard; for example, we can easily derive

    #t = (u) = #f

as an intermediate result and deduce #f from that. Here’s the full proof:

#lang falsch

(defun (u) y)
                                ; <current goal>
(defthm (oopsie y)
  (equal? (u) y)
)                               ; (equal? (u) y)
(-> (1) (u))                    ; (equal? y y)
(-> () (equal-same y))          ; #t
(qed)

(defthm (woopsie)
  (equal? #t #f)
)                               ; (equal? #t #f)
(<- (1) (oopsie #t))            ; (equal? (u) #f)
(<- (2) (oopsie #f))            ; (equal? (u) (u))
(-> () (equal-same (u)))        ; #t
(qed)

(get-flag)                      ; #f
(<- () (if-true #f #t))         ; (if #t #f #t)
(-> (Q) (woopsie))              ; (if #f #f #t)
(-> () (if-false #f #t))        ; #t
(qed)

Sending this proof to the server yields the (amusing) flag:

Please send your proof and terminate it with an EOF line.
Thank you. Let's see...
p4{formal_poof_(n.)_the_sound_your_work_makes_when_your_proof_checker_is_inconsistent}