hxp CTF 2022: tex_based_adventure writeup

This challenge was a text-based adventure game implemented in LaTeX3 (maybe known to some as expl3). The goal is simple: to get the flag, win the game.

Some LaTeX3 fundamentals

LaTeX3 is much more supportive of you doing arbitrary programming than the plain TeX/LaTeX primitives that we had before. I won’t go into too many details here (there is comprehensive documentation of the APIs), but here’s what you can expect:

  • _ is now a valid character in commands (and in fact part of the recommended style, so that the code you write only works within \ExplSyntaxOn / \ExplSyntaxOff). You may need to fix your syntax highlighter to work with this (a quick fix in VIM is :syn match texStatement "\\[a-zA-Z_]\+"). To put a space somewhere, you need to use ~ - normal whitespace is just a separator now!

  • Functions generally come with an argument specifier that tells you what kind of argument they expect. Read the docs for the full explanation, here is a quick overview: N is for a single token (e.g. \thing), n is for multiple tokens (e.g. {hello \world}), and w is for anything super-weird where you specify the arguments yourself in \def-like fashion. Then, there are different ways of expanding the arguments that are passed to a function: V expands a variable using the \*_use:N for the correct type, c expands a string to a control sequence (i.e. a function) like TeX \csname/\endcsname, and e and x trigger full expansion (a bit like \edef-ing something to the argument inside a TeX macro). e and x are different (in particular, x is usually faster, but e is expandable recursively, which is very useful). You can generate variants of the base functions to use those expansions, or you can use \exp_args:....

  • There are a number of different general-purpose data types. I won’t go into them all, but I might bring them up as they become relevant. Mostly, they are different ways of working on (lists of) strings (either handled as TeX tokens (tl), individual characters (str), or comma- (clist) or otherwise delimited (seq) lists. There’s also some basic math support, but only up to $2^{31}$, and some more fancy data types (arrays of integers up to $2^{30}$ (yes, a different limit) and floating point numbers, as well as “property lists” (that are really dictionaries), etc.)

  • In general, you can have global and local variable state. Local variables live inside the current \group_begin: / \group_end: scope (but that really is the lifetime of the current value, in theory, nothing stops you from using e.g. \tl_gset:N on \l_tmpa_tl). with local variable scope. Variables don’t have an argument specifier.

  • You can read and write files dynamically (look for ior/iow in the documentation), and there is some support for randomness.

Step 1: Lots of weird characters

Initially, here’s what we can see:

  • There’s some setup with \catcode in the shebang line - that just makes the shebang work so you can’t accidentally use something like xelatex on this file (there are some PDFLaTeX primitives way down in the code, and I don’t want to see those error messages).

  • At the end is a large blob of data that looks like base64 (but has commas inbetween) inside \begin{world} / \end{world}.

  • Before that is a small chunk of LaTeX3 code followed by what looks like random characters

Of course, there is some self-decrypting going on here - but the formula is in plain sight. The code just reads the current file (\g_file_curr_name_str.tex), grabs the random-looking characters, and then constructs new code by transforming each character according to $f(x) = 32 + ((x + 71) % 96)$ (this was chosen specifically so we don’t get any unexpected spaces that might mess with the parsing). Then, \tl_rescan:nn (or rather, a newly-created variant called \tl_rescan:nV) is used to re-scan the transformed text as LaTeX3 code.

Step 2: Obfuscation

Let’s look at the decrypted code now. Clearly, there is some obfuscation going on (most of the functions defined there are named something incomprehensible), but most of the strings and all the builtin functions are still readable. A little bit of code formatting goes a long way - most auto-formatting tools horribly break when they encounter LaTeX3, but good old regex will do the job in a pinch.

Starting from the end, we can see relatively quickly that the world environment does the actual heavy lifting. Given that there is only one place where the passed-in data is used, \hxp_YWCf:Nn must be the code that loads the data into \hxp_GVBG.

On the other hand, the “Game loaded in … seconds” message happens before we actually play the game, so \hxp_VCSn:NN\hxp_GVBG\hxp_tMtH must be responsible for the actual gameplay (and indeed, it uses the \hxp_GVBG variable into which we loaded the world data earlier). We can ignore \hxp_tMtH here: width=16cm makes it clear that this is just the graphics settings - and indeed, opening the output PDF (e.g. after dying in the game) reveals an exactly 16cm wide image.


Oh, and the error messages indicate that there is some anti-tampering protection. Since we’re not actually doing this dynamically, we can ignore this for now.

Step 3: World loading

Even though the meanings of some of these parts will only become clear later on, I’ll deal with the format of the world data first. It is quite obvious that this is a comma-separated list of base64-encoded values. The functions called in \hxp_YWCf:Nn split it at the first comma (\hxp_sXFL:n/\hxp_VoCK:n).

The first entry (which is also significantly longer than all the others) is clearly treated separately here, with all the other entries stored in the game data under the XbZo key.

The first of the remaining entries (i.e. the second entry in the original list) is also processed in \hxp_ysFk:Nn, and we can see that there is code that increments an index and loads the next entry from the XbZo key in \hxp_UFaA:NN - this is the list of levels that the player character travels through. If you don’t figure this out here, don’t worry - things will become more clear when we look at the gameplay.

All that happens at this stage is that the base64 chunks that are processed are split up into smaller pieces and stored in the game state. If you base64-decode the data in those chunks (and look at the indices in the parsing function), you’ll be able figure out a few things about the game.

  • The first chunk contains the world data. It starts with a permutation of the numbers between 1 and 48 (EdWg, this will be relevant later, but you don’t need to figure this out here). Then it has a little bit of extra data, followed by all the labels that you can see while playing the game (e.g. “You are in a dark room” - the rest of the text is hardcoded, this ends up in bxFX). Finally, there is a big chunk of additional data. You can see find the text BvtK and vzIn both in the challenge code and the data here. After BvtK, you can clearly see RGB pixel data (this is the “YOU ARE DEAD” image shown above). The data after vzIn looks random, and we will get back to it later (spoiler: it is only referenced after the message Please~wait~for~your~reward!):

Random-looking data `vzIn`

  • The other chunks are all similar: They contain data at the start, and then additional LaTeX3 code. We’ll find out what those do later.

At this point, you could try to find out dynamically which parts of the input map to which key in the game state, or just reverse the parsing code by hand.

Step 4: Gameplay

The gameplay loop is simple, and the messages hint at what is happening. At the start of each round, the game first does most of the output handling:

  • First, the game tries to describe the current room (You~are~in~...). This should tell you where the current room is stored in the game state (MrzK, along with some other value).

  • Then, the game handles enemy attacks (In~front~of~you~is~...). Remember BvtK, the “YOU ARE DEAD” image? It’s referenced here also. Clearly, the list of enemies is in qwsI.

  • The You~hear~... messages are emitted if there is at least one enemy in a neighboring room.

  • If there are any items (NHmX) in the current room, it prints those too.

  • Then, the current inventory (rzVB, via \hxp_UvRB:Nn) is printed.

  • Finally, the game looks at where you can place items (vpAZ). I called those “slots” in the un-obfuscated implementation, so this is what we’ll use here.

Next comes the actual input handling. The command that you enter (from \ior_get_term:nN in \hxp_xhOH:nn - this is the only place where terminal input is read, and you can also find the > prompt there) is matched against a set of regexes (see \hxp_txMa:NN for movement, \hxp_PrHP:Nnn to pick up items, and \hxp_PfFq:Nnn for item placement). You can move forward, backward, left, and right at any given point, and pick up and place items (you can also give up, but that doesn’t make a lot of sense).

After the input handling, the code updates the enemy positions. The handling here is somewhat complicated, because it involves a check for a special case where the player and an enemy should encounter each other in a hallway (the enemy then simply doesn’t move, so the player and enemy end up in the same room, and the game will end on next turn). If you dig into it, the handling for move forward in the player code is very similar to the enemy movement - in fact, one of the values in both the player and the enemy data indicates the current direction to move in (as a number between 1 and 6).

Investigating the handlers for each of the actions you can take also shows that the code in the world data handles item placement - and dispatches to the code that loads the next entry from list of levels. Clearly, placing items correctly means that you advance through the levels, which sounds like a good thing to strive for.

Step 5: The world of tex_based_adventure

Of course, this begs the question: if we have four directions to go in at any given point, why are there six movement directions? And what is up with that permutation of 1 to 48 in the world data (which is used when computing the movements)?

First, notice that the handlers for turning left, right, or around change the value that is stored alongside the current room in the game state, then dispatch to the same code that runs for moving forward. This means that that value is the direction that we are moving in (and indeed, again, it is the current movement direction). To figure out more information, we need to reverse the actual movement handler (\hxp_iIXC:N). Clearly, it will take the current direction and somehow compute the next room to move to, then update the location in the game state:


The \hxp_BXzj:N function is never defined, so we can ignore the last line (it’s a hook left in from challenge development in case we wanted to add more insanity, but it’s probably good that we didn’t). \hxp_lsdS:Nn also appears in the item pickup and placement handlers; here, it appends the movement direction to an entry in the game state. In this way, the game can keep track of all the actions we have performed so far.

More interesting is the remaining bit of code. We can understand from the output code that we looked at earlier that \hxp_pUCt:N gives us the current direction, and \hxp_LRXj:N gives us the current room that the player is in. So what is \hxp_BdnG:N? It takes the current room and runs it through the permutation stored in the game state to get another value. Then, that value and the current direction are used to look up another value via \hxp_sXKM before inverting the permutation.

Decoding the base36 strings that make up \hxp_sXKM (or just dumping it after loading) shows us partial permutations of numbers between 1 and 54 (not 48 as before), with blocks of ff interspersed.

What is going on here?

Remember that the permutation (EdWg) comes from the initial world setup. It is updated only in one place: from \hxp_UFaA:NN via \hxp_POIe:NN, \hxp_POIe:NNn, \hxp_CXUO:NN, and \hxp_LDLX:Nn. This, in turn, is called from the placement hooks - so this happens when we move up a level. So how can the permutation change? Closer investigation shows that the second argument to \hxp_UFaA:NN is yet another permutation (with 00s being ignored entirely and the value staying the same). There are only six different possible values. Convert some subset of those to numbers and apply your favorite internet search engine ot it:

Sidney wants to know your location | Search results for "14, 15, 16, 43, 45, 48, 42, 47, 41, 44, 46"

The permutations that are applied to the game state are moves on a Rubik’s cube (specifically, these ones, but don’t actually try to use that for your solution, you won’t be happy).

So what is the EdWg permutation all about? It shuffles 1..48, and leaves the remaining 6 possible values (that we know exist from the longer permutations in \hxp_sXKM) alone. On a Rubik’s cube, there are 48 edges and corners that can move, and the six face centers that cannot move (at least under the operations given). The logical conclusion: it transforms between a position on the cube (in absolute terms) and the “tile” that is currently at that position.

Back to the directions: now that we know we are dealing with a cube, there are indeed six possible movement directions (clockwise and counterclockwise around each axis of the cube), and movement aroud the axis that is perpendicular to the face of the cube the player is currently on is invalid (the ff entries in \hxp_sXKM!).

Step 6: “Speed”cubing

There are many ways to solve a Rubik’s cube, so clearly just somehow solving it is not unique and probably not enough. There are two observations here that help us solve this uniquely.

Enemy placements

In each level, there are six item slots that each trigger a different rotation (which one doesn’t actually matter here; you can see that there are six from the hook code that is included in the level data). When the next level is loaded, there will be an enemy at all of these locations (leading to insta-death) except for one. This that in each cube state, there is only one possible next cube rotation - now, you just need to move to that slot while picking up all the items needed for the remainder of the levels.

Turn limit

Observe that the main game loop has a limit on the number of rounds that you can play. Eventually, the ceiling will cave in and the game will end. Indeed, you should strive for the shortest possible solution on each level (only in this way can you actually make it in time for the turn limit).

Then, it is just a question of collecting all items that you need to place in later levels and avoiding the enemies while moving to the target slot. A breadth-first search (over time steps, position, and the collected items) should yield a unique shortest path through a level that does not collide with the enemies. Submitting the correct commands to the challenge will then let you win the game.

Step 7: Where flag?

Remember the Please~wait~for~your~reward! victory message from earlier? The code that follows loads the vzIn image from the world data using the \hxp_rsKl:Nnn function - but the BvtK “YOU ARE DEAD” image is loaded using \hxp_rsKl:Nn (note the additional third argument).

Both end up dispatching to \hxp_rsKl:nn (which does the actual image rendering using PDFLaTeX internals that I will not go into here), but the Nnn version actually loads vzIn_NHNv - which shows up before that in \hxp_RmAA:neeVe (into which we also pass the original image data, loaded from \hxp_BoZH).

So, what does this do? The vzIn image looks like random data, so it is very likely some form of decryption algorithm.

\hxp_RmAA:nnnnn splits the data into 16-byte chunks using some ugly \def-style argument parsing, then passes it into \hxp_vMNr:nnn. Classic reversing shows us what happens here: For each chunk, the (1-based) index of the chunk is encoded as a 4 byte big-endian integer. The third argument to \hxp_rsKl:Nnn is concatenated to this, then processed by \hxp_BpLC:e (we’ll get to this in a minute), and then XOR’d with the chunk data to yield a new image.

What does \hxp_BpLC:e do? Unfortunately, it’s just MD5. \hxp__KXc: yields the typical MD5 starting state, there is Merkle-Damgård padding in \hxp_im_G:n, you can see the four different implementations of $F$ in \hxp_sfWu:nn, and the \hxp_JhWf integer array contains all the values from $s$.

This means our decryption routine looks like this:

import hashlib
def decrypt(image_data, key):
    for chunk_index, chunk_start in enumerate(range(0, len(image_data), 16), start=1):
        chunk_key = hashlib.md5(chunk_index.to_bytes(4, 'big') + key).digest()
        yield bytes(i ^ k for i, k in zip(image_data[chunk_start:chunk_start + 16], chunk_key))

Finally, where does the key come from? The third argument to \hxp_rsKl:Nnn in this case is \hxp_BpLC:e{\prop_item:Nn#1{kyHu}}, so the MD5 hash of the kyHu entry of the game state - which is the event history that \hxp_lsdS:Nn appends to from the action handlers (see step 5).

But of course, you don’t actually need to do any of this last step - once you solve the cube and the per-level maze, you can just paste the steps into the game and wait for it to spit out the final PDF: