Rating:

# Compiler Hell
## Analysis

We poked around a bit, and we roughly went through the following train of thoughts:

* Defining the `S4` macro with different values, dictates the output of the code. Looks like the author intended to indicate that $0 \le S4 < 2^{16}$ based on the initial `#if` statements.
* There are lots of `#define`s and `#ifdef`s that define the flow of the preprocessing. Also, there are lots of `#include`s that include the same file.
* There is so little actual code that is not preprocessor directives:
```
$ cat code.c | grep -v "^#"
static const unsigned char O[] = {
};
int main() { return !fwrite(O, 58, 1, stdout); }
0,
-~
```
* But how does that even result in anything other than trivial outputs? Apparently the compiler has a flag to show emitted code right after pre-processing phase:
```
gcc code.c -E | grep -v "^#" | grep -v "^$"
```
The `grep`s are there to filter out empty lines and comments.
* Oh, hey, `-~0 = 1` and `-~-~0 = 2` and `-~-~-~0 = 3` and so on! It's making sense now.

So to wrap up: the code recursively includes itself, macros get defined and undefined along the way, and that effects the number of `-~`s that are emitted which in turn indicates the character that is being constructed. Also emission of the `0,` means we are moving to the next character. We are expected to find the appropriate value of `S4` such that beginning with just that macro defined, results in a code that outputs the flag.

## Base implementation

We could try to brute force our way by compiling the code over and over again, each time defining `S4` with a different value. We didn't try that and don't think it would've worked either since the compilation process seemed too slow to work.

So what we did was we wrote an interpreter to predict the output of compiling the code with a specific value of `S4`, then iterated through every value of `S4` to find a string that contains the word `CTF`.

This is not faster than compiling the code directly, but it lets us apply optimizations going forward.

## Optimization 1: Memoize

Compilers are, hopefully, deterministic beings. Meaning that their initial state dictates their final state. In this case, **result** (here: emitted code + defined macros) of `include`ing only depends on the macros that are defined right before the file is included.

We can memoize the result of `include`ing the file, given the current state of defined macros, and avoid re-processing the include if we face a similar macro state.

I guess this optimization alone could be enough to get the flag in a couple of hours, but we wanted to be in the first 3 to solve this to get the bonus points. So, we went even further:

## Optimization 2: Prune
We expect the final string to be human readable. Let's just stop after we spot the first non-humanreadable character in the resulting output. This makes the search tree so lightweight that the whole search space may be explored in 10 minutes or so.

## Flag
`#define S4 34979` results in:

```
S4CTF{Th1nK_oF_tHi5_En7rY_aZ_coD3_0bFuSc4t3r_wIth_a_cOdE}
```

One question remains though: How did the authors write the code in the first place?