Rating:

tl;dr We exploit a weakness in the hash function to revert it for a known data string. (The weakness does **not** require in-depth understanding of the hashing, see step 2)

*The exploit idea is quite simple, but I find it rather complex to explain it correctly. Sorry if the write-up is a bit confusing.*

Between nanothorpe and minithorpe, 3 small changes happened:
1. Deletion of the URL decoding -> we cannot send the 0x80 character that is part of the padding
2. Argumentparsing changed -> now, if we provide several identical parameters, the first will be used instead of the last
3. the expiry parameter is appendend instead of prepended to our message

Those changes turn the length attack unfeasable. Now it is time to take a look at the hash function in _compress()

Our exploit will do the following steps:
1. Let the server sign a message consisting of 2 Blocks in total (we know everything except the first 32 bytes in the first block, we put an own expiry= in the first block, and cmd=ls in the second block)
2. Using the final state of the octothorpe class and the second block, we compute octothorpe's state after compressing the first block, essentially deleting the second block (with the cmd=ls parameter)
3. We can now compress our own second block appending our own cmd: `cmd=cat+%2Fflag_%2A.txt`

##### Step 1
We know the secret is 32 Bytes (check the Docker file), we add an `expiry=` parameter with a time far in the future.
We add some unused parameters for filling and finish the first message with `cmd=ls`. The total message is 128 Bytes - 32 Bytes (the secret added by the server) - 9 Bytes (added by octothorpe to finalize the hashing). (The exact length does not matter)
We let the server sign this message and extract the query_string (which includes the server's expiry= as well) and the signature (final state of octothorpe)

##### Step 2
Let's investigate the hash function:
* We have a 16 byte state, and a 64 byte block as the input.
* There are 20 Rounds. In each round, every byte of the block is used in some magic and modifies the state at index `j-1` and `j+1`. There has to be something fishy about this.
* If we want to revert the hashing, we should do it round-by-round, so we look at the rounds seperatly
* If you take a closer look what is "constant" and what is unknown in the hashing, we can rewrite the inner loop as:
```
for i in range(64):
j = 'independent of input'
self._state[j-1] = self._state[j-1] ^ magic-depending-on-(state[j])
self._state[j+1] = self._state[j+1] ^ magic-depending-on-(state[j])
```
* We wanted to know, which byte in the 16 byte state contribute to which byte in the resulting state. Therefor, we just run the program and collected this information. So, in each iteration, we know `i` and `j` (both do not depend on the state or input, so we can do this with abitrary values). And we know that the state's bytes at index `j-1` and `j+1` are affected by the original values at these positions and the input value at index `i`. From this, we can compute the *depends on*-information. It looks like this for the last round:
```
0 depends on 15, 1, 15, 1, 15, 1, 15, 1,
1 depends on 0, 2, 0, 2, 0, 2, 0, 2,
2 depends on 3, 1, 3, 1, 3, 1, 3, 1,
3 depends on 2, 4, 2, 4, 2, 4, 2, 4,
4 depends on 3, 5, 3, 5, 3, 5, 3, 5,
5 depends on 6, 4, 6, 4, 6, 4, 6, 4,
6 depends on 5, 7, 5, 7, 5, 7, 5, 7,
7 depends on 6, 8, 6, 8, 6, 8, 6, 8,
8 depends on 9, 7, 9, 7, 9, 7, 9, 7,
9 depends on 8, 10, 8, 10, 8, 10, 8, 10,
10 depends on 9, 11, 9, 11, 9, 11, 9, 11,
11 depends on 12, 10, 12, 10, 12, 10, 12, 10,
12 depends on 11, 13, 11, 13, 11, 13, 11, 13,
13 depends on 12, 14, 12, 14, 12, 14, 12, 14,
14 depends on 15, 13, 15, 13, 15, 13, 15, 13,
15 depends on 0, 14, 0, 14, 0, 14, 0, 14,
```
Weeee. So, only the values of the 1st and 15th (and the 0th byte itself) in the old state affect the 0th byte in the state after this final round (numbers appear several times, because several values for `i` use the same state bytes). And this is the worst case already. What does this mean? The bytes are fairly independent, allowing us to bruteforce them step by step instead of all at once.

Let's say, the (unknown) state before the final round is `[s_0, s_1, s_2, ..., s_15]` and the final state is `[t_0, t_1, t_2, ..., t_15]` (known from the signature). So, `t_0` depends on the values of `s_1`, `s_15` and `s_0` for a fixed input block. We also consider `block` to be known (if the message is at least 2 blocks long, we know the last block).

So, basically, the value `t_0 = s_0 ^ magic(s_1) ^ magic(s_15)`. There are `256*256` possibilities for `s_1` and `s_15` and for all those possibilities, we can simulate the hashing and compute `magic(s_1) ^ magic(s_15)`, because we know `sbox`, `block`, `shift`, ... Also knowning `t_0`, we can now deduce `s_0`. So, for each pair `(s_1, s_15)`, we also know `s_0`. We now have about `256*256` triples to continue with (but this number will now grow much).

Now let's go to to `t_1`. It depends on `s_0`, `s_2` and `s_ 1`. We already fixed `s_1` and computed `s_0`, so only bruteforce 256 possibilities for `s_2` and check which one gives the exptected value of `t_1`. In most cases there is one value that fits, so we do this for each of the triples and now have around the same amount of tuples with 4 bytes fixed in each of them. We continue this up to `t_15`. Looking at `t_14` and `t_15`, you can see that they depend on `s_15` and `s_0`, two values that are already fixed, so when we get there, we just need to check whether the hashing computes the expected values of `t_14` and `t_15`. And hereby the number of around `256*256` tuples decreases to 1 valid tuple. This tuple (fixed values for all 16 bytes `s_i`) is a valid state for `s` that will result in `t` after applying the hashing using the given input `block`. We undid round 20 and can now do the same for round 19, round 18, ... and all the way back until we undid round 1 giving us the initial state of octothorpe after hashing the first block, but before hashing the second block.

*Notes*:
* some cases, there are 2 or more input states valid (we encountered rare cases of 4 or even 8 states), so we had to undo previous rounds for all of those states, but most of them lead to a dead end (no input state could lead to the expected output and could thus be dropped). In the end, we got 7 states where only one was the actual state used by octothorpe on the server. But we can try 7 possibilities.
* We implemented this algorithm in C++ to be faster. It took about 1 minute to compute.
* The code is really bad c++, because I did not want to think about clean solutions, so here is just a pseudo code variant for inverting a single round:
```
function invert_one_round(round, state_after, block):
possible_states = [[-1] * 16] # 1 state, nothing known yet
for idx in range(16):
for ps in possible_states:
d1, d2 = get-depends-on(idx) # e.g. idx=0 -> d1 = 15, d2 = 1
for v1 in range(256) and v2 in range(256): # (or if we already fixed d1 or d2 in ps, use this one)
ps[d1] = v1, ps[d2] = v2
tmp = state_after[idx] # this is the exptected state (= t_idx)
for each depedency of idx: # a dependency contains the index inside of block and the state used to compute magic()
tmp = tmp ^ magic(block_idx, j, jj, round) # magic() is this line: self.sbox[block[i] ^ rol(state[j], self.shift[jj], 8)], all values are known / fixed here
# now tmp is the value of s_idx
if ps[idx] == -1:
ps[idx] = tmp # fix this one
possible_pre_states.add(ps)
elif ps[idx] != tmp:
# error, continue with next possible_state
else:
# consistent
possible_pre_states.add(ps)
possible_states = possible_pre_states # for next round
return possible_states # all states lead to after_state, when applying the hashing using input block in the given round
```

##### Step 3
We setup octothorpe with the state after the first block, add our own second block containing the command we want to execute and hash this second block. This message can be sent to the `/api/run` endpoint just like in nanothorpe. As we can have multiple possible state, we just try all of them (all tests gave about 5-10 states).