Tags: crypto md5 collision 

Rating: 5.0

In this challenge, the user must send a base64 encoded "magic word" that will allow them to get the flag. Two files are provided:

- **door.c**, the source code of the binary checking that the magic word is given.
- **gatekeeper.py**, a Python script checking that only previously authenticated users can access the door, with a system of hash storage.

### The door

The code of the door is the following:

```c
#define MAGIC_WORD "sp3akfr1end4nd3nt3r"
int main() {
char input[255];
char flag[255];

scanf("%254s", input);
printf("You said: %s\n", input);

if (strcmp(input, MAGIC_WORD) == 0)
{
int fd = open("./flag", O_RDONLY);
if (fd == -1)
{
printf("Something went wrong! Thats a bug, please report!\n");
return 1;
}

read(fd, flag, 254);
printf("Flag: %s\n", flag);
}
else
{
printf("Nope :/\n");
}

return 0;
}
```

The magic word is now known, **sp3akfr1end4nd3nt3r**.

Plus, the comparison is done with the function `strcmp`. Thus, any given input starting with **sp3akfr1end4nd3nt3r\0...** will match the comparison. This allows the user to give the magic word, followed by some garbage, without caring about the door check.

### The gate keeper

The gate keeper is composed of an infinite loop as main function, in addition to two functions, `generateSecretHash` (generating a weird hash with the user's input) and `passToGate` (transmitting the input to the door).

#### The main function

Its code is as follows:

```python
# Two ways to get to the C program :
# 1. The input hash is in the goodHashes list
# 2. The magic word is not in the input
while True:
try:
currentInput = input("Give me your input:")

bytesInput = base64.b64decode(currentInput)
print("Doing magic, stand by")
hashed = generateSecretHash(bytesInput)

if hashed in goodHashes:
print(passToGate(bytesInput))
else:
if b"sp3akfr1end4nd3nt3r" in bytesInput:
print("Everybody knows the magic words. I can't hear it anymore! Go away! *smash*")
exit(0)
else:
goodHashes[hashed] = bytesInput
print(passToGate(bytesInput))
```

It takes the base64 encoded user's input and generates a strange hash with the `generateSecretHash` function.

If the hash has already been computed, the input is directly transfered to the gate. Otherwise, it checks that the input does not contain the magic word. If so, the hash is added to the list of previous hashes and the input is transfered to the door.

The aim is quite clear now and is done in two steps:

1. Add a first hash to the list with an input without the magic word in it
2. Use a collision attack to generate another input with the magic word in it, and send it then.

#### The `generateSecretHash` function

Its code is the following:

```python
# Actually, finding a collision on MD5 would be sufficient to get same hashes
def generateSecretHash(byteInput):
md5 = hashlib.md5()
sha1 = hashlib.sha1()
sha256 = hashlib.sha384()
blake2b = hashlib.blake2b()
md5.update(byteInput) # Here is the only location where the input is used
sha1.update(md5.digest())
md5.update(sha1.digest())
#then it performs lots of cryptograpic operations
for i in range(0, 2938):
sha256.update(md5.digest())
for k in range(-8222, 1827, 2):
sha1.update(sha256.digest())
sha256.update(sha1.digest())
for j in range(20, 384):
blake2b.update(sha256.digest())
return blake2b.hexdigest()
```

It performs a lot of cryptographic operations, which makes it quite unpredictable for a given entry.

However, a weakness can be highlighted: the input is used in the first operation, which is an MD5 computation. It means that, if one can find an MD5 collision satisfying our requirements (one input starting with **sp3akfr1end4nd3nt3r\x00** and one not), for these two inputs, the first MD5 operation will return the same result and, thus, the secret hash wil also be the same.

### The MD5 collision

By doing some research on the internet, the program named *HashClash* immediatly seemed to be the solution to perform such an attack.

However, I wasted a lot of time trying to use the Chosen Prefix Collision attack of the program, which would have let me chose two prefixes matching my desires to generate two files with the same MD5 digest. The big inconvenient of this method being the time of execution, even with a good GPU and 8 cores, the program ran for several hours without getting any results.

I then came across a presentation of Ange Albertini and Marc Stevens for the Pass-The-Salt 2019 conference (available [here](https://2019.pass-the-salt.org/files/slides/01-KILL_MD5.pdf)) and the corresponding [Github Repository](https://github.com/corkami/collisions) introducing, in particular, the UniColl attack, an *Identical prefix* attack, in which, for a given prefix, two small collision files are created with a few differences on the first 64-bytes block. Especially, with a bit of luck, one byte of the prefix can be changed for one of the files.

That is what happened when I ran the program `script/poc_no.sh` of the *HashClash* repository on the prefix **sp3akfr1end4nd3nt3r\x00**. The generated files are [collision1.bin](https://github.com/dspiricate/writeups/blob/main/ALLES/Doors%20of%20Durin/collision1.bin?raw=true) and [collision2.bin](https://github.com/dspiricate/writeups/blob/main/ALLES/Doors%20of%20Durin/collision2.bin?raw=true).

### The attack

The execution of the attack is shown in the original writeup.

Original writeup (https://github.com/dspiricate/writeups/tree/main/ALLES/Doors%20of%20Durin).