Rating: 5.0
## Solution
This problem is about finding input that will provide a __hash collision__ with the hash of the admin account.
### Analyzing the hash
We analyze the hash function and we see that the input is made up of _blocks of 16 bytes_ that is processed with AES.
```python
def digest(self, user, password):
cipher = AES.new(self.key, AES.MODE_ECB)
q = 0
data = self._pkcs7pad(user + password)
for i in xrange(0, len(data), self.bs):
block = data[i:i + self.bs]
q ^= int(cipher.encrypt(block).encode("hex"), 0x10)
return q
```
And from the prompt we note that:
> I've learnt from one-oh-one that h(message | key) is
secure... I think
Now it is important not three facts here:
1. The mode used in AES is __ECB__
2. The blocks are just __XOR'ed__ with each other
3. The key is appended in the message before hashing. H(message | key)
This gives us with several nice properties. Since the AES is in ECB mode, then the order of the blocks do not really matter and we are sure that a particular block will have a unique hash value.
```python
a = '0'*16 # bytes per block
b = '1'*16
a_hash = get_hash(a)
b_hash = get_hash(b)
ab_hash = get_hash(a+b)
ba_hash = get_hash(b+a)
assert ab_hash == ba_hash
assert (a_hash ^ b_hash) == ab_hash # This will fail ... why?
```
The last statement will fail because the _key_ is appended to the message before each hashing.
```python
get_hash(a) == H(a) ^ H(key)
get_hash(b) == H(b) ^ H(key)
get_hash(a+b) == H(a) ^ H(b) ^ H(key)
```
So to simplify things, __we first get the hash of the key__.
```python
get_hash(a+a) == H(a) ^ H(a) ^ H(key)
get_hash(a+a) == (H(a) ^ H(a)) ^ H(key)
get_hash(a+a) == H(key)
```
And now we can get the actual hashes and reliably determine the resultant hash of whatever input we want.
```Python
key_hash = get_hash(a+a)
a_hash = get_hash(a) ^ key_hash
b_hash = get_hash(b) ^ key_hash
ab_hash = get_hash(a+b) ^ key_hash
assert (a_hash ^ b_hash) == ab_hash
```
### Finding a collision
Our main goal would be given several inputs which we know the corresponding hash, find a combination of these input that will result to our desired hash.
```python
h1 = H(x1) # get_hash(x1) ^ key_value
h2 = H(x2)
h3 = H(x3)
...
```
Find some subset of h's such that
```python
h1 + h2 + ... + hn == (0x57ef36c6071b024df40fd6e5f47857d8^key_value) # Admin Hash
```
Such that
```python
get_hash(x1+x2+...+xn) == 0x57ef36c6071b024df40fd6e5f47857d8
```
We represent this problem in a system of linear equation in the mod 2
For example
```
x1 = 101
x2 = 001
x3 = 110
desired = 010
```
The last bit is a result of x1 to x3...
```
0 = 1*x1 + 1*x2 + 0*x3
1 = 0*x1 + 0*x2 + 1*x3
0 = 1*x1 + 0*x2 + 1*x3
```
And we can represent this as a matrix and use classical linear algebra techniques to solve it
```
[ 1 1 0 ] [ x1 ] [ 0 ]
[ 0 0 1 ] x [ x2 ] = [ 1 ]
[ 1 0 1 ] [ x3 ] [ 0 ]
```
So finding a collision is:
1. Generate 128 payload-hash pairs
2. Use the 128 hashes to construct a 128x128 matrix
3. Solve for the linear combination that results to the desired hash in mod 2
__For implementation details please see the link__