Tags: pwntools python3 hash crypto xor 

Rating:

# GoogleCTF 2018: dogestore

* **Category:** crypto
* **Points:** 267
* **Description:**

> Secret Cloud Storage System: This is a new system to store your end-to-end
> encrypted secrets. Now with SHA3 integrity checks!
>
> `nc dogestore.ctfcompetition.com 1337`
>
> [\[Attachment\]](https://storage.googleapis.com/gctf-2018-attachments/d9a4d2232d40353fa6160ccf0c6f9a51f0afa75d65dc3c835230e00773f01730)

## Writeup

As with the last two years, Google hosted another Capture the Flag competition.
This time the challenges were quite hard, but also lots of fun to solve. We,
"LosFuzzys", teamed up with the Viennese CTF team "We\_0wn\_You" and played
together under the name "KuK Hofhackerei".

We didn't solve this challenge in time, but our exploit did finish executing a
few hours after the CTF closed.

## The Challenge

At first we tried looking at the attached files. There were two files:

- a `fragments.rs` Rust source file
- an `encrypted_secret` binary file

The objective was to recover the plaintext contents of the `encrypted_secret`
file.

We can also send an encrypted file to the server and have it decrypt and decode
the file. It will then send back the base64-encoded SHA3 hash of the decoded
data.

Here are the contents of the Rust file:

```rust
const FLAG_SIZE: usize = 56;
const FLAG_DATA_SIZE: usize = FLAG_SIZE * 2;

#[derive(Debug, Copy, Clone)]
struct Unit {
letter: u8,
size: u8,
}

fn deserialize(data: &Vec<u8>) -> Vec<Unit> {
let mut secret = Vec::new();
for (letter, size) in data.iter().tuples() {
secret.push(Unit {
letter: *letter,
size: *size,
});
}
secret
}

fn decode(data: &Vec<Unit>) -> Vec<u8> {
let mut res = Vec::new();
for &Unit { letter, size } in data.iter() {
res.extend(vec![letter; size as usize + 1].iter())
}
res
}

fn decrypt(data: &Vec<u8>) -> Vec<u8> {
key = get_key();
iv = get_iv();
openssl::symm::decrypt(
openssl::symm::Cipher::aes_256_ctr(),
&key,
Some(&iv),
data
).unwrap()
}

fn store(data: &Vec<u8>) -> String {
assert!(
data.len() == FLAG_DATA_SIZE,
"Wrong data size ({} vs {})",
data.len(),
FLAG_DATA_SIZE
);
let decrypted = decrypt(data);
let secret = deserialize(&decrypted);
let expanded = decode(&secret);
base64::encode(&compute_sha3(&expanded)[..])
}
```

## The Code

It seems this code is part of the server. The `store(data)` function seems to be the
main entry point where the sent data is processed.

After a bit of experimenting with the service, we found out that the `data`
variable is the raw input we send to the server.

However, we noticed that the server reads *110* bytes, not the *112* bytes
stated in the `FLAG_DATA_SIZE` constant.

Once there, it gets processed in multiple steps:

- `decrypt(data)`:
our data is decrypted with AES-256 in CTR mode with an unknown key and nonce
- `deserialize(decrypted)`:
an array of `Unit`s is constructed from the decrypted data, representing a
byte and a count (how often the byte should be repeated). This is a kind of
[run-length-encoding](https://en.wikipedia.org/wiki/Run-length_encoding)
- `decode(secret)`:
the run-length-encoding is decoded, but a count of zero here means one. In
fact, every count is increased by one
- `compute_sha3(expanded)`:
the decoded data is hashed with the SHA3 hashing algorithm. The exact
algorithm used here doesn't matter much, the important thing to note is that
it's a one-way function, and we can only test if the hash changed or not.
- `base64::encode(hash)`:
afterwards, the hash digest is base64 encoded and sent back over the wire.

## The Vulnerability

Looking at the steps, it doesn't look that bad, but after a little bit of trying
around, it seems that *the nonce used for decryption is constant* and reused
every time we send something to the server. This is a glaring vulnerability, as
a nonce should only ever be used once.

![AES CTR mode](https://upload.wikimedia.org/wikipedia/commons/3/3c/CTR_decryption_2.svg)

(*Image by Gwenda (PNG version), WhiteTimberwolf (SVG version) [Public domain],
via Wikimedia Commons*)

As you can see, the ciphertext is XOR-ed with an XOR-pad generated by encrypting
the nonce and counter with the key. This XOR-pad is completely independent from
the ciphertext and plaintext, and thus constant.

Also, the ciphertext is merely XOR-ed with that XOR-pad, so every bit-flip we
introduce in the ciphertext is exactly replicated in the decrypted plaintext at
exactly that offset. Moreover, decrypting only null-bytes yields the XOR-pad.

This alone is not enough to expose the key, as we only get a hash of the decoded
key. However, we can abuse a property of the run-length-encoding to produce
hash-collisions.

Here is an example:

```
letter: 'A'
|
| count: 6 times 6 times
| | decode ------ SHA3 + base64
0x41 0x05 0x41 0x02 =========> AAAAAAAAA ==> 9 times ================> PRZqCLkb9WfZtPBvGBAbw4knLYl482IOxMiXwUWURqw=
| | ---
| count: 3 times 3 times ^
| |
letter: 'A' |
same hash! |
letter: 'A' |
| |
| count: 5 times 5 times v
| | decode ----- SHA3 + base64
0x41 0x04 0x41 0x03 =========> AAAAAAAAA ==> 9 times ================> PRZqCLkb9WfZtPBvGBAbw4knLYl482IOxMiXwUWURqw=
| | ----
| count: 4 times 4 times
|
letter: 'A'
```

The reason this works is that during decoding, adjacent letter blocks are just
concatenated, and if two letter blocks consist only of the same letter, after
concatenation, it's impossible to distinguish how many of the letters came from
which block.

Also, it doesn't matter what the actual count is: Simply decreasing one count
and increasing the other should not change the length of the decoded text, and
if the letters are the same, the hash should stay the same as well.

## The Attack

Sadly, we cannot reliably decrease or increase bytes in the decrypted data,
only flip bits.

If we flip the lowest-order bit, two things can happen:

- the value *increases* by one if the bit was unset aka the value was *even*
- the value *decreases* by one if the bit was set aka the value was *odd*

This means that if we flip both counts' lowest-order bits the length either
doesn't change at all (which is what we want) if the lowest-order bits *differ*
(i.e. one is even and the other is odd) or the length increases or decreases by
two if the lowest-order bits are the same (i.e. both are even or both are odd).

From that, we can formulate the following attack to recover the XOR between two
adjacent `Unit`s:

### Recovering the Letter XOR

```
guessed letter XOR
|
sent data: 0x00 0x00 guess 0x00
decrypted: let1 cnt1 decr cnt2 => base1_hash

sent data: 0x00 0x01 guess 0x01
decrypted: let1 cnt1 decr cnt2 => test1_hash
^ ^
flipped lowest-order bit

guessed letter XOR flipped lowest-order bit
| v
sent data: 0x00 0x00 guess 0x01
decrypted: let1 cnt1 decr cnt2 => base2_hash

sent data: 0x00 0x01 guess 0x00
decrypted: let1 cnt1 decr cnt2 => test2_hash
^
flipped lowest-order bit
```

In this attack, `let1`, `cnt1`, `let2`, and `cnt2` represent bytes from the
decryption XOR-pad, interpreted as a letter or count when sending an (almost)
all zeroes ciphertext.

`decr` is the XOR between `guess` and `let2`.

If we guessed the XOR between `let1` and `let2` correctly, `decr` will be equal
to `let1`, and the vulnerable situation described above will occur.
Consequently, if `cnt1` and `cnt2` differ in their lowest-order bit, then
`base1_hash` and `test1_hash` will be equal. If they are both even or both odd,
then `base2_hash` and `test2_hash` will be equal.

If we guessed incorrectly, neither of the pairs will be equal.

This means that in as little as `256*4` (worst-case) attempts, we can recover
the XOR between two adjacent letter bytes. (adjacent in this context means
located in adjacent `Unit`s)

For recovering the XOR between count bytes, a few more hashes are necessary:

### Recovering the Count XOR

```
recovered letter XOR
|
sent data: 0x00 0x00 lxor 0x00
decrypted: let1 cnt1 let1 cnt2 => base1_hash

sent data: 0x00 pow lxor pow
decrypted: let1 cnt1 let1 cnt2 => test_bit_hash
^ ^
flipped nth bit
```

Here, `lxor` is the recovered letter XOR from above and `pow` is the nth power
of two.

If `cnt1` and `cnt2` differ in their nth bit, then `test_bit_hash` will be the
same as `base1_hash`. We don't have to request `base1_hash` since it is the same
as in the last iteration of the above attack for guessing the letter XOR.

Therefore, 7 more attempts are required to recover the XOR between two adjacent
count bytes.

### Chaining the attack

To attack different parts of the decryption XOR-pad the attack payload can be
prepended and appended with anything (we chose zeroes) to move the payload to
overlap the attacked two units.

After attacking each pair, the XOR between two letter and count bytes
respectively is known. Knowledge of the preceding and following letter and
count XOR values is not needed, so the attack can be trivially distributed and
parallelized. The limiting factor is the speed of the attacked server.

These are the recovered letter and count XOR values:

```
>> letter xors
[
191, 119, 132, 188, 171, 242, 33, 15, 50, 0, 32, 130, 110, 51, 57, 36, 108,
223, 132, 48, 58, 47, 190, 144, 54, 115, 250, 91, 13, 16, 25, 193, 178, 26,
115, 140, 231, 65, 99, 180, 221, 121, 92, 206, 16, 64, 152, 181, 231, 228,
136, 149, 177, 237
]
>> count xors
[
73, 144, 186, 217, 118, 36, 95, 211, 27, 42, 53, 201, 159, 255, 22, 105, 94,
172, 128, 119, 170, 28, 157, 183, 114, 107, 49, 248, 147, 241, 153, 129, 199,
37, 51, 84, 250, 35, 235, 225, 197, 205, 41, 230, 129, 107, 248, 35, 197,
142, 58, 204, 52, 21
]
```

### Offline Decryption

After recovering all letter and count XOR values the only missing pieces are the
first letter byte and the first count byte. When these are known, any payload
encrypted with the key can be decrypted.

Fortunately, we know that the decrypted secret must contain the text `CTF{`,
since a flag must be in there somewhere.

Using this, we first guess the first letter key byte (256 possibilities) and
ignore the run-length count (always take each letter once). For each possible
decryption that contains `CTF{` we then guess the first count byte (256
possibilities), looking for decryptions that contain `CTF{` even after
run-length-decoding (as opposed to e.g. `CCCCCTTFF{{{{{{{`).

Finally, we got out the decrypted data:

```
b'HFHFHHHDHDHDDDDDDSSSSSSSAAAAaAAAAAACTF{{{SADASDSDCTF{LLLLLLLLL___EEEEE____RRRRRRRRRRR_OYYYYYYYYYY_JEEEEEEENKKKINNSSS}ASDDDDDDDCTF{{{{{\n'
```

and the flag:
`CTF{LLLLLLLLL___EEEEE____RRRRRRRRRRR_OYYYYYYYYYY_JEEEEEEENKKKINNSSS}`

The correct decryption XOR-pad was this:

```
[
174, 22, 17, 95, 102, 207, 226, 117, 94, 172, 245, 218, 7, 254, 38, 161, 41,
114, 27, 105, 27, 67, 59, 118, 185, 191, 215, 32, 228, 223, 221, 201, 249,
160, 149, 254, 74, 82, 206, 210, 254, 165, 196, 15, 235, 19, 85, 142, 197,
57, 243, 75, 128, 32, 122, 17, 33, 233, 44, 122, 60, 139, 37, 18, 228, 147,
86, 84, 76, 113, 63, 66, 179, 22, 84, 236, 21, 207, 118, 36, 194, 197, 31, 0,
102, 205, 58, 228, 244, 2, 228, 131, 164, 232, 60, 16, 137, 51, 110, 246,
138, 120, 2, 66, 151, 142, 38, 186, 203, 175
]
```

## The Script

This is the cleaned-up exploit script we used to recover the flag:

```python
from pwn import *
import base64

FLAG_SIZE = 55
FLAG_DATA_SIZE = FLAG_SIZE * 2

bord = lambda code: bytes([code])

class Unit:
"""
Represents a letter with run length encoding count
"""
def __init__(self, letter, size):
self.letter = letter
self.size = size
def __repr__(self):
return "<{}*{}>".format(bytes([self.letter]), self.size + 1)

def crypt(data, key):
"""
Encrypts the data with the specified xorpad key
(xorpad generated by AES_256_CTR)
"""
return bytes(data[i] ^ key[i] for i in range(FLAG_DATA_SIZE))

def deserialize(data):
"""
Construct Unit instances from bytestream
Every second byte is a repeat length
"""
units = []
for i in range(0, len(data), 2):
units.append(Unit(data[i], data[i + 1]))
return units

def decode(units):
"""
Decode run length encoding
(zero length decodes as one)
"""
res = b""
for unit in units:
res += bytes([unit.letter]) * (unit.size + 1)
return res

def decode_without_runlength(units):
"""
Decode without applying run length encoding
Useful for checking flag plausibility
"""
res = b""
for unit in units:
res += bytes([unit.letter])
return res

def decrypt_and_hash(data):
"""
Connect to the remote and decrypt and hash some chosen ciphertext
"""
r = remote("dogestore.ctfcompetition.com", 1337)
r.send(data)
b64 = r.readall()
return base64.b64decode(b64)

with open("encrypted_secret", "rb") as f:
secret = f.read()

letter_xors = [None] * (FLAG_SIZE - 1)
count_xors = [None] * (FLAG_SIZE - 1)

def xor_payload(offset, l1, c1, l2, c2):
"""
Generate a payload used for finding the XOR difference between neighboring
letters and run length counts
"""
return (
b"\0" * offset +
bord(l1) + bord(c1) + bord(l2) + bord(c2) +
b"\0" * (FLAG_DATA_SIZE - (offset + 4))
)

def find_xor(offset):
"""
Infer the XOR difference between two letters and run length counts in the raw
AES_256_CTR xorpad
(This is where the real magic happens)

Runs in worst case 256*4 + 7 decrypt/hash iterations
"""
letter_xor = None
count_xor = None
for letter_xor_try in range(256):
print(".", end = "")
base_hash = decrypt_and_hash(xor_payload(offset, 0, 0, letter_xor_try, 0))
test_hash = decrypt_and_hash(xor_payload(offset, 0, 1, letter_xor_try, 1))
if base_hash == test_hash:
letter_xor = letter_xor_try
count_xor = 1
break
base2_hash = decrypt_and_hash(xor_payload(offset, 0, 0, letter_xor_try, 1))
test2_hash = decrypt_and_hash(xor_payload(offset, 0, 1, letter_xor_try, 0))
if base2_hash == test2_hash:
letter_xor = letter_xor_try
count_xor = 0
break
if letter_xor == None:
print()
return None, None
for bit in range(1, 8):
print(".", end = "")
power = 1 << bit
test_bit_hash = decrypt_and_hash(xor_payload(offset, 0, power, letter_xor, power))
if test_bit_hash == base_hash:
count_xor |= power
print()
return letter_xor, count_xor

def find_all_xor():
"""
Find all XOR differences in the AES_256_CTR xorpad

Narrows down the key space to 256*256 possibilities
"""
for i in range(0, FLAG_SIZE - 1):
print(">> {}".format(i))
print(">> letter xors")
print(letter_xors)
print(">> count xors")
print(count_xors)
letter_xor, count_xor = find_xor(2 * i)

letter_xors[i] = letter_xor
count_xors[i] = count_xor

def find_plausible():
"""
Find plausible values for the first xorpad letter and first run length count
and prints potential flags when it finds them

Runs completely offline
"""
key = [0] * FLAG_DATA_SIZE
for start_letter_xor in range(256):
key[0] = start_letter_xor
for i, xor in enumerate(letter_xors):
key[2 * (i + 1)] = key[2 * i] ^ xor

decr = crypt(secret, key)
units = deserialize(decr)
deco_wr = decode_without_runlength(units)
if b"CTF{" in deco_wr:
print(">> plausible letter xor: {:02x}".format(start_letter_xor))
for start_count_xor in range(256):
key[1] = start_count_xor
for i, xor in enumerate(count_xors):
key[2 * (i + 1) + 1] = key[2 * i + 1] ^ xor

decr = crypt(secret, key)
units = deserialize(decr)
deco = decode(units)
if b"CTF{" in deco:
print(">> plausible count xor: {:02x}".format(start_count_xor))
print(">> key")
print(key)
print(deco)

context.log_level = "ERROR"
find_all_xor()
print(">> letter xors")
print(letter_xors)
print(">> count xors")
print(count_xors)
find_plausible()
```

Original writeup (https://losfuzzys.github.io/writeup/2018/06/26/googlectf-dogestore/).