Tags: prng crypto xor

Rating:

# Probably Really Nice Goodies from Santa

Just reading the title, we can see a certain acronym looming: PRNG, or Pseudo-Random Number Generator. Inside the provided .zip file are two files, flag.enc and Probably Really Nice Goodies from Santa.py. The python file is relatively concise:

import os

class PRNG():
def __init__(self):
self.seed = self.getseed()
self.iv = int(bin(self.seed)[2:].zfill(64)[0:32], 2)
self.key = int(bin(self.seed)[2:].zfill(64)[32:64], 2)
self.aux = 0

def parity(self,x):
x ^= x >> 16
x ^= x >> 8
x ^= x>> 4
x ^= x>> 2
x ^= x>> 1
return x & 1

def getseed(self):
return int(os.urandom(12).encode('hex'), 16)

def LFSR(self):
return self.iv >> 1 | (self.parity(self.iv&self.key) << 32)

def next(self):
self.aux, self.iv = self.iv, self.LFSR()

def next_byte(self):
self.next()
x ^= x >> 16
x ^= x >> 8
return (x & 255)

def encrypt(s):
o=''
for x in s:
o += chr(ord(x) ^ p.next_byte())
return o.encode('hex')

p=PRNG()

with open('flag.enc','w') as f:
f.write(encrypt(flag))

There are three basic parts to this program: the PRNG class which seems to involve a lot of bitwise arithmetic, the encrypt method which seems to encrypt a string, and the bits of code that define the flag and that uses these definitions. Let's start at the bottom and work our way up.
There are three basic parts to this program: the PRNG class which seems to involve a lot of bitwise arithmetic, the encrypt method which seems to encrypt a string, and the bits of code that define the flag and that uses these definitions. Let's start at the bottom and work our way up.

...

p=PRNG()

with open('flag.enc','w') as f:
f.write(encrypt(flag))

flag is a string variable which is simply the contents of some flag.txt. We don't have flag.txt, so it's up to us to reverse engineer the value in the flag variable. p is clearly an instance of PRNG, presumable a Pseudo-Random Number Generator. The file flag.enc is opened, then the encrypt(s) is called with the flag and written to flag.enc. If we can find a way reverse the encrypt function, we can apply it to the flag.enc file we were provided to find the original key.

def encrypt(s):
o=''
for x in s:
o += chr(ord(x) ^ p.next_byte())
return o.encode('hex')

This encrypt method is pretty concise. It just loops through every character x in a string, calls p.next_byte(), and XORs them together. The XOR function (expressed in Python as the ^ operator) is reversible: if we apply encrypt(s) to the ciphertext, we will get the original message back. All that remains is to restore the original state of the PRNG.

class PRNG():
def __init__(self):
self.seed = self.getseed()
self.iv = int(bin(self.seed)[2:].zfill(64)[0:32], 2)
self.key = int(bin(self.seed)[2:].zfill(64)[32:64], 2)
self.aux = 0

def parity(self,x):
x ^= x >> 16
x ^= x >> 8
x ^= x>> 4
x ^= x>> 2
x ^= x>> 1
return x & 1

def getseed(self):
return int(os.urandom(12).encode('hex'), 16)

def LFSR(self):
return self.iv >> 1 | (self.parity(self.iv&self.key) << 32)

def next(self):
self.aux, self.iv = self.iv, self.LFSR()

def next_byte(self):
self.next()
x ^= x >> 16
x ^= x >> 8
return (x & 255)

This is the heart of the challenge. The first line of the constructor sets self.seed = self.getseed(), which uses os.urandom(12) to generate a random 96 bit integer. The constructor then unpacks self.seed into 3 32 bit chunks, which are put into self.iv, self.key, and self.mask. To my knowledge, there is no way for us to guess the random number provided by os.urandom, which uses the Operating System's random number generator. This means we have no information whatsoever about the initial values of the IV (Initial Value), KEY, and MASK. As far as I can tell, self.aux is irrelevant.

Let's continue reading with next_byte(). First x is set to the XOR of IV and Mask. self.next() is then called, which presumably changes the state of the PRNG so that the next call to next_byte() returns a different value. The last few lines basically XOR different bits of X with itself, like this example with a random x:

x is a 32 bit integer

1 1 0 1 1 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 1 1 0 1 0 0 0 1 1 0 1 1
| 8 bits | 8 bits | 8 bits | 8 bits |

x ^= x >> 16
which is the same as:
x = x ^ (x >> 16)

x
1 1 0 1 1 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 1 1 0 1 0 0 0 1 1 0 1 1
| | | | |
x >> 16
| | 1 1 0 1 1 1 0 0 0 0 1 0 0 0 1 0
| | | | |
x ^ x >> 16
1 1 0 1 1 1 0 0 0 0 1 0 0 0 1 0 1 1 0 0 0 0 0 1 0 0 1 1 1 0 0 1
| irrelevant | irrelevant | | |

x ^= x >> 8
x
1 1 0 1 1 1 0 0 0 0 1 0 0 0 1 0 1 1 0 0 0 0 0 1 0 0 1 1 1 0 0 1
| irrelevant | irrelevant | | |
x >> 8
1 1 0 1 1 1 0 0 0 0 1 0 0 0 1 0 1 1 0 0 0 0 0 1
| | irrelevant | irrelevant | |
x ^ x >> 8
1 1 0 1 1 1 0 0 1 1 1 1 1 1 1 0 1 1 1 0 0 0 1 1 1 1 1 1 1 0 0 0
| irrelevant | irrelevant | irrelevant | |

I have been marking bytes as irrelevant, because the last step is to AND x with 255, which has a convenient value in binary:

return x & 255
x
1 1 0 1 1 1 0 0 1 1 1 1 1 1 1 0 1 1 1 0 0 0 1 1 1 1 1 1 1 0 0 0
| | | | |
255
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
| | | | |
x & 255
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0
| | | | |

Essentially, x is folded into itself to reduce it from a 32 bit int to an 8 bit byte. I explicitly went through this process because it will be done again in parity(x).

OK, back to the next() function:

def parity(self,x):
x ^= x >> 16
x ^= x >> 8
x ^= x>> 4
x ^= x>> 2
x ^= x>> 1
return x & 1

...

def LFSR(self):
return self.iv >> 1 | (self.parity(self.iv&self.key) << 32)

def next(self):
self.aux, self.iv = self.iv, self.LFSR()

next() simply sets self.iv to LFSR() (and also sets self.aux, but this appears to be irrelevant). LFSR() ([Linear-Feedback Shift Register](https://en.wikipedia.org/wiki/Linear-feedback_shift_register)) does some interesting bitwise arithmetic. parity(x) is used to employ a similar folding algorithm to the one used above to collapse (IV AND KEY) into a single bit. IV is then shifted to the right one bit, then that parity bit is inserted on the far left of the IV. The resulting value is the new IV.

PRNG has a lot of technical things going on, so lets sum up what we know about it. Only the IV ever actually changes; KEY and MASK both remain constant. Mask XOR IV is used to garble up the IV before it is used as the next random byte. KEY AND IV is used to garble up the IV before the parity bit is calculated, and IV is shifted to the right by one bit to make room for the parity bit.

This is pretty difficult. We need to determine three different random 32 bit integers before we have any chance of decrypting our flag. Once we have the initial state of the PRNG, we will be set. Unfortunately, the only information we have about the initial state is the encrypted flag:

ab38abdef046216128f8ea76ccfcd38a4a8649802e95f817a2fc945dc04a966d502ef1e31d0a2d

Fortunately, we do know one more thing: the first six characters of the flag. All X-Mas CTF flags are formatted like this: X-MAS{...} (which can be hex encoded into 582d4d41537b...7d). Considering only the very first byte, before any of the shifting occurs, we know that:

0x58 XOR collapse_to_8_bits(IV XOR MASK) = 0xab

Because of how XOR's algebraic properties:

X XOR X = 0
X XOR 0 = X
X XOR Y = Y XOR X

We also know that:

collapse_to_8_bits(IV XOR MASK) = 0x58 XOR 0xab = 0xf3

I am going to handwave my hands around some algebra and assert that:

This allows us to reduce the 32 bit value of MASK to an 8 bit value, MASK8: MASK8 = collapse_to_8_bits(MASK). This ultimately gives us:

It is now sufficient to figure out what MASK8 is instead of the full value of MASK. Let's start by XORing the first 6 bytes of the ciphertext with the first 6 bytes of the flag to get the first 6 bytes produced by the PRNG:

ab38abdef046 ^ 582d4d41537b = f315e69fa33d

Now let's write out those 6 bytes in a table:

f3 1 1 1 1 0 0 1 1 collapse_to_8_bits(IV) XOR MASK8
15 0 0 0 1 0 1 0 1 collapse_to_8_bits(P1 + IV >> 1) XOR MASK8
e6 1 1 1 0 0 1 1 0 collapse_to_8_bits(P2 + P1 + IV >> 2) XOR MASK8
9f 1 0 0 1 1 1 1 1 collapse_to_8_bits(P3 + P2 + P1 + IV >> 3) XOR MASK8
a3 1 0 1 0 0 0 1 1 collapse_to_8_bits(P4 + P3 + P2 + P1 + IV >> 4) XOR MASK8
3d 0 0 1 1 1 1 0 1 collapse_to_8_bits(P5 + P4 + P3 + P2 + P1 + IV >> 5) XOR MASK8

For now we're going to do our best to ignore the parity bits. Because of how X is folded, each new parity bit and IV shift only invalidates the leftmost bit in the table. All the others are strictly dependent on the previous values of IV. I will remove these corrupted bits and add some ASCII formatting to indicate how the bits of IV propogate to the right:

f3 1 1 1 1 0 0 1 1 IV8 XOR MASK8
\ \ \ \ \ \ \
15 0 0 1 0 1 0 1 IV8 >> 1 XOR MASK8
\ \ \ \ \ \
e6 1 0 0 1 1 0 IV8 >> 2 XOR MASK8
\ \ \ \ \
9f 1 1 1 1 1 IV8 >> 3 XOR MASK8
\ \ \ \
a3 0 0 1 1 IV8 >> 4 XOR MASK8
\ \ \
3d 1 0 1 IV8 >> 5 XOR MASK8

Note that the same MASK8 is applied in each row. The only differentiating factor is how far the IV8 (collapse_to_8_bits(IV)) was shifted to the right. This means that each diagonal line represents a single bit of the IV8, while each column is a bit of the MASK8. The value in each cell is IV XOR MASK.
We will make an inital assumption that the rightmost (least significant) bit of the MASK8 is 1 (MASK8 = 1). Looking at the second row in the above table, we can see that IV8 ^ MASK8 = 1, which means IV8 = 0. Now going back to the first row in the table, we can see that that IV8 ^ MASK8 = 1, which means MASK8 = 1. We can continue this process back to determine the complete values of both IV8 and MASK8:

b7 1 0 1 1 0 1 1 1 MASK8

44 0 1 0 0 0 1 0 0 IV8

f3 1 1 1 1 0 0 1 1 IV8 XOR MASK8
\ \ \ \ \ \ \
15 0 0 1 0 1 0 1 IV8 >> 1 XOR MASK8

That's awesome! We now know all of the mask that matters, and we have enough of the Initial Value to decrypt the first few values of the flag. However, what about the assumption that the first bit of the mask was 1? What happens if it's 0? We can go ahead and try that:

48 0 1 0 0 1 0 0 0 MASK8

bb 1 0 1 1 1 0 1 1 IV8

f3 1 1 1 1 0 0 1 1 IV8 XOR MASK8
\ \ \ \ \ \ \
15 0 0 1 0 1 0 1 IV8 >> 1 XOR MASK8

It's simply the inverse of the other values we found. Interestingly enough, because the IV and the mask are being XOR'd together, we can safely invert them and still get the same results. We will go with the first result from here one out, but it is just as valid to assume that the first bit is 0.

Now, how to determine the value of KEY so we can calculate the parity bits? In short, I don't think it's possible. We know what IV8 is, but we need determine the original IV to combine it properly with they KEY. There are 8 different combinations of bits IV that shuffle into each bit of IV8, which means there are 64 different possible IVs. For each of these IVs, 31 bits in the KEY are completely arbitrary; we can choose the last bit in the KEY such that parity(IV & KEY) = known_value. And we only know 6 parity values, one for each time we applied next() and shifted the IV to the right. There are simply too many possibilities and too little information to solve this in a tractable way.

Fortunately, we can actually determine the parity bit at each phase without knowing the KEY. The parity bit is always inserted at the highest bit of the IV, so only the highest bit of IV8 is affected. Interestingly, ASCII characters all have decimal values between 32 and 127, or 00100000 and 01111111. This means that the highest bit in each character of the flag is ALWAYS 0. Therefore, for every byte, we have:

PARITY_BIT ^ 0 = CIPHERTEXT
PARITY_BIT = CIPHERTEXT

The only reason we needed to know the real value of IV is to calculate the parity using KEY. Now that we have this shortcut, we can just use the 8-bit byte values for IV and MASK. We will need to change the algorithm so that IV is not collapsed at all. We will also modify PRNG to accept the next byte of ciphertext when calculating next_byte(), and modify parity(x) to this:

def parity(self, c):
return (c >> 7) & 1

My final solution code is included in goodies/goodies_solve.py. Running that will give you the flag:

X-MAS{S4n7a_4lw4ys_g1ve5_n1c3_pr3s3n7s}

Original writeup (https://github.com/dchiquit/xmasctf-2018/tree/master).