Tags: crypto padding-oracle 

Rating:

This problem is a basic CBC padding oracle attack, as described by Vaudenay in EUROCRYPT 2002, famously used against ASP.NET in 2010, and more recently against Steam in 2016. The problem occurs when a system provides a means for any encrypted data to be decrypted, when:

  1.  The algorithm is a block cipher such as AES 
  2.  That cipher is being used in Cipher Block Chaining (CBC) mode
  3.  The system indicates whether the decrypted data is correctly or incorrectly padded

Block ciphers encrypt data in chunks called "blocks". The size of any block is dependent on the algorithm; for instance, AES has a 16-byte block size. Block ciphers can't encrypt or decrypt any data that isn't exactly the block size of the cipher. In order to encrypt and decrypt larger and smaller messages, padding and a block cipher mode are needed. A block cipher mode is a method of dealing with multiple blocks, and padding fills in incomplete blocks in such a way that it can be removed without accidentally removing some plaintext. One of the more common block cipher modes is called CBC mode.

In CBC mode, flipping any bit from 1 to 0 or vice versa in any block of ciphertext before decrypting effectively randomizes the decryption of that block. However, it also causes the bit in the same position in the next block to be flipped upon decryption! This is so because the decryption of any block in CBC mode is done in two steps: In the first, the block is decrypted using the block cipher algorithm (in this case, AES) and then XORed with the ciphertext of the previous block. Knowing this, if we are willing and able to sacrifice the meaning of any block, we can reliably flip the bits of the next block. It stands to reason, then, that if we were to change any byte in a block of ciphertext to all of its 256 possible values and then decrypt each version, the corresponding byte in the next block would also be changed to each of its 256 possible values upon decryption. It also stands to reason that if we know the value of the plaintext byte we're changing through modifications to the ciphertext at any point, we can change it to whatever value we want by figuring out which bits to flip from its current value to get the value we want and flipping them in the ciphertext byte being modified. We can also determine the original plaintext byte from the known plaintext byte by applying the same bit flips we made to the ciphertext byte to the known plaintext byte: naturally, this will reverse the changes we made to the byte to get it to our known byte value.

So, we know we can make arbitrary changes to bits, and learn the original value for any byte if we can learn its state through modification. How, then, does a padding oracle give us these things? In PKCS#7 padding, the type of padding used in this challenge, the ASCII value of the bytes we add to the end of the data to get a complete block is equal to the number of bytes needed. If we need only one byte, we pad with \x01. If we need 6, we pad with \x06\x06\x06\x06\x06\x06, and so on, up to \x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10 (0x10 = decimal 16), a full block of padding, in the case that we already have a plaintext that splits evenly into blocks. We know, then, that changing the last byte of any message whose length is a multiple of the block size to \x01 results in a correctly padded message under PKCS#7. If we have a padding oracle, we can change the last byte of the last block to every possible byte value using the technique described above and learn which messages have valid padding. If the original value of the byte we're changing is \x01, none of our edited versions should produce a message with valid padding, except in the case that the original message ends with something like \x02\x01 or \x03\x03\x01. We can rule out this case by changing the second to last byte to confirm that the last byte is \x01. Otherwise, there should be exactly one message besides the original (if the original is padded correctly) which is padded correctly. We can then learn the original value of the last byte by calculating \x01 XOR Cn XOR Cn' where Cn is the value of the ciphertext byte we're flipping before our changes and Cn' is the altered version. By cutting and pasting any block from any ciphertext onto the end of the ciphertext we send to the padding oracle, we can learn the last byte of any block.

But what about the other bytes? Remember that when we know the value of a byte in any position, we can change it to any other value. Let's say, then, that we've already learned the last byte of a block. We can set the value of the last byte to \x02 and then repeat the same process as before, changing the second to last byte instead of the last, iterating through all possible byte values until the padding is correct. which should only happen once the value of the second to last byte is \x02. This allows us to learn the second to last byte of any block. We then set the last two bytes to \x03\x03 and flip the third to last byte, and so on and so forth until we have learned the original values of the entire block. We can now cut and paste each block of a message onto the end of a ciphertext, and learn its decrypted value as described above.

There's only one thing missing: What about the first block of a message? There's no previous block of ciphertext to XOR with! This is where what's known as an IV, or Initialization Vector, is used. In the case that we have control of the IV, we can modify that to change the bytes of the first block. What's really great about this is that changing the IV essentially results in free bit flips, since the IV is never decrypted (it isn't ciphertext, after all) and as such no block randomization occurs. If we know or can guess the IV, we can calculate the original value of the first block. As it turns out, the IV is appended to the front of the ciphertext for the ciphertext we're given as part of the challenge. Using the IV, we can recover the first block. Assuming that the "matrix ID" we're given is the encrypted flag, we set to work writing a solver to decrypt the flag.

The Cryptanalib library included with FeatherDuster (https://github.com/nccgroup/featherduster) has functionality for exploiting this type of bug, and there is a FeatherDuster module which builds a basic padding oracle attack script for use in attacking such vulnerable systems as described above. The effort needed above and beyond what FeatherDuster provides is to write a function which interacts with the oracle, providing it ciphertext in the way it prefers and returning True to indicate good padding or False to indicate bad padding. There are a few minor gotchas, namely that without the header Content-Type: application/x-www-form-urlencoded the application will not recognize the POST body correctly and if the Base64 string contains any unencoded plus signs they will be interpreted as spaces and Base64 decoding will fail. The following is a Python script which uses python-requests and Cryptanalib to exploit the padding oracle flaw and return the decrypted data:

<p>
<span># Generated by FeatherDuster
import cryptanalib as ca
import requests
import string
import sys

if len(sys.argv) != 2:
   print '[*] Padding oracle attack script'
   print '[*] Usage: %s <span>' % sys.argv[0]
   exit()

def padding_oracle(ciphertext):
   ciphertext = ciphertext.encode('base64')
   ciphertext = string.replace(ciphertext,'\n','')
   ciphertext = string.replace(ciphertext,'+','%2B')
   ciphertext = string.replace(ciphertext,'=','%3D')

   headers = {'Content-Type': 'application/x-www-form-urlencoded'}
   r = requests.post('http://crypto.chal.csaw.io:8001/', data='matrix-id='+ciphertext, headers=headers)

   if r.content.find('Caught exception during AES') != -1:
      return False
   else:
      return True

ca.padding_oracle_decrypt(padding_oracle=padding_oracle, ciphertext=sys.argv[1].decode('hex'), block_size=16, padding_type='pkcs7', iv='00000000000000000000000000000000'.decode('hex'), verbose=True, hollywood=True)
</p>

Upon decryption with the padding oracle, the flag (flag{what_if_i_told_you_you_solved_the_challenge}) is returned. You can speed this up very slightly by setting hollywood=False to disable the hollywood hacker movie style output, but what fun is that? :) </span></span>