Difficulty Level: TSA-approved lock
The Setup: You know a thing or two about encryption. You’ve read about the One Time Pad. You know that 256-bit keys are plenty long enough – in fact, you’ve even seen some crypto nerds argue that even 128 bits is plenty.
The Algorithm: You generate a 256-bit key with a cryptographically strong random number generator. You encrypt messages by splitting them into 32-byte chunks (that’s 256 bits), and XORing them with the key. Decrypting works the same way.
It’s fast. The output looks pretty indecipherable. You ship it.
Problem 1: Patterns in the ciphertext
Let’s suppose this algorithm is used to encrypt bitmap images, including a picture of this handsome guy:
After encryption, most any image viewer will be unable to open the file. In other words, the encryption works great! Well, not so fast. The BMP file format has certain requirements for the first few bytes of the file (e.g. it must start with “BM”, otherwise it’s not a valid BMP file). With a few tweaks to the beginning of the encrypted file, we get a very different picture:
The encrypted file still has patterns left over from the original image, so the encryption didn’t really work.
You might be wondering why this happened. The problem is that each 256-bit block of the file is encrypted using the same process. If two blocks are the same in the original file, they will also be the same in the encrypted file. Modern encryption algorithms (like AES-GCM) take special care to make sure that each block is encrypted differently.
Problem 2: It’s all broken anyway
That’s not good enough, though. Let’s step up our game.
We know that a BMP file must start with ‘B’ (0x42 in hex), regardless of the image. We also know that this byte will be encrypted using the first byte of the key. This means that we can write an equation for the first byte of the encrypted file:
first_ciphertext_byte = first_key_byte XOR 'B'
With a bit of algebra, we get:
first_ciphertext_byte XOR 'B' = (first_key_byte XOR 'B') XOR 'B'
first_ciphertext_byte XOR 'B' = first_key_byte
In other words, if we take the first byte of the encrypted file and XOR it with ‘B’, we get the first byte of the encryption key! This is disastrous, since we can use this knowledge to decrypt the first byte of every block in the file.
Of course, we can do the same thing with the next byte (‘M’), which gives us the second byte of the key. Only 30 more bytes to go. The next 4 bytes are a size field. Encryption doesn’t hide a file’s size, so we can easily figure out what these bytes are. We can continue like this until we’ve recovered all or most of the key, at which point we can crack the rest using a for loop.
Want to try this yourself? Using this algorithm, I encrypted an image with a random 16-byte (128-bit) key, which you can download here. Open it in a hex editor, and follow along with a BMP file format reference to figure out the key.