Rating: 0

# Obscure File Format

We get an archive with three files: `a`, `k` and `l`. Running `file`
to see what they are doesn't tell us anything about the first two, but
`l` is a Python script:

```
$ file a k l
a: data
k: data
l: Python script, ASCII text executable, with very long lines
```

If we open `l`, we will see something like the following:

```
#! /usr/bin/env python3
import zlib;
exec(zlib.decompress(b"lots of data"))
```

If we replace the `exec` with `print` to see what the script executes,
we get a similar output:

```
#! /usr/bin/env python3
import binascii
exec(binascii.unhexlify(b"even more data"))
```

Again, we replace `exec` with `print`, and unsurprisingly run into a
similar-looking script... Several rounds of this later, we get the
actual code -- an obfuscated Python script (see
[here](https://de298.user.srcf.net/writeups/insa/obscure-obfuscated.py)).

After replacing all the obfuscated imports and function aliases with
the original names (pick your favourite automation method here --
Emacs macros worked well for me), we are left with a slightly clearer,
but not yet readable file. There isn't much to do except read the
file, guess what the variables are meant to be from their usage and
deobfuscate it slowly. My deobfuscated version can be found
[here](https://de298.user.srcf.net/writeups/insa/obscure-clean.py) --
some names probably don't match what the author intended, but it's
enough to get an idea of what the program does.

The program is an archiver that does the following:

* reads all files under a directory,
* encrypts each of them and saves the keys to a keystore file, and
* writes the encrypted data to an archive file.

The two files can be recognised based on their headers: `L0LKSTR\0`
for the keystore, and `L0LARCH\0` for the archive. We can now
recognise that the other two files we have are the keystore (`k`) and
archive (`a`).

All that remains is to reconstruct the encryption algorithm and file
formats, parse them and extract the files. The encryption key consists
of an AES key and IV, and a permutation `p`. The algorithm is:

* pad the data and compress (using `zlib`),
* pad and encrypt using AES in CBC mode, and
* divide the output into 128-byte blocks and shuffle them according to
the permutation.

The keystore format is:

* header (`L0LKSTR\0`),
* number of entries (32-bit integer),
* for each entry:
* UUID (16 bytes),
* key and IV (16 bytes each), in reversed byte-order,
* length of the permutation `p` (32-bit), and
* list of pairs (from, to), showing which index to which offset in
the output.

The archive format is similar:

* header (`L0LARCH\0`),
* number of entries,
* for each entry:
* length of path,
* path (null-terminated string),
* UUID,
* metadata (),
* length of encrypted data, and
* encrypted data.

After parsing the two files, we can use the UUIDs to match the files
with keys, decrypt and output them (my parser is
[here](https://de298.user.srcf.net/writeups/insa/obscure-decode.py)^[Note:
written during the CTF -- not exactly clean code.]). We get several
"filler" files and a file containing the flag.