Rating: 5.0

> Written by **lighthouse64**
>
> Your friend was trying to send you the stuff that he did for your group project, but the data mysteriously got [corrupted](https://static.tjctf.org/dc3b12c1155ea3c09800ebec00f8d31498af64fe6f76e0c75a552ff2c75cc762_corrupted_project). He said his computer got infected before he was able to send it. Regardless of what happened, you need to fix it or you won't be able to complete the project.

This chall is actually pretty simple but the zip archive's messed up structure works kind of as a misdirection. The zip structure tells where the archived file starts and ends.

Here is a very nice zip file reference documentation:
https://users.cs.jmu.edu/buchhofp/forensics/formats/pkzip.html

The bytes in the range 001A:001D store the archived filename's length and some extra fields after it:

00000000: 2e4b 0304 1e03 8200 3f12 339f b250 ea54 .K......?.3..P.T
00000010: 882e 7d0a 0000 f40a 0000 1100 1c00 7072 ..}...........pr
00000020: 6f6a 6563 742d 6669 6c65 732e 7a69 7055 oject-files.zipU
00000030: 5409 0003 f220 c35e ea24 c35e 7578 0b00 T.... .^.$.^ux..
00000040: 0104 e803 0000 04e8 0300 0042 5a65 390c ...........BZe9.

filename length: 0x0011 = 17 bytes
extra field length: 0x001C = 28 bytes

So the archived file starts at 004B:

0000004b: 425a 6539 0c2a 12f9 7be9 4d9e 3789 0001 BZe9.*..{.M.7...

... and it's a bzip2 archive file header. We have it's beginning now let's find it's end.

A zip archive has two structures at it's end, the Central Directory Record (CDR) and the CDR End. Just before the CDR is the end of the archive's packed data. A zip file signature and it's internal headers all start with 0x504B ("PK") and has 4 bytes in total, being:

0x504B0304: local file header (one for each file in the archive)
0x504B0102: CDR
0x504B0506: CDR End

As there's only one local file header things get easier. We get all the bytes from 004B (75 bytes) to the ones before 0x504B0102, which is at 0x0AC7.

$ dd bs=1 skip=75 count=2759 if=dc3b12c1155ea3c09800ebec00f8d31498af64fe6f76e0c75a552ff2c75cc762_corrupted_project of=project-files.bz2

The bzip2 documentation is very scarce and our best shot is in the Wikipedia:
https://en.wikipedia.org/wiki/Bzip2#File_format

This is what matters here:

.magic:16 = 'BZ' signature/magic number
.version:8 = 'h' for Bzip2 ('H'uffman coding), '0' for Bzip1 (deprecated)
.hundred_k_blocksize:8 = '1'..'9' block-size 100 kB-900 kB (uncompressed)
.compressed_magic:48 = 0x314159265359 (BCD (pi))
.crc:32 = checksum for this block

Going each header part at a time, without touching CRC complicated stuff, it goes from this:

00000000: 425a 6539 0c2a 12f9 7be9 4d9e 3789 0001 BZe9.*..{.M.7...

to this:

00000000: 425a 6839 3141 5926 5359 4d9e 3789 0001 BZh91AY&SYM.7...

In challenges like this it's very useful if you create a file yourself to check how a valid file is structured and it's values.

Ok, so now we can open the project-files.bz2 archive, extract the `project-files` file and extract its content too. It is 3 PNG files, one of which (project-notes-1.png) is messed up. When this file is open in a hex editor, we get a cry for help:

$ xxd project-notes-1.png | head
00000000: 4845 4c50 0d0a 1a0a 0000 000d 4948 4452 HELP........IHDR

... and we replace it with the correct 0x89504E47 value:

$ xxd project-notes-1.png | head
00000000: 8950 4e47 0d0a 1a0a 0000 000d 4948 4452 .PNG........IHDR

Here's the flag:
> tjctf{plz_dont_procrastinate}

Original writeup (https://github.com/0x8Layer/CTF-Writeups/blob/master/TJCTF/corrupted%20project.md).