Tags: crypto 

Rating:

# Tale of Two Cities

-----

Found the OG file: https://www.gutenberg.org/files/98/98-0.txt

Time to use a diff tool to see where it differs, https://www.diffchecker.com/

After diffing the two files, we end up with the following string:

㐾�㐻㐌㐟㐀㐏㑖㐄㐓㐀㐴㐀㐄㐻㐉㐴㐷㐻㐾㐇㑎㑟Offset: 0x3400

The "Offset: 0x3400" refers to: Unicode U+3400 where the symbols are located:
https://unicode-table.com/en/#cjk-compatibility

Remove "�" and "Offset: 0x3400" to get:
㐾㐻㐌㐟㐀㐏㑖㐄㐓㐀㐴㐀㐄㐻㐉㐴㐷㐻㐾㐇㑎㑟

Let's look at the hex values of these characters:
E3 90 BE E3 90 BB E3 90 8C E3 90 9F E3 90 80 E3 90 8F E3 91 96 E3 90 84 E3 90 93 E3 90 80 E3 90 B4 E3 90 80 E3 90 84 E3 90 BB E3 90 89 E3 90 B4 E3 90 B7 E3 90 BB E3 90 BE E3 90 87 E3 91 8E E3 91 9F

Looks very repetitive, let's subtract E3 90 00 from each set of 3 pairs of hex characters:
BE BB 8C 9F 80 8F 196 84 93 80 B4 80 84 BB 89 B4 B7 BB BE 87 18E 19F

Now let's convert to decimal:
190 187 140 159 128 143 406 132 147 128 180 128 132 187 137 180 183 187 190 135 398 415

Let's take the lowest value and subtract that value from all of the numbers:
62 59 12 31 0 15 278 4 19 0 52 0 4 59 9 52 55 59 62 7 270 287

Here, I assumed that it was crackable as a monoalphabetic substitution cipher.
I made the assumption that the numbers had some kind of ordering to them (smaller = beginning of alphabet, larger = end of alphabet)

My thought process for cracking the sequence was as follows:

62 59 12 31 0 15 278 4 19 0 52 0 4 59 9 52 55 59 62 7 270 287
u t f l a g { _ _ a _ a _ t _ _ _ t u _ _ }

0 = a
4 = guess: c
u t f l a g { c _ a _ a c t _ _ _ t u _ _ }

guess: c h a r a c t e r
u t f l a g { c h a r a c t e r _ t u _ _ }

52 = r
55 = guess: s
u t f l a g { c h a r a c t e r s t u _ _ }

4 = c
7 = guess: d
u t f l a g { c h a r a c t e r s t u d _ }

guess: s t u d y
u t f l a g { c h a r a c t e r s t u d y }

And there you have it!

-----

Note: the intended solution was to use the hint that was provided which refers to the OEIS sequence A000788: https://oeis.org/A000788
Then use the following formula: Encoded number = index of letter + OEIS[index of letter]