Tags: reverse_engineering 

Rating:

Reverse Engineering - BadType

We are given the Windows binary which wants to have flag as its first argument:

> badtype.exe
Usage: badtype.exe <flag>

Let's give it something random:

> badtype.exe DrgnS{x}

A window with the text

BAD FLAG!:\

appears. Let's have a look in IDA:

.text:000000014000252C                 cmp     ecx, 2
.text:000000014000252F                 jz      short length_ok

It wants a single argument, that's what we already figured.

.text:000000014000254A length_ok:
.text:000000014000254A                 mov     rcx, [rdx+8]
.text:000000014000254E                 mov     [rsp+110h+arg_18], rdi
.text:0000000140002556                 call    processFlag

There is a function, to which we pass argv[1]. The code that follows creates a font from a static variable and then does the usual dance: RegisterClassExW, CreateWindowEx, etc. Window function displays the "bad flag" message - nothing notable here. So let's inspect processFlag:

.text:0000000140001FE0 strlen_flag:
.text:0000000140001FE0                 inc     rax
.text:0000000140001FE3                 cmp     byte ptr [rcx+rax], 0
.text:0000000140001FE7                 jnz     short strlen_flag
.text:0000000140001FE9                 cmp     rax, 50
.text:0000000140001FED                 jnz     done

Flag must be 50 characters long.

.text:0000000140001FF3                 movups  xmm0, cs:g_magic1
.text:0000000140001FFA                 movups  xmmword ptr [rbp+57h+magic0], xmm0
.text:0000000140001FFE                 movups  xmm1, cs:g_magic2
.text:0000000140002005                 movups  xmmword ptr [rbp+57h+magic1], xmm1
.text:0000000140002009                 movups  xmm0, cs:g_magic3
.text:0000000140002010                 movups  [rbp+57h+magic2], xmm0
.text:0000000140002014                 movzx   eax, cs:g_magicword
.text:000000014000201B                 mov     [rbp+57h+magicword], ax
.text:000000014000201F                 movzx   eax, cs:g_magicbyte
.text:0000000140002026                 mov     [rbp+57h+magicbyte], al

The code copies 50 magic bytes from a static variable to stack.

.text:0000000140002050 init_bitpairs:
.text:0000000140002050                 lea     rax, [rbp+57h+magic0]
.text:0000000140002054                 add     rax, rsi
.text:0000000140002057                 movzx   ebx, byte ptr [r14+rax]
.text:000000014000205C                 xor     bl, [rax]

The entire flag is then XORed with these bytes.

.text:000000014000205E                 movzx   eax, bl
.text:0000000140002061                 shr     al, 6
.text:000000014000208C                 movzx   eax, bl
.text:000000014000208F                 shr     al, 4
.text:0000000140002092                 and     al, 3
.text:00000001400020BC                 movzx   eax, bl
.text:00000001400020BF                 shr     al, 2
.text:00000001400020C2                 and     al, 3
.text:00000001400020EC                 and     bl, 3

Each XORed byte is then split into 4 bit pairs, which are placed in a vector.

.text:0000000140002149                 mov     r8d, RESOURCE_FONT_SIZE
.text:000000014000214F                 lea     rdx, g_font_resource
.text:0000000140002156                 lea     rcx, [rbp+57h+magic0]
.text:000000014000215A                 call    std_string_operator_eq

The static font variable (the one that will be used in main) is copied into a string. The memory which used to hold magic bytes now contains this string - this is going to be somewhat confusing, but oh well.

.text:00007FF7C58D2160                 lea     rdx, [rbp+57h+magic0]
.text:00007FF7C58D2164                 lea     rcx, [rbp+57h+tables]
.text:00007FF7C58D2168                 call    parse_font

The string is passed to some function - let's skip it for now.

.text:00007FF7C58D21A3 find_cff:
.text:00007FF7C58D21A3                 cmp     [rax+table_info.tag], cff_tag
.text:00007FF7C58D21A9                 jz      cff_tag_ok

parse_font appears to produce an array, and the loop is looking for an entry containing CFF tag.

.text:00007FF7C58D22A2 scan_cff:
.text:00007FF7C58D22A2                 lea     rcx, [rax+rdx]
.text:00007FF7C58D22A6                 movzx   eax, word ptr [rcx]
.text:00007FF7C58D22A9                 cmp     ax, word ptr cs:g_cff_pattern
.text:00007FF7C58D22B0                 jnz     short next_cff

In addition to a tag, each array entry contains a string. The code looks for occurrences of \x1C\x41\x41 in this string.

.text:00007FF7C58D22CF corrupt_1:
.text:00007FF7C58D22CF                 mov     byte ptr [rax+rdx+1], 0
.text:00007FF7C58D22E9                 movzx   eax, byte ptr [r8]
.text:00007FF7C58D22ED                 mov     [rcx+rdx+2], al

It then replaces the second byte with 0 and the third byte with the value of the next bit pair. There are exactly 200 bit pairs and \x1C\x41\x41 occurrences.

.text:00007FF7C58D231E                 lea     rdx, [rbp+57h+magic0]
.text:00007FF7C58D2322                 lea     rcx, [rbp+57h+tables]
.text:00007FF7C58D2326                 call    reassemble_font

The string and the array are then passed to another function. Let's leave it be for now.

.text:00007FF7C58D2347                 mov     r8, [rbp+57h+magic1]
.text:00007FF7C58D234B                 lea     rcx, g_font_resource
.text:00007FF7C58D2352                 call    memmove

Finally, the string is copied back to the static variable. Setting a hardware watchpoint on argv[1] indicates that processFlag is the only function where it's read. So, the code uses the flag to modify the font used to show us the message, which means we need to find a flag value that modifies the font in a way that the displayed message changes. For that we need to understand what these \x1C\x41\x41 and \x1C\x00\x00-\x03 thingies are.

What we have is OTTO magic at the beginning of the font. This is solid googling material, and indeed, we immediately learn that this is an OpenType font. Let's ask Google what is CFF: turns out that's Compact Font Format. Let's have a look at the specs:

OpenType files consist of a number of tables, CFF being one of them. Dumping the original and the modified static font variable values, splitting them into individual tables and comparing those tables shows that only CFF table is changed. Good, no suprises. This means we can now focus on CFF spec.

Reading CFF spec reveals that fonts are drawn using a stack-based virtual machine, which is documented in the Charstring spec. Googling for "python opentype cff" points to fonttools project that we can use to dump the bytecode. Doing this in a straightforward manner does not work, because it tries to also run the bytecode, which contains a couple fancy unsupported instructions, but we can still reuse the lower-level infrastructure and copy decompilation parts of higher-level infrastructure into our script, while avoiding running the decompiled code.

Let's dump the charstring:

CharStrings[1]
    [0x1] push 1
    [0x2] push 0
    [0x4] put
    [0x5] push 0
    [0x6] push 1
    [0x8] put
    [0x9] push 0
    [0xa] push 2
    [0xc] put
    [0xd] push 1
    [0xe] push 3
    [0x10] put
    [0x11] push 2
    [0x12] push 4
    [0x14] put
    [0x15] push 3
    [0x16] push 5
    [0x18] put
    [0x1b] push 0
    [0x1c] push -96
    [0x1d] callsubr
    [0x20] push 0
    [0x21] push -96
    [0x22] callsubr
    [0x25] push 0
    [0x26] push -96
    [0x27] callsubr
    [0x2a] push 0
    [0x2b] push -96
    [0x3fe] push 0
    [0x3ff] push -96
    [0x400] callsubr
    [0x401] push 0
    [0x403] get
    [0x404] push 41
    [0x406] eq
    [0x407] push 1
    [0x409] get
    [0x40a] push 40
    [0x40c] eq
    [0x40e] add
    [0x40f] push 2
    [0x411] eq
    [0x413] not
    [0x414] push -99
    [0x416] add
    [0x417] callsubr
    [0x418] endchar

Instructions 0x1-0x18 initialize the transient array (let's call it t) with [1, 0, 0, 1, 2, 3]. We then see a ton of calls to subr11 (actual subroutine numbers are computed by adding a bias - in this case 107, to callsubr arguments) - it's their arguments which are encoded as \x1C\x00\x00-\x03. So the code passes each bit pair to subr11.

At the end subr9 is normally called, but if t[0] == 41 and t[1] == 40, then 1 is added to a subroutine number and the code calls subr8 instead. This must be the goal here: find which values to pass to subr11 so that the desired t[0] and t[1] values are produced. Let's dump subr11:

    localSubrs[11]
        [0x1] push 0
        [0x2] push 1
        [0x3] push 0
        [0x4] push 3
        [0x6] index
        [0x8] ifelse
        [0x9] push 0
        [0xa] push 1
        [0xb] push 3
        [0xd] index
        [0xe] push 3
        [0x10] ifelse
        [0x12] add
        [0x14] not
        [0x15] push -98
        [0x17] add
        [0x18] callsubr

If bitpair is out of bounds, subr9 is called. It seems to be a bad thing - let's keep in mind that we should avoid subr9 being ever called. If the check succeeds, however, then subr10 is called. Dumping it reveals that it contains a single return instruction - a good thing apparently.

        [0x19] push 0
        [0x1a] push 1
        [0x1b] push 0
        [0x1c] push 0
        [0x1e] get
        [0x20] ifelse
        [0x21] push 0
        [0x22] push 1
        [0x23] push 0
        [0x24] push 1
        [0x26] get
        [0x28] ifelse
        [0x29] push 0
        [0x2a] push 1
        [0x2b] push 0
        [0x2d] get
        [0x2e] push 42
        [0x30] ifelse
        [0x31] push 0
        [0x32] push 1
        [0x33] push 1
        [0x35] get
        [0x36] push 40
        [0x38] ifelse
        [0x3a] add
        [0x3c] add
        [0x3e] add
        [0x40] not
        [0x41] push -98
        [0x43] add
        [0x44] callsubr

Ditto t[0] and t[1].

        [0x45] push 1
        [0x47] get
        [0x48] push -91
        [0x4a] add
        [0x4b] callsubr

Here we call a subroutine with index t[1] + 16. Since we know that 0 <= t[1] <= 40 we can assume that this would call subr16 - subr56. These subroutines check whether t[0] belongs to a certain set, and if yes, call the dreaded subr9. Each of them has a unique associated set, otherwise they are all the same. For example:

    localSubrs[17]
        [0x1] push 0
        [0x2] push 0
        [0x4] get
        [0x5] push 0
        [0x7] eq
        [0x9] add
        [0xa] push 0
        [0xc] get
        [0xd] push 4
        [0xf] eq
        [0x11] add
        [0x12] push 0
        [0x14] get
        [0x15] push 12
        [0x17] eq
        [0x19] add
        [0x1a] push 0
        [0x1c] get
        [0x1d] push 24
        [0x1f] eq
        [0x21] add
        [0x22] push 0
        [0x24] get
        [0x25] push 38
        [0x27] eq
        [0x29] add
        [0x2a] push 0
        [0x2c] get
        [0x2d] push 42
        [0x2f] eq
        [0x31] add
        [0x33] not
        [0x34] push -98
        [0x36] add
        [0x37] callsubr
        [0x38] return

This basically means that if t[1] == 1, then t[0] cannot be one of [0, 4, 12, 24, 38, 42]. Let's continue subr11 analysis.

        [0x4c] push 2
        [0x4e] add
        [0x50] get
        [0x51] push -95
        [0x53] add
        [0x54] callsubr

This calls subroutine t[bitpair + 2] + 12. Since t[2:6] are in range 0-3, this will call subr12-subr15. They are responsible for incrementing and decrementing t[0] and t[1], for example:

    localSubrs[12]
        [0x1] push 0
        [0x3] get
        [0x4] push 1
        [0x6] add
        [0x7] push 0
        [0x9] put
        [0xa] return

We can see that subr12 increments t[0]. Let's finish subr11.

        [0x55] push -50
        [0x56] callsubr
        [0x57] return

At the end it calls subr57.

    localSubrs[57]
        [0x1] push 2
        [0x3] get
        [0x4] push 3
        [0x6] get
        [0x7] push 4
        [0x9] get
        [0xa] push 5
        [0xc] get
        [0xd] push 4
        [0xe] push 1
        [0x10] roll
        [0x11] push 5
        [0x13] put
        [0x14] push 4
        [0x16] put
        [0x17] push 3
        [0x19] put
        [0x1a] push 2
        [0x1c] put
        [0x1d] return

This one rotates t[2:6], for example, if it's [0, 1, 2, 3], then it will become [3, 0, 2, 1]. Ok, where to go from here?

It's useful to think about t[0] and t[1] as x and y coordinates subr12-subr15 allow us to walk right, down, left and up respectively. subr16-subr56 define where we can't go, that is, walls. We start at (0, 0) and need to reach (41, 40).

First, let's extract the labyrinth map. Doing this manually by analyzing subr16-subr56 is boring, so let's use symbolic execution with z3. It's enough to have just one symbolic variable t[0] and simulate each subr individually. There are about 10 instruction types, so the simulator code fits on a single screen. When we reach callsubr, we need to find all values of t[0] that satisfy subr == 9 - they will be our walls.

Second, let's convert the wall information to a networkx graph and find the shortest path from (0, 0) to (41, 40).

Third, let's convert the path to bitpairs. We already have the simulator, so let's reuse it to run the whole charstring. Whenever we find ourselves calling subr11, let's replace the actual argument value with the one computed based on the following observations. Let's recap that movement happens using t[bitpair + 2] + 12 and t[2:6] rotate each turn. The latter circumstance does not complicate life, because we simulate everything anyway and thus know the current t value when we need to make a decision. So the task is to compute bitpair based on direction of the current step and current value of t, and this is trivial.

Finally, we need to squash bitpairs into bytes and xor them with magic bytes. This reveals the flag!

Original writeup (https://github.com/mephi42/ctf/tree/master/2019.09.21-Teaser_Dragon_CTF_2019/reverse_engineering-BadType).