Tags: hash smt 

Rating: 5.0

## Challenge

> Look at that cute fox over there!!!
> http://www.talesshop.com/?page=product&query=fox
> Please Answer My Question!

Executing the program only outputs this rather useless message:

Welcome To Kornewbie CTF, CTF Challengers!
Before You Solve This Problem, Please Watch This Video.
Video Link : https://www.youtube.com/watch?v=jofNR_WkoCE
Please Answer My Question, What Does Fox Say?

## Binary analysis

The binary contains multiple code paths in the `main` function which are
selected at random. However, only one of those code paths is relevant, because
it's the only code path which checks for a flag *and* prints the
`KorNewbie{%s}` string:

/* somewhere in main */
buf[0] = 0;
*(_DWORD *)&buf[1] = 0;
*(_DWORD *)&buf[5] = 0;
*(_DWORD *)&buf[9] = 0;
*(_DWORD *)&buf[13] = 0;
scanf("%s", buf);
if(strlen(buf) == 16 && compute_hash(buf, 0x7C35D9A3) == 0xF92AC34) {
if(buf[3] == '_' && buf[7] == '_' && buf[12] == '_') {
if(buf[0] >= 'A'
&& buf[0] <= 'Z'
&& buf[1] >= 'A'
&& buf[1] <= 'Z'
&& buf[2] >= 'A'
&& buf[2] <= 'Z'
&& buf[4] >= 'A'
&& buf[4] <= 'Z'
&& buf[5] >= 'A'
&& buf[5] <= 'Z'
&& buf[6] >= 'A'
&& buf[6] <= 'Z'
&& buf[8] >= 'A'
&& buf[8] <= 'Z'
&& buf[9] >= 'A'
&& buf[9] <= 'Z'
&& buf[10] >= 'A'
&& buf[10] <= 'Z'
&& buf[11] >= 'A'
&& buf[11] <= 'Z'
&& buf[13] >= 'A'
&& buf[13] <= 'Z'
&& buf[14] >= 'A'
&& buf[14] <= 'Z'
&& buf[15] >= 'A'
&& buf[15] <= 'Z') {
printf("Good! Check The Flag!\n");
printf("KorNewbie{%s}\n", buf);

There is an input for a single word (`scanf`). Afterwards, the length is
checked (16 characters) and a hash is computed. If the hash matches the
expected hash, the structure of the string is checked:

- only uppercase letters
- `_` at positions 3, 7, and 12

Therefore, the string we are looking for is supposed to have this structure:

The hash function itself looks like this:

int compute_hash(const char *str, int magic)
unsigned int i;
int hash = 0;

for(i = 0; i < strlen(str); i++)
hash = magic ^ ((str[i] | 0x20) + __ROR4__(hash, 8));
return crt_check(1, hash);

(The `crt_check` function in the hash function just performs some runtime
checks and then returns the second argument, this function is not interesting
for us.)

This function computes a hash using the magic constant. Interestingly enough,
this hash function was taken from a [real world malware sample](https://www.dailysecu.com/bbs/download.php?table=bbs_10&savefilename=bbs_10_2979_1.pdf&filename=%EC%83%88%EB%A1%9C%EC%9A%B4%20%EC%A4%91%EA%B5%AD%20%EC%82%AC%EC%9D%B4%EB%B2%84%20%EA%B3%B5%EA%B2%A9%20%EA%B7%B8%EB%A3%B9%20APT41%20%EC%84%B8%EB%B6%80%20%EB%82%B4%EC%9A%A9%20%EB%B3%B4%EA%B3%A0%EC%84%9C-%ED%8C%8C%EC%9D%B4%EC%96%B4%EC%95%84%EC%9D%B4%202019.pdf).
Unfortunately the published analysis of that malware sample does not contain a
useful analysis of the hash function itself.

## Basic Idea

The only way to solve this challenge is by breaking the hash function. A simple
brute force attack is way too slow and even with a dictionary the search space
is still too big. The solution is to use SMT with a powerful SMT solver, which
deals with all the complexity for us. We only have to encode the problem in

## Encode the problem in SMT

Encoding the problem in SMT is rather simple: just unroll the loop of the hash
algorithm and add the constraints on the input from the main function. We use
the theory of bit vectors to encode our 32bit variables and operations.

### Definitions

First define constants for the lower and upper limit of the input:

(declare-const lower (_ BitVec 32))
(assert (= lower #x00000041)) ; A
(declare-const upper (_ BitVec 32))
(assert (= upper #x0000005A)) ; Z

We also need the magic constant from the main function as well as the constant
that is or'd to the input:

(declare-const magic (_ BitVec 32))
(assert (= magic #x7C35D9A3))

(declare-const cons (_ BitVec 32))
(assert (= cons #x00000020))

### Loop Unrolling: First Iteration

Now we need to translate the hash function itself. In SMT everything has to be
unrolled, and variables from the original program have to be in SSA form. This
means we need one set of variables for every iteration of the hash function.

Let's start with the first iteration. The `hash` variable is initialized with
`0`. We call this `out0`:

(declare-const out0 (_ BitVec 32))
(assert (= out0 #x00000000))

Now apply the constraints from the main function on the first character of the
input (only upper case letters):

(declare-const i0 (_ BitVec 32))
(assert (and (bvule i0 upper) (bvuge i0 lower)))

Then, we define a temporary variable `t0 = str[0] | 0x20`:

(declare-const t0 (_ BitVec 32))
(assert (= t0 (bvor i0 cons)))

This variable is XOR'd with the magic constant, which results in the `hash`
value after the first iteration. Since the previous hash value is `0`, we can
ignore the ROT part here:

(declare-const out1 (_ BitVec 32))
(assert (= out1 (bvxor t0 magic)))

### Loop Unrolling: Second Iteration

Now we have to unroll the second iteration. Again, we start with the
constraints on the input:

(declare-const i1 (_ BitVec 32))
(assert (and (bvule i1 upper) (bvuge i1 lower)))

Then, we define the temporary variable `it1 = str[1] | 0x20`:

(declare-const it1 (_ BitVec 32))
(assert (= it1 (bvor i1 cons)))

This time we have to add the rotated old hash value:

(declare-const rot1 (_ BitVec 32))
(assert (= rot1 ((_ rotate_right 8) out1)))
(declare-const prexor1 (_ BitVec 32))
(assert (= prexor1 (bvadd it1 rot1)))

The XOR with the magic constants results in the `hash` value after the second iteration:

(declare-const out2 (_ BitVec 32))
(assert (= out2 (bvxor prexor1 magic)))

### Loop Unrolling: Third .. Sixteenth Iteration

Unrolling the remaining loop iterations works exactly in the same way. However,
there is one special case with the checks for `_` characters. At those
positions, the input is asserted to be `_` instead of the constraint for upper
case letters:

(declare-const i3 (_ BitVec 32))
(assert (= i3 #x0000005f)) ; _

### Check the Hash

After unrolling all loop iterations, the final hash value `out16` is checked by
comparing it to our expected hash:

(assert (= out16 #x0F92AC34))

## Compute the Flag

A solution can now be computed by calling an SMT solver:
boolector -m -x fox.smt2

The SMT solver now checks if there is a "satisfying assignment", which means it
tries to find an input which produces the hash we're interested in. It then outputs the variable assignment, and we get this result:


Unfortunately, we cannot submit this flag because the online system tells us it's "incorrect".

## Guessing the Flag

So … we computed a valid flag, but we can't submit it because the system tells
us it's invalid. This is where the real fun starts. We now have to *guess* the

We can choose letters and e.g. specify that the word `FOX` has to appear.
Unfortunately we can still get lots of "incorrect" flags, even if we force
words. After a few hours of playing this "guessing game", I guessed that the
flag might start with `THE_FOX_SAYS_`, since the program asks us `What Does Fox

Encoding such a constraint is rather simple, just force the input to specific values:

; guess the first few words
(assert (and (= i0 #x00000054) (= i1 #x00000048) (= i2 #x00000045))) ; THE
(assert (and (= i4 #x00000046) (= i5 #x0000004F) (= i6 #x00000058))) ; FOX
(assert (and (= i8 #x00000053) (= i9 #x00000041) (= i10 #x00000059) (= i11 #x00000053))) ; SAYS

Now, with *this* guess we finally get our flag that can be submitted:


[Final SMT file](https://www.sigflag.at/assets/posts/foxyfox/fox.smt2)

## Unique Solution

Challenges without unique solutions suck. After we already guessed the
solution, the challenge was "fixed" with the following change:

ref[0] = 0;
ref[1] = 0x1C;
ref[2] = 0x11;
ref[3] = 0xB;
ref[4] = 0x12;
ref[5] = 0x1B;
ref[6] = 0xC;
ref[7] = 0xB;
ref[8] = 7;
ref[9] = 0x15;
ref[10] = 0xD;
ref[11] = 7;
ref[12] = 0xB;
ref[13] = 7;
ref[14] = 1;
ref[15] = 0x15;

if(...) { /* this is the long if from before */
ok = 1;
for(i = 0; i < strlen(buf); i++) {
if((buf[i] ^ buf[0]) != ref[i])
ok = 0;
if(ok == 1) {
printf("Good! Check The Flag!\n");
printf("KorNewbie{%s}\n", buf);

This change is supposed to make the solution unique.

In SMT, this contstraint looks like this:

(assert (= (bvxor i0 i0) #x00000000))
(assert (= (bvxor i1 i0) #x0000001C))
(assert (= (bvxor i2 i0) #x00000011))
(assert (= (bvxor i3 i0) #x0000000B))
(assert (= (bvxor i4 i0) #x00000012))
(assert (= (bvxor i5 i0) #x0000001B))
(assert (= (bvxor i6 i0) #x0000000C))
(assert (= (bvxor i7 i0) #x0000000B))
(assert (= (bvxor i8 i0) #x00000007))
(assert (= (bvxor i9 i0) #x00000015))
(assert (= (bvxor i10 i0) #x0000000D))
(assert (= (bvxor i11 i0) #x00000007))
(assert (= (bvxor i12 i0) #x0000000B))
(assert (= (bvxor i13 i0) #x00000007))
(assert (= (bvxor i14 i0) #x00000001))
(assert (= (bvxor i15 i0) #x00000015))

If we ask the SMT solver for a model, we directly get the "correct" flag.

To convince ourselves that the solution is now really unique, we can check if
there is another solution by "blacklisting" the previous solution:

; blacklist 'THE_FOX_SAYS_SUA'
(assert (not (and
(= i0 #x00000054)
(= i1 #x00000048)
(= i2 #x00000045)
(= i3 #x0000005f)
(= i4 #x00000046)
(= i5 #x0000004f)
(= i6 #x00000058)
(= i7 #x0000005f)
(= i8 #x00000053)
(= i9 #x00000041)
(= i10 #x00000059)
(= i11 #x00000053)
(= i12 #x0000005f)
(= i13 #x00000053)
(= i14 #x00000055)
(= i15 #x00000041)

Now the SMT solver tells us `unsat`, which means there is indeed only one solution.

## Useful Links

- [boolector](https://boolector.github.io/): efficient SMT solver
- [smtlib2](http://smtlib.cs.uiowa.edu/): description of the smtlib2 language

Original writeup (https://www.sigflag.at/blog/2019/writeup-newbiectf2019-foxyfox/).