Tags: reversing z3 smt

Rating: 5.0

# Intro

You Shall Not Pass

This rather strange binary needs a particular string as its input. Yet we have no idea what this string is. Do you?

This is one of the reverse challenges in VolgaCTF 2018.
The challenge is 200 points. What is interesting about it is:
* has 40 threads that check the input
* it is written in C++
* every thread does a mathematical calculation that is checked against a constant

You can't find more suitable challenge for using a SMT solver on.
My choice is z3 with its Python binding. But before the solving lets do the
application analysis.

# Analysis

The app is a 64 bit ELF binary:

$file ysnp ysnp: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=efd7438ad3cc4c32c0ce7838e6f72d005b0ad18e, stripped Using radare2 there are two interesting functions to analyse: * main * sub.__cxa_guard_acquire_4e0 For the sake of convenience I usually dump all the interesting stuff in separate .asm files. It is easier to inspect the assembly in Sublime Text 3 for me. It let me search/highlight more freely: r2> aaa r2> pdf @ main > asm/main.asm r2> pdf @ sub.__cxa_guard_acquire_4e0 > sub.__cxa_guard_acquire_4e0.asm I spent some time there to grasp the whole scheme. I immediately noticed the launch of the 40 threads in sub.__cxa_guard_acquire_4e0. This code is repeated 40 times and it is easy to spot the start address of the thread functions in esi: nasm | ::| 0x00406509 be60204000 mov esi, 0x402060 ; "AVAUATUSH\x83\xec H\x8b\x1dmq " | ::| 0x0040650e bf90926000 mov edi, 0x609290 | ::| 0x00406513 48c705c22c20. mov qword [0x006091e0], 0x609200 ; [0x6091e0:8]=0 | ::| 0x0040651e 48c744241010. mov qword [rsp + local_10h], 0x609110 ; [0x609110:8]=1 | ::| 0x00406527 48c704247092. mov qword [rsp], 0x609270 ; [0x609270:8]=0 | ::| 0x0040652f e83c0d0000 call sub.pthread_create_270  What I did is to search all the addresses and dump the assembly in separate files: r2> pd 108 @ 0x402060 > asm.th/01-2060.asm r2> pd 108 @ 0x403250 > asm.th/02-3250.asm r2> pd 108 @ 0x403c90 > asm.th/03-3c90.asm r2> pd 108 @ 0x405740 > asm.th/04-5740.asm r2> pd 108 @ 0x403470 > asm.th/05-3470.asm r2> pd 108 @ 0x402640 > asm.th/06-2640.asm r2> pd 108 @ 0x401490 > asm.th/07-1490.asm r2> pd 108 @ 0x403680 > asm.th/08-3680.asm r2> pd 108 @ 0x4042a0 > asm.th/09-42a0.asm r2> pd 108 @ 0x402840 > asm.th/10-2840.asm r2> pd 108 @ 0x405b80 > asm.th/11-5b80.asm r2> pd 108 @ 0x402e40 > asm.th/12-2e40.asm r2> pd 108 @ 0x405fc0 > asm.th/13-5fc0.asm r2> pd 108 @ 0x401a60 > asm.th/14-1a60.asm r2> pd 108 @ 0x405da0 > asm.th/15-5da0.asm r2> pd 108 @ 0x4061d0 > asm.th/16-61d0.asm r2> pd 108 @ 0x4040b0 > asm.th/17-40b0.asm r2> pd 108 @ 0x403050 > asm.th/18-3050.asm r2> pd 108 @ 0x402a40 > asm.th/19-2a40.asm r2> pd 108 @ 0x4048d0 > asm.th/20-48d0.asm r2> pd 108 @ 0x405540 > asm.th/21-5540.asm r2> pd 108 @ 0x401870 > asm.th/22-1870.asm r2> pd 108 @ 0x404ef0 > asm.th/23-4ef0.asm r2> pd 108 @ 0x405310 > asm.th/24-5310.asm r2> pd 108 @ 0x401260 > asm.th/25-1260.asm r2> pd 108 @ 0x403890 > asm.th/26-3890.asm r2> pd 108 @ 0x402c50 > asm.th/27-2c50.asm r2> pd 108 @ 0x403ea0 > asm.th/28-3ea0.asm r2> pd 108 @ 0x405100 > asm.th/29-5100.asm r2> pd 108 @ 0x4046b0 > asm.th/30-46b0.asm r2> pd 108 @ 0x403a80 > asm.th/31-3a80.asm r2> pd 108 @ 0x401e60 > asm.th/32-1e60.asm r2> pd 108 @ 0x401c70 > asm.th/33-1c70.asm r2> pd 108 @ 0x404cd0 > asm.th/34-4cd0.asm r2> pd 108 @ 0x404ad0 > asm.th/35-4ad0.asm r2> pd 108 @ 0x4044a0 > asm.th/36-44a0.asm r2> pd 108 @ 0x402260 > asm.th/37-2260.asm r2> pd 108 @ 0x402440 > asm.th/38-2440.asm r2> pd 108 @ 0x405960 > asm.th/39-5960.asm r2> pd 108 @ 0x401670 > asm.th/40-1670.asm Radare could not recognize the functions at these addresses so I had to guess their lengths. Looking at the code a similar structure emerges in all threads: nasm ; Boiler plate code .. ; ... ; Beginning of calculations. Note the use of 'r14'. As I discover later - it holds the input in an integer array ...-----> 0x004020f5 498b06 mov rax, qword [r14] :::|||| 0x004020f8 488b88f00000. mov rcx, qword [rax + 0xf0] ; [0xf0:8]=-1 ; 240 :::|||| 0x004020ff 488b7008 mov rsi, qword [rax + 8] ; [0x8:8]=-1 ; 8 :::|||| 0x00402103 488b90980000. mov rdx, qword [rax + 0x98] ; [0x98:8]=-1 ; 152 ; ... ; ... ; Some calculations ; ... ; ... ; The comparison :::|||| 0x004021c1 483d77021400 cmp rax, 0x140277 ; ... ; Exit code  I did some dynamic analysis in GDB also: break *(0x402060+0x98) break *(0x403250+0x98) ... break *(0x401670+0x98) run qwertyuiopasdfghjklzxcvbnm0123456789_=+- I saw that r14 holds the input (hexdump of qword ptr [r14/rax]): pwndbg> hexdump$rax
+0000 0x61fcf0 71 00 00 00 00 00 00 00 77 00 00 00 00 00 00 00 │q...│....│w...│....│
+0010 0x61fd00 65 00 00 00 00 00 00 00 72 00 00 00 00 00 00 00 │e...│....│r...│....│
+0020 0x61fd10 74 00 00 00 00 00 00 00 79 00 00 00 00 00 00 00 │t...│....│y...│....│
+0030 0x61fd20 75 00 00 00 00 00 00 00 69 00 00 00 00 00 00 00 │u...│....│i...│....│
pwndbg>
+0040 0x61fd30 6f 00 00 00 00 00 00 00 70 00 00 00 00 00 00 00 │o...│....│p...│....│
+0050 0x61fd40 61 00 00 00 00 00 00 00 73 00 00 00 00 00 00 00 │a...│....│s...│....│
+0060 0x61fd50 64 00 00 00 00 00 00 00 66 00 00 00 00 00 00 00 │d...│....│f...│....│
+0070 0x61fd60 67 00 00 00 00 00 00 00 68 00 00 00 00 00 00 00 │g...│....│h...│....│
pwndbg>
+0080 0x61fd70 6a 00 00 00 00 00 00 00 6b 00 00 00 00 00 00 00 │j...│....│k...│....│
+0090 0x61fd80 6c 00 00 00 00 00 00 00 7a 00 00 00 00 00 00 00 │l...│....│z...│....│
+00a0 0x61fd90 78 00 00 00 00 00 00 00 63 00 00 00 00 00 00 00 │x...│....│c...│....│
+00b0 0x61fda0 76 00 00 00 00 00 00 00 62 00 00 00 00 00 00 00 │v...│....│b...│....│
pwndbg>
+00c0 0x61fdb0 6e 00 00 00 00 00 00 00 6d 00 00 00 00 00 00 00 │n...│....│m...│....│
+00d0 0x61fdc0 30 00 00 00 00 00 00 00 31 00 00 00 00 00 00 00 │0...│....│1...│....│
+00e0 0x61fdd0 32 00 00 00 00 00 00 00 33 00 00 00 00 00 00 00 │2...│....│3...│....│
+00f0 0x61fde0 34 00 00 00 00 00 00 00 35 00 00 00 00 00 00 00 │4...│....│5...│....│
pwndbg>
+0100 0x61fdf0 36 00 00 00 00 00 00 00 37 00 00 00 00 00 00 00 │6...│....│7...│....│
+0110 0x61fe00 38 00 00 00 00 00 00 00 39 00 00 00 00 00 00 00 │8...│....│9...│....│
+0120 0x61fe10 5f 00 00 00 00 00 00 00 3d 00 00 00 00 00 00 00 │_...│....│=...│....│
+0130 0x61fe20 2b 00 00 00 00 00 00 00 2d 00 00 00 00 00 00 00 │+...│....│-...│....│

Then I cleaned up the code and gathered it all in [all-asm.json](https://github.com/mr6r4y/re-write-ups/blob/master/crackmes/volgactf-2018/you-shall-not-pass/all-asm.json)

By searching all offsets in the snippets of all threads:

nasm
mov rcx, qword [rax + 0xf0]
mov rsi, qword [rax + 8]
mov rdx, qword [rax + 0x98]


I came up with this:

python
>>> indexes = [0xf0, 8, 0x98, ..., 0x10, 0xa8, 0x58, 0xc8]
>>> len(set(indexes))
3: 44
>>> max(set(indexes))
4: 352
>>> 352/8
5: 44


So the length of the input is 45 characters long. Expecting the input to have the format of a flag (VolgaCTF{...})
I came up with the following start input in GDB:

pwndbg> run "VolgaCTF{qwertyuiopasdfghjklzxcvbnmQWERTYUIO}"

The analysis part finishes with:
* having length and format of the input/flag
* isolated asm code in [all-asm.json](https://github.com/mr6r4y/re-write-ups/blob/master/crackmes/volgactf-2018/you-shall-not-pass/all-asm.json)

What has left is to parse the code and cram into z3.

# Z3 Script

The script has the following main goals:
* parse the limited operations in all-asm.json
* extract a constraint for each thread
* cram all into solve(..)

Some tricky stuff include:
* usage of BitVecVal and BitVec to simulate machine registers
* use python globals (simulate registers) as processor state
* craft valid python statements from assembly code that can be exec-ed (lazy to think of more sophisticated parser/interpreter)

## analysis.py
python
#!/usr/bin/env python

import json
import re
import struct
import binascii
from z3 import *

TABLE = [
BitVecVal(0x56, 64),
BitVecVal(0x6f, 64),
BitVecVal(0x6c, 64),
BitVecVal(0x67, 64),
BitVecVal(0x61, 64),
BitVecVal(0x43, 64),
BitVecVal(0x54, 64),
BitVecVal(0x46, 64),
BitVecVal(0x7b, 64),
]

TABLE = TABLE + [BitVec("x_%i" % i, 64) for i in range(9, 44)] + [BitVecVal(0x7d, 64)]

# TABLE = [BitVec("x_%i" % i, 64) for i in range(45)]

rax = None
rbx = None
rcx = None
rdx = None
rsi = None
rdi = None
r8 = None
r9 = None
r10 = None
r11 = None
r12 = None
r13 = None
r14 = TABLE

def reset():
global rax, rbx, rcx, rdx, rsi, rdi, r8, r9, r10, r11, r12, r13, r14
rax = None
rbx = None
rcx = None
rdx = None
rsi = None
rdi = None
r8 = None
r9 = None
r10 = None
r11 = None
r12 = None
r13 = None
r14 = TABLE

def asm_mov(a1, a2):
return "%s = %s" % (a1, a2)

def asm_lea(a1, a2):
return "%s = %s" % (a1, a2)

def asm_shl(a1, a2):
return "%s = %s << %s" % (a1, a1, a2)

if a2.startswith("0x"):
a2 = "BitVecVal(%s, 64)" % a2
return "%s = %s + %s" % (a1, a1, a2)

def asm_sub(a1, a2):
if a2.startswith("0x"):
a2 = "BitVecVal(%s, 64)" % a2
return "%s = %s - %s" % (a1, a1, a2)

def asm_imul(a1, a2, a3=None):
if a3 is None:
if a2.startswith("0x"):
a2 = "BitVecVal(%s, 64)" % a2
return "%s = %s * %s" % (a1, a1, a2)
else:
if a3.startswith("0xffff"):
a3 = str(struct.unpack(">q", binascii.unhexlify(a3[2:]))[0])
a3 = "BitVecVal(%s, 64)" % a3
return "%s = %s * %s" % (a1, a2, a3)

def asm_neg(a1):
return "%s = -%s" % (a1, a1)

def asm_cmp(a1, a2):
if a2.startswith("0xffff"):
a2 = str(struct.unpack(">q", binascii.unhexlify(a2[2:]))[0])
return "%s == BitVecVal(%s, 64)" % (a1, a2)

def parse_assembly(s):
def _get_instr(s):
ins = re.search("^[a-z]+", s).group()
return ins

def _get_args(s):
raw_args = re.search("^[a-z]+ (.*)", s).group(1)
raw_args = [i.strip() for i in raw_args.split(",")]

args_expr = []
for a in raw_args:
if a in ["rax", "rbx", "rcx", "rdx", "rsi", "rdi",
"r8", "r9", "r10", "r11", "r12", "r13", "r14"]:
args_expr.append(a)
else:
r = re.search("^qword \$(.+)\$$", a) if r: r = r.group(1) r = [i.strip() for i in r.split("+")] if len(r) == 1: if r[0] == "r14": args_expr.append("%s" % r[0]) else: args_expr.append("%s[0]" % r[0]) else: r1, r2 = r try: e = int(r2) / 8 except ValueError: e = int(r2, 16) / 8 args_expr.append("%s[%i]" % (r1, e)) else: args_expr.append(a) return args_expr ins = _get_instr(s) args = _get_args(s) instructions = { "mov": asm_mov, "lea": asm_lea, "shl": asm_shl, "add": asm_add, "sub": asm_sub, "imul": asm_imul, "neg": asm_neg, "cmp": asm_cmp } c = instructions[ins](*args) return ins, c def eval_block(b): global rax, rbx, rcx, rdx, rsi, rdi, r8, r9, r10, r11, r12, r13, r14 for i in b: ins, c = parse_assembly(i) if ins == "cmp": print c return eval(c) else: print c exec c def main(): allasm = json.load(open("all-asm.json", "r")) constraints = [] for n, b in enumerate(allasm): print "Block No: %i" % n print "="*40 c = eval_block(b) constraints.append(c) reset() print "-"*40 print solve(*constraints) if __name__ == '__main__': main()  At the end I get:$ python analysis.py
...
...
----------------------------------------
[x_19 = 48,
x_38 = 101,
x_11 = 36,
x_20 = 117,
x_26 = 101,
x_17 = 95,
x_15 = 115,
x_39 = 95,
x_9 = 68,
x_34 = 117,
x_32 = 121,
x_29 = 110,
x_28 = 97,
x_43 = 101,
x_33 = 48,
x_27 = 95,
x_40 = 115,
x_13 = 117,
x_42 = 102,
x_35 = 95,
x_25 = 100,
x_18 = 121,
x_41 = 64,
x_31 = 95,
x_14 = 105,
x_37 = 114,
x_10 = 49,
x_12 = 103,
x_36 = 64,
x_30 = 100,
x_22 = 95,
x_24 = 111,
x_23 = 99,
x_21 = 114,
x_16 = 51]
None

and I then converted this to:

python
x = [i for i in "VolgaCTF{qwertyuiopasdfghjklzxcvbnmQWERTYUIO}"]
x[19] = chr(48)
x[38] = chr(101)
x[11] = chr(36)
x[20] = chr(117)
x[26] = chr(101)
x[17] = chr(95)
x[15] = chr(115)
x[39] = chr(95)
x[9 ]= chr(68)
x[34] = chr(117)
x[32] = chr(121)
x[29] = chr(110)
x[28] = chr(97)
x[43] = chr(101)
x[33] = chr(48)
x[27] = chr(95)
x[40] = chr(115)
x[13] = chr(117)
x[42] = chr(102)
x[35] = chr(95)
x[25] = chr(100)
x[18] = chr(121)
x[41] = chr(64)
x[31] = chr(95)
x[14] = chr(105)
x[37] = chr(114)
x[10] = chr(49)
x[12] = chr(103)
x[36] = chr(64)
x[30] = chr(100)
x[22] = chr(95)
x[24] = chr(111)
x[23] = chr(99)
x[21] = chr(114)
x[16] = chr(51)
print "".join(x)


VolgaCTF{[email protected][email protected]}

Original writeup (https://github.com/mr6r4y/write-ups/tree/master/challenges/volgactf-2018%40you-shall-not-pass).