Rating: 5.0

> We found a strange binary, claiming to use a custom "Medea" instruction set. We found a spec for it at [https://github.com/Kantaja/MedeaCTF](https://github.com/Kantaja/MedeaCTF), can you help us solve this?

1. Write a disassembler for this "Medea" architecture. There's not much to say here; just a matter of implementing the provided spec. [https://github.com/ba-sing-sec/MedeaVM](https://github.com/ba-sing-sec/MedeaVM) (I've polished this up a bit since the CTF so it now has full support for loading compressed images and running a VM; for the CTF we used a python script to convert the compressed image to an uncompressed one and only implemented some of the opcodes in the VM. The VM turned out to be useless anyway since the provdied image was broken, but a disassembler was useful.)
2. Disassemble the code memory from the given image and reverse-engineer it.
I've annotated the disassembly below:

0x0001 icpy 0x003e,rtrgt y = get_input()
0x0004 call nil,nil,nil
0x0006 pop ry
0x0008 icpy 0x000c,rz z = 12
0x000b cmp ry,rz if y != z:
0x000d icpy 0x0017,rtrgt
0x0010 jequ
0x0011 icpy 0x005a,rtrgt incorrect_length()
0x0014 call nil,nil,nil
0x0016 halt halt()
do:
0x0017 cpy rz,rtrgt offset = z
0x0019 push rz push(z)
0x001b rcpf+ smain(0x0001),rx x = smain[1 + offset]
0x001e rcpf+ sin(0x0001),rz z = sin[1 + offset]
0x0021 cmpl rz z = ~z
0x0023 xor rx,rz x = x ^ z
0x0025 icpy 0x00ff,rz z = 0xff
0x0028 and rx,rz x = x & z
0x002a writ rx putchar(x)
0x002c pop rz z = pop()
0x002e inc rz z++
0x0030 icpy 0x0000,ry y = 0
0x0033 cmp rx,ry while x!= y
0x0035 icpy 0x003d,rtrgt
0x0038 jequ
0x0039 icpy 0x0017,rtrgt
0x003c jump
0x003d halt halt()

def get_input():
0x003e icpy 0x000a,ry y = '\n'
0x0041 icpy 0x0000,rz z = 0
while True:
0x0044 inc rz z++
0x0046 read rx x = getchar()
0x0048 cmp rx,ry if x == y:
0x004a icpy 0x0057,rtrgt break
0x004d jequ
0x004e cpy rz,rtrgt offset = z
0x0050 rcpt+ nil,smain(0x0001) smain[1 + offset] = 0
0x0053 icpy 0x0044,rtrgt
0x0056 jump
0x0057 cpy rz,rtrgt return z
0x0059 rtrn

def incorrect_length():
# prints "Incorrect length!"
0x005a icpy 0x6e49,rx x = 'In'
0x005d icpy 0x00a2,rtrgt put_2_chars(x)
0x0060 call rx,nil,nil
0x0062 icpy 0x6f63,rx x = 'co'
0x0065 icpy 0x00a2,rtrgt put_2_chars(x)
0x0068 call rx,nil,nil
0x006a icpy 0x7272,rx [etc]
0x006d icpy 0x00a2,rtrgt
0x0070 call rx,nil,nil
0x0072 icpy 0x6365,rx
0x0075 icpy 0x00a2,rtrgt
0x0078 call rx,nil,nil
0x007a icpy 0x2074,rx
0x007d icpy 0x00a2,rtrgt
0x0080 call rx,nil,nil
0x0082 icpy 0x656c,rx
0x0085 icpy 0x00a2,rtrgt
0x0088 call rx,nil,nil
0x008a icpy 0x676e,rx
0x008d icpy 0x00a2,rtrgt
0x0090 call rx,nil,nil
0x0092 icpy 0x6874,rx
0x0095 icpy 0x00a2,rtrgt
0x0098 call rx,nil,nil
0x009a icpy 0x0a21,rx
0x009d icpy 0x00a2,rtrgt
0x00a0 call rx,nil,nil

def put_2_chars(x):
0x00a2 writ rx putchar(x & 0xff)
0x00a4 bswp rx byteswap(x)
0x00a6 writ rx putchar(x & 0xff)
0x00a8 rtrv return


There are two obvious bugs, but it's easy enough to figure out what's going on. It reads a line from the input, performs some computations against the contents of the input memory, and prints the output.

First, the instruction at 0x0050 is wrong:

0x0050 rcpt+ nil,smain(0x0001) smain[1 + offset] = 0


The first argument is invalid (shown in my disassmebly as nil). If we look at the machine code, the instruction word is 0b0011001001000010 - the first (most significant) four bits indicate that the first argument is a register argument and the second is an address in smain.

The first argument word (describing register arguments) is 0b0000000100000000. This is invalid. The spec says that the lowest 4 bits of the register argument word are reserved, and then the register indices take 4 bits each, packed to the right so that the next 4 bits above the reserved bits describe the last register argument. In this case those bits are all 0, but the second register argument index (which is unused since the instruction only takes one register argument, per the argument flags in the instruction word) indicates rx. From context, we can tell that it probably should be rx, since rx at this point contains the character read as input, which we want to store in memory so it can be processed later.

The second bug is that rz is set to 12 at address 0x0008 (to check the length of the input). Then rz is used as a counter while processing the input. From context it seems that it should count from 0, but it is never cleared and still has the value 12 at that point.

> These bugs were fixed by the challenge author, but the fixed image was not published until after the CTF (although some teams apparently received the fixed image privately from the challenge author during the CTF).

Accounting for those bugs, we can figure out that the program reads 12 characters, then outputs input[n] ^ ~sin[n] & 0xff for each character of the input.

3. Dump the read-only memory sin from the image (starting at address 0x0001):

0xbc 0xac 0xaf 0xbf 0xa8 0xb6 0x8f 0xfa 0x9c 0xae 0xff 0xb6


We compute the input which would produce the known characters of the flag format ractf{?????} and get 123412?????4. From this we can guess the input is 123412341234, and compute the output: ractf{C1Rc3}.