Tags: disassembly instructionset cpu emulator
Rating:
# Summary
Provided with the 'tcemu' challenge was an executable ('cpu') and a binary file ('flag'). The CPU is an emulator for a (presumably custom) architecture, and the binary file presumably contained the flag somewhere.
# Process
1. Ran the 'flag' code on the 'cpu' emulator executable, it outputs "I put the flag here somewhere". Good start, this should make it easier to start to figure out what the CPU is doing since we have some known behaviour we can try to match up with the 'flag' binary.
2. Ran 'strings' on the 'cpu' binary to try to get some info on it, there's a bunch of references to a stack and some error messages to do with certain instructions, so this gives an idea of what kind of instructions we might be dealing with (mov, xor, add, store, push, pop, jmp, prnt) and that it's a stack-based system.
3. Hexdumped the 'flag' binary to try to find the instruction set width, incorrectly guessed 9 bytes at first because it's almost entirely sets of 2-3 instructions in a sequence. Figured out by randomly changing bytes here and there that 0x05 is an invalid instruction (error message gets emitted), so by changing various bytes to this to identify which bytes were instruction and which were data I eventually figured out the instructions are actually 3 bytes wide (ish, it turned out later). Everything started to make more sense once I figured this part out.
4. Using the instruction set width and the info from strings, figured out that most of the code is PUSHing a letter onto the stack (0x04), then popping the letter from the stack and printing it (0x08, which I'll call PRNT). The "I" in "I put the flag here somewhere" isn't actually in the place in the code that you'd expect (at least in plain text) though, whereas all the other letters are. I initially thought the 0x00 x3 after each PUSH was a NOOP, but after more testing it turned out to be part of the argument to PUSH.
5. Investigated where the "I" came from, at first I thought it was a jump somewhere that printed it, but eventually figured out that the code for the "I" seemed to PUSH a number, do something else, then PRNT. The argument for the "something else" instruction plus the number originally pushed added up to the ASCII value for "I", so I figured the mystery instruction must be ADD (0x10), which ADDs the immediate value to the top item on the stack.
6. Threw a tiny "disassembler" (if you can call it that) together in ruby that just understood the instructions I understood so far, and it turned out that's all the 'flag' binary used! The other instructions supported by the CPU aren't involved in this challenge.
7. Disassembling the 'flag' binary, it showed that after the message it displayed, it actually kept PUSHing more things to the stack and ADDing to them, but there were no more PRNT calls, so everything was invisible.
8. Added PRNT calls after each PUSH/ADD pair, which resulted in the flag being output!
# Instruction Set
Instruction set in the end (from looking at the 'cpu' binary I know there's more, but the 'flag' binary only uses these ones). The instruction set is variable width, the first byte is instruction, then either 2 more bytes or 5 more bytes (3 or 6 total bytes per instruction), it looks like to me?
* 0x04 - PUSH - 5 byte arg - Push argument to the stack.
* 0x08 - PRNT - 2 byte arg - Display information on the terminal, second byte of argument specifies what kind of information (0=print contents of register 0, 1=pop the top item off the stack and display it. Only 1 is used in 'flag' binary)
* 0x10 - ADD - 5 byte arg - Add the argument to the top item on the stack. The result replaces the top item on the stack.
# Lessons Learned
* When looking at a new/mystery architecture, start with small word sizes and work up. I wasted a bunch of time while I was thinking it ran on 9-byte words, and things made a lot more sense when I started assuming 3-byte. 3-byte also turned out to be not entirely correct, but everything started getting unlocked once I made that assumption at least. It's easier to glue two lines of words together in your head than it is to split up longer words.
* I spent too much time trying to reverse the 'cpu' executable, when reversing the instruction set based on the 'flag' binary turned out to be much more straightforward. Coming up with a couple potential strategies at first and flipping to an alternate one when I wasn't making much progress using the first one was a good idea here, and I'll try to keep this in mind in the future.
* I need better options for CLI hex editing, that slowed me down a bit on this one. I eventually wrote a tiny bash script that used xxd to dump the binary to a hex file 3 bytes wide, edit it in vim, then use xxd -r to reassemble it into binary. Every time I needed to add an instruction I had to renumber all the following addresses though, which was a huge pain. Something to sort out for next time.
# Note
After thinking about it more and reading other writeups, it's clear that the stack isn't actually involved here, what I thought was PUSH is actually STORE to register, and what I was interpreting as pop/print is just 'print register'.
Not entirely sure where my idea that a stack was involved in this flag came from, but these things happen when you're under time pressure and your understanding is evolving as you go. I'm going to leave this writeup as-is as it was my understanding at the time, and worked well enough to get the flag!