Rating:

# Signals

Caterogy: Reverse engineering
Description: *This program is sending me mixed signals...*

# Basic Analysis

In this challenge we get a 64-bit Linux executable

```cmd
D:\CTF\UIUCTF\signals>file signals
signals; ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 3.2.0, not stripped
```

# Analyzing the binary

Opening the file in binary ninja

We see something interesting
```asm
000012a0 lea rdi, [rel code]
000012a7 call make_executable
```
We see a call to `make_executable` with `code` as the parameter

Disassembling the make_executable function, one instruction caught my attention
```asm
00001248 call mprotect
```

So make_executable is calling `mprotect` with `code`. mprotect is used to change the permissions of some area of the memory, i.e. we can make some read-only data to executable too.

My guess was that it is making the segment executable where `code` variable is stored.

Analyzing the main function more
```asm
000012f5 mov rax, qword [rbp-0x20 {argv}]
000012f9 mov rax, qword [rax+0x8]
000012fd mov rcx, rax
00001300 lea rdx, [rel code]
00001307 mov eax, 0x0
0000130c call rdx {code}
```

We see that it loads argv[1] (our input) into rcx.

And then it loads the address of code variable into rdx, and then jumps there.

Jumping to `code` variable, binary ninja doesn't show it as a function, we have to press `P` to make function at that address.

And indeed it is some valid code

```asm
code:
00004020 lea rax, [rel data_4435]
00004027 mov dl, byte [rcx]
00004029 xor rbx, rbx {0x0}

0000402c xor byte [rax+rbx], dl
0000402f inc rbx
00004032 cmp rbx, 0x1d
00004036 jne 0x402c

00004038 inc rcx
0000403b jmp rax {data_4435}
```

In this code, it first loads some data `data_4435` into rax. That rel means that it is relative to RIP. Binary ninja shows it like that, the actual instruction looks like `LEA RAX, [RIP+0x15d5]`.

Then it loads a byte from rcx into dl, remember that the input that we pass (argv[1]) was stored in the rcx register.

Then it clears out rbx register and use that register as a counter for the loop. In this loop, it loads a byte from the `data_4435` and xors it with our input, i.e. it is decrypting it and it jumps to that decrypted code.

The high level representation of that code would look like
```C
rcx = &argv[1]; //our input
rax = &data_4435;
for(rbx=0; rbx<0x1d; ++rbx){
rax = *rax ^ *rcx;
rax++;
}
rcx++;
rax();
```

Also notice that it increments rcx at the end. Which means that now rcx is pointing to second character of our input.

As we know that the flag format is `uiuctf{...}`, so the first character has to be `u` or maybe `U`.

So we can write a simple python code to decrypt this

```py
def decrypt(data, val):
return b''.join([chr(i ^ val).encode('charmap') for i in data])

with open('signals', 'rb') as fp:
dat = fp.read()

base = 0x1000
start = 0x0004435
end = start + 0x1d

key = ord`('u')

code = decrypt(dat[start:end+1], key)

open('layer1', 'wb').write(
dat[:start] +
code +
dat[end:]
)

```

Opening the `layer1` file in binary ninja and jumping to code

```asm
00004435 lea rax, [rel data_5a11]
0000443c mov dl, byte [rcx]
0000443e xor rbx, rbx {0x0}

00004441 xor byte [rax+rbx], dl
00004444 inc rbx
00004447 cmp rbx, 0x1d
0000444b jne 0x4441

0000444d inc rcx
00004450 jmp rax {data_5a11}
```

Indeed, it decrypted some more meaningful code. One thing to note here is that the decrypted code looks similar to the code that we saw earlier, only the data that is to be decrypted is different `data_5a11`.

So now it will decrypt the code with second letter of our input, i.e. `i`.

We can just run our [layer1 code]() with some changes into the offset and key, it will decrypt the second layer.

```asm
LEA RAX, [RIP-0x9f9]
MOV DL, [RCX]
XOR RBX, RBX
XOR [CR4:RAX+RBX], DL
INC RBX
CMP RBX, 0x1d
JNZ 0x100c
INC RCX
JMP RAX
```

And again the similar code is decrypted.

**Lets assume that it will be decrypting the same code over and over again for the input characters.**

So we just need to find the characters that will decrypt the code which looks similar to our previous code.

But doing this character by character and then opening it in a disassembler takes too much time, we need to somehow automate this.

## Possible Approaches

1. Symbolic execution: tools like `angr` or `manticore`can be used for this approach but these are automatic solvers, and it sounds boring to just run a simple script to solve a crackme.

2. Dynamic analysis: this would require us to run the program and count the number of instructions executed. For the correct input, it should execute the highest number of instructions.

3. Static analysis: using a python code to decrypt layer by layer based on the correct input.

## Dynamic Analysis

As the program is decrypting itself and if the characters are correct, it executes more instructions, so for every character we execute more instructions than before.

So we can just use the `perf` tool which is capable of counting the number of instructions executed.

I got this idea from [this](https://dustri.org/b/defeating-the-recons-movfuscator-crackme.html) write-up. According to the write-up, if we the following command, it will give us the count of the instructions.

```bash
perf stat -x, -e instructions:u ./signals
```

But for some reason, it was giving me the error
`The instructions event is not supported.`

If this would have worked, we could just change the program name in the code that was in that write-up, and it would have given us the flag.

## Static Analysis

**Lets assume that it will be decrypting the same code over and over again for the input characters.**

Then we just need the offsets of the data to be decrypted and we can just compare the parts of the code to the previous one.

Lets start writing the code.

1. Lets just define our decrypt function and read the binary

```py
def decrypt(d, val):
return b''.join([chr(i ^ val).encode('charmap') for i in d])

with open('signals', 'rb') as fp:
dat = fp.read()

```

2. Getting some offsets from the file

```py
code_addr = 0x4435
base = 0x1000
```

The `code_addr` is the address of the data which is to be decrypted.
The address at which the binary is loaded is stored at `base` variable. To get the base address, just run the following command

```cmd
D:\CTF\UIUCTF 2021\signals>objdump -d signals

signals: file format elf64-x86-64

Disassembly of section .init:

0000000000001000 <_init>:
```

The first address (0x1000) is the base address.

3. Then we can just bruteforce it char by char. But how will we get the disassembly. There are three methods I can think of:

a) **objdump**: to get the disassembly, but objdump just decodes the text section, and our code is not in the text section of the binary

b) **ndisasm**: we can get the disassembly using the following command
```bash
ndisasm -b64 file
```

It requires us to save the code in a file, and then pass it.

c) **distorm3**: [distorm3](https://github.com/gdabah/distorm/) is a python libraby which is capable of disassembling code. I used this approach

Now continuing writing the script

```py
import distorm3
def decrypt_inst(data):
return [i[2] for i in distorm3.Decode(0x1000, data, distorm3.Decode64Bits)]

flag = '' # Will store the flag
while 1:
for ch in range(32, 127): # Printable range
address = code_addr - base
code = decrypt(dat[address:address+0x1d], ch)
inst = decrypt_inst(code)
if 'JMP RAX' in inst and 'LEA' in inst[0]: # if character found
flag += chr(ch)
print(flag)
off = inst[0].split('LEA RAX, [RIP')[1].replace('0x','').replace(']', '') # getting address of next data
code_addr = code_addr + int(off, 16)
break
if 126 == ch: # If character not found
exit("ERROR")
```

Code looks fine lets run it

```cmd
D:\CTF\UIUCTF 2021\signals>python script.py
u
ERROR
```

It prints the first character correctly, but then it errors out

So the problem was that, it was getting the code_addr wrong, it was getting the address 0x5a0a instead of 0x5a11.

It is beacuse the first instruction `00004435 lea rax, [rel data_5a11]` is 7 bytes long, so we have to add 7 to the code_addr

The [final script](https://github.com/DaBaddest/CTF-Writeups/blob/master/UIUCTF%202021/signals/script.py) should be

```py
import distorm3
def decrypt_inst(data):
return [i[2] for i in distorm3.Decode(0x1000, data, distorm3.Decode64Bits)]

def decrypt(d, val):
return b''.join([chr(i ^ val).encode('charmap') for i in d])

with open('signals', 'rb') as fp:
dat = fp.read()

base = 0x1000
code_addr = 0x4435
flag = ''
while 1:
for ch in range(32, 127):
address = code_addr - base
code = decrypt(dat[address:address+0x1d], ch)
inst = decrypt_inst(code)
if 'JMP RAX' in inst and 'LEA' in inst[0]:
flag += chr(ch)
print(flag)
off = inst[0].split('LEA RAX, [RIP')[1].replace('0x','').replace(']', '')
code_addr = code_addr + int(off, 16) + 7
break
if 126 == ch:
exit("ERROR")

```

Running the script

```cmd
D:\CTF\UIUCTF 2021\signals>python script.py
u
ui
uiu
uiuc
uiuct
uiuctf
uiuctf{
uiuctf{a
uiuctf{an
uiuctf{ano
uiuctf{anot
uiuctf{anoth
uiuctf{anothe
uiuctf{another
uiuctf{another_
uiuctf{another_c
uiuctf{another_ct
uiuctf{another_ctf
uiuctf{another_ctf_
uiuctf{another_ctf_a
uiuctf{another_ctf_an
uiuctf{another_ctf_ano
uiuctf{another_ctf_anot
uiuctf{another_ctf_anoth
uiuctf{another_ctf_anothe
uiuctf{another_ctf_another
uiuctf{another_ctf_another_
uiuctf{another_ctf_another_f
uiuctf{another_ctf_another_fl
uiuctf{another_ctf_another_fla
uiuctf{another_ctf_another_flag
uiuctf{another_ctf_another_flag_
uiuctf{another_ctf_another_flag_c
uiuctf{another_ctf_another_flag_ch
uiuctf{another_ctf_another_flag_che
uiuctf{another_ctf_another_flag_chec
uiuctf{another_ctf_another_flag_check
uiuctf{another_ctf_another_flag_checke
uiuctf{another_ctf_another_flag_checker
ERROR
```

It errors out but almost prints the flag.

As we know that the flag format is `uiuctf{...}`, so we can just add the `}` and try the flag

And indeed it is the correct flag

```
uiuctf{another_ctf_another_flag_checker}
```

Original writeup (https://github.com/DaBaddest/CTF-Writeups/tree/master/UIUCTF%202021/signals).