Tags: disassembly jail python3 

Rating: 5.0

# Description

Category: Misc

Difficulty: Easy

Author: explo1t

Description: Run the secret function you must! Hrrmmm. A flag maybe you will get.

# Overview

This challenge takes place in a remote restricted Python shell a.k.a. a Python jail. Usually the goal is to escape the jail, i.e. to shell out and find the flag in the filesystem.

But this challenge description says that one *may* get a flag by running a secret function inside the jail. It turns out that another challenge called **Pyjail Escape** takes place inside this same jail, and its goal is to escape. So one can say this challenge is a bit easier than an actual jail escape.

# Reconnaissance

After connecting to the challenge server, there is the following Python shell:

```python
The flag is stored super secure in the function ALLES !
>>> a =
```

So the secret function name is `ALLES`. Call it:

```python
a = ALLES()
name 'alles' is not defined
```

The input characters are converted to lowercase. At this point it's a good idea to try to find other restrictions of this jail.

## Character set

Some characters are prohibited:

```python
>>> a = w
w
Denied
```

But others are OK:

```python
>>> a = e
name 'e' is not defined
```

There are 95 printable ASCII characters, so it doesn't take long to try all of them. List of all allowed characters:

```python
["1", "2", "3", "7", "9", "0", "\"", "(", ")", "'", "+", ".", "a", "c", "d", "e", "g", "i", "l", "n", "o", "p", "r", "s", "t", "v", "[", "]", "_"]
```

Uppercase counterparts of allowed lowercase characters are not prohibited, but are converted to lowercase.

## Built-ins

Lots of built-in functions are removed:

```python
>>> a = ord()
name 'ord' is not defined
```

At this point calling `__builtins__` to see which functions weren't removed is not possible (`"__builtins__"` contains prohibited characters `'b'` and `'u'` ). Using [Built-in Functions table](https://docs.python.org/3/library/functions.html) it is possible to manually check that the only readily available built-ins are: `['repr', 'str', 'print', 'eval', 'all']`. There might be others, but it is not possible to check at this point because of the character set restrictions.

## Using eval()

Out of that list, `eval` immediately catches attention. `eval` is used to evaluate a string as a Python expression and return the result.

Since it is not possible to input uppercase letters directly to get the function `ALLES`, there is another way which uses `eval`:

This works locally:

```python
>>> def ALLES(): ...
...
>>> eval(eval("\"alles\".upper()"))
<function ALLES at 0x105725a60>
```

Unfortunately this wouldn't work inside the jail since `'upper'` contains forbidden characters.

One way to go around this restriction is by finding the string `'upper'` in the list of all attributes and methods of `str` type.

```python
>>> a = print("".__dir__())
['__repr__', '__hash__', '__str__', '__getattribute__', '__lt__', '__le__', '__eq__', '__ne__', '__gt__', '__ge__', '__iter__', '__mod__', '__rmod__', '__len__', '__getitem__', '__add__', '__mul__', '__rmul__', '__contains__', '__new__', 'encode', 'replace', 'split', 'rsplit', 'join', 'capitalize', 'casefold', 'title', 'center', 'count', 'expandtabs', 'find', 'partition', 'index', 'ljust', 'lower', 'lstrip', 'rfind', 'rindex', 'rjust', 'rstrip', 'rpartition', 'splitlines', 'strip', 'swapcase', 'translate', 'upper', 'startswith', 'endswith', 'islower', 'isupper', 'istitle', 'isspace', 'isdecimal', 'isdigit', 'isnumeric', 'isalpha', 'isalnum', 'isidentifier', 'isprintable', 'zfill', 'format', 'format_map', '__format__', 'maketrans', '__sizeof__', '__getnewargs__', '__doc__', '__setattr__', '__delattr__', '__init__', '__reduce_ex__', '__reduce__', '__subclasshook__', '__init_subclass__', '__dir__', '__class__']
>>> a = print("".__dir__()[20+20+2+2+2])
upper
```

The strange indexing is due to the restriction on the digits that can be used in the jail.

Using this, it is finally possible obtain `ALLES` function object:

```python
>>> a = eval(eval('"alles".'+"".__dir__()[20+20+2+2+2]+'()'))
>>> a = print(a)
<function ALLES at 0x7f13d1aa0378>
>>> a = eval(eval('"alles".'+"".__dir__()[20+20+2+2+2]+'()'))
>>> a = print(a())
No flag for you!
```

There's more work to be done...

# Examining ALLES.\_\_code\_\_

`ALLES` doesn't give away the flag when called with no arguments. One way to figure out how `ALLES` works and how to get the flag is to try passing different arguments to the function and observe its behaviour.

Another way is to look at the `code` object:

```python
>>> a = eval(eval('"alles".'+"".__dir__()[20+20+2+2+2]+'()'))
>>> a = print(a.__code__)

>>> a = eval(eval('"alles".'+"".__dir__()[20+20+2+2+2]+'()'))
>>> a = print(a.__code__.__dir__())
['__repr__', '__hash__', '__getattribute__', '__lt__', '__le__', '__eq__', '__ne__', '__gt__', '__ge__', '__new__', '__sizeof__', 'co_argcount', 'co_kwonlyargcount', 'co_nlocals', 'co_stacksize', 'co_flags', 'co_code', 'co_consts', 'co_names', 'co_varnames', 'co_freevars', 'co_cellvars', 'co_filename', 'co_name', 'co_firstlineno', 'co_lnotab', '__doc__', '__str__', '__setattr__', '__delattr__', '__init__', '__reduce_ex__', '__reduce__', '__subclasshook__', '__init_subclass__', '__format__', '__dir__', '__class__']
```

There are lots of attributes that are useful in reverse engineering a Python function.

## Constants

This is definitely the first attribute to check. A naive implementation of `ALLES` would store the flag as a string inside the function. This means the string would be stored as a constant in the function bytecode.

### Definition

`co_consts` - tuple of constants used in the bytecode (see documentation of [inspect](https://docs.python.org/3/library/inspect.html) for reference)

### How to get

```python
>>> a = eval(eval('"alles".'+"".__dir__()[20+20+2+2+2]+'()'))
>>> a = print(a.__code__.co_consts)
(None, 'p\x7f\x7frbH\x00DR\x07CRUlJ\x07DlRe\x02N', 'No flag for you!')
```
The string `'p\x7f\x7frbH\x00DR\x07CRUlJ\x07DlRe\x02N'` is not the flag format is `ALLES{...}`.

## Arguments

### Definition

`co_varnames`- tuple of names of arguments and local variables

### How to get

```python
>>> a = eval(eval('"alles".'+"".__dir__()[20+20+2+2+2]+'()'))
>>> a = print(eval("a.__code__."+a.__code__.__dir__()[19]))
('flag',)
```
#### Explanation

`a.__code__.__dir__()[19]` is `"co_varnames"`.

## Local variables

### Definition

`co_names` - tuple of names of global variables

### How to get

```python
>>> a = eval(eval('"alles".'+"".__dir__()[20+20+2+2+2]+'()'))
>>> a = print(eval("a.__code__."+a.__code__.__dir__()[17+1]))
('string_xor',)
```

#### Explanation

`a.__code__.__dir__()[17+1]` is `"co_names"`.

## Bytecode

### Definition

`co_code` - string of raw compiled bytecode

### How to get

```python
>>> a = eval(eval('"alles".'+"".__dir__()[20+20+2+2+2]+'()'))
>>> a = print(a.__code__.co_code)
b'|\x00r\x0et\x00d\x01|\x00\x83\x02S\x00d\x02S\x00d\x00S\x00'
```

## Disassembly

The function bytecode can be disassembled using [dis](https://docs.python.org/3/library/dis.html#opcode-CALL_FUNCTION) module locally:

```python
>>> import dis
>>> dis.dis(x=b'|\x00r\x0et\x00d\x01|\x00\x83\x02S\x00d\x02S\x00d\x00S\x00')
0 LOAD_FAST 0 (0)
2 POP_JUMP_IF_FALSE 14
4 LOAD_GLOBAL 0 (0)
6 LOAD_CONST 1 (1)
8 LOAD_FAST 0 (0)
10 CALL_FUNCTION 2
12 RETURN_VALUE
>> 14 LOAD_CONST 2 (2)
16 RETURN_VALUE
18 LOAD_CONST 0 (0)
20 RETURN_VALUE
```

### Instruction format

Python bytecode instruction disassembly follows this format: `offset opname arg (argval)`

`offset` - start index of operation within bytecode sequence

`opname` - human readable name for operation

`arg` - numeric argument to operation (if any), otherwise None

`argval` - resolved arg value (if known), otherwise same as arg (*this is the case here*)

`>>` at offset 14 indicates that the instruction is a jump target.

### Disassembly breakdown

| OP | Description |
|--------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------|
| `0 LOAD_FAST 0 (0)` | Pushes a reference to the local `co_varnames[0] = 'flag'` onto the stack. |
| `2 POP_JUMP_IF_FALSE 14` | If TOS (top of stack) is false, sets the bytecode counter to target. TOS is popped. |
| `4 LOAD GLOBAL 0 (0)` | Loads the global named `co_names[0] = 'string_xor'` onto the stack. |
| `6 LOAD_CONST 1 (1)` | Pushes `co_consts[1] = 'p\x7f\x7frbH\x00DR\x07CRUlJ\x07DlRe\x02N'` onto the stack. |
| `8 LOAD_FAST 0 (0)` | Pushes a reference to the local `co_varnames[0] = 'flag'` onto the stack. |
| `10 CALL_FUNCTION 2` | Calls a callable object with positional arguments. `arg` indicates the number of positional arguments. The top of the stack contains positional arguments. |
| `12 RETURN_VALUE` | Returns with TOS to the caller of the function. |
| `>> 14 LOAD_CONST 2 (2)` | Pushes `co_consts[2] = 'No flag for you!'` onto the stack. |
| `16 RETURN_VALUE` | Returns with TOS to the caller of the function. |
| `18 LOAD_CONST 0 (0)` | Pushes `co_consts[0] = None` onto the stack. |
| `20 RETURN_VALUE` | Returns with TOS to the caller of the function. |

### Decompilation

The disassembly can be decompiled to this Python source code:

```python
def string_xor(x, y): ...

def ALLES(flag):
if flag:
return string_xor('p\x7f\x7frbH\x00DR\x07CRUlJ\x07DlRe\x02N', flag)
else:
return 'No flag for you!'
return

```

### Sanity check

Now that the dependence between the input `flag` and the output of `ALLES` is clearer. It's best to do a sanity check.

`flag` is falsy:

```python
>>> a = eval(eval('"alles".'+"".__dir__()[20+20+2+2+2]+'()'))
>>> a = print(a())
No flag for you!
```

`flag` is truthy:

```python
>>> a = eval(eval('"alles".'+"".__dir__()[20+20+2+2+2]+'()'))
>>> a = print(a("0000"))
@OOB
```
It is now clear that `string_xor` a string of the same length as the input to `ALLES`.

### string_xor decompilation

`string_xor` is a global function so it is possible to get its code object and disassemble it.

Instead, given that `string_xor` has a pretty descriptive name, it might be a better idea to save some time and guess how the function works by observing its behavior.

Passing an `int`:

```python
>>> a = eval(eval('"alles".'+"".__dir__()[20+20+2+2+2]+'()'))
>>> a = print(a(1))
zip argument #2 must support iteration
```

Passing a `list`:

```python
>>> a = eval(eval('"alles".'+"".__dir__()[20+20+2+2+2]+'()'))
>>> a = print(a([1]))
ord() expected string of length 1, but int found
```

Looks like the `string_xor(x, y)` accepts string inputs. It converts each character of `x` and `y`
using `ord()` to an `int`, XORs them together, passes the result to `chr()` and uses that to construct an output string.

Possible decompilation:

```python
def string_xor(x, y):
ret = ''
for i, j in zip(x, y):
ret += chr(ord(i) ^ ord(j))
return ret
```

### Another sanity check

Trying this locally:

```python
>>> def string_xor(x, y):
... ret = ''
... for i, j in zip(x, y):
... ret += chr(ord(i) ^ ord(j))
... return ret
...
>>> def ALLES(flag):
... if flag:
... return string_xor('p\x7f\x7frbH\x00DR\x07CRUlJ\x07DlRe\x02N', flag)
... else:
... return 'No flag for you!'
... return
...
>>> ALLES('0000')
'@OOB'
```

Same output as seen on the challenge server.

## Guessing the correct flag

The flag format is known to be `ALLES{...}`.

Therefore, the first 6 characters that need to be passed to `ALLES` to get the flag are the following (checking this locally):

```python
>>> ALLES('ALLES{')
'133713'
```

It is now obvious Wh47 h42 70 8E d0NE 70 Ge7 7he fl4G...

```python
>>> a = eval(eval('"alles".'+"".__dir__()[20+20+2+2+2]+'()'))
>>> a = print(a('1337133713371337133713'))
ALLES{3sc4ped_y0u_aR3}
```

Original writeup (https://mmxmb.com/posts/alles-ctf-2020-writeup-pyjail-atricks/).