Rating:

[Link to original writeup](https://wrecktheline.com/writeups/imaginary-2021/#foliage)

# Foliage (8 solves, 400 points)
by JaGoTu

## Description
```
Description
Welcome to the woods. Consider yourself lost with only one way out. It's dangerous out here, take this binary

Attachments
* foliage

nc chal.imaginaryctf.org 42013

Author
Robin_Jadoul

Points
400
```

## Reversing

We are given an **executable ELF file**:

```
$ file foliage
foliage: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=639edeae48962dc684b788202f5a326e8b5142f8, for GNU/Linux 3.2.0, not stripped
```

Let's throw it straight into IDA. After a bit of renaming we get the following important functions:

```C
int __cdecl main(int argc, const char **argv, const char **envp)
{
int ready_answer; // [rsp+8h] [rbp-48h] BYREF
int v5; // [rsp+Ch] [rbp-44h] BYREF
int i; // [rsp+10h] [rbp-40h]
int j; // [rsp+14h] [rbp-3Ch]
__int64 rand1; // [rsp+18h] [rbp-38h] BYREF
__int64 rand2; // [rsp+20h] [rbp-30h] BYREF
__int64 rand2_stored; // [rsp+28h] [rbp-28h] BYREF
tree_ele *nodes; // [rsp+30h] [rbp-20h]
tree_ele *v12; // [rsp+38h] [rbp-18h]
tree_ele *v13; // [rsp+40h] [rbp-10h]
unsigned __int64 v14; // [rsp+48h] [rbp-8h]

v14 = __readfsqword(0x28u);
alarm(0x1Eu);
init_rnd(&rand1, &rand2);
printf("Your seed: %lu\n", rand1);
fflush(stdout);
nodes = calloc(500001uLL, sizeof(tree_ele));
init_tree(nodes, &rand1, &rand2);
printf("Ready? ");
fflush(stdout);
__isoc99_scanf("%d", &ready_answer);
if ( ready_answer != 42 )
exit(1);
rand2_stored = rand2;
for ( i = 0; i < 50000; ++i )
{
do
{
do
v13 = subtrees[choice(500001u, &rand2)];
while ( !v13->left );
}
while ( !v13->right );
printf("%d %d\n", v13->left->important_ptr->node_id, v13->right->important_ptr->node_id);
}
printf("Answers? ");
fflush(stdout);
for ( j = 0; j < 50000; ++j )
{
do
{
do
v12 = subtrees[choice(500001u, &rand2_stored)];
while ( !v12->left );
}
while ( !v12->right );
__isoc99_scanf("%d", &v5;;
if ( v12->node_id != v5 )
{
puts("Wrong!");
return 1;
}
}
print_flag();
return 0;
}

int __fastcall init_rnd(uint64_t *rand1, uint64_t *rand2)
{
FILE *stream; // [rsp+18h] [rbp-8h]

stream = fopen("/dev/urandom", "r");
if ( !stream )
{
fwrite("Could not open /dev/urandom, please contact an operator.\n", 1uLL, 0x39uLL, stderr);
exit(1);
}
fread(rand1, 8uLL, 1uLL, stream);
fread(rand2, 8uLL, 1uLL, stream);
return fclose(stream);
}

void __fastcall init_tree(tree_ele *tree, __int64 *rand1, __int64 *rand2)
{
tree_ele *v3; // rax
int i1; // eax
int i2; // eax
int i; // [rsp+2Ch] [rbp-24h]
signed int j; // [rsp+30h] [rbp-20h]
int built_offset; // [rsp+34h] [rbp-1Ch]
tree_ele *cur_subtree; // [rsp+38h] [rbp-18h]

for ( i = 0; i < 500001; ++i )
{
subtrees[i] = tree;
subtrees[i]->node_id = i;
v3 = subtrees[i];
v3->right = 0LL;
subtrees[i]->left = v3->right;
subtrees[i]->important_ptr = subtrees[i];
++tree;
}
j = 250001;
built_offset = 0;
while ( j <= 500000 )
{
cur_subtree = subtrees[j];
i1 = choice(j - built_offset, rand1);
swap(i1 + built_offset, built_offset);
i2 = choice(j - built_offset - 1, rand1);
swap(built_offset + 1 + i2, built_offset + 1);
cur_subtree->left = subtrees[built_offset];
cur_subtree->right = subtrees[built_offset + 1];
cur_subtree->important_ptr = subtrees[decide(built_offset, built_offset + 1, j++, rand2)]->important_ptr;
built_offset += 2;
}
}

unsigned __int64 __fastcall choice(unsigned int max, _QWORD *rng_state)
{
return lcg(rng_state) % max;
}

__int64 __fastcall decide(unsigned int a1, unsigned int a2, unsigned int a3, _QWORD *rndstate)
{
unsigned __int64 v4; // rax

v4 = lcg(rndstate) % 3uLL;
if ( v4 == 2 )
return a3;
if ( v4 > 2 )
{
fwrite("This cannot happen, what??\n", 1uLL, 0x1BuLL, stderr);
exit(1);
}
if ( v4 )
return a2;
else
return a1;
}

__int64 __fastcall lcg(_QWORD *rng_state)
{
*rng_state = 0x5851F42D4C957F2DLL * *rng_state + 1;
return *rng_state;
}

```

To sum it up:

1. Two random 64-bit values are read from `/dev/urandom`. We'll call them rand1 and rand2.
2. The binary gives us rand1.
3. Both rand1 and rand2 are used to create 500001 nodes and connect them into binary trees, more details later.
4. Rand2-based randomness is used to get 50000 random nodes that have both a left and right son, and we are given `node->left->important_ptr->node_id` and `node->right->important_ptr->node_id`.
5. For each of the 50000 questions we need to answer with `node->node_id`.
6. We get flag.

So to be able to answer the questions, we need to figure out how the tree is generated and what the meaning behind `important_ptr` is.

## Building the tree

Let's work through the `init_tree` function now. First, all 500001 nodes are initialized such that they have no children and `important_ptr` points to self. The important part is the while loop:

```C
j = 250001;
built_offset = 0;
while ( j <= 500000 )
{
cur_subtree = subtrees[j];
i1 = choice(j - built_offset, rand1);
swap(i1 + built_offset, built_offset);
i2 = choice(j - built_offset - 1, rand1);
swap(built_offset + 1 + i2, built_offset + 1);
cur_subtree->left = subtrees[built_offset];
cur_subtree->right = subtrees[built_offset + 1];
cur_subtree->important_ptr = subtrees[decide(built_offset, built_offset + 1, j++, rand2)]->important_ptr;
built_offset += 2;
}
```

As we want to built the same tree locally and we only know rand1, it's also important to note that rand2 is only used for setting the value of `important_ptr`.

What the algorithm does is it iteratively walks through the array starting at 250001, picks up two random nodes before the current one that dont have a parent yet and puts them as children of the current one. It puts the already parented nodes at the beginning array, moving `built_offset` as to not consider them anymore:

![array schema](https://wrecktheline.com/writeups/imaginary-2021/foliage_array.drawio.svg)

So, what about the `important_ptr`? Each node "inherits" one of the `important_ptr` of itself (which points to itself), or of its left son, or of its right son. As this choice is made based on rand2, we can't predict it, as only rand1 is leaked. However, if we for example take the `important_ptr` of our left son, that son also had to point either to itself or inherit from one of its sons. Eventually, you have to reach a node that has no sons, and that one was never made a parent and as such has to still be pointing to itself, as that's what it was inited to. **The result is, `node->important_ptr` always points to a random node in the subtree implied by `node`** (either the node itself or any of its descendants).

## Answering the questions

Back to our questions then. We know that we're looking for a node with a specific `node->left->important_ptr->node_id` and `node->right->important_ptr->node_id`. That means we're looking for a node such that the first `node_id` is in its left subtree and the second `node_id` is in its right subtree. Only one such node can exist though, and it's the Lowest common ancestor (LCA) of the two nodes, which is a known graph theory problem: https://en.wikipedia.org/wiki/Lowest_common_ancestor.

As 50000 doesn't like a very huge number, we hope we'll just get away with the simple way of finding LCA, that is by just finding the first intersection of the paths from v and w to the root.

So, to answer the questions, we read the rand1, use it to built the same tree, however unlike the given implementation **we also remember a pointer to the parent**, as otherwise finding the path to root would be very painful.

When we did the first implementation, it was too slow. Originally, we used the following implementation of `get_node(i)`:

```py
def get_node(i):
i = int(i)
for node in nodes:
if node.index == i:
return node
```

So we replaced it with something a bit more clever:

```py
nodict = [None] * 500001

for node in nodes:
nodict[node.index] = node

def get_node(i):
i = int(i)
return nodict[i]
```

Second improvement we did was we added caching for the path to root calculations. We're not sure if it was necessary, but it was a simple change. After these changes we get the following solver:

```py
from pwn import *
#io = process('./foliage')
io = remote('chal.imaginaryctf.org', 42013)
io.recvuntil(b'seed: ')
seeded = int(io.recvline())
seed0 = [seeded]

def lcg(seed):
seed[0] = ((0x5851F42D4C957F2D * seed[0]) + 1) & 0xFFFFFFFFFFFFFFFF
return seed[0]

def choice(max, seed):
return lcg(seed) % max

def swap(nodes, i, j):
tmp = nodes[i]
nodes[i] = nodes[j]
nodes[j] = tmp

class Node():
def __init__(self, i):
self.left = None
self.right = None
self.index = i
self.parent = None
self.path = None

def path_up(self):
if self.path is None:
if self.parent is None:
self.path = [self.index]
else:
self.path = [self.index] + self.parent.path_up()

return self.path

nodes = []

for i in range(500001):
nodes.append(Node(i))

built_offset = 0
for j in range(250001, 500001):
i1 = choice(j-built_offset, seed0)
swap(nodes, i1+built_offset, built_offset)
i2 = choice(j-built_offset-1, seed0)
swap(nodes, i2+built_offset+1, built_offset+1)
nodes[j].left = nodes[built_offset]
nodes[j].left.parent = nodes[j]
nodes[j].right = nodes[built_offset+1]
nodes[j].right.parent = nodes[j]
built_offset += 2

nodict = [None] * 500001

for node in nodes:
nodict[node.index] = node

print("built!")

def get_node(i):
i = int(i)
return nodict[i]

def get_up(n):
res = []
while n is not None:
res.append(n.index)
n = n.parent
return res

def find(a, b):
aa = get_node(a).path_up()
bb = get_node(b).path_up()

for x in aa:
if x in bb:
return x
return -1

io.sendlineafter(b'Ready?', b'42')
tasks = io.recvuntil('Answers? ').strip().decode().split('\n')[:-1]

answer = b''

for task in tasks:
(a, b) = task.split(' ')
res = find(a, b)
answer += str(res).encode() + b'\n'
print("done, sending answer")

io.send(answer)

io.interactive()
```

```
$ python3 foliage.py
[+] Opening connection to chal.imaginaryctf.org on port 42013: Done
built!
/home/jagotu/ctf/imaginary/foliage.py:94: BytesWarning: Text is not bytes; assuming ASCII, no guarantees. See https://docs.pwntools.com/#bytes
tasks = io.recvuntil('Answers? ').strip().decode().split('\n')[:-1]
done, sending answer
[*] Switching to interactive mode
ictf{I_can_f1n4lly_see_the_forest_through_the_tr33s}

[*] Got EOF while reading in interactive
$
```

Original writeup (https://wrecktheline.com/writeups/imaginary-2021/#foliage).