**Tags:** misc algorithms

Rating:

# [K3RN3L CTF 2021] BogoAttack

## tl;dr

Find the order of a permutation of size $10^4$ stored in an array

with an oracle that is able to get the contents

of a subset of indices of the array but randomly shuffles the contents before returning.

There is a limit of $15$ queries.

Solve by a divide and conquer/parallel binary search algorithm.

## Description

misc/BogoAttack; 26 solves, 446 points

Challenge author: `DrDoctor`

Someone attacced by Bogo! I must seek revenge. Now is the time to attacc back!

[main.py](https://flu.xxx/static/chall/lwsr_0c872acfc0b66f185a4968ac3198e067.zi://ctf.k3rn3l4rmy.com/kernelctf-distribution-challs/bogo-attack/main.py)

## First impressions of the problem

This problem was actually first given as `Bogo Solve` where the query limit was

accidentally not enforced.

I didn't notice and solved this problem (and later only modified the port

of the server in my solve script).

I'll walk through my first thoughts on the problem assuming the limit was

actually enforced.

We're given the following python script that's running on the server:

```python

import random

NUMS = list(range(10**4))

random.shuffle(NUMS)

tries = 15

while True:

try:

n = int(input('Enter (1) to steal and (2) to guess: '))

if n == 1:

if tries==0:

print('You ran out of tries. Bye!')

break

l = map(int,input('Enter numbers to steal: ').split(' '))

output = []

for i in l:

assert 0<= i < len(NUMS)

output.append(NUMS[i])

random.shuffle(output)

print('Stolen:',output)

tries-=1

elif n == 2:

l = list(map(int,input('What is the list: ').split(' ')))

if l == NUMS:

print(open('flag.txt','r').read())

break

else:

print('NOPE')

break

else:

print('Not a choice.')

except:

print('Error. Nice Try...')

```

I got pretty excited when I saw the question, as a former competitive programmer (or maybe I still am one?) and a computer science theory student.

(So excited that I probably spent ten times longer writing this writeup than

actually solving and coding a solution.)

I immediately recognize this as an interactive competitive programming question.

(This might have seen this exact one on [Codeforces](https://codeforces.com/) or

[AtCoder](https://atcoder.jp/) but interactive problems are pretty rare, and my memory is fuzzy)

The question can be summarized as follows:

> Given a permutation of size $10^4$ in an array and access to the array

> via an oracle

> that is able to get the contents of a subset of indices of the array

> but randomly shuffles the contents before returning.

> Find the contents in at most $15$ queries.

First we note that $15$ is more or less $\log_2(10^4)$,

so we want to make logarithmically many queries.

This suggests some sort of divide and conquer solution.

But what exactly are we dividing here?

## A Divide and Conquer approach

Let's think about what we can accomplish with one query.

We can split the array down the middle and query all the indices

in the first half as pictured below. What does this give us?

![dc1](https://davidzheng.web.illinois.edu//assets/images/k3rn3lctf2021/bogoattack/DC1.png)

The server would tell us which elements are in the first half,

which (by simple deduction) would tell us the rest of the elements

are in the second half.

Now we can treat these two halves of the array as two seperate problems

in and of themselves and do the same thing.

![dc2](https://davidzheng.web.illinois.edu//assets/images/k3rn3lctf2021/bogoattack/DC2.png)

For each of these subproblems we can repeat again, and

continue until we know exactly where every element is!

![dc3](https://davidzheng.web.illinois.edu//assets/images/k3rn3lctf2021/bogoattack/DC3.png)

However, naively this would give us a lot of queries,

in particular, if we let $Q(n)$ denote the number of queries

needed to solve the problem on an array of size $n$,

we essentially found the following recurrence:

$ Q(n) = 2 Q(n/2) + 1 $

Unfortuantely this solves to $Q(n) = n$, which is no better than

querying each position individually!

We need one more idea to help us out.

What if we send the queries for all our subproblems of the

same size simultaneously?

Since the elements involved in each subproblem form a partition of our

original elements, it doesn't matter that we get the elements in a random order,

we already **know** which elements are from each subproblem.

This means we can solve the problem with the recurrence of:

$ Q(n) = Q(n/2) + 1 = \lfloor \log_2 n \rfloor$

However, coding a solution like this seems complicated, how do we maintain

all these subproblems?

## Another way of looking at things

Let's take a step back and look at what we're learning from each we make.

For simplicity, let's actually assume that we are working with a permutation

of size $2^{14}$ elements ($16384$)

instead of $10^4$ elements. We'll see why this makes things easier in a bit.

Let's look at the first query to a problem:

![dc1](https://davidzheng.web.illinois.edu//assets/images/k3rn3lctf2021/bogoattack/DC1.png)

Querying for which elements are in the first half of the array is

essentially looking at what elements have the index of the first bit

be $0$. The rest of the numbers have first bit $1$.

So a query can learn the most significant bit of the **positions** of all the numbers

in the list!

In fact, there was nothing special about chooosing the first half,

the positions with most siginficant bit $0$.

We could just as easily

have chosen every position with a $0$ in the $k$th bit for some $1\le k \le 14$

and learn that bit for every element in the permutation!

So this suggests another algorithm, for each bit, learn the $k$th bit of

every element for every $k$.

If you examine this new algorithm closely, this would make the same queries

as the divide and conquer algorithm we had before!

This is a fairly common phenomenon, when doing binary divide and conquer,

we can instead view it in terms of the bits of the number and work with those for a

much simpler to code algorithm

(this forms the basis of things like [segment trees](https://codeforces.com/blog/entry/18051)).

We can view this as a form of parallel binary search, for every element of

the permutation, we are finding

its position in the list via binary search.

Cleverly, we're able to do this for all

elements at once!

This is what I ended up coding:

```python

from pwn import *

def read_until(s, delim=b'='):

delim = bytes(delim, "ascii")

buf = b''

while not buf.endswith(delim):

buf += s.recv(1)

print("[+] READING: ", buf)

return buf

sock = connect("ctf.k3rn3l4rmy.com", 2247)

NUMS = [0]*(10**4)

POS = [0]*(10**4)

for i in range(14):

inp = read_until(sock, ':')

sock.sendline(b'1')

inp = read_until(sock, ':')

output = ""

for j in range(10**4):

if j>>i&1:

output += str(j) + " "

sock.sendline(output[:-1])

inp = read_until(sock, '[')

inp = read_until(sock, ']')[:-1].split(b', ')

nums = [int(x) for x in inp]

for x in nums:

POS[x]+=1<

Original writeup (https://davidzheng.web.illinois.edu/2021/11/14/k3rn3lctf-bogo-attack.html).