CTFtime.org / TSG CTF 2020 / BlindCoder / Writeup

# TSGCTF 2020 BlindCoder Writeup

We can count up the number of test cases that satisfies a certain condition `f(N)` by sending the following function to the server: `f(N) ? (N % 10) : 10`.

The simplest approach is to run the binary search recursively until all the test cases are identified. However, once a test case is isolated (i.e. found a range that contains exactly one test case), binary search query returns only 1-bit information: whether a given range contains the test case or not. This simple approach cannot identify all test cases in 1000 queries. In order to identify all test cases in up to 1000 queries, about 1.5-bit information is required for each query.

The server returns a *number* of test cases, not a *binary* value. So think about trying to utilize it.

Instead of a single range $[l,r)$, We can query a union of ranges $[l1,r1)\cup[l2,r2)$. In this case, `f(N)` is given by `(N >= l1 && N < r1) || (N >= l2 && N < r2)`. Think about the case that we have two disjoint ranges $[l1,r1)$, $[l2,r2)$ and each range contains exactly one test case. We divide these ranges into four ranges, namely $[l1,m1)$, $[m1,r1)$, $[l2,m2)$, $[m2,r2)$. Note that each range is not empty and $[l1,r1)=[l1,m1)\cup[m1,r1)$, $[l2,r2)=[l2,m2)\cup[m2,r2)$. Then, there are four cases to occur:

* $[l1,m1)$ and $[l2,m2)$ contains test cases: **case LL**
* $[m1,r1)$ and $[l2,m2)$ contains test cases: **case UL**
* $[l1,m1)$ and $[m2,r2)$ contains test cases: **case LU**
* $[m1,r1)$ and $[m2,r2)$ contains test cases: **case UU**

Now query for the range $[l1,m1)\cup[l2,m2)$. If the returned value is two, then both $[l1,m1)$ and $[l2,m2)$ contains test cases and thus **case LL** is identified. For the same reason, **case UU** is identified if the returned value is zero. In these cases, **we obtain 2-bit information** for a single query.

If the returned value is one, **case UL** or **case LU** is happening but we do not know which is going. In this case, we obtain 1-bit information. When we next query for the range $[l1,m1)\cup[m2,r2)$, the returned value must be zero or two. If the returned value is zero, we missed one test case by exchanging $[l2,m2)$ and $[m2,r2)$, and thus **case UL** is identified. If the returned value is two, we got one test case by exchanging $[l2,m2)$ and $[m2,r2)$, and thus **case LU** is identified. However, this is inefficient because we know that the server will not return one. So, lets test for another range simultaneously. Here $[l3,r3)=[l3,m3)\cup[m3, r3)$ is another range to be searched. This range also contains exactly one test case. Now we query for the triple union: $[l1,m1)\cup[m2,r2)\cup[l3,m3)$. And there are four cases to occur.

* $[l1,m1)$, $[m2,r2)$, and $[l3,m3)$ contains test cases: **case LUL**
* $[l1,m1)$, $[m2,r2)$, and $[m3,r3)$ contains test cases: **case LUU**
* $[m1,r1)$, $[l2,m2)$, and $[l3,m3)$ contains test cases: **case ULL**
* $[m1,r1)$, $[l2,m2)$, and $[m3,r3)$ contains test cases: **case ULU**

Checking the returned value, we can identify these four cases at once; we obtain 2-bit information! As a result, we obtain total 3-bit information by two queries.

Considering all the above, we obtain 5/3=1.66666... bit information for a query on average! This is enough to identify all the test cases in 1000 queries.

`TSGCTF{ichi_zero_zero_zero_zero_zero_zero..._Counting!!}` <3

BlindCoder

Comments

Sign in with