Tags: onnx rev 

Rating:

# TSGCTF2020 ONNXRev Writeup

Netron is not useful in this problem.

First of all, we need a way to edit onnx file in order to exploit its internal neural network. Onnx files are saved in protobuf format, so we can use a general way to edit protobuf files. The module `google.protobuf.json_format` supports conversions from/to JSON format.

to JSON:
```:
import onnx
import google.protobuf.json_format as json_format

model = onnx.load("problem.onnx")
str = json_format.MessageToJson(model)
print(str)
```

from JSON:
```:
import onnx
import google.protobuf.json_format as json_format

with open("modified.json") as f:
str = f.read()

model = json_format.Parse(str, model)
onnx.save(model, "problem_mod.onnx")
```

Now we can look into the internal graphs of `model.onnx`. Reading carefully, we can see what the model actually does. It's something like the following pseudo code:

```:
coeff_matrix[41][41] = **embedded**
target_coeffs[41] = **embedded**
ok = true
for i from 0 until 41:
accum = 0
for j from 0 until 41:
chara = trim_image_at_nth_char(input, i)
id = run_nn_to_get_char_id(chara)
accum += id * coeff_matrix[i][mod(i - j, 41)]
ok = ok && accum == target_coeffs[i]

return ok ? "Correct" : "Wrong"
```

It's just a matrix multiplication! By extracting matrix elements (don't forget to shear `coeff_matrix`), we can guess the return values of `run_nn_to_get_char_id` for each character. And this is it:

```
71, 54, 32, 39, 71, 5, 91, 17, 40, 12, 33, 22, 31, 27, 22, 9, 22, 90, 52, 12, 73, 22, 56, 65, 22, 51, 42, 43, 75, 9, 40, 86, 22, 10, 22, 7, 14, 52, 40, 90, 77

T = 71
S = 54
G = 32
C = 39
T = 71
F = 5
{ = 91
? = 17
? = 40
? = 12
? = 33
? = 22
? = 31
? = 27
? = 22
? = 9
? = 22
? = 90
? = 52
? = 12
? = 73
? = 22
? = 56
? = 65
? = 22
? = 51
? = 42
? = 43
? = 75
? = 9
? = 40
? = 86
? = 22
? = 10
? = 22
? = 7
? = 14
? = 52
? = 40
? = 90
} = 77
```

This is **not** character codes, it's just character class ids to which the neural network classify characters. And what is worse, it is scrambled. So the only way to restore the flag is to run the neural network actually for each ascii character.

Now we need to edit the model. We modified JSON so that the model writes the last input character's class id:

```:
coeff_matrix[41][41] = **embedded**
target_coeffs[41] = **embedded**
for i from 0 until 41:
accum = 0
for j from 0 until 41:
chara = trim_image_at_nth_char(input, i)
id = run_nn_to_get_char_id(chara)
accum = id
result = accum

return result
```

Running this model several times, we finally get the class id table:

```
87 = !
64 = "
50 = #
0 = $
11 = %
92 = &
37 = '
23 = (
6 = )
18 = *
53 = +
15 = ,
34 = -
19 = .
66 = /
56 = 0
31 = 1
24 = 2
44 = 3
9 = 4
42 = 5
84 = 6
7 = 7
69 = 8
45 = 9
21 = :
21 = ;
63 = <
62 = =
47 = >
1 = ?
83 = @
25 = A
79 = B
39 = C
61 = D
89 = E
5 = F
32 = G
78 = H
10 = I
72 = J
70 = K
75 = L
4 = M
12 = N
17 = O
58 = P
60 = Q
68 = R
54 = S
71 = T
85 = U
57 = V
8 = W
74 = X
46 = Y
41 = Z
55 = [
76 = \
35 = ]
38 = ^
22 = _
88 = `
30 = a
81 = b
82 = c
73 = d
51 = e
65 = f
86 = g
14 = h
52 = i
16 = j
90 = k
28 = l
80 = m
40 = n
43 = o
2 = p
36 = q
48 = r
27 = s
26 = t
49 = u
67 = v
29 = w
33 = x
3 = y
59 = z
91 = {
20 = |
77 = }
```

Now it's easy to restore the flag: `TSGCTF{OnNx_1s_4_kiNd_0f_e5oL4ng_I_7hink}`