Tags: onnx rev

Rating:

# TSGCTF2020 ONNXRev Writeup

Netron is not useful in this problem.

First of all, we need a way to edit onnx file in order to exploit its internal neural network. Onnx files are saved in protobuf format, so we can use a general way to edit protobuf files. The module google.protobuf.json_format supports conversions from/to JSON format.

to JSON:
:
import onnx

str = json_format.MessageToJson(model)
print(str)


from JSON:
:
import onnx

with open("modified.json") as f:

model = json_format.Parse(str, model)
onnx.save(model, "problem_mod.onnx")


Now we can look into the internal graphs of model.onnx. Reading carefully, we can see what the model actually does. It's something like the following pseudo code:

:
coeff_matrix[41][41] = **embedded**
target_coeffs[41] = **embedded**
ok = true
for i from 0 until 41:
accum = 0
for j from 0 until 41:
chara = trim_image_at_nth_char(input, i)
id = run_nn_to_get_char_id(chara)
accum += id * coeff_matrix[i][mod(i - j, 41)]
ok = ok && accum == target_coeffs[i]

return ok ? "Correct" : "Wrong"


It's just a matrix multiplication! By extracting matrix elements (don't forget to shear coeff_matrix), we can guess the return values of run_nn_to_get_char_id for each character. And this is it:


71, 54, 32, 39, 71, 5, 91, 17, 40, 12, 33, 22, 31, 27, 22, 9, 22, 90, 52, 12, 73, 22, 56, 65, 22, 51, 42, 43, 75, 9, 40, 86, 22, 10, 22, 7, 14, 52, 40, 90, 77

T = 71
S = 54
G = 32
C = 39
T = 71
F = 5
{ = 91
? = 17
? = 40
? = 12
? = 33
? = 22
? = 31
? = 27
? = 22
? = 9
? = 22
? = 90
? = 52
? = 12
? = 73
? = 22
? = 56
? = 65
? = 22
? = 51
? = 42
? = 43
? = 75
? = 9
? = 40
? = 86
? = 22
? = 10
? = 22
? = 7
? = 14
? = 52
? = 40
? = 90
} = 77


This is **not** character codes, it's just character class ids to which the neural network classify characters. And what is worse, it is scrambled. So the only way to restore the flag is to run the neural network actually for each ascii character.

Now we need to edit the model. We modified JSON so that the model writes the last input character's class id:

:
coeff_matrix[41][41] = **embedded**
target_coeffs[41] = **embedded**
for i from 0 until 41:
accum = 0
for j from 0 until 41:
chara = trim_image_at_nth_char(input, i)
id = run_nn_to_get_char_id(chara)
accum = id
result = accum

return result


Running this model several times, we finally get the class id table:


87 = !
64 = "
50 = #
0 = \$
11 = %
92 = &
37 = '
23 = (
6 = )
18 = *
53 = +
15 = ,
34 = -
19 = .
66 = /
56 = 0
31 = 1
24 = 2
44 = 3
9 = 4
42 = 5
84 = 6
7 = 7
69 = 8
45 = 9
21 = :
21 = ;
63 = <
62 = =
47 = >
1 = ?
83 = @
25 = A
79 = B
39 = C
61 = D
89 = E
5 = F
32 = G
78 = H
10 = I
72 = J
70 = K
75 = L
4 = M
12 = N
17 = O
58 = P
60 = Q
68 = R
54 = S
71 = T
85 = U
57 = V
8 = W
74 = X
46 = Y
41 = Z
55 = [
76 = \
35 = ]
38 = ^
22 = _
88 =
30 = a
81 = b
82 = c
73 = d
51 = e
65 = f
86 = g
14 = h
52 = i
16 = j
90 = k
28 = l
80 = m
40 = n
43 = o
2 = p
36 = q
48 = r
27 = s
26 = t
49 = u
67 = v
29 = w
33 = x
3 = y
59 = z
91 = {
20 = |
77 = }


Now it's easy to restore the flag: TSGCTF{OnNx_1s_4_kiNd_0f_e5oL4ng_I_7hink}`