TSGCTF2020 ONNXRev Writeup

Netron is not useful in this problem.

First of all, we need a way to edit onnx file in order to exploit its internal neural network. Onnx files are saved in protobuf format, so we can use a general way to edit protobuf files. The module google.protobuf.json_format supports conversions from/to JSON format.

to JSON:

import onnx
import google.protobuf.json_format as json_format

model = onnx.load("problem.onnx")
str = json_format.MessageToJson(model)
print(str)

from JSON:

import onnx
import google.protobuf.json_format as json_format

with open("modified.json") as f:
    str = f.read()

model = json_format.Parse(str, model)
onnx.save(model, "problem_mod.onnx")

Now we can look into the internal graphs of model.onnx. Reading carefully, we can see what the model actually does. It's something like the following pseudo code:

coeff_matrix[41][41] = **embedded**
target_coeffs[41] = **embedded**
ok = true
for i from 0 until 41:
    accum = 0
    for j from 0 until 41:
        chara = trim_image_at_nth_char(input, i)
        id = run_nn_to_get_char_id(chara)
        accum += id * coeff_matrix[i][mod(i - j, 41)]
    ok = ok && accum == target_coeffs[i]

return ok ? "Correct" : "Wrong"

It's just a matrix multiplication! By extracting matrix elements (don't forget to shear coeff_matrix), we can guess the return values of run_nn_to_get_char_id for each character. And this is it:

71, 54, 32, 39, 71, 5, 91, 17, 40, 12, 33, 22, 31, 27, 22, 9, 22, 90, 52, 12, 73, 22, 56, 65, 22, 51, 42, 43, 75, 9, 40, 86, 22, 10, 22, 7, 14, 52, 40, 90, 77

T = 71
S = 54
G = 32
C = 39
T = 71
F = 5
{ = 91
? = 17
? = 40
? = 12
? = 33
? = 22
? = 31
? = 27
? = 22
? = 9
? = 22
? = 90
? = 52
? = 12
? = 73
? = 22
? = 56
? = 65
? = 22
? = 51
? = 42
? = 43
? = 75
? = 9
? = 40
? = 86
? = 22
? = 10
? = 22
? = 7
? = 14
? = 52
? = 40
? = 90
} = 77

This is not character codes, it's just character class ids to which the neural network classify characters. And what is worse, it is scrambled. So the only way to restore the flag is to run the neural network actually for each ascii character.

Now we need to edit the model. We modified JSON so that the model writes the last input character's class id:

coeff_matrix[41][41] = **embedded**
target_coeffs[41] = **embedded**
for i from 0 until 41:
    accum = 0
    for j from 0 until 41:
        chara = trim_image_at_nth_char(input, i)
        id = run_nn_to_get_char_id(chara)
        accum = id
    result = accum

return result

Running this model several times, we finally get the class id table:

87 = !
64 = "
50 = #
0  = $
11 = %
92 = &
37 = '
23 = (
6  = )
18 = *
53 = +
15 = ,
34 = -
19 = .
66 = /
56 = 0
31 = 1
24 = 2
44 = 3
9  = 4
42 = 5
84 = 6
7  = 7
69 = 8
45 = 9
21 = :
21 = ;
63 = <
62 = =
47 = >
1  = ?
83 = @
25 = A
79 = B
39 = C
61 = D
89 = E
5  = F
32 = G
78 = H
10 = I
72 = J
70 = K
75 = L
4  = M
12 = N
17 = O
58 = P
60 = Q
68 = R
54 = S
71 = T
85 = U
57 = V
8  = W
74 = X
46 = Y
41 = Z
55 = [
76 = \
35 = ]
38 = ^
22 = _
88 = `
30 = a
81 = b
82 = c
73 = d
51 = e
65 = f
86 = g
14 = h
52 = i
16 = j
90 = k
28 = l
80 = m
40 = n
43 = o
2  = p
36 = q
48 = r
27 = s
26 = t
49 = u
67 = v
29 = w
33 = x
3  = y
59 = z
91 = {
20 = |
77 = }

Now it's easy to restore the flag: TSGCTF{OnNx_1s_4_kiNd_0f_e5oL4ng_I_7hink}

ONNXrev

TSGCTF2020 ONNXRev Writeup

Comments

ONNXrev

TSGCTF2020 ONNXRev Writeup

Comments

Sign in with