Tags: machinelearning machine_learning adversarial 


This is an adversarial Machine Learning challenge!
We applied the Fast Sign Gradient Method to compute th adversaral point.
Here the code, for more details just go [HERE](https://zenhack.it/writeups/UTCTF2019/facesafe/)!
UTCTF 2019
Can you get the secret? http://facesafe.xyz
import keras
import numpy as np

from keras.models import load_model
from keras import backend as K
from PIL import Image

# The class we want to obtain (I have just counted from 0 to 9). No magic here!

# The strength of the perturbation
# ! VERY IMPORTANT: it is an integer beause we want to produce images in RGB, not greyscale!
eps = 1

# Produce encoding of the output for class 4 (canonical base, e_4)
target = np.zeros(10)
target[TARGET] = 1

#l Lad the model using Keras, standard way for saving netwrok's weights is HDF5 format.
model = load_model('model.model')

# Produce np array from image, using PIL (one of the thousand ways for loading an image)
img = Image.open('img2.png')
data = np.asarray(img, dtype="int32")

print(np.argmax(model.predict(np.array([data]))[0]),' should be 4 BUT NOT NOW!')

# We need te function that incapsulate the gradient of the loss wrt the input.
# ! MOST IMPORTANT: the loss function is the main actor here. It defines what we want to search.
# In this case, we want the distance between the prediciton and the target label 4.
# Hence, we produce the loss written there.
session = K.get_session()
d_model_d_x = K.gradients( keras.losses.mean_squared_error(target, model.output), model.input)

x0 = data
conf = model.predict(np.array([x0]))[0]

# The attack may last forever?
# YES! But I tried with a black image and it converges.
# You should put here a fixed number of iterations...
while np.argmax(conf) != TARGET:

# Thank you Keras + Tensorflow!
# That [0][0] is just ugly, but it is needed to obtain the value as an array.
eval_grad = session.run(d_model_d_x, feed_dict={model.input:np.array([x0])} )[0][0]

# Compute the perturbation!
# This is the Fast Sign Gradient Method attack.
fsgm = np.sign(eval_grad * eps)

# The gradient always points to maximum ascent direction, but we need to minimize.
# Hence, we swap the sign of the gradient.
x0 = x0 - fsgm

# Here we need to bound the editing. No negative values in images!
# So we clip all the negative values to 0.
# We also clip all values above 255.
x0[x0 < 0] = 0
x0[x0 > 255] = 255
conf = model.predict(np.array([x0]))[0]
print("Confdence of target class {}: {:.3f}%\nPredicted class: {}\nConfidence of predicted class: {:.3f}%\n----".format(TARGET, conf[TARGET]*100, np.argmax(conf), conf[np.argmax(conf)]*100))

# If we obtained the evasion, we just save the new image
i = Image.fromarray(x0.astype('uint8'), 'RGB')

No captcha required for preview. Please, do not write just a link to original writeup here.

Original writeup (https://zenhack.it/writeups/UTCTF2019/facesafe/).