Tags: ocr hash image misc

Rating:

## myopia

The code is using the [abonander/img_hash](https://github.com/abonander/img_hash) library to calculate the perceptual hash of the image. To get this flag, we need to find two images with the same perceptual hash of ERsrE6nTHhI= (in Base64), but one showing the text "sudo please", while the other showing the text "give me the flag". The text is OCR'd by the tesseract library.

Since the given image img1.png with the text "sudo please" already have this target perceptual hash, what's remaining is to generate the second image that will be recongnized by the OCR library as having the text "give me the flag", but having the same perceptual hash calculated by the img_hash library. We don't think it's a good idea to trick the OCR library since the codebase is large and is widely used, so we started by looking at the img_hash library.

The first thing the img_hash library does to the image is to convert it into a grayscale image. This logic is done by the image library. Looking into the source code, it's doing the basic luminance calculation. However, we found that this image library supports reading PNG files with transparent layer (RGBA), but will ignore the alpha value in the calculation. Therefore, near-transparent pixels will be included in the final grayscale image as well. Here we took a guess that tesseract library will deal with transparent images correctly, meaning it should ignore near-transparent pixels.

We generated an image that has the same pixel values of img1.png, but every pixel having alpha values of only 1. Then we added a black text saying "give me the flag" on top of the black area of the image, so if you directly view it on a light background, you can only see the text. But if you convert the image into grayscale, you would see the same image as img1.png. As what we have guessed, this worked very well.