Tags: logic rev

Rating: 5.0

# Smart Solver solved without SMT and Z3
Well the heading is intentional since I did not know about SMT and Z3 when I solved this problem during the CTF, and solved it logically. So read this if you are interested in a logical solution to the problem.

# Overview

The binary that we are provided uses a lot of instructions to compare each character of a string with all other characters of the string, if any of the test fails it just jumps to a location that exits. So we have to find that string which passes all tests.

# Observations
- The string length is 73
- The operations compare the numeric value of characters with each other so it defines a lexographic order for a character with all other characters of the string.

# Idea

Given a string with 73 characters s, Consider a 73 X 73 2-d array a. Define a[i][j] in the following manner -:

if(s[i] == s[j]):
a[i][j] = -1
elif(s[i] < s[j]):
a[i][j] = 0
else:
a[i][j] = 1


Now note that we can obtain the entire 2-d array a from the binary if we are able to parse the instructions in the disassembled code of the binary. This is because each character is compared with a number of other characters using either the jbe or jae instruction and if it is not compared with some character then it is safe to assume that both of them are equal, since no other possibility exists.

## Code to parse the binary and get the 2-d array

### Some debugging aids which can be safely ignored
The code includes some try-except and assert statements to verify that what we are doing is right. The variable ct is the error counter. The code throws one error at an instruction before the comparisons are even started so ct is not allowed to be greater than 1.


import re
import json
import sys
a=[[-1 for i in range(73)] for i in range(73)]

def dump_data():
# Objective a[i][j] = 0 if s[i] < s[j] , 1 if(s[i] > s[j]), -1 if (s[i] == s[j])

regex = '.*movzx e([ad])x,BYTE PTR $rbp-(0x[0-9a-f]*)$.*' # regex to match src and dest
jmp_reg = '.*j([ab])e.*' # regex to match operator (less or greater)

src,dst=-1,-1
ct=0
with open("disassembly",'r') as f:
# print(line)
try:
reg, pos = re.match(regex, line).groups()
pos = int(pos, 16) - 0xd8

if(reg == 'd'):
assert src == -1
src = pos
print("Setting src to ",pos)
elif(reg == 'a'):
assert dst == -1
dst = pos
print("Setting dst to ", pos)

except Exception as e:

if isinstance(e, AssertionError):
ct += 1
print(src, dst)
print(line)
if(ct>1):
sys.exit(-1)
try:
oper = re.match(jmp_reg, line).groups()
assert src != -1 and dst != -1 and a[src][dst] == -1
if oper == 'b':
print("oper is b")
a[src][dst] = 0
else:
print("Oper is a")
a[src][dst] = 1
src, dst = -1, -1
except Exception as e:
if isinstance(e,AssertionError):
ct += 1
print(src, dst)
print(line)
if(ct > 1):
sys.exit(-1)

open("out", 'w').write(json.dumps(a))


# Reversing the 2-d array to obtain flag
So now we have a. How to obtain the string s from this 2-d array. So the idea is this, we can first group indices that map to the same character i.e indexes x and y belong to the same group if s[x][y] == -1 == s[y][x].

**Note:** We do not need to ensure the second condition since that is a problem constraint and it cannot fail otherwise the data would be wrong. So what I mean is that for data to be correct the following invariant needs to be satisfied

s[i][j] = ~s[j][i] if (s[i][j] != -1) else s[i][j] == -1 == s[j][i]


- Once we have grouped the characters we know that characters at indices in the same group are same.
- Also note that suppose for an index i we can compute less_ct[i] = count({j: s[i][j] == 0}), gt_ct[i] = count({j: s[i][j] == 1}) and eq_ct[i] = count({j: a[i][j] == 0}) . less_ct[i] is a count of the number of characters which are lexographically **larger** than s[i].
- Now suppose character indices i1, i2, i3 belong to a group G i.e they map to same character. So they will have the same values of these quantities since the character comparison will yield equivalent results. So the variables less_ct, gt_ct are indeed a characterstic of the group and not of a particular index.
- Now suppose a group is mapped to the character 'b'. How will its less_ct compare to the group that maps to the character 'c'? It is obvious that the number of characters that are lexographically larger than 'c' are less than the number of characters that are lexographically larger than 'b' since the latter set will also include the characters that are equal to 'c'.
- So it is convincing enough that if we sort these groups on the basis of less_ct and start assigning them the characters from 'a' we could complete the entire string.

**Note:** We have made an assumption that all characters from 'a' to 'z' are being used. And this assumption only holds true if the number of groups that we are able to form are >= 26. In our case we got 28 groups i.e. all 26 alpahabets and '{' and '}'

## Code to obtain string from 2-d array

**Note:** It seemed like that we need to reverse the flag after obtaining the entire string to have the format flag{...}

def rev():
ingroup = [False for i in range(73)]

groups=[]

for i in range(73):
if(ingroup[i]):
continue
ele = data[i]
eq_ct,less_ct,gt_ct=0,0,0
same_indexes = []
for j in range(73):
b=ele[j]
if b==-1:
eq_ct+=1
same_indexes.append(j)
ingroup[j] = True
elif (b==0):
less_ct+=1
else:
gt_ct+=1
groups.append({"eq_ct":eq_ct,"less_ct":less_ct,"gt_ct":gt_ct,"same_indexes":same_indexes})
groups.sort(key=lambda x: x['less_ct'])
flag=['' for i in range(73)]
for i in range(len(groups)):
c = chr(ord('a')+i) if(i!=len(groups)-1) else '}'
group = groups[i]
for index in group['same_indexes']:
flag[index] = c
flag="".join(flag)[::-1]
print(flag)


# Result
flag{thequickbrownfoxjumpedoverthelazydogandlearnedhowtoautomateanalysis}