Rating:

# Summary

In this challenge we are given some obfuscated Python code that ultimately emulates a custom machine with memory encoded in base 3. The flag can be found in memory. Note: To find the files referred to here, please see [this repository](https://gitlab.com/shalaamum/ctf-writeups/-/blob/master/FE-CTF%202022/snake-jazz/).

# Obfuscation layer

We are given two Python files, runme.py and magic.py. Running the former we are prompted to enter a key, and after inputting a teststring the program responds with sadpanda.jpg.

$./runme.py . :. :. .:' ,:: .:' ;:' :: ;:' : .:' . :. _________________________ : _ _ _ _ _ _ _ _ _ _ _ _ : ,---:".".".".".".".".".".".".": : ,'"::.:.:.:.:.:.:.:.:.:.:.::' .. :-===-===-===-===-===-:' .-._: : -.__. ,' ,--------"-------------'--------. "--.__ __.--"' ""-------------""' Please enter key: testkey sadpanda.jpg  Presumably the program checks whether we have input the correct flag. So let us now look at runme.py. The only content of this file after import magic is a very long expression that begins with _+___+---+---+-__+++-+_-_+_++_++++-+-_+_++_-__-_-+++-++---, continuing on in the same fashion. While + and - are unary and binary arithmetic operations in Python, _, as well as __ and ___ etc. are usually not defined, so this must come from the magic module, so let us have a look at that now. magic.py contains first the definition of a class X, which we will turn to later, followed by the following code that gets executed on import. python for i in range(1, 100): setattr(sys.modules['__main__'],'_'*i,X(3**i))  We can thus see that e.g. _ used in runme.py refers to an object of type X that was initialized as X(3**1), and __ is X(3**2), etc. Let us now look at the definition of X. Looking at the __init__ function we see that an object of X holds as data three integers, with member names a, b, and c: python class X(object): def __init__(x,a=0,b=0,c=0): x.a=a x.b=b or ~-a x.c=c  Apart from the __init__ function we find definitions for various unary and binary arithmetic or boolean operations, as an example we have here the __invert__ function (which gets called for an expression of the form ~x, if the type of x is X): python def __invert__(x): x.c-=x.c return X(x.a,x.b,x.c)  Just like this one also the other ones return a new object of type X with a, b, and c depending on the arguments, but with no side effects. The last member function of class X is __del__, which makes up the bulk of magic.py, and which gets called on deletion of the object. Only in __del__ input is taken in from the user, but looking through the function we can see that no new object of type X gets constructed in it, and neither does anything else happen (such as changing global variables) that would lead to code running after termination of a call to __del__ to depend on what happens during the call. This implies that input from the user can only ever impact what happens in the one call to __del__ in which that input was read in. While evaluating the long expression from runme.py, many objects of X will be created and deleted, however as __del__ begins with python def __del__(x): if not x.c: return  some of those calls will have no effect. So let us see what calls with which arguments actually pass this test, by adding something like the following on the very next line. python print(f'Call to __del__, object to reproduce:\nX(a=0x{x.a:x}, b=0x{x.b:x}, c=0x{x.c:x})')  The reason we print out a, b, and c in hexadecimal is that the numbers are quite large, and Python would complain when trying to parse them later if we used a decimal representation. The files magic2.py and runme2.py are obtained by making the change above. Executing runme2.py we obtain only a single printout announcing a call to __del__ that passed the x.c condition:  Call to __del__, object to reproduce: X(a=0xf5252907aa10ab6b52973f93fed61d01e6971a55867bd6951a6ed5648135e04d7e273fee7c571d49ea0bbd16990761aa51cb33051a16ecf621b04a666ffb0f898a02a6909d475d40504d5eb1724f519f3657dbec9c52e765bdb9c999945c4e45ddcbbfa7823f90901c884697f2d4be68ec1002a304446ba391cc6f4c1440ec86d927b41196005422de0800c89b35076b2783f261406cf7e31c58ac64d7e87b14a7ab5898b258179bdfd50cf75962f47d6302f9b58e501a78bcfcda3e9bed740c72be0306c62da2a6f7451a4bf94b8b5d276984ebe8a41356097dee1e2efd3f334b76bb3e688072d8ea2cb6a4bc729759ef7183de43a58c99bba04110220d3f30365185419d85adab0330fc64563e33632b03ac396024417bb7d34f17341daa528ea42b0f9d2e3aa3f01da0ce89c5a6b33472bf304621675fea0b049ff995bb5047f2bdde4a3e4139c00768eb1f1cd560def7956e46b94d3280772dfd86d0af33b485dccbffd7a931e85b422166dda0eb2cbd08ab956da9132823acc6e4cd23376118fe5e0906f758d37a1c6c77525f45c4028371cf42fb2e9f3cb8e2d979d84400b6605aef8bf8dd123fc8f1006d222ec7063214d655f6037d3373422df5d4b37c5dd31feb7bbf6f8aaa870e23220979be5dc5ccde592cff25e03fe5f35e31b1504f5baac5b8c5f1203e252d677e85dd10a39c65872124dabf57132001f78d8f4f8fdf10721b408d33a784a7fdf4b13be594ef48483b68b46ac893187bb27a0ac1aa55462cccf6f916fa52fd19372a24090218a43c8f4e8a4860792806dc968f1bc21b5c8ce417a24932079d536d811433e45e0db40cc0af718eeb08a17f25e0fb5227223c4577b9e7fb377340da03f30904585e404e1c54d5d7a48a0f457f2a339b9dbc2bf5f5a6d096bd0719e8e58b62d6cb1bd5668037cdca13605a3ef2b4ac0a48554fe7fa3a5481c3bdfb897ea3bbf45b8be86a06dbeb100cf7d922d3a00e5d8408d72ecb1414f3fa3e7699e02bade1a086385d862ca587ec92840e6df1f306bcf31aa721d8aaa84d41c1e102665ba735450abdfb7c584aaf452a103877054d50607824f157b813821afcaf341e1262d4228039deccd9c1bb8f4db72bdcf57a68543c9257959bd6e72371469c804b1df2c96d7c9ed65cd59391855df0318a6180967a40df1556e09246e7139537d2d02b8cb90c8e73f3a2176186aca1a38ce0d2635b6bf20725d09ac719521ac157fb8800511744ffb3d24daee3973da0da8edba159d9e3e4d1b178075eb6947f07fdc3d4888c1058d588dd5d468d21875557fe8ad65abd10c9e1852f9324302f9bbc8783f303be0eabca1549dfc2286d4bce502b9f3b8cd7f02f1b8ef16b8aaded71a18a3bb28dd437cf5b68ab49cbe55d996b115a54eddd8f1291b98d1cd277640824d476d384eb1d8a2c03746a685821c11f8d77ce45f721be893bcc4b7494b56391a97c6883d05e5e2cb2049fbea98cc2d228955a513faa00e0d0fcb034ee2f2b6248992d2e9d853621793000b5719c9166e71d222650dcba1442ba3808da13df5f7547b1a94bfa2abafbe682a3216e038f1091eda9c2877e9bc0a46bb2c06877bc40620da516e00c38142d45eb69d07235477cef8aef038f1570fbd638d17a851c63e81ff34f6b8b2238d3329de77fa77a2aa37fd78732aca902be47f8524c97a7132476cf94482c1052127711928f3c7a04a0ede860b06338c219c8e6406b7215a351baa8ed664bd5be785245ffb57bf45b72e4177c834084c9182ded559fc4930035d1ebe39d6d05ef138a2a763c2b41601120f676a4e3e26a9af94cba7280580633661c7e7222067f84df7b8ddade5f5a19f2d15509ba4469e44261e268e0d9e850b99758296fbea7bdd6522e715d42c258a6d878691994a86a35ad8db, b=0xa738fbf877afd0ba31d02a27bc32ade21db41e0c2dddd279e7046f8ab2850129842b55704bdc96fd8fc5b00503cded660076ca97af0870f3999a34a5776a84318bb349c6508a32ebb806bc7ad6ea251733349a6c81eebf874fa647f77c0bf882d4aec2fc71d6f17b261ae4c4caf80606296e6681ed2dd964289a862423db6617f60b87b5b3d3d62dd700544fe5b3e2515382f74cbba936c5fba8ff816221d3175e1b616637d8d72f43091593c82fd87932b0da6aa3199eca65e91896c29e3cb49ac86aa31f98273b05f19fce26deca7c7bb786048256903ae1b5a2f9b7c389a4acb14f349798d3aafcfd59dc8502a9aad21cb69d5122141a88ff8e213d064f7e56f23afc51627555bfd26cd544c653554ae51aaeec39398d0c3a2585d1aaa1caf26c51dec1cb5f7464fe870553cc455316f6bd51abe6e31c566f97922ea3e4072c76aab3811db49d5ef38b8c525429c19b86d71f8258919736b85cb6bfa8bb89a6687fd38c95395de0f2fc5f1c199b0ab0e8dd20e1ba178be7449b7ca72186cc1a91408942e7d44ebdca3657d71cb196830ed4ba8b7295eade9b5c28a4e7e3878ffaf95e489429ec328109f6a1985a8697c2433ed811001344c6ef6eb693f644b051adaa363e5b3026723325c44ddee6a5ce3736bfaddaeafcfe9f0cccf9bf34450ee273bb056821e65c30d2fcff1c4ad3a10c29986931d218fed36be54fdd8415902447045475de5c1597b813caebf63c51ef265f387a31662f6e3f56fe8d632d2842aa1619e2f26973cee12cf1a62af52b632112c08d349effe78d96a4fef7960c7a5b6966cff1aaf3ea3f837f327f43d619e377e7494b6eefc119c8f86ea9507af6ebbf4eb4c37428223ef67afda9b633568928783d6e57a4d498cd8835032c873fe942b5144cb1d40d1b955d1c72817dc8a4de65348a7dcd5682636b3478b3cc0f98d2cc83e8e93ec0e0462c5350dfb1839cc9f96f239b33fc698e72aa3b1571458b15538b37eb15a1775813bc18635a912f9bf124ad9c428c88747fe59eff7e7ec4b50db42196351b86cb45ff838982f56b5bc60d5afba65b2df159b1c61499413ea9bc8937d2a7f8e48f003abafdf01e55f5cabc73b761ece120ac06dc2a1d2eb1aad14757e56cdaf17baad6a2c9d07e10c9f939cc20af7d72fcb257cc65ec8252b1b666fd514b8e746adccba16784d817bf014a9f9c81d2a02ba2695fc7efe5bcea6081ef8fd28d0b29d10b72531f0715439677b6f045c363fab66dcfe0a562e19d91cbfd208ccce7f264913b1b1beeb6e41a3c96396db110fb6a6a295a60341276f2caeb955124b4a4b4e87d6ec692bc18d28563ad62d0ef0187663ebd1cb7cf30eb13f960939c14c17d17bd9f4f3602c34f52243653519c8393704506465d7e9163412059245a2db9b418556164c19287b461ee4d58f5b722df4adc1b1365f79f57d5a16cdeacac35fc46c8ccc45fd2f04d13e3f79f7f16ac5d1ba915c4ad16fa1ee69a3ff86d4e0e745531e9339db60f253c3462bf326ebb989e46a10358f8648cf0e928bdc5a4ccb352daf93d0bb9ca973a579069ec8df57cbf28ad62b47b5642194e43955184caa278b0595a2252f6c0d53f92a16697f8a167ac62cedcb848fc82eb18f562767b9d31b9d477174f8b1fd03c1182a0577265a1db83abd5f7b4472a353e2b1fd3e81563b76d0be67e73cb9fb72150532c1d1f2dad3fe16b42dc842f0051bb70c7a7ee3a869aae7135dc5100a179776572eb088f84f37073159c85c37ca7add3be1f511674502924930406991c0262e3cae062589eced1e6073fcb9173bc1d348b004cd8e31959fca97b7053f47c6e331120fa2f572a217e55461a7d7e13f75b0eff7e74fa4ae3204e8445e046aadc3f382b698aa8d5e4b275676b3ab7cd45bdfd42e799033, c=0x3)  Instead of the long expression in runme2.py we could instead just construct and delete the object indicated above. If we do this, then all the arithmetic and boolean unary and binary operations for X are not going to be used anymore, so they have become irrelevant and can be discarded. Similarly for the definition of _ and so on. We thus arrive at magic3.py and runme3.py, where we additionally renamed __del__ to run. # Analysis of the emulator The following is the beginning of the run function in magic3.py we now have to analyze. python def run(x): if not x.c: return y=[0]*9 while 3**y[8]<x.a: z=x.b//3**y[8]%3**7 y[8]+=7 a=z//3**4 b=z//9%9 c=z%9 d=c+x.b//3**y[8]%3**7*3*3 if a==0: os._exit(0) elif a==1: y[8]+=7 y[b]=d  After that there are a bunch more elif branches depending on the value of a. So we initialize 9 integer variables in the list y to 0, and then there is a loop where we first calculate some values for z, a, b, c, and d, and then there is a big case distinction based on what the value of a is. Searching through the code we can quickly see that x.a and x.c are only used in the first couple of lines; x.c is only used in the if not x.c: return check at the start and x.a only in the exit condition of the loop. So the behavior of this function must mostly be encoded in the value of x.b. The following are the lines in the run function in which x.b is accessed or changed: python z=x.b//3**y[8]%3**7 d=c+x.b//3**y[8]%3**7*3*3 y[b]=x.b//3**y[c]%3**(3*3) x.b+=y[b]*3**y[c]-x.b//3**y[c]%3**9*3**y[c]  We can make those lines a little more readable (perhaps after looking up [order of precedence for Python operations](https://docs.python.org/3/reference/expressions.html#operator-precedence)): python z = ( x.b // (3**y[8]) ) % (3**7) d = c + ( ( x.b // 3**y[8] ) % ((3**7)) ) * 3 * 3 y[b] = ( x.b // (3**y[c]) ) % (3**(3*3)) x.b += ( y[b] * (3**y[c]) ) - ( ( x.b // (3**y[c]) ) % 3**9 ) * (3**y[c])  We see that on the right hand side x.b is always first divided by a power of 3 that depends on some other variable, and then the result is taken modulo a fixed power of 3. This has a very natural interpretation if we think of x.b as a number in base 3. If we index the trits (like bits, but in ternary, i.e. base 3) of x.b starting with 0 for the least significant one, then the first line can be interpreted as extracting trits y[8] through y[8] + 6 from x.b, in the sense that the resulting value has as least significant trit the one that occurs at index y[8] in x.b. Similarly for the third line, though in this case rather than 7 trits we extract 9 trits. In the second line 7 trits are extracted, and the result is shifted up by two trits. Finally, the last line changes the value of x.b. With the interpretation so far in mind we can rewrite this line as follows. python x.b = ( x.b - ( ( x.b // (3**y[c]) ) % 3**9 ) * (3**y[c]) ) \ + ( y[b] * (3**y[c]) )  What happens is that with ( x.b // (3**y[c]) ) % 3**9 we first extract 9 trits beginning from the one indexed by y[c]. The multiplication by 3**y[c] shifts that number up so that the we now obtain a copy of x.b in which all trits except the 9 ones starting from the one indexed by y[c] have been set to 0. Subtracting this from x.b thus results in a copy of x.b in which the 9 trits starting with the one indexed by y[c] have been cleared. Finally, as long as y[b] is a nonnegative integer smaller than 3**9, adding y[b] * (3**y[c]) sets those 9 trits to be the least significant 9 trits of y[b]. We can thus conclude that x.b acts as the memory of the emulated ternary machine. The smallest units that are read or written seem to be both 7 and 9 trits long, which is a bit unusual. Accordingly, memory is addressed at the lowest level, the trit (i.e. in the first line above we have 3**y[8] rather than 3**(y[8]*7)). # Getting the flag If the program encoded in x.b checks whether our input is the flag, then the easiest way this might happen would be to just compare with the correct flag that is stored in memory (perhaps after decryption). In that case we can find it just by printing out the content of the memory (perhaps at the right moment). But it could also be that the correct flag in memory is encrypted or hashed in some way and our input will be compared after being encrypted or hashed as well. In the latter case we might have to identify what the individual instructions do and disassemble/debug the program. But as a first check we can try to print out memory access by replacing the four lines that use x.b above with a new function that also prints out the memory accessed. We thus add the following to the definition of class X: python def mem_get(self, index, width): value = (self.b // (3**index)) % (3**width) output = f'Memory at {index} of width {width} accessed, value is 0x{value:02x}' if value < 127: output += f'="{chr(value)}"' print(output) return value def mem_set(self, index, width, value): if value >= 3**width: print(f'Trying to write {value} at {index} with only width {width}') raise Exception self.b = (self.b - ((self.b // (3**index)) % (3**width))*(3**index)) + value*(3**index) output = f'Memory at {index} of width {width} set to 0x{value:02x}' if value < 127: output += f'="{chr(value)}"' print(output)  and replace e.g. z=x.b//3**y[8]%3**7 with z = x.mem_get(y[8], 7). This change is implemented in magic4.py and runme4.py. The output has very many lines. Perhaps we should reduce this a bit to see something. The case distinction being made in the loop depends on a . The first three lines at the top of the loop are as follows. python z = x.mem_get(y[8], 7) y[8]+=7 a=z//3**4  Thus a is given by the lowest 4 trits of z, which is given by 7 trits of memory indexed by y[8]. Furthermore, y[8] is increased by 7. This suggests that z encodes a instruction and y[8] is the instruction pointer. We are thus not really interested in memory access at the instruction pointer, so lets us make the printout of memory access optional and let us filter out these accesses. Furthermore, let us filter out memory access before input is first taken from the user. This is done in magic5.py and runme5.py. Going through the output, we can actually see characters f, l, a, g, { occurring in that order, the first one being the following line.  Memory at 6333 of width 9 accessed, value is 0x66="f"  Thus it looks like we might be able to extract the flag from memory directly. So let us now print out memory starting at 6333 with width 9. We add the following function to class X: python def print_memory_string(self, index, width): while True: value = self.mem_get(index, width, debug=False) if value < 127: print(chr(value), end='') else: break index += width print()  and call this with x.print_memory_string(6333, 9). This is implemented in magic6.py and runme6.py. If we directly call this, we unfortunately do not obtain the flag, initially the value at address 6333 is 0x1411 rather than 0x66. So perhaps the flag is first decrypted in memory. To circumvent this problem we print the memory *after* having run the program. In order to do this we need to make one more change: when a == 0, execution of the emulated program halts, by calling os._exit(0). We change this to break to end the loop instead. With these changes we now obtain the following output: $ ./runme6.py
.
:.
:.
.:' ,::
.:' ;:'
:: ;:'
: .:'
. :.
_________________________
: _ _ _ _ _ _ _ _ _ _ _ _ :
,---:".".".".".".".".".".".".":
: ,'"::.:.:.:.:.:.:.:.:.:.:.::'
.. :-===-===-===-===-===-:'
.-._: :
-.__. ,'
,--------"-------------'--------.
"--.__ __.--"'
""-------------""'


The correct flag is indeed flag{it's, it's a device Morty!}.