Tags: quine
Rating:
# Challenge 15: Self-Replicating Toy
This challenge requires writing a program in the given language that prints out itself. This is commonly known as a quine.
## Assemblium
Assemblium uses 2 stacks namely the data stack and code stack, with the following being executed when an instruction gets popped off the top of the code stack:
Instruction | Result
------------|-------
``0x00``-``0x7f`` | Pushes instruction to data stack
``0x80`` | XOR top value in data stack with ``0x80``
``0x81`` | If top value on data istack is ``0x00``, replace with ``0xff``, else with ``0x00``
``0x82`` | Replace top 2 values on data stack with bitwise AND
``0x83`` | Replace top 2 values on data stack with bitwise OR
``0x84`` | Replace top 2 values on data stack with bitwise XOR
``0x90`` | Swaps top 2 values on data stack
``0x91`` | Duplicates top value on data stack
``0xa0`` | Pops top value on data stack as index, pops from data stack until ``0xa1`` and assigns to indexed function
``0xb0`` | Pops top value on data stack and outputs
``0xc0``-``0xdf`` | Pushes specified function to code stack
``0xe0``-``0xff`` | Pops top value on data stack, pushes specified function to code stack if not ``0x00``
## Writing a quine
A common approach to writing a quine is to have a program declare a string containing the rest of the program, then have the rest of the program write everything before the string, and then use the string to first print it normally then print the code (which is contained in the string).
An example in Python would be as follows:
```
s = "print('s = \"' + s + '\"')\nprint(s)"
print('s = \"' + s + '\"')
print(s)
```
This has a problem though, in that it creates a situation where quotation marks need to be escaped, but any escape characters also need to be escaped, et cetera.
To solve this, encode the string in a certain way and only decode it when we print it
```
from base64 import b64decode
s = b"cHJpbnQoImZyb20gYmFzZTY0IGltcG9ydCBiNjRkZWNvZGUiKQpwcmludCgicyA9IGJcIiIgKyBzLmRlY29kZSgidXRmLTgiKSArICJcIiIpCnByaW50KGI2NGRlY29kZShzKS5kZWNvZGUoInV0Zi04Iikp"
print("from base64 import b64decode")
print("s = b\"" + s.decode("utf-8") + "\"")
print(b64decode(s).decode("utf-8"))
```
While not the shortest quine ever, this approach does work
## Writing a quine in Assemblium
Code from here on will be in the format where all data is hex instructions unless in a comment denoted by a hash
### Functions
First of all, we need to be able to define functions. A problem exists in that, all code written to a function must be in the data stack, but in order to get things into the data stack it needs to be smaller than ``0x80``. What we do to avoid this is to XOR any instructions with ``0x80`` before, and as soon as they are on the data stack, XOR them back.
A function might look like this
```
# put a1 on data stack
21 80
# put code on data stack in reverse order
# code to run: 61 b0 (prints a)
30 80 61
# define as function 0
00 a0
```
And can then be called by doing
```
c0
```
### Strings
Our code will use the following structure. A function ``00`` will be defined that copies string data (less than ``0x80``) onto the code stack, which will then be reversed onto the data stack, where it will be printed by some functions. Firstly, we just need to define a function that prints the string data normally.
The function we use will be recursive, calling itself until the data is printed. To stop at the end, we need an end character. Since there may be null bytes in the string, we used ``0x7e`` instead.
The function, ``01``, is as follows
```
# put a1 on data stack
21 80
# put code on data stack in reverse order
# code to run: b0 91 70 0e 83 84 e1 (print string data until 0x7e)
61 80 04 80 03 80 0e 70 11 80 30 80
# define as function 1
01 a0
```
In pseudocode:
```
define function 01:
print top byte on data stack
duplicate next byte
put 0x7e on data stack by AND'ing 70 and 0e to avoid an unintentional special character
XOR next duplicated byte with 0x7e
if not equal, run function 01
```
### Executable data
Now, since our program will contain executable data (larger than ``0x80``), we need some way to transform that into string data to be stored in a function, and transform it back to the original bytes on the data stack to be printed. The way we do this is to XOR it with ``0x80``, but since we can't store the ``0x80`` instruction itself in the string to be executed, we will store ``0x7f`` instead, signifying that the next bute needs to be XORed with ``0x80`` before printing as code.
This means that we will need another string printing function that will print the same string, but actually XORing the correct bytes as well, to print out the actual Assemblium code.
This function, ``02``, is as follows:
```
# put a1 on data stack
21 80
# put code on data stack in reverse order
# code to run: 91 70 0f 83 84 81 e3 b0 91 70 0e 83 84 e2 (print unencoded data using 0x7f until 0x7e)
62 80 04 80 03 80 0e 70 11 80 30 80 63 80 01 80 04 80 03 80 70 0f 11 80
# define as function 2
02 a0
```
This also depends on another function ``03``:
```
# put a1 on data stack
21 80
# put code on data stack in reverse order
# code to run: 91 84 e3 80 (pop and 0x80 next byte from data stack)
00 80 63 80 04 80 11 80
# define as function 3
03 a0
```
In pseudocode:
```
define function 02:
duplicate next byte
put 0x7f on data stack by AND'ing 70 and 0f to avoid an unintentional special character
XOR next duplicated byte with 0x7f
if equal, run function 03 (pop second duplicate character and XOR next byte on data stack with 0x80)
print top byte on data stack
duplicate next byte
put 0x7e on data stack by AND'ing 70 and 0e to avoid an unintentional special character
XOR next duplicated byte with 0x7e
if not equal, run function 02
```
### Putting it all together
Finally, we will be running this code:
```
# print header (putting a1 on data stack for string function)
21 b0 00 80 b0
# print string data by putting 0x7e on data stack, putting string on data stack and running function 01
70 0e 83
c0 c1
# print unencoded real data by putting 0x7e on data stack, putting string on data stack and running function 02
70 0e 83
c0 c2
```
Thus, our final code is as follows:
```
#00: copy string data to code stack
# double reversed string data, replace 0x8_ with 0x7f 0x0_
# include all code after string line
21 80
# !!! INSERT PROCESSED CODE STRING FOLLOWING THIS LINE HERE !!!
00 a0
#01: print normal string data until 0x7e
# b0 91 70 0e 83 84 e1
21 80
61 80 04 80 03 80 0e 70 11 80 30 80
01 a0
#02: print real data using 0x7f until 0x7e
# 91 70 0f 83 84 81 e3 b0 91 70 0e 83 84 e2
21 80
62 80 04 80 03 80 0e 70 11 80 30 80 63 80 01 80 04 80 03 80 70 0f 11 80
02 a0
#03: pop and 0x80 next byte from data stack
# 91 84 e3 80
21 80
00 80 63 80 04 80 11 80
03 a0
# print header
21 b0 00 80 b0
# print string data
70 0e 83
c0 c1
# print real code
70 0e 83
c0 c2
```
Now, we simply take all the bytes following the line for the string data, replace all code bytes (larger than ``0x80``) with ``0x7f`` and the byte XORed with ``0x80``, and put this data on the string data line.
String data:
```
00 7f 20 21 7f 00 61 7f 00 04 7f 00 03 7f 00 0e 70 11 7f 00 30 7f 00 01 7f 20 21 7f 00 62 7f 00 04 7f 00 03 7f 00 0e 70 11 7f 00 30 7f 00 63 7f 00 01 7f 00 04 7f 00 03 7f 00 70 0f 11 7f 00 02 7f 20 21 7f 00 00 7f 00 63 7f 00 04 7f 00 11 7f 00 03 7f 20 21 7f 30 00 7f 00 7f 30 70 0e 7f 03 7f 40 7f 41 70 0e 7f 03 7f 40 7f 42
```
And that is it! A successful quine in Assemblium