Tags: ml
Rating: 4.5
We are given a .ckpt file, which is a common format used for trained model checkpoints. Let's try to load it using torch and inspect the insides:
>>> import torch
>>> model = torch.load('step-000029999.ckpt', map_location=torch.device('cpu'))
>>> print(model)
{'step': 29999, 'pipeline': OrderedDict([('datamanager.train_camera_optimizer.pose_adjustment', tensor([[-7.7659e-04, -3.3166e-04, -8.3249e-04, 4.0183e-04, -1.1744e-04,
...
...
So, we have a dictionary with keys steps
, pipeline
, optimizers
and scalers
.
Looking up the first element in the pipeline (datamanager.train_camera_optimizer.pose_adjustment
) on Google, we find a GitHub repo that talks about NeRFs.
After scouting that repo, we find that they use nerfstudio to run some experiments using NeRF. So our checkpoint must be from the nerfstudio trainer.
I've followed the installation instructions to install nerfstudio in my WSL2 environment and tested it by following their example in the docs:
ns-download-data dnerf
ns-train nerfacto --data data/dnerf/lego dnerfdata
After confirming that the training works, I've added the --steps-per-save 1
parameter so it saves the model immediately after starting training.
This way, we get config.yml
and a model checkpoint in the outputs
folder.
Now I delete the model checkpoint we've just created, and move over the checkpoint from the challenge.
Trying to run ns-viewer --load-config {outputs/.../config.yml}
to load our implanted checkpoint fails with an error similar to the one in that GitHub repo we found earlier. After some more reading in that repo, we find that the shapes of the model depend on the dataset size.
Looking at the shapes in the error and comparing that to the lego
dataset I used as a dummy dataset, I find the correct dataset size for our model - 224 train images, and 24 validation images.
Then, I went on a hacky way - patching the library with hardcoded numbers. But that proved to be harder that I thought, and in the end, I just decided to make a dummy dataset that matched the size of the original dataset that the model was trained on.
I made 224 image copies in the train
folder, and 24 in the val
folder. Next, I wrote a simple Python script that renamed those images and added them into the transforms_train.json
and transforms_val.json
files with some dummy camera positions:
from pathlib import Path
import os
import json
for i, image_path in enumerate(list(Path('./train').glob('*.png'))):
os.rename(image_path, image_path.parent / f'r_{i}.png')
for i, image_path in enumerate(list(Path('./val').glob('*.png'))):
os.rename(image_path, image_path.parent / f'r_{i}.png')
train = json.load(open('transforms_train.json'))
val = json.load(open('transforms_val.json'))
train_frame = train["frames"][0]
val_frame = val["frames"][0]
train["frames"] = []
val["frames"] = []
for i in range(0, 224):
train_frame['file_path'] = './train/r_{}'.format(i)
train["frames"].append(train_frame.copy())
for i in range(0, 24):
val_frame['file_path'] = './val/r_{}'.format(i)
val["frames"].append(val_frame.copy())
json.dump(train, open('transforms_train.json', 'w'))
json.dump(val, open('transforms_val.json', 'w'))
Now, that we have the correct dataset size, we can run ns-viewer --load-config {outputs/.../config.yml}
again, and this time it successfully loads the model. After navigating to the viewer and looking around, we find the flag:
NOTE: The flag shown in the image is the flag to the other challenge (Beheeyem's Password), the task author accidentally uploaded the wrong model file. The file was reuploaded shortly after.