A reimplementation of mip-NeRF in PyTorch.
Not exactly 1-to-1 with the official repo, as we organized the code to out own liking (mostly how the datasets are structued, and hyperparam changes to run the code on a consumer level graphics card), made it more modular, and removed some repetitive code, but it achieves the same results.
- Can use Spherical, or Spiral poses to generate videos for all 3 datasets
- Spherical:
video.mp4
- Spiral:
spiral.mp4
- Depth and Normals video renderings:
- Depth:
depth.mp4
- Normals:
normals.mp4
- Can extract meshes
mesh_.mp4
mesh.mp4
In the future we plan on implementing/changing:
- Factoring out more repetitive/redundant code, optimize gpu memory and rps
- Clean up and expand mesh extraction code
- Zoomed poses for multicam dataset
- Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields support
- NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis support
- Create a conda environment using
mipNeRF.yml
- Get the training data
- run
bash scripts/download_data.sh
to download all 3 datasets: LLFF, Blender, and Multicam. - Individually run the bash script corresponding to an individual dataset
bash scripts/download_llff.sh
to download LLFFbash scripts/download_blender.sh
to download Blenderbash scripts/download_multicam.sh
to download Multicam (Note this will also download the blender dataset since it's derived from it)
- run
- Optionally change config parameters: can change default parameters in
config.py
or specify with command line arguments- Default config setup to run on a high-end consumer level graphics card (~8-12GB)
- Run
python train.py
to trainpython -m tensorboard.main --logdir=log
to start the tensorboard
- Run
python visualize.py
to render a video from the trained model - Run
python extract_mesh.py
to render a mesh from the trained model
I explain the specifics of the code more in detail here but here is a basic rundown.
config.py
: Specifies hyperparameters.datasets.py
: Base genericDataset
class + 3 default dataset implementations.NeRFDataset
: Base class that all datasets should inherent from.Multicam
: Used for multicam data as in the original mip-NeRF paper.Blender
: Used for the synthetic dataset as in original NeRF.LLFF
: Used for the llff dataset as in the original NeRF.
loss.py
: mip-NeRF loss, pretty much just MSE, but also calculates psnr.model.py
: mip-NeRF model, not as modular as the way the original authors wrote it, but easier to understand its structure when laid out verbatim like this.pose_utils.py
: Various functions used to generate poses.ray_utils.py
: Various functions related involving rays that the model uses as input, most are used within the forward function of the model.scheduler.py
: mip-NeRF learning rate scheduler.train.py
: Trains a mip-NeRF model.visualize.py
: Creates the videos using a trained mip-NeRF.
Here's a summary on how NeRF and mip-NeRF work that I wrote when writing this originally.
All PSNRs are average PSNR (coarse + fine).
Video:
video_.mp4
Depth:
depth_.mp4
Normals:
normals_.mp4
Video:
video.mp4
Depth:
depth.mp4
Normals:
normals.mp4
Video:
video_.mp4
Depth:
depth_.mp4
Normals:
normals.mp4
- Thanks to Nina for helping with the code
- Original NeRF Code in Tensorflow
- NeRF Project Page
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
- Original mip-NeRF Code in JAX
- mip-NeRF Project Page
- Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields
- nerf_pl