Skip to content

MELANCHOLY828/mixvoxels

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MixVoxels: Mixed Neural Voxels for Fast Multi-view Video Synthesis

Pytorch implementation for the paper: Mixed Neural Voxels for Fast Multi-view Video Synthesis.

cat.mp4

We present MixVoxels to better represent the dynamic scenes with fast training speed and competitive rendering qualities. The proposed MixVoxels represents the 4D dynamic scenes as a mixture of static and dynamic voxels and processes them with different networks. In this way, the computation of the required modalities for static voxels can be processed by a lightweight model, which essentially reduces the amount of computation, especially for many daily dynamic scenes dominated by the static background. As a result, with 15 minutes of training for dynamic scenes with inputs of 300-frame videos, MixVoxels achieves better PSNR than previous methods.

Installation

Install environment:

conda create -n mixvoxels python=3.8
conda activate mixvoxels
pip install torch torchvision
pip install tqdm scikit-image opencv-python configargparse lpips imageio-ffmpeg kornia lpips tensorboard pyfvvdp

Dataset

  1. Download the Plenoptic Video Dataset
  2. Unzip to your directory DATADIR and run the following command:
python tools/prepare_video.py ${DATADIR}

Training

To train a dynamic scene, run the following commands, you can train different dynamic scenes by assign DATA to different scene name:

DATA=coffee_martini # [coffee_martini|cut_roasted_beef|cook_spinach|flame_salmon|flame_steak|sear_steak]
# MixVoxels-T
python train.py --config configs/schedule5000/${DATA}_5000.txt --render_path 0
# MixVoxels-S
python train.py --config configs/schedule7500/${DATA}_7500.txt --render_path 0
# MixVoxels-M
python train.py --config configs/schedule12500/${DATA}_12500.txt --render_path 0
# MixVoxels-L
python train.py --config configs/schedule25000/${DATA}_25000.txt --render_path 0

Please note that in your first running, the above command will first pre-process the dataset, including resizing the frames by a factor of 2 (to 1K resolution which is a standard), as well as calculating the std of each video and save them into your disk. The pre-processing will cost about 2 hours, but is only required at the first running. After the pre-processing, the command will automatically train your scenes.

We provide the trained model:

scene PSNR download
MixVoxels-T (15min) MixVoxels-M (40min) MixVoxels-T (15min) MixVoxels-M (40min)
coffee-martini 28.1339 29.0186 link link
flame-salmon 28.7982 29.2620 link link
cook-spinach 31.4499 31.6433 link link
cut-roasted-beef 32.4078 32.2800 link link
flame-steak 31.6508 31.3052 link link
sear-steak 31.8203 31.2136 link link

Rendering and Generating Spirals

The following command will generate 120 novel view videos, or you can set the render_path as 1 in the above training command.

python train.py --config your_config --render_only 1 --render_path 1 --ckpt log/your_config/your_config.ckpt

Generating spirals:

python tools/make_spiral.py --video_path log/your_config/img_path_all/ --target log/your_config/spirals --target_video log/your_config/spirals.mp4

Acknowledge

The codes are based on TensoRF, many thanks to the authors.

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.2%
  • Shell 0.8%