This paper proposes a method for improving the per- formance of semantic segmentation models using implicit representation called Plenoxels. The method involves using Plenoxels as an auxiliary input to enhance the performance of the models by combining the implicit representation with RGB information. Additionally, this work proposes an auto- mated pipeline for novel synthetic views generation for tail class sampling using Plenoxels radiance field. The results showed that the combination of rendered images and spher- ical harmonics effectively captures the information present in camera images and incorporating projected spherical harmonics can potentially enhance the performance of se- mantic segmentation models over using color-only rendered images. However, the results showed no improvement when generating novel views for training, due to the used seg- mentation model’s small size and the drop in render quality when deviating from the original camera trajectory.
- Link to PeRFception-ScanNet
- Install git-lfs to your system
- Install git-lfs to your account
git lfs install
- Clone the repo
git clone https://huggingface.co/datasets/YWjimmy/PeRFception-ScanNet
- LFS Fetch to download the dataset
git lfs fetch --all
Follow the steps mentioned Here on how to request access to ScanNet.
- /pose/ - Directory - Contains the list of Camera Extrinsics needed to render PeRFception and Groundtruth Annotations
- /intrinsics/ - Directory - Contains list of Camera Intrinsics needed for rendering
- *_vh_clean_2.label.ply - The ground truth annotated mesh - needed for rendering annotations
git clone [email protected]:m13ammed/Semantic-Segmentation-using-Implicit-Representation.git
CUDA 11.3
is needed for rendering PeRFception.- Install Conda
- Create a conda virtual environment from the requirements.txt
conda create --name semseg --file requirements.txt
- Activate the environment
conda activate semseg
- Install Plenoxels
pip install .
The next steps assumes your folder structure is as follows:
|
|-Semantic-Segmentation-using-Implicit-Representation/
|-PeRFception-ScanNet/
|-ScanNet/
- Open up
configs/
- Cycle through the
.gin
files and update the following:
- Modify
datadir
entries to the location of PeRFception - Modify
scannet_dir
entries to the location of ScanNet. Do not forget the trailing/scans/
- Modify
gt_seg_root
andoutput_segmentation
entries to the planned location where the annotation images will be stored. - Modify
pose_dir
entries to the planned location where the novel view segmentations will be stored. - Modify the
logs_dir
entries to the planned location where the model training logs will be stored - Modify the
include
to the absolute location of thebase.gin
found in the configs
Bash files are heavily used to automate several common processes. Bash files can be found in the base
directory, and in the scripts/
directory
After following Steps 6
and 7
in the previous steps
Run
cd scripts/
chmod +x run.sh
./run.sh
This will render out the PeRFception dataset
After following Steps 6
and 7
in the previous steps
Run
cd scripts/
chmod +x render_seg.sh
./render_seg.sh
This will render out the ground truth Annotations from ScanNet
This step assumes that you rendered out the annotations in the previous step.
- Load up
Export-GT-Stats
and set the threshold value to 25% of your segmentation output size - Run through the blocks until you save the output dictionary as a numpy array.
For your convinience we attached an analyzed dictionary in analysis/
'
- Open
generate_novel_views.py
found in thebase
of the folder - Modify the variables to your directory structure
- Modify
analysis_dict_path
variable to the location of the analysis npy from the previous step - Modify
new_poses_dir
to the location where you want to save the generated camera extrinsics to - Modify
least_4_classes
list to include the class IDs that you want to generate novel views for. - Run
python generate_novel_views.py
After following the previous step, you now have a folder full of scenes and inside them a list of novel camera extrinsics.
The next step filltering the redundant views.
- Open
clean_render_novel_views.py
found in theBase
directory - Modify
ginc
array to the absolute path to the ginc - Modify the corresponding
render_novel_seg.gin
inconfigs/
with thepose_dir
to the folder you used in the last step, andoutput_segmentation
to the directory in which you plan to save the novel view segmentations
After finishing the last step you will have a folder with the filtered rendered annotations.
You can simply loop through the generated annotations and remove the poses where there are no annotations rendered.
- Please note you need to modify the corresponding
bash
files with your directory arrangement as well as the correspodningginc
files. - Please note that (+ Novel Views) assumes you have generated and rendered novel views and you merged the groundtruth + rendered novel views in a new directory for PeRFception and the Annotations.
- To run the model using Spherical Harmonics + RGB
chmod +x sh.sh
./sh.sh
- To run the model using Spherical Harmonics + RGB (+ Novel Views)
chmod +x sh_novel.sh
./sh_novel.sh
- To run the model using RGB only
chmod +x rgb.sh
./rgb.sh
- To run the model using RGB (+ Novel Views)
chmod +x rgb_novel.sh
./rgb_novel.sh
- Modify any of the training ginc files with the full experiment name
- We provided a checkpoint for testing in
analysis/sh_model/
python run_enet.py --mode Test --load_ckpt 420.ckpt --ginc configs/enet_exp/test.gin
- Results will be logged on tensorboard. -->
To help with debugging, the team created some Blender Scripts that uses BPY
to quickly visualize, debug, and export camera extrinsics.
- Navigate to the
blender_scripts/
directory - Open Blender and import the mesh ply of a scene
- Import the
ReadPose.py
- Modify
intrinsics_dir
to point to anintrinsic_color.txt
- Modify
pose_dir
to point to thepose
folder of the scene - To view all the poses in the folder:
view_all_poses(read_k, pose_dir)
- To revert the process and export from Blender a camera extrinsic that follows ScanNet Conventions, create a camera then use the following to get the pose. You can then use numpy to save the txt file.
ext_cam = bpy.data.objects['new_extrinsics']
pose = extract_gt_extrinsics(ext_cam)
- To generate a dome around a target point
create_dome(INITIAL_CAMERA_EXTRINSIC, MAX_ANGLE, TARGET_PT_IN_3D, INTERVALS, INTRINSICS_NP)
|
|-analysis #Contains the saved analysis np mentioned in the Class analysis
|-blender_scripts #Helper scripts to use blender for debugging
|-configs #Contains configuration files for ginc and set of scenes to be used for training, validation and testing
|-dataloader #PeRFception and ScanNet dataset loaders
|-enet #Segmentation model logic
|-lib #Plenoxel specific CUDA library files
|-model #PeRFception model reconstruction
|-scripts #Scripts for rendering
|-static #Resources for the README
|-utils #Utility scripts used in several locations like exporting images and custom segmentation shader
Comparison of the results achieved by using ScanNet Camera images, Rendered RGB images, Rendered RGB+SH images, Rendered RGB (+Novel Views), Rendered SH + RGB (+Novel Views) Sample segmentation results showing segmentation of a Sofa and a Bed. Middle: Ground truth segmentation, Right: Our model.
- YoonwooJeong,SeungjooShin,JunhaLee,ChrisChoy,An- ima Anandkumar, Minsu Cho, and Jaesik Park. Perfcep- tion: Perception using radiance fields. Github
- Angela Dai, Angel X Chang, Manolis Savva, Maciej Hal- ber, Thomas Funkhouser, and Matthias Nießner. Scannet: Richly-annotated 3d reconstructions of indoor scenes. Website
- Krishna Murthy. ENet-ScanNet Github
- Alex Yu, Sara Fridovich-Keil, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. Plenoxels: Radiance fields without neural networks. Github