X-Ray: A Sequential 3D Representation for Generation.

Introduction

We introduce X-Ray, a novel 3D sequential representation inspired by the penetrability of x-ray scans. X-Ray transforms a 3D object into a series of surface frames at different layers, making it suitable for generating 3D models from images. Our method utilizes ray casting from the camera center to capture geometric and textured details, including depth, normal, and color, across all intersected surfaces. This process efficiently condenses the whole 3D object into a multi-frame video format, motivating the utilize of a network architecture similar to those in video diffusion models. This design ensures an efficient 3D representation by focusing solely on surface information. We demonstrate the practicality and adaptability of our X-Ray representation by synthesizing the complete visible and hidden surfaces of a 3D object from a single input image, which paves the way for new 3D representation research and practical applications.

The example of X-Ray.

The overview of 3D synthesis via X-Ray.

Getting Started

Installation

$ conda create -n xray python=3.10
$ pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118
$ pip install -U xformers==v0.0.23.post1 --index-url https://download.pytorch.org/whl/cu118
$ pip install -r requirements.txt

Download Dataset from Huggingface.

$ cat 0*.zip > Objaverse_XRay.zip
$ unzip Objaverse_XRay.zip
$ ln -s /path/to/Objaverse_XRay Data/Objaverse_XRay

Preprocess rendering image and obtain X-Ray for your own dataset.

Render the mesh to obtain the image and camera parameters.

$ cd preprocess/get_image
$ bash custom/render_mesh.sh

Obtain the X-Ray representation.

$ cd preprocess/get_xray
$ python get_xray.py

load xray from .npz file

from scipy.sparse import csr_matrix
import numpy as np

def load_xray(xray_path):
    loaded_data = np.load(xray_path)
    loaded_sparse_matrix = csr_matrix((loaded_data['data'], loaded_data['indices'], loaded_data['indptr']), shape=loaded_data['shape'])
    original_shape = (16, 1+3+3, 256, 256)
    restored_array = loaded_sparse_matrix.toarray().reshape(original_shape)
    return restored_array
xray = load_xray('example/dataset/xrays/0a0bc2921e5246a28732bf5584c251d1/000.npz')

A minimal dataset is located in ./example/dataset

Training

Train Diffusion Model

$ bash scripts/train_diffusion.sh

Train Upsampler

$ bash scripts/train_upsampler.sh

Evaluation

$ python evaluate_diffusion.py --exp_diffusion Objaverse_XRay --date_root Data/Objaverse_XRay
$ python evaluate_upsampler.py --exp_diffusion Objaverse_XRay --exp_upsampler Objaverse_XRay_upsampler

TODO list

Release paper details.
Release the dataset.
Release the training and testing source code.
Release the preprocessing code.
Release the pre-trained model.
Release the gradio demo.

Authors

Tao Hu et al.

Acknowledgement

The model is related to Diffusers and Stability AI;
The source code is mainly based on SVD Xtend, which can train Stable Video Diffusion from scratch.

Citation

If you find this work useful for your research, please cite our paper:

@inproceedings{
hu2024xray,
title={X-Ray: A Sequential 3D Representation For Generation},
author={Tao Hu and Wenhang Ge and Yuyang Zhao and Gim Hee Lee},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024},
url={https://openreview.net/forum?id=36tMV15dPO}
}

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
example		example
preprocess		preprocess
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
evaluate_diffusion.py		evaluate_diffusion.py
evaluate_normalized_metric.py		evaluate_normalized_metric.py
evaluate_upsampler.py		evaluate_upsampler.py
evaluate_vae.py		evaluate_vae.py
inference_hr.py		inference_hr.py
inference_lr.py		inference_lr.py
requirements.txt		requirements.txt
train_diffusion.py		train_diffusion.py
train_upsampler.py		train_upsampler.py
train_vae.py		train_vae.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

X-Ray: A Sequential 3D Representation for Generation.

Introduction

Getting Started

Installation

Download Dataset from Huggingface.

Preprocess rendering image and obtain X-Ray for your own dataset.

Training

Train Diffusion Model

Train Upsampler

Evaluation

TODO list

Authors

Acknowledgement

Citation

About

Releases

Packages

Languages

tau-yihouxiang/X-Ray

Folders and files

Latest commit

History

Repository files navigation

X-Ray: A Sequential 3D Representation for Generation.

Introduction

Getting Started

Installation

Download Dataset from Huggingface.

Preprocess rendering image and obtain X-Ray for your own dataset.

Training

Train Diffusion Model

Train Upsampler

Evaluation

TODO list

Authors

Acknowledgement

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages