Skip to content

Latest commit

 

History

History
89 lines (78 loc) · 9.97 KB

TRAIN.md

File metadata and controls

89 lines (78 loc) · 9.97 KB

Training and Inference

Environment Setup

To install the environment requirements needed for SceneVerse, you can run the installation scripts provided by:

$ conda env create -n sceneverse python=3.9
$ conda activate sceneverse
$ pip install --r requirements.txt

Meanwhile, SceneVerse depends on an efficient implementation of PointNet2 which is located in modules. Remember to install it with

$ cd modules/third_party/pointnet2
$ python setup.py install
$ cd ../..

Model Configurations

1. Experiment Setup

We provide all experiment configurations in configs/final, you can find the experiment setting in the top of comment each experiment file. To correctly use the configuration files, you need to change the following fields in the configuration file to load paths correctly:

  • base_dir: save path for model checkpoints, configurations, and logs.
  • logger.entity: we used W&B for logging experiments, change it to your corresponding account.
  • data.{DATASET}_familiy_base: path to {Dataset} related data.
  • model.vision.args.path: path to the pre-trained object encoder (PointNet++).
  • model.vision.args.lang_path: deprecated, but basically text embeddings of the 607 classes in ScanNet.

You can walk through the configs/final/all_pretrain.yaml and compare it with other files to see how we controlled data and objectives used in training.

Experiments

1. Training and Inference

This codebase leverages Huggingface Accelerate package and Facebook Submitit package for efficient model training on multi-node clusters. We provide a launcher file launch.py which provides three ways of launching experiment:

# Launching using submitit on a SLURM cluster (e.g. 10 hour 1 node 4 GPU experiment with config file $CONFIG)
$ python launch.py --mode submitit --time 10 --qos $QOS --partition $PARTITION --mem_per_gpu 80 \
                   --gpu_per_node 4 --config $CONFIG note=$NOTE name=$EXP_NAME
                   
# Launching using accelerator with a multi-gpu instance
$ python launch.py --mode accelerate --gpu_per_node 4 --num_nodes 1 -- config $CONFIG note=$NOTE name=$EXP_NAME 

Basically, launch.py set up process(es) to run the main entry point run.py under multi GPU settings. You can directly overwrite configurations in the configuration file $CONFIG by setting property fields using = after all command line arguments. (e.g., name=$EXP_NAME,solver.epochs=400,dataloader.batchsize=4)

For testing and inference, remember to set up the testing data correctly under each configuration files and switch the mode field in the configurations into test (i.e., mode=test).

2. Debugging

If you want to debug your code without an additional job launcher, you can also directly run the file run.py . As an example, you can directly run the file for debugging with

# Single card direct run for debugging purposes
$ python run.py --config-path ${PROJ_PATH}/configs/final/ --config-name ${EXP_CONFIG_NAME}.yaml \
                num_gpu=1 hydra.run.dir=. hydra.output_subdir=null hydra/job_logging=disabled hydra/hydra_logging=disabled \
                debug.flag=True debug.debug_size=1 dataloader.batchsize=2 debug.hard_debug=True name=Debug_test

Checkpoints

We provide all available checkpoints under the same data directory, named after Checkpoints. Here we provide detailed descriptions of checkpoint in the table below:

Setting Description Corresponding Experiment Checkpoint based on experiment setting
pre-trained GPS model pre-trained on SceneVerse 3D-VL grounding (Tab.2) Model
scratch GPS model trained on datasets from scratch 3D-VL grounding (Tab.2)
SceneVerse-val (Tab. 3)
ScanRefer, Sr3D, Nr3D, SceneVerse-val
fine-tuned GPS model fine-tuned on datasets with grounding heads 3D-VL grounding (Tab.2) ScanRefer, Sr3D, Nr3D
zero-shot GPS model trained on SceneVerse without data from ScanNet and MultiScan Zero-shot Transfer (Tab.3) Model
zero-shot text GPS Zero-shot Transfer (Tab.3) ScanNet, SceneVerse-val
text-ablation Ablations on the type of language used during pre-training Ablation on Text (Tab.7) Template only, Template+LLM
scene-ablation Ablations on the use of synthetic scenes during pre-training Ablation on Scene (Tab.8) Real only, S3D only, ProcTHOR only
model-ablation Ablations on the use of losses during pre-training Ablation on Model Design (Tab.9) Refer only, Refer+Obj-lvl, w/o Scene-lvl

To properly use the pre-trained checkpoints, you can use the pretrain_ckpt_path key in the configs:

# Directly testing the checkpoint
$ python launch.py --mode submitit --qos $QOS --partition $PARTITION --mem_per_gpu 80 \
                   --gpu_per_node 4 --config $CONFIG note=$NOTE name=$EXP_NAME mode=test \
                   pretrain_ckpt_path=$PRETRAIN_CKPT

# Fine-tuning with pre-trained checkpoint
$ python launch.py --mode submitit --qos $QOS --partition $PARTITION --mem_per_gpu 80 \
                   --gpu_per_node 4 --config $CONFIG note=$NOTE name=$EXP_NAME \
                   pretrain_ckpt_path=$PRETRAIN_CKPT

For fine-tuning the pre-trained checkpoint on datasets, you can use the fine-tuning config files provided under configs/final/finetune.