[ICLR2024] HEAL: An Extensible Framework for Open Heterogeneous Collaborative Perception
This repo is also a unified and integrated multi-agent collaborative perception framework for LiDAR-based, camera-based and heterogeneous setting!
这个仓库同时是一个统一且高度集成的多智能体协作感知框架,适用于 纯LiDAR、纯Camera和异构 实验设置。
Through powerful code integration, you can access 4 datasets, the latest collaborative perception methods, and multiple modality here. This is the most complete collaboration perception framework available.
通过强大的代码集成,您可以在本仓库使用4个数据集、最新协同感知方法、多模态数据。这是目前最完整的协作感知框架。
OpenReview | ArXiv | Zhihu
-
Modality Support
- LiDAR
- Camera
- LiDAR + Camera
-
Heterogeneity Support
- Sensor Data Heterogeneity: We have multiple LiDAR data (16/32/64-line) and camera data (w./w.o. depth sensor) in the same scene.
- Modality Heterogeneity: You can assign different sensor modality to agents in the way you like!
- Model Heterogeneity: You can assign different model encoders (together with modality) to agents in the way you like!
-
Dataset Support
- OPV2V
- V2XSet
- V2X-Sim 2.0
- DAIR-V2X-C
-
Detector Support
- PointPillars (LiDAR)
- SECOND (LiDAR)
- Pixor (LiDAR)
- VoxelNet (LiDAR)
- Lift-Splat-Shoot (Camera)
-
multiple collaborative perception methods
- OPV2V: Please refer to this repo.
- OPV2V-H: We store our data in Huggingface Hub. Please refer to Downloading datasets tutorial for the usage.
- V2XSet: Please refer to this repo.
- V2X-Sim 2.0: Download the data from this page. Also download pickle files from google drive.
- DAIR-V2X-C: Download the data from this page. We use complemented annotation, so please also follow the instruction of this page.
Note that you can select your interested dataset to download. OPV2V and DAIR-V2X-C are heavily used in this repo, so it is recommended that you download and try them first.
Create a dataset
folder under HEAL
and put your data there. Make the naming and structure consistent with the following:
HEAL/dataset
.
├── my_dair_v2x
│ ├── v2x_c
│ ├── v2x_i
│ └── v2x_v
├── OPV2V
│ ├── additional
│ ├── test
│ ├── train
│ └── validate
├── OPV2V_Hetero
│ ├── test
│ ├── train
│ └── validate
├── V2XSET
│ ├── test
│ ├── train
│ └── validate
├── v2xsim2-complete
│ ├── lidarseg
│ ├── maps
│ ├── sweeps
│ └── v1.0-mini
└── v2xsim2_info
├── v2xsim_infos_test.pkl
├── v2xsim_infos_train.pkl
└── v2xsim_infos_val.pkl
conda create -n heal python=3.8
conda activate heal
# install pytorch. Cudatoolkit 11.3 are tested in our experiment.
conda create -n coalign python=3.8 pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.6 -c pytorch -c conda-forge
# install dependency
pip install -r requirements.txt
# install this project. It's OK if EasyInstallDeprecationWarning shows up.
python setup.py develop
We use spconv 1.2.1 or spconv 2.x to generate voxel features.
To install spconv 1.2.1, please follow the guide in https://github.com/traveller59/spconv/tree/v1.2.1. You can also get a detailed installation guide in CoAlign Installation Doc.
To install spconv 2.x, check the table to run the installation command. For example we have cudatoolkit 11.6, then we should run
pip install spconv-cu116 # match your cudatoolkit version
Install bbx nms calculation cuda version
python opencood/utils/setup.py build_ext --inplace
pip install git+https://github.com/klintan/pypcd.git
Install the dependencies for fpv-rcnn.
cd HEAL
python opencood/pcdet_utils/setup.py build_ext --inplace
To align with our agent-type assignment in our experiments, please make a copy of the assignment file under the logs folder
# in HEAL directory
mkdir opencood/logs
cp -r opencood/modality_assign opencood/logs/heter_modality_assign
These training and testing instructions apply to all end-to-end training methods. Note that HEAL requires that a collaborative base be constructed before aligning other agent types, see the next section for training for HEAL. If you want to train a collaborative perception model based on the Pyramid Fusion, the following approach still applies.
We uses yaml file to configure all the parameters for training. To train your own model from scratch or a continued checkpoint, run the following commonds:
python opencood/tools/train.py -y ${CONFIG_FILE} [--model_dir ${CHECKPOINT_FOLDER}]
Arguments Explanation:
-y
orhypes_yaml
: the path of the training configuration file, e.g.opencood/hypes_yaml/opv2v/LiDAROnly/lidar_fcooper.yaml
, meaning you want to train a FCooper model. We elaborate each entry of the yaml in the exemplar config fileopencood/hypes_yaml/exemplar.yaml
.model_dir
(optional) : the path of the checkpoints. This is used to fine-tune or continue-training. When themodel_dir
is given, the trainer will discard thehypes_yaml
and load theconfig.yaml
in the checkpoint folder. In this case, ${CONFIG_FILE} can beNone
,
CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 --use_env opencood/tools/train_ddp.py -y ${CONFIG_FILE} [--model_dir ${CHECKPOINT_FOLDER}]
--nproc_per_node
indicate the GPU number you will use.
python opencood/tools/inference.py --model_dir ${CHECKPOINT_FOLDER} [--fusion_method intermediate]
inference.py
has more optional args, you can inspect into this file.[--fusion_method intermediate]
the default fusion method is intermediate fusion. According to your fusion strategy in training, available fusion_method can be:- single: only ego agent's detection, only ego's gt box. [only for late fusion dataset]
- no: only ego agent's detection, all agents' fused gt box. [only for late fusion dataset]
- late: late fusion detection from all agents, all agents' fused gt box. [only for late fusion dataset]
- early: early fusion detection from all agents, all agents' fused gt box. [only for early fusion dataset]
- intermediate: intermediate fusion detection from all agents, all agents' fused gt box. [only for intermediate fusion dataset]
We introduced identifiers such as m1
, m2
, ... to indicate the modalities and models that an agent will use.
However, yaml files without identifiers like m1
(if you are familiar with the CoAlign repository) still work in this repository. For example, PointPillar Early Fusion.
Note that there will be some differences in the weight key names of their two models' checkpoint. For example, training with the m1
identifier will assign some parameters's name with prefix like encoder_m1.
, backbone_m1
, etc. But since the model structures are the same, you can convert them using the rename_model_dict_keys
function in opencood/utils/model_utils.py
.
-
The identifiers like
m1, m2
inopv2v_4modality.json
are used to assign agent type to each agent in the scene. With this assignment, we ensure the validation scenarios for all methods are consistent and fixed. To generate these json files, you can refer to heter_utils.py. -
The identifiers like
m1, m2
in${METHOD}.yaml
are used to specify the sensor configuration and detection model used by this agent type (likem2
in the case ofcamera_pyramid.yaml
).
In ${METHOD}.yaml
, there is also a concept of mapping_dict
. It maps the given agent type of opv2v_4modality.json
to the agent type in the current experiment. As you can see, camera_pyramid.yaml
is a homogeneous collaborative perception setting, so the type of all agents should be the same, which can be referred to by m2
.
Just note that mapping_dict
will not take effect during the training process to introduce more data augmentation. Each agent will be randomly assigned an agent type that exists in the yaml.
HEAL will first train a collaboration base and then align new agent type to this base. Follows our paper, we select LiDAR w/ PointPillars as our collaboration base.
Suppose you are now in the HEAL/
folder. If this is your first training attempt, execute mkdir opencood/logs
. Then
mkdir opencood/logs/HEAL_m1_based
mkdir opencood/logs/HEAL_m1_based/stage1
mkdir opencood/logs/HEAL_m1_based/stage1/m1_base
cp opencood/hypes_yaml/opv2v/MoreModality/HEAL/stage1/m1_pyramid.yaml opencood/logs/HEAL_m1_based/stage1/m1_base/config.yaml
python opencood/tools/train.py -y None --model_dir opencood/logs/HEAL_m1_based/stage1/m1_base # you can also use DDP training
After the collaboration base training, you probably get a best-validation checkpoint. For example, "net_epoch_bestval_at23.pth". Then we use and fix the parameters of Pyramid Fusion in "net_epoch_bestval_at23.pth" for new agent type training.
mkdir opencood/logs/HEAL_m1_based/stage2
mkdir opencood/logs/HEAL_m1_based/stage2/m2_alignto_m1
mkdir opencood/logs/HEAL_m1_based/stage2/m3_alignto_m1
mkdir opencood/logs/HEAL_m1_based/stage2/m4_alignto_m1
# your bestval checkpoint!
cp opencood/logs/HEAL_m1_based/stage1/m1_base/net_epoch_bestval_at23.pth opencood/logs/HEAL_m1_based/stage2/net_epoch1.pth
ln -s opencood/logs/HEAL_m1_based/stage2/net_epoch1.pth opencood/logs/HEAL_m1_based/stage2/m2_alignto_m1
ln -s opencood/logs/HEAL_m1_based/stage2/net_epoch1.pth opencood/logs/HEAL_m1_based/stage2/m3_alignto_m1
ln -s opencood/logs/HEAL_m1_based/stage2/net_epoch1.pth opencood/logs/HEAL_m1_based/stage2/m4_alignto_m1
cp opencood/hypes_yaml/opv2v/MoreModality/HEAL/stage2/m2_single_pyramid.yaml opencood/logs/HEAL_m1_based/stage2/m2_alignto_m1/config.yaml
cp opencood/hypes_yaml/opv2v/MoreModality/HEAL/stage2/m3_single_pyramid.yaml opencood/logs/HEAL_m1_based/stage2/m3_alignto_m1/config.yaml
cp opencood/hypes_yaml/opv2v/MoreModality/HEAL/stage2/m4_single_pyramid.yaml opencood/logs/HEAL_m1_based/stage2/m4_alignto_m1/config.yaml
Then you can train new agent type without collaboration. These models can be trained in parallel.
# you can also use DDP training
python opencood/tools/train.py -y None --model_dir opencood/logs/HEAL_m1_based/stage2/m2_alignto_m1
python opencood/tools/train.py -y None --model_dir opencood/logs/HEAL_m1_based/stage2/m3_alignto_m1
python opencood/tools/train.py -y None --model_dir opencood/logs/HEAL_m1_based/stage2/m4_alignto_m1
mkdir opencood/logs/HEAL_m1_based/final_infer/ # create a log folder for final infer.
cp opencood/hypes_yaml/opv2v/MoreModality/HEAL/final_infer/m1m2m3m4.yaml opencood/logs/HEAL_m1_based/final_infer/config.yaml
python opencood/tools/heal_tools.py merge_final \
opencood/logs/HEAL_m1_based/stage2/m2_alignto_m1 \
opencood/logs/HEAL_m1_based/stage2/m3_alignto_m1 \
opencood/logs/HEAL_m1_based/stage2/m4_alignto_m1 \
opencood/logs/HEAL_m1_based/stage1/m1_base \
opencood/logs/HEAL_m1_based/final_infer
python opencood/tools/heal_tools.py merge_final
will automatically search the best checkpoints for each folder and merge them together. The collaboration base's folder (m1 here) should be put in the second to last place, while the output folder should be put last.
To validate the HEAL's performance in open heterogeneous setting, i.e., gradually adding new agent types into the scene, we use opencood/tools/inference_heter_in_order.py
.
python opencood/tools/inference_heter_in_order.py --model_dir opencood/logs/HEAL_m1_based/final_infer
This will overwrite many parameters in config.yaml
, including mapping_dict
, comm_range
, and gradually adding m1, m2, m3, m4 agent into the scene. Ground-truth will always be max_cav
's fused gt boxes.
Take the DAIR-V2X dataset as an example, which consists of one vehicle and one Road-side Unit(RSU). We first trained the Pyramid Fusion using the vehicle and the RSU’s data as the collaboration base. Subsequently, we distributed the vehicle’s raw data and the Pyramid Fusion’s weights to various companies, allowing them to train their respective models locally.
We store our checkpoints files in HEAL's Huggingface Hub.
Update: Those checkpoints has a faulty input channel number for SECOND related models, but you can still run them with spconv 1.2.1 (because spconv 1.2.1 has no sanity check). The performance should degrade but it still looks reasonable. More discussion can be found in Issue 20.
If you want to compare with HEAL's model and you use spconv 1.2.1, you can still load from the checkpoint. To develop your model, please do not use these checkpoints.
@inproceedings{
lu2024an,
title={An Extensible Framework for Open Heterogeneous Collaborative Perception},
author={Lu, Yifan and Hu, Yue and Zhong, Yiqi and Wang, Dequan and Chen, Siheng and Wang, Yanfeng},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
}