Skip to content

Commit fe25f7a

Browse files
authored
Merge pull request #2867 from open-mmlab/dev-1.x
Bump version to 1.4.0
2 parents 5c0613b + 0ef13b8 commit fe25f7a

File tree

80 files changed

+7291
-1522
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

80 files changed

+7291
-1522
lines changed

.circleci/test.yml

+5-3
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ jobs:
8585
type: string
8686
cuda:
8787
type: enum
88-
enum: ["11.1", "11.7"]
88+
enum: ["10.2", "11.7"]
8989
cudnn:
9090
type: integer
9191
default: 8
@@ -173,7 +173,8 @@ workflows:
173173
torch: 1.8.1
174174
# Use double quotation mark to explicitly specify its type
175175
# as string instead of number
176-
cuda: "11.1"
176+
cuda: "10.2"
177+
cudnn: 7
177178
requires:
178179
- hold
179180
- build_cuda:
@@ -190,7 +191,8 @@ workflows:
190191
- build_cuda:
191192
name: minimum_version_gpu
192193
torch: 1.8.1
193-
cuda: "11.1"
194+
cuda: "10.2"
195+
cudnn: 7
194196
filters:
195197
branches:
196198
only:

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -134,3 +134,4 @@ data/sunrgbd/OFFICIAL_SUNRGBD/
134134
# Waymo evaluation
135135
mmdet3d/evaluation/functional/waymo_utils/compute_detection_metrics_main
136136
mmdet3d/evaluation/functional/waymo_utils/compute_detection_let_metrics_main
137+
mmdet3d/evaluation/functional/waymo_utils/compute_segmentation_metrics_main

README.md

+8-2
Original file line numberDiff line numberDiff line change
@@ -104,9 +104,15 @@ Like [MMDetection](https://github.com/open-mmlab/mmdetection) and [MMCV](https:/
104104

105105
### Highlight
106106

107-
**We have renamed the branch `1.1` to `main` and switched the default branch from `master` to `main`. We encourage users to migrate to the latest version, though it comes with some cost. Please refer to [Migration Guide](docs/en/migration.md) for more details.**
107+
In version 1.4, MMDetecion3D refactors the Waymo dataset and accelerates the preprocessing, training/testing setup, and evaluation of Waymo dataset. We also extends the support for camera-based, such as Monocular and BEV, 3D object detection models on Waymo. A detailed description of the Waymo data information is provided [here](https://mmdetection3d.readthedocs.io/en/latest/advanced_guides/datasets/waymo.html).
108108

109-
We have constructed a comprehensive LiDAR semantic segmentation benchmark on SemanticKITTI, including Cylinder3D, MinkUNet and SPVCNN methods. Noteworthy, the improved MinkUNetv2 can achieve 70.3 mIoU on the validation set of SemanticKITTI. We have also supported the training of BEVFusion and an occupancy prediction method, TPVFomrer, in our `projects`. More new features about 3D perception are on the way. Please stay tuned!
109+
Besides, in version 1.4, MMDetection3D provides [Waymo-mini](https://download.openmmlab.com/mmdetection3d/data/waymo_mmdet3d_after_1x4/waymo_mini.tar.gz) to help community users get started with Waymo and use it for quick iterative development.
110+
111+
**v1.4.0** was released in 8/1/2024:
112+
113+
- Support the training of [DSVT](<(https://arxiv.org/abs/2301.06051)>) in `projects`
114+
- Support [Nerf-Det](https://arxiv.org/abs/2307.14620) in `projects`
115+
- Refactor Waymo dataset
110116

111117
**v1.3.0** was released in 18/10/2023:
112118

README_zh-CN.md

+8-4
Original file line numberDiff line numberDiff line change
@@ -104,9 +104,15 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱,下一代
104104

105105
### 亮点
106106

107-
**我们将 `1.1` 分支重命名为 `main` 并将默认分支从 `master` 切换到 `main`。我们鼓励用户迁移到最新版本,请参考 [迁移指南](docs/en/migration.md)以了解更多细节。**
107+
在1.4版本中,MMDetecion3D 重构了 Waymo 数据集, 加速了 Waymo 数据集的预处理、训练/测试启动、验证的速度。并且在 Waymo 上拓展了对 单目/BEV 等基于相机的三维目标检测模型的支持。在[这里](https://mmdetection3d.readthedocs.io/en/latest/advanced_guides/datasets/waymo.html)提供了对 Waymo 数据信息的详细解读。
108108

109-
我们在 SemanticKITTI 上构建了一个全面的点云语义分割基准,包括 Cylinder3D 、MinkUNet 和 SPVCNN 方法。其中,改进后的 MinkUNetv2 在验证集上可以达到 70.3 mIoU。我们还在 `projects` 中支持了 BEVFusion 的训练和全新的 3D 占有网格预测网络 TPVFormer。更多关于 3D 感知的新功能正在进行中。请继续关注!
109+
此外,在1.4版本中,MMDetection3D 提供了 [Waymo-mini](https://download.openmmlab.com/mmdetection3d/data/waymo_mmdet3d_after_1x4/waymo_mini.tar.gz) 来帮助社区用户上手 Waymo 并用于快速迭代开发。
110+
111+
**v1.4.0** 版本已经在 2024.1.8 发布:
112+
113+
-`projects` 中支持了 [DSVT](<(https://arxiv.org/abs/2301.06051)>) 的训练
114+
-`projects` 中支持了 [Nerf-Det](https://arxiv.org/abs/2307.14620)
115+
- 重构了 Waymo 数据集
110116

111117
**v1.3.0** 版本已经在 2023.10.18 发布:
112118

@@ -171,8 +177,6 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱,下一代
171177

172178
## 基准测试和模型库
173179

174-
## 基准测试和模型库
175-
176180
测试结果和模型可以在[模型库](docs/zh_cn/model_zoo.md)中找到。
177181

178182
<div align="center">
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,184 @@
1+
# dataset settings
2+
# D3 in the config name means the whole dataset is divided into 3 folds
3+
# We only use one fold for efficient experiments
4+
dataset_type = 'WaymoDataset'
5+
data_root = 'data/waymo/kitti_format/'
6+
class_names = ['Pedestrian', 'Cyclist', 'Car']
7+
metainfo = dict(classes=class_names)
8+
input_modality = dict(use_lidar=False, use_camera=True)
9+
10+
# Example to use different file client
11+
# Method 1: simply set the data root and let the file I/O module
12+
# automatically infer from prefix (not support LMDB and Memcache yet)
13+
14+
# data_root = 's3://openmmlab/datasets/detection3d/waymo/kitti_format/'
15+
16+
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
17+
# backend_args = dict(
18+
# backend='petrel',
19+
# path_mapping=dict({
20+
# './data/': 's3://openmmlab/datasets/detection3d/',
21+
# 'data/': 's3://openmmlab/datasets/detection3d/'
22+
# }))
23+
backend_args = None
24+
25+
train_pipeline = [
26+
dict(type='LoadImageFromFileMono3D', backend_args=backend_args),
27+
dict(
28+
type='LoadAnnotations3D',
29+
with_bbox=True,
30+
with_label=True,
31+
with_attr_label=False,
32+
with_bbox_3d=True,
33+
with_label_3d=True,
34+
with_bbox_depth=True),
35+
# base shape (1248, 832), scale (0.95, 1.05)
36+
dict(
37+
type='RandomResize3D',
38+
scale=(1248, 832),
39+
ratio_range=(0.95, 1.05),
40+
# ratio_range=(1., 1.),
41+
interpolation='nearest',
42+
keep_ratio=True,
43+
),
44+
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
45+
dict(
46+
type='Pack3DDetInputs',
47+
keys=[
48+
'img', 'gt_bboxes', 'gt_bboxes_labels', 'gt_bboxes_3d',
49+
'gt_labels_3d', 'centers_2d', 'depths'
50+
]),
51+
]
52+
53+
test_pipeline = [
54+
dict(type='LoadImageFromFileMono3D', backend_args=backend_args),
55+
dict(
56+
type='RandomResize3D',
57+
scale=(1248, 832),
58+
ratio_range=(1., 1.),
59+
interpolation='nearest',
60+
keep_ratio=True),
61+
dict(
62+
type='Pack3DDetInputs',
63+
keys=['img'],
64+
meta_keys=[
65+
'box_type_3d', 'img_shape', 'cam2img', 'scale_factor',
66+
'sample_idx', 'context_name', 'timestamp', 'lidar2cam'
67+
]),
68+
]
69+
# construct a pipeline for data and gt loading in show function
70+
# please keep its loading function consistent with test_pipeline (e.g. client)
71+
eval_pipeline = [
72+
dict(type='LoadImageFromFileMono3D', backend_args=backend_args),
73+
dict(
74+
type='RandomResize3D',
75+
scale=(1248, 832),
76+
ratio_range=(1., 1.),
77+
interpolation='nearest',
78+
keep_ratio=True),
79+
dict(
80+
type='Pack3DDetInputs',
81+
keys=['img'],
82+
meta_keys=[
83+
'box_type_3d', 'img_shape', 'cam2img', 'scale_factor',
84+
'sample_idx', 'context_name', 'timestamp', 'lidar2cam'
85+
]),
86+
]
87+
88+
train_dataloader = dict(
89+
batch_size=3,
90+
num_workers=3,
91+
persistent_workers=True,
92+
sampler=dict(type='DefaultSampler', shuffle=True),
93+
dataset=dict(
94+
type=dataset_type,
95+
data_root=data_root,
96+
ann_file='waymo_infos_train.pkl',
97+
data_prefix=dict(
98+
pts='training/velodyne',
99+
CAM_FRONT='training/image_0',
100+
CAM_FRONT_LEFT='training/image_1',
101+
CAM_FRONT_RIGHT='training/image_2',
102+
CAM_SIDE_LEFT='training/image_3',
103+
CAM_SIDE_RIGHT='training/image_4'),
104+
pipeline=train_pipeline,
105+
modality=input_modality,
106+
test_mode=False,
107+
metainfo=metainfo,
108+
cam_sync_instances=True,
109+
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset
110+
# and box_type_3d='Depth' in sunrgbd and scannet dataset.
111+
box_type_3d='Camera',
112+
load_type='fov_image_based',
113+
# load one frame every three frames
114+
load_interval=3,
115+
backend_args=backend_args))
116+
117+
val_dataloader = dict(
118+
batch_size=1,
119+
num_workers=1,
120+
persistent_workers=True,
121+
drop_last=False,
122+
sampler=dict(type='DefaultSampler', shuffle=False),
123+
dataset=dict(
124+
type=dataset_type,
125+
data_root=data_root,
126+
data_prefix=dict(
127+
pts='training/velodyne',
128+
CAM_FRONT='training/image_0',
129+
CAM_FRONT_LEFT='training/image_1',
130+
CAM_FRONT_RIGHT='training/image_2',
131+
CAM_SIDE_LEFT='training/image_3',
132+
CAM_SIDE_RIGHT='training/image_4'),
133+
ann_file='waymo_infos_val.pkl',
134+
pipeline=eval_pipeline,
135+
modality=input_modality,
136+
test_mode=True,
137+
metainfo=metainfo,
138+
cam_sync_instances=True,
139+
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset
140+
# and box_type_3d='Depth' in sunrgbd and scannet dataset.
141+
box_type_3d='Camera',
142+
load_type='fov_image_based',
143+
load_eval_anns=False,
144+
backend_args=backend_args))
145+
146+
test_dataloader = dict(
147+
batch_size=1,
148+
num_workers=1,
149+
persistent_workers=True,
150+
drop_last=False,
151+
sampler=dict(type='DefaultSampler', shuffle=False),
152+
dataset=dict(
153+
type=dataset_type,
154+
data_root=data_root,
155+
data_prefix=dict(
156+
pts='training/velodyne',
157+
CAM_FRONT='training/image_0',
158+
CAM_FRONT_LEFT='training/image_1',
159+
CAM_FRONT_RIGHT='training/image_2',
160+
CAM_SIDE_LEFT='training/image_3',
161+
CAM_SIDE_RIGHT='training/image_4'),
162+
ann_file='waymo_infos_val.pkl',
163+
pipeline=eval_pipeline,
164+
modality=input_modality,
165+
test_mode=True,
166+
metainfo=metainfo,
167+
cam_sync_instances=True,
168+
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset
169+
# and box_type_3d='Depth' in sunrgbd and scannet dataset.
170+
box_type_3d='Camera',
171+
load_type='fov_image_based',
172+
backend_args=backend_args))
173+
174+
val_evaluator = dict(
175+
type='WaymoMetric',
176+
waymo_bin_file='./data/waymo/waymo_format/fov_gt.bin',
177+
metric='LET_mAP',
178+
load_type='fov_image_based',
179+
result_prefix='./pgd_fov_pred')
180+
test_evaluator = val_evaluator
181+
182+
vis_backends = [dict(type='LocalVisBackend')]
183+
visualizer = dict(
184+
type='Det3DLocalVisualizer', vis_backends=vis_backends, name='visualizer')

0 commit comments

Comments
 (0)