MSG3D

Abstract

Spatial-temporal graphs have been widely used by skeleton-based action recognition algorithms to model human action dynamics. To capture robust movement patterns from these graphs, long-range and multi-scale context aggregation and spatial-temporal dependency modeling are critical aspects of a powerful feature extractor. However, existing methods have limitations in achieving (1) unbiased long-range joint relationship modeling under multi-scale operators and (2) unobstructed cross-spacetime information flow for capturing complex spatial-temporal dependencies. In this work, we present (1) a simple method to disentangle multi-scale graph convolutions and (2) a unified spatial-temporal graph convolutional operator named G3D. The proposed multi-scale aggregation scheme disentangles the importance of nodes in different neighborhoods for effective long-range modeling. The proposed G3D module leverages dense cross-spacetime edges as skip connections for direct information propagation across the spatial-temporal graph. By coupling these proposals, we develop a powerful feature extractor named MS-G3D based on which our model outperforms previous state-of-the-art methods on three large-scale datasets: NTU RGB+D 60, NTU RGB+D 120, and Kinetics Skeleton 400.

Citation

@inproceedings{liu2020disentangling,
  title={Disentangling and unifying graph convolutions for skeleton-based action recognition},
  author={Liu, Ziyu and Zhang, Hongwen and Chen, Zhenghao and Wang, Zhiyong and Ouyang, Wanli},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  pages={143--152},
  year={2020}
}

Model Zoo

We release numerous checkpoints trained with various modalities, annotations on NTURGB+D and NTURGB+D 120. The accuracy of each modality links to the weight file.

Dataset	Annotation	GPUs	Joint Top1	Bone Top1	Joint Motion Top1	Bone-Motion Top1	Two-Stream Top1	Four Stream Top1
NTURGB+D XSub	Official 3D Skeleton	8	joint_config: 89.6	bone_config: 89.3	joint_motion_config: 87.7	bone_motion_config: 86.7	91.0	91.7
NTURGB+D XSub	HRNet 2D Skeleton	8	joint_config: 92.7	bone_config: 92.8	joint_motion_config: 89.8	bone_motion_config: 90.2	93.8	94.1
NTURGB+D XView	Official 3D Skeleton	8	joint_config: 95.9	bone_config: 95.0	joint_motion_config: 94.0	bone_motion_config: 92.4	96.4	96.9
NTURGB+D XView	HRNet 2D Skeleton	8	joint_config: 97.1	bone_config: 97.1	joint_motion_config: 95.9	bone_motion_config: 95.1	97.9	98.3
NTURGB+D 120 XSub	Official 3D Skeleton	8	joint_config: 84.0	bone_config: 85.3	joint_motion_config: 82.2	bone_motion_config: 81.5	86.9	87.8
NTURGB+D 120 XSub	HRNet 2D Skeleton	8	joint_config: 85.5	bone_config: 85.0	joint_motion_config: 82.6	bone_motion_config: 82.9	86.7	87.4
NTURGB+D 120 XSet	Official 3D Skeleton	8	joint_config: 86.0	bone_config: 87.3	joint_motion_config: 82.9	bone_motion_config: 83.2	88.9	89.6
NTURGB+D 120 XSet	HRNet 2D Skeleton	8	joint_config: 88.2	bone_config: 88.9	joint_motion_config: 86.6	bone_motion_config: 86.5	90.0	90.9

Note

We use the linear-scaling learning rate (Initial LR ∝ Batch Size). If you change the training batch size, remember to change the initial LR proportionally.
For Two-Stream results, we adopt the 1 (Joint):1 (Bone) fusion. For Four-Stream results, we adopt the 2 (Joint):2 (Bone):1 (Joint Motion):1 (Bone Motion) fusion.

Training & Testing

You can use the following command to train a model.

bash tools/dist_train.sh ${CONFIG_FILE} ${NUM_GPUS} [optional arguments]
# For example: train MSG3D on NTURGB+D XSub (3D skeleton, Joint Modality) with 8 GPUs, with validation, and test the last and the best (with best validation metric) checkpoint.
bash tools/dist_train.sh configs/msg3d/msg3d_pyskl_ntu60_xsub_3dkp/j.py 8 --validate --test-last --test-best

You can use the following command to test a model.

bash tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${NUM_GPUS} [optional arguments]
# For example: test MSG3D on NTURGB+D XSub (3D skeleton, Joint Modality) with metrics `top_k_accuracy`, and dump the result to `result.pkl`.
bash tools/dist_test.sh configs/msg3d/msg3d_pyskl_ntu60_xsub_3dkp/j.py checkpoints/SOME_CHECKPOINT.pth 8 --eval top_k_accuracy --out result.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MSG3D

Abstract

Citation

Model Zoo

Training & Testing

Files

README.md

Latest commit

History

README.md

File metadata and controls

MSG3D

Abstract

Citation

Model Zoo

Training & Testing