Skip to content

Latest commit

 

History

History
169 lines (121 loc) · 10.3 KB

README_en.md

File metadata and controls

169 lines (121 loc) · 10.3 KB

简体中文 | English

KeyPoint Detection Models

Introduction

  • The keypoint detection part in PaddleDetection follows the state-of-the-art algorithm closely, including Top-Down and BottomUp methods, which can meet the different needs of users.

Model Zoo

COCO Dataset

Model Input Size AP(coco val) Model Download Config File
HigherHRNet-w32 512 67.1 higherhrnet_hrnet_w32_512.pdparams config
HigherHRNet-w32 640 68.3 higherhrnet_hrnet_w32_640.pdparams config
HigherHRNet-w32+SWAHR 512 68.9 higherhrnet_hrnet_w32_512_swahr.pdparams config
HRNet-w32 256x192 76.9 hrnet_w32_256x192.pdparams config
HRNet-w32 384x288 77.8 hrnet_w32_384x288.pdparams config
HRNet-w32+DarkPose 256x192 78.0 dark_hrnet_w32_256x192.pdparams config
HRNet-w32+DarkPose 384x288 78.3 dark_hrnet_w32_384x288.pdparams config
WiderNaiveHRNet-18 256x192 67.6(+DARK 68.4) wider_naive_hrnet_18_256x192_coco.pdparams config
LiteHRNet-18 256x192 66.5 lite_hrnet_18_256x192_coco.pdparams config
LiteHRNet-18 384x288 69.7 lite_hrnet_18_384x288_coco.pdparams config
LiteHRNet-30 256x192 69.4 lite_hrnet_30_256x192_coco.pdparams config
LiteHRNet-30 384x288 72.5 lite_hrnet_30_384x288_coco.pdparams config

Note:The AP results of Top-Down models are based on bounding boxes in GroundTruth.

MPII Dataset

Model Input Size PCKh(Mean) PCKh([email protected]) Model Download Config File
HRNet-w32 256x256 90.6 38.5 hrnet_w32_256x256_mpii.pdparams config

We also release PP-TinyPose, a real-time keypoint detection model optimized for mobile devices. Welcome to experience.

Getting Start

1. Environmental Installation

​ Please refer to PaddleDetection Installation Guild to install PaddlePaddle and PaddleDetection correctly.

2. Dataset Preparation

​ Currently, KeyPoint Detection Models support COCO and MPII. Please refer to Keypoint Dataset Preparation to prepare dataset.

​ About the description for config files, please refer to Keypoint Config Guild.

  • Note that, when testing by detected bounding boxes in Top-Down method, We should get bbox.json by a detection model. You can download the detected results for COCO val2017 (Detector having human AP of 56.4 on COCO val2017 dataset) directly, put it at the root path (PaddleDetection/), and set use_gt_bbox: False in config file.

3、Training and Testing

Training on single gpu:

#COCO DataSet
CUDA_VISIBLE_DEVICES=0 python3 tools/train.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml

#MPII DataSet
CUDA_VISIBLE_DEVICES=0 python3 tools/train.py -c configs/keypoint/hrnet/hrnet_w32_256x256_mpii.yml

Training on multiple gpu:

#COCO DataSet
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m paddle.distributed.launch tools/train.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml

#MPII DataSet
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m paddle.distributed.launch tools/train.py -c configs/keypoint/hrnet/hrnet_w32_256x256_mpii.yml

Evaluation

#COCO DataSet
CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml

#MPII DataSet
CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/keypoint/hrnet/hrnet_w32_256x256_mpii.yml

#If you only need the prediction result, you can set --save_prediction_only. Then the result will be saved at output/keypoints_results.json by default.
CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml --save_prediction_only

Inference

​ Note:Top-down models only support inference for a cropped image with single person. If you want to do inference on image with several people, please see "joint inference by detection and keypoint". Or you can choose a Bottom-up model.

CUDA_VISIBLE_DEVICES=0 python3 tools/infer.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml -o weights=./output/higherhrnet_hrnet_w32_512/model_final.pdparams --infer_dir=../images/ --draw_threshold=0.5 --save_txt=True

Deploy Inference

#export models
python tools/export_model.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml -o weights=output/higherhrnet_hrnet_w32_512/model_final.pdparams

#deploy inference
#keypoint inference for a single model of top-down/bottom-up method. In this mode, top-down model only support inference for a cropped image with single person.
python deploy/python/keypoint_infer.py --model_dir=output_inference/higherhrnet_hrnet_w32_512/ --image_file=./demo/000000014439_640x640.jpg --device=gpu --threshold=0.5
python deploy/python/keypoint_infer.py --model_dir=output_inference/hrnet_w32_384x288/ --image_file=./demo/hrnet_demo.jpg --device=gpu --threshold=0.5

#joint inference by detection and keypoint for top-down models.
python deploy/python/det_keypoint_unite_infer.py --det_model_dir=output_inference/ppyolo_r50vd_dcn_2x_coco/ --keypoint_model_dir=output_inference/hrnet_w32_384x288/ --video_file=../video/xxx.mp4  --device=gpu

joint inference with Multi-Object Tracking model FairMOT

#export FairMOT model
python tools/export_model.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams

#joint inference with Multi-Object Tracking model FairMOT
python deploy/python/mot_keypoint_unite_infer.py --mot_model_dir=output_inference/fairmot_dla34_30e_1088x608/ --keypoint_model_dir=output_inference/higherhrnet_hrnet_w32_512/ --video_file={your video name}.mp4 --device=GPU

Note: To export MOT model, please refer to Here.

Benchmark

We provide benchmarks in different runtime environments for your reference when choosing models. See Keypoint Inference Benchmark for details.

Reference

@inproceedings{cheng2020bottom,
  title={HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation},
  author={Bowen Cheng and Bin Xiao and Jingdong Wang and Honghui Shi and Thomas S. Huang and Lei Zhang},
  booktitle={CVPR},
  year={2020}
}

@inproceedings{SunXLW19,
  title={Deep High-Resolution Representation Learning for Human Pose Estimation},
  author={Ke Sun and Bin Xiao and Dong Liu and Jingdong Wang},
  booktitle={CVPR},
  year={2019}
}

@article{wang2019deep,
  title={Deep High-Resolution Representation Learning for Visual Recognition},
  author={Wang, Jingdong and Sun, Ke and Cheng, Tianheng and Jiang, Borui and Deng, Chaorui and Zhao, Yang and Liu, Dong and Mu, Yadong and Tan, Mingkui and Wang, Xinggang and Liu, Wenyu and Xiao, Bin},
  journal={TPAMI},
  year={2019}
}

@InProceedings{Zhang_2020_CVPR,
    author = {Zhang, Feng and Zhu, Xiatian and Dai, Hanbin and Ye, Mao and Zhu, Ce},
    title = {Distribution-Aware Coordinate Representation for Human Pose Estimation},
    booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2020}
}

@inproceedings{Yulitehrnet21,
  title={Lite-HRNet: A Lightweight High-Resolution Network},
  author={Yu, Changqian and Xiao, Bin and Gao, Changxin and Yuan, Lu and Zhang, Lei and Sang, Nong and Wang, Jingdong},
  booktitle={CVPR},
  year={2021}
}