diff --git a/README.md b/README.md index ee061436..6ef68291 100644 --- a/README.md +++ b/README.md @@ -22,34 +22,27 @@ pip install -r requirements.txt #### Human3.6M 1. Download and preprocess the dataset by following the instructions in [mvn/datasets/human36m_preprocessing/README.md](https://github.com/karfly/learnable-triangulation-pytorch/blob/master/mvn/datasets/human36m_preprocessing/README.md). -2. Place the preprocessed dataset to `data/human36m`. If you don't want to store the dataset in the directory with code, just create a soft symbolic link: `ln -s {PATH_TO_HUMAN36M_DATASET} ./data/human36m`. -3. Download pretrained backbone's weights from [here](https://drive.google.com/open?id=1TGHBfa9LsFPVS5CH6Qkcy5Jr2QsJdPEa) and place them here: `data/pretrained/human36m/pose_resnet_4.5_pixels_human36m.pth` (ResNet-152 trained on COCO dataset and finetuned jointly on MPII and Human3.6M). -4. If you want to train Volumetric model, you need rough estimations of the 3D skeleton both for train and val splits. You have two options: - - Rough 3D skeletons can be estimated by Algebraic model and placed to `data/precalculated_results/human36m/results_train.pkl` and `data/precalculated_results/human36m/results_val.pkl` respectively. - - Other option is to use the ground truth (GT) estimate of the 3D skeleton by setting `use_gt_pelvis: true` in a config file. Here you don't need any precalculated results, but such training mode overestimates the resulting accuracy, because pelvis is always perfectly defined. +2. Place the preprocessed dataset to `./data/human36m`. If you don't want to store the dataset in the directory with code, just create a soft symbolic link: `ln -s {PATH_TO_HUMAN36M_DATASET} ./data/human36m`. +3. Download pretrained backbone's weights from [here](https://drive.google.com/open?id=1TGHBfa9LsFPVS5CH6Qkcy5Jr2QsJdPEa) and place them here: `./data/pretrained/human36m/pose_resnet_4.5_pixels_human36m.pth` (ResNet-152 trained on COCO dataset and finetuned jointly on MPII and Human3.6M). +4. If you want to train Volumetric model, you need rough estimations of the 3D skeleton both for train and val splits. In the paper we estimate 3D skeletons via Algebraic model. You can use [pretrained](#model-zoo) Algebraic model to produce predictions or just take [precalculated 3D skeletons](#model-zoo). -#### CMU Panoptic -*Will be added soon* - -## Train -Every experiment is defined by `.config` files. Configs with experiments from the paper can be found in `experiments` directory (results can be found below): +## Model zoo +In this section we collect pretrained models and configs. All **pretrained weights** and **precalculated 3D skeletons** can be downloaded from [Google Drive](https://drive.google.com/open?id=1TGHBfa9LsFPVS5CH6Qkcy5Jr2QsJdPEa) and placed to `./data` dir, so that eval configs can work out-of-the-box (without additional setting of paths). **Human3.6M:** - 1. Algebraic w/o confidences — [experiments/human36m/train/human36m_alg_no_conf.yaml](https://github.com/karfly/learnable-triangulation-pytorch/blob/master/experiments/human36m/train/human36m_alg_no_conf.yaml) - 2. Algebraic w/ confidences — [experiments/human36m/train/human36m_alg.yaml](https://github.com/karfly/learnable-triangulation-pytorch/blob/master/experiments/human36m/train/human36m_alg.yaml) - 3. Volumetric (softmax aggregation) — [experiments/human36m/train/human36m_vol_softmax.yaml](https://github.com/karfly/learnable-triangulation-pytorch/blob/master/experiments/human36m/train/human36m_vol_softmax.yaml) - 4. Volumetric (softmax aggregation, GT pelvis) — [experiments/human36m/train/human36m_vol_softmax_gtpelvis.yaml](https://github.com/karfly/learnable-triangulation-pytorch/blob/master/experiments/human36m/train/human36m_vol_softmax_gtpelvis.yaml) - - **CMU Panoptic** - - *Will be added soon* +| Model | Train config | Eval config | Weights | Precalculated results | MPJPE (relative to pelvis), mm | +|----------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------:|:--------------------------------------------------------------------------------------------:|-------------------------------:| +| Algebraic | [train/human36m_alg.yaml](https://github.com/karfly/learnable-triangulation-pytorch/blob/master/train/human36m_alg.yaml) | [eval/human36m_alg.yaml](https://github.com/karfly/learnable-triangulation-pytorch/blob/master/eval/human36m_alg.yaml) | [link](https://drive.google.com/file/d/1HAqMwH94kCfTs9jUHiuCB7vt94rMvxWe/view?usp=sharing) | [link](https://drive.google.com/drive/folders/1LCzMQswdn4UM9fbRYOZb3FmMZ7pZFyIP?usp=sharing) | 22.4 | +| Volumetric (softmax) | [train/human36m_vol_softmax.yaml](https://github.com/karfly/learnable-triangulation-pytorch/blob/master/train/human36m_vol_softmax.yaml) | [eval/human36m_vol_softmax.yaml](https://github.com/karfly/learnable-triangulation-pytorch/blob/master/eval/human36m_vol_softmax.yaml) | [link](https://drive.google.com/file/d/1r6Ut3oMKPxhyxRh3PZ05taaXwekhJWqj/view?usp=sharing) | — | **20.5** | +## Train +Every experiment is defined by `.config` files. Configs with experiments from the paper can be found in the `./experiments` directory (see [model zoo](#model-zoo)). #### Single-GPU -To train a Volumetric model with softmax aggregation and GT-estimated pelvises using **1 GPU**, run: +To train a Volumetric model with softmax aggregation using **1 GPU**, run: ```bash python3 train.py \ - --config experiments/human36m/train/human36m_vol_softmax_gtpelvis.yaml \ + --config train/human36m_vol_softmax.yaml \ --logdir ./logs ``` @@ -58,11 +51,11 @@ The training will start with the config file specified by `--config`, and logs ( #### Multi-GPU (*in testing*) Multi-GPU training is implemented with PyTorch's [DistributedDataParallel](https://pytorch.org/docs/stable/nn.html#distributeddataparallel). It can be used both for single-machine and multi-machine (cluster) training. To run the processes use the PyTorch [launch utility](https://github.com/pytorch/pytorch/blob/master/torch/distributed/launch.py). -To train a Volumetric model with softmax aggregation and GT-estimated pelvises using **2 GPUs on single machine**, run: +To train a Volumetric model with softmax aggregation using **2 GPUs on single machine**, run: ```bash python3 -m torch.distributed.launch --nproc_per_node=2 --master_port=2345 \ train.py \ - --config experiments/human36m/train/human36m_vol_softmax_gtpelvis.yaml \ + --config train/human36m_vol_softmax.yaml \ --logdir ./logs ``` @@ -86,7 +79,7 @@ Run: ```bash python3 train.py \ --eval --eval_dataset val \ - --config experiments/human36m/eval/human36m_vol_softmax.yaml \ + --config eval/human36m_vol_softmax.yaml \ --logdir ./logs ``` Argument `--eval_dataset` can be `val` or `train`. Results can be seen in `logs` directory or in the tensorboard. @@ -111,8 +104,8 @@ MPJPE relative to pelvis: | Kadkhodamohammadi & Padoy [\[5\]](#references) | 49.1 | | [Qiu et al.](https://github.com/microsoft/multiview-human-pose-estimation-pytorch) [\[9\]](#references) | 26.2 | | RANSAC (our implementation) | 27.4 | -| **Ours, algebraic** | 22.6 | -| **Ours, volumetric** | **20.8** | +| **Ours, algebraic** | 22.4 | +| **Ours, volumetric** | **20.5** |
MPJPE absolute (scenes with invalid ground-truth annotations are excluded): @@ -190,6 +183,7 @@ Volumetric triangulation additionally improves accuracy, drastically reducing th - [Ivan Bulygin](https://github.com/blufzzz) # News +**18 Oct 2019:** Pretrained models (algebraic and volumetric) for Human3.6M are released. **8 Oct 2019:** Code is released! # References diff --git a/experiments/human36m/eval/human36m_alg.yaml b/experiments/human36m/eval/human36m_alg.yaml new file mode 100644 index 00000000..9bb74fc7 --- /dev/null +++ b/experiments/human36m/eval/human36m_alg.yaml @@ -0,0 +1,74 @@ +title: "human36m_alg" +kind: "human36m" +vis_freq: 1000 +vis_n_elements: 10 + +image_shape: [384, 384] + +opt: + criterion: "MSESmooth" + mse_smooth_threshold: 400 + + n_objects_per_epoch: 15000 + n_epochs: 9999 + + batch_size: 8 + val_batch_size: 100 + + lr: 0.00001 + + scale_keypoints_3d: 0.1 + +model: + name: "alg" + + init_weights: true + checkpoint: "./data/pretrained/human36m/human36m_alg_10-04-2019/checkpoints/0060/weights.pth" + + + use_confidences: true + heatmap_multiplier: 100.0 + heatmap_softmax: true + + backbone: + name: "resnet152" + style: "simple" + + init_weights: true + checkpoint: "./data/pretrained/human36m/pose_resnet_4.5_pixels_human36m.pth" + + num_joints: 17 + num_layers: 152 + +dataset: + kind: "human36m" + + train: + h36m_root: "./data/human36m/processed" + labels_path: "./data/human36m/extra/human36m-multiview-labels-GTbboxes.npy" + with_damaged_actions: true + undistort_images: true + + scale_bbox: 1.0 + + shuffle: true + randomize_n_views: false + min_n_views: null + max_n_views: null + num_workers: 8 + + val: + h36m_root: "./data/human36m/processed" + labels_path: "./data/human36m/extra/human36m-multiview-labels-GTbboxes.npy" + with_damaged_actions: true + undistort_images: true + + scale_bbox: 1.0 + + shuffle: false + randomize_n_views: false + min_n_views: null + max_n_views: null + num_workers: 8 + + retain_every_n_frames_in_test: 1 diff --git a/experiments/human36m/train/debug.yaml b/experiments/human36m/eval/human36m_ransac.yaml similarity index 64% rename from experiments/human36m/train/debug.yaml rename to experiments/human36m/eval/human36m_ransac.yaml index b8696b78..e3b20882 100644 --- a/experiments/human36m/train/debug.yaml +++ b/experiments/human36m/eval/human36m_ransac.yaml @@ -1,4 +1,4 @@ -title: "debug" +title: "human36m_ransac" kind: "human36m" vis_freq: 1000 vis_n_elements: 10 @@ -6,41 +6,28 @@ vis_n_elements: 10 image_shape: [384, 384] opt: - criterion: "MAE" + criterion: "MSESmooth" + mse_smooth_threshold: 400 - use_volumetric_ce_loss: true - volumetric_ce_loss_weight: 0.01 - - n_objects_per_epoch: 50 + n_objects_per_epoch: 15000 n_epochs: 9999 - batch_size: 5 - val_batch_size: 10 + batch_size: 8 + val_batch_size: 100 - lr: 0.0001 - process_features_lr: 0.001 - volume_net_lr: 0.001 + lr: 0.00001 scale_keypoints_3d: 0.1 model: - name: "vol" - kind: "mpii" - volume_aggregation_method: "softmax" + name: "ransac" init_weights: false checkpoint: "" - use_gt_pelvis: false - - cuboid_side: 2500.0 - - volume_size: 64 - volume_multiplier: 1.0 - volume_softmax: true - - heatmap_softmax: true + direct_optimization: true heatmap_multiplier: 100.0 + heatmap_softmax: true backbone: name: "resnet152" @@ -58,8 +45,6 @@ dataset: train: h36m_root: "./data/human36m/processed" labels_path: "./data/human36m/extra/human36m-multiview-labels-GTbboxes.npy" - pred_results_path: "./data/precalculated_results/human36m/results_train.pkl" - with_damaged_actions: true undistort_images: true @@ -74,8 +59,6 @@ dataset: val: h36m_root: "./data/human36m/processed" labels_path: "./data/human36m/extra/human36m-multiview-labels-GTbboxes.npy" - pred_results_path: "./data/precalculated_results/human36m/results_val.pkl" - with_damaged_actions: true undistort_images: true @@ -87,4 +70,4 @@ dataset: max_n_views: null num_workers: 8 - retain_every_n_frames_in_test: 30 + retain_every_n_frames_in_test: 1 diff --git a/experiments/human36m/train/human36m_vol_softmax_gtpelvis.yaml b/experiments/human36m/eval/human36m_vol_softmax.yaml similarity index 75% rename from experiments/human36m/train/human36m_vol_softmax_gtpelvis.yaml rename to experiments/human36m/eval/human36m_vol_softmax.yaml index d2b7dbd8..eac00656 100644 --- a/experiments/human36m/train/human36m_vol_softmax_gtpelvis.yaml +++ b/experiments/human36m/eval/human36m_vol_softmax.yaml @@ -1,4 +1,4 @@ -title: "human36m_vol_softmax_gtpelvis" +title: "human36m_vol_softmax" kind: "human36m" vis_freq: 1000 vis_n_elements: 10 @@ -15,7 +15,7 @@ opt: n_epochs: 9999 batch_size: 5 - val_batch_size: 10 + val_batch_size: 20 lr: 0.0001 process_features_lr: 0.001 @@ -28,10 +28,10 @@ model: kind: "mpii" volume_aggregation_method: "softmax" - init_weights: false - checkpoint: "" + init_weights: true + checkpoint: "./data/pretrained/human36m/human36m_vol_softmax_10-08-2019/checkpoints/0040/weights.pth" - use_gt_pelvis: true + use_gt_pelvis: false cuboid_side: 2500.0 @@ -58,6 +58,7 @@ dataset: train: h36m_root: "./data/human36m/processed" labels_path: "./data/human36m/extra/human36m-multiview-labels-GTbboxes.npy" + pred_results_path: "./data/pretrained/human36m/human36m_alg_10-04-2019/checkpoints/0060/results/train.pkl" with_damaged_actions: true undistort_images: true @@ -73,6 +74,7 @@ dataset: val: h36m_root: "./data/human36m/processed" labels_path: "./data/human36m/extra/human36m-multiview-labels-GTbboxes.npy" + pred_results_path: "./data/pretrained/human36m/human36m_alg_10-04-2019/checkpoints/0060/results/val.pkl" with_damaged_actions: true undistort_images: true @@ -85,4 +87,4 @@ dataset: max_n_views: null num_workers: 8 - retain_every_n_frames_in_test: 30 + retain_every_n_frames_in_test: 1 diff --git a/experiments/human36m/train/human36m_alg.yaml b/experiments/human36m/train/human36m_alg.yaml index a1224142..34c8f5e9 100644 --- a/experiments/human36m/train/human36m_alg.yaml +++ b/experiments/human36m/train/human36m_alg.yaml @@ -9,7 +9,7 @@ opt: criterion: "MSESmooth" mse_smooth_threshold: 400 - n_objects_per_epoch: 10000 + n_objects_per_epoch: 15000 n_epochs: 9999 batch_size: 8 diff --git a/mvn/datasets/human36m.py b/mvn/datasets/human36m.py index 46adb161..89065695 100644 --- a/mvn/datasets/human36m.py +++ b/mvn/datasets/human36m.py @@ -180,10 +180,8 @@ def __getitem__(self, idx): # save sample's index sample['indexes'] = idx - try: + if self.keypoints_3d_pred is not None: sample['pred_keypoints_3d'] = self.keypoints_3d_pred[idx] - except AttributeError: - pass sample.default_factory = None return sample @@ -270,4 +268,4 @@ def evaluate(self, keypoints_3d_predicted, split_by_subject=False, transfer_cmu_ 'per_pose_error_relative': self.evaluate_using_per_pose_error(per_pose_error_relative, split_by_subject) } - return result['per_pose_error']['Average']['Average'], result + return result['per_pose_error_relative']['Average']['Average'], result diff --git a/mvn/models/pose_resnet.py b/mvn/models/pose_resnet.py index 0c530ffc..3e2d420b 100644 --- a/mvn/models/pose_resnet.py +++ b/mvn/models/pose_resnet.py @@ -372,6 +372,6 @@ def get_pose_net(config, device='cuda:0'): print("Parameters [{}] were not inited".format(not_inited_params)) model.load_state_dict(new_pretrained_state_dict, strict=False) - print("Successfully loaded pretrained weights") + print("Successfully loaded pretrained weights for backbone") return model diff --git a/train.py b/train.py index 6d7a7267..6fe62f62 100644 --- a/train.py +++ b/train.py @@ -7,6 +7,7 @@ from collections import defaultdict from itertools import islice import pickle +import copy import numpy as np import cv2 @@ -406,7 +407,12 @@ def main(args): if config.model.init_weights: state_dict = torch.load(config.model.checkpoint) - model.load_state_dict(state_dict, strict=False) + for key in list(state_dict.keys()): + new_key = key.replace("module.", "") + state_dict[new_key] = state_dict.pop(key) + + model.load_state_dict(state_dict, strict=True) + print("Successfully loaded pretrained weights for whole model") # criterion criterion_class = {