Human pose forecasting, i.e., forecasting human body keypoints' locations given a sequence of observed ones, is a challenging task due to the uncertainty in human pose dynamics. Many approaches have been proposed to solve this problem, including Long Short-Term Memories (LSTMs) and Variational AutoEncoders (VAEs). Yet, they do not effectively predict human motions when both global trajectory and local pose movements exist. We propose to learn a representation that decouples the global and local pose forecasting tasks. We also show that it is better to stop the prediction when the uncertainty in human motion increases. Our forecasting model outperforms all existing methods on the pose forecasting benchmark to date by over 20%. The code will be made available online.
This is the official code for the paper "Learning Decoupled Representation for Human Pose Forecasting", accepted and published in "ICCVW 2021".
Here, you will find the Poster, the Slides and the Video.
- Repository Structure
- Proposed Method
- Results
- Installation
- Dataset
- Training/Testing
- Tested Environments
├── pose-prediction : Project repository
├── 3dpw
├── somof_data_3dpw : Dataset for 3dpw.
├── train.py : Script for training.
├── valid.py : Script for validating on validation set.
├── test.py : Script for testing on test set.
├── DataLoader.py : Script for data loader for train and validation datasets.
├── DataLoader_test.py : Script for data loader for test set.
├── model.py : Script containing the implementation of the network.
├── utils.py : Script containing necessary functions.
├── viz.py : Script for visualization.
├── posetrack
├── somof_data_posetrack : Dataset for posetrack.
├── train.py : Script for training.
├── valid.py : Script for validating on validation set.
├── test.py : Script for testing.
├── DataLoader.py : Script for data loader for train and validation datasets.
├── DataLoader_test.py : Script for data loader for test set.
├── model.py : Script containing the implementation of the network.
├── utils.py : Script containing necessary functions.
We decouple the pose forecasting into a global trajectory forecasting and a local pose forecasting as shown below:
We show the observed (left) and the predicted (right) poses for two different scenarios. The rows correspond to DeRPoF w/o early stop and w/o Decoupling from top to bottom. Only the pose of every other frame is shown.
Start by cloning this repositiory:
git clone https://github.com/vita-epfl/decoupled-pose-prediction
cd decoupled-pose-prediction
Create a virtual environment:
virtualenv myenv
source myenv/bin/activate
And install the dependencies:
pip install -r requirements.txt
- We use the preprocessed dataset of the SoMoF challenge. For easy usage, these datasets for 3dpw and posetrack are available in directories 3dpw/ and posetrack/ respectively.
In order to train the model for 3dpw:
cd 3dpw/
python train.py [--hidden_dim hidden_dim --latent_dim latent_dim --embedding_dim embedding_dim --dropout dropout --lr lr --n_epochs n_epochs --batch_size batch_size --loader_shuffle loader_shuffle --load_checkpoint load_checkpoint --dev dev]
where the above options are:
- hidden_dim (int): hidden state dimension (default: 64)
- latent_dim (int): the dimension of the latent space (default: 32)
- embedding_dim (int): the embedding dimension (default: 8)
- dropout (float): dropout (default: 0.2)
- lr (float): initial learning rate (default: 0.001)
- n_epochs (int): the number of epochs (default: 100)
- batch_size (int): the size of the batch (default: 60)
- loader_shuffle (bool): if the data should be shuffled (default: True)
- load_checkpoint (bool): to start from previous saved parameters (default: False)
- dev: device either 'cpu' or 'cuda', default is 'cpu'
For validating the model run the command:
python val.py [--hidden_dim hidden_dim --latent_dim latent_dim --embedding_dim embedding_dim --dropout dropout --dev dev]
The output will be the vim and vam values as well as some visualizations in the visualization folder.
Test the trained network by running the command:
python test.py [--hidden_dim hidden_dim --latent_dim latent_dim --embedding_dim embedding_dim --dropout dropout --dev dev]
where the options are similar to the training.
- Ubuntu 18.04, CUDA 10.1
The base of the code is borrowed from bounding box predictin.
@InProceedings{parsaeifard2021decoupled,
author = {Parsaeifard, Behnam and Saadatnejad, Saeed and Liu, Yuejiang and Mordan, Taylor and Alahi, Alexandre},
title = {Learning Decoupled Representations for Human Pose Forecasting},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
month = {October},
year = {2021},
pages = {2294-2303}
}