Code repository for the paper:
Appearance Consensus Driven Self-Supervised Human Mesh Recovery
Jogendra N Kundu*, Mugalodi Rakesh*, Varun Jampani, Rahul M V, R. Venkatesh Babu
ECCV 2020
[paper] [project page]
Clone the repo or download it as a zip from the GitHub GUI.
git clone https://github.com/rakeshramesha/SS_Human_Mesh.git
We have tested the full pipeline on linux with python2
, hence we suggest you create a python2
virtual environment and install the relevant pip packages as follows:
mkdir ./ss_h_mesh_venv
python -m virtualenv ./ss_h_mesh_venv
source ./ss_h_mesh_venv/bin/activate
pip install -U pip
pip install -r requirements.txt
Following external packages are required to realize the full pipeline:
-
Install Dirt-renderer from the original repo here or from our fork here Instructions to install Dirt-renderer can be found on the respective repo pages.
-
SMPL Model Download the neutral SMPL model from here and place it in the
assets
folder.cp <path_to_smplify>/code/models/basicModel_neutral_lbs_10_207_0_v1.0.0.pkl assets/neutral_smpl.pkl
Download pre-trained model weights from here, extract and place them in the weights
folder. Check if the weights path matches the path in config.py
.
tar -xvf <path_to_downloaded_file> -C ./weights/
Images should be properly cropped, where the person bounding box is image-centered & scaled to get a bbox size of roughly 180px-200px (along the longer bbox dimension). Also single unoccluded person with full body visible (not truncated) yields best overlays and coloured mesh.
There are 4 ways to run our inference code:
- Bounding box as a json file along with image, Bbox would be used internally to obtain a proper crop of the image.
python demo.py --img_path <path_to_img> --bbox <path_to_bbox_json>
- OpenPose/CenterTrack detection json file along with image, J2D detections would be used to obtain a proper crop of image.
python demo.py --img_path <path_to_img> --j2d_det <path_to_j2d_json>
- Direct single image inference, Note: Proper crop (as mentioned above) is assumed.
python demo.py --img_path <path_to_img>
- Direct webcam inference, person is assumed to be at the center of the feed. All renderings are performed in real-time, including colored mesh and mesh overlays. Note: Although we provide video inference code, we highly recommend use of a person detector in order to feed proper cropped images to the network. Also note that our model is not trained on video data, hence it might exhibit flicking artifacts.
python demo.py --webcam <cam_id>
If you find our work helpful in your research, please cite the following paper:
@Inproceedings{kundu_human_mesh,
Title = {Appearance Consensus Driven Self-Supervised Human Mesh Recovery},
Author = {Kundu, Jogendra Nath and Rakesh, Mugalodi and Jampani, Varun and Venkatesh, Rahul M and Babu, R. Venkatesh},
Booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
Year = {2020}
}