H-InDex: Visual Reinforcement Learning with Hand-Informed Representations for Dexterous Manipulation
Yanjie Ze · Yuyao Liu* · Ruizhe Shi* · Jiaxin Qin · Zhecheng Yuan · Jiashun Wang · Xiaolong Wang · Huazhe Xu
Project Page | arXiv | Twitter
H-InDex is a visual reinforcement learning framework that leverages hand-informed representations to learn dexterous manipulation skills with great efficiency. H-InDex consistes of three stages: pre-training, offline adaptation, and reinforcement learning. In this repo, all the stages are provided, together with the pre-trained checkpoint and the adapted checkpoints.
We also encourage the user to use our pre-trained representations directly for their own downstream tasks.
To benchmark our method, we also provide several strong baselines in this repo, including VC-1, MVP, R3M, and RRL.
Enjoy Dexterity!
See INSTALL.md.
We also provide some error catching solutions in INSTALL.md.
Feel free to post an issue if you have any questions.
We use wandb
to log the training process. Remember to set your wandb
account before training by wandb login
. You could also disable wandb
by use_wandb=0
in our script.
Given a task name task_name
, you could run the following pipeline.
- Stage 1: Human Hand Pretraining.
- Simply download the pre-trained hand representation from FrankMocap by this command
wget https://dl.fbaipublicfiles.com/eft/fairmocap_data/hand_module/checkpoints_best/pose_shape_best.pth -O archive/frankmocap_hand.pth --no-check-certificate`
- Simply download the pre-trained hand representation from FrankMocap by this command
- Stage 2: Offline Adaptation.
- First, download the initial model weights in Stage 2 from here and put it under
stage2_adapt/
. - Second, generate image dataset for offline adaptation. See
scripts/adroit/gen_img_dataset.sh
orscripts/dexmv/gen_img_dataset.sh
for details. An example:bash scripts/adroit/gen_img_dataset.sh hammer
- Third, adapt affine transformation in pretrained model. See
scripts/train_stage2.sh
for details. An example:bash scripts/train_stage2.sh hammer-v0
- For the users' convenience, we also provide the adapted checkpoints for all the tasks. You can download them from here and put them under
archive/
folder.
- First, download the initial model weights in Stage 2 from here and put it under
- Stage 3: Reinforcement Learning.
- Train RL agents with the pre-trained representations. See
scripts/adroit/train.sh
orscripts/dexmv/train.sh
for details. An example:Arguments are task name, representation name, experiment name, seed, and GPU id respectively.bash scripts/adroit/train.sh hammer hindex test 0 0
- Train RL agents with the pre-trained representations. See
We provide 12 dexterous manipulation Tasks in total:
- Adroit (3): pen, door, hammer
- DexMV (9): pour, place_inside, relocate-mug, relocate-foam_brick, relocate-large_clamp, relocate-mustard_bottle, relocate-potted_meat_can, relocate-sugar_box, relocate-tomato_soup_can
Our work is based on many open-source projects. The algorithms are mainly built upon RRL and TTP. The simulation environments are from DAPG and DexMV. The pre-trained hand representation is from FrankMocap. Baselines are from RRL, MVP, R3M and VC-1. We thank all these authors for their nicely open sourced code and their great contributions to the community.
H-InDex is licensed under the MIT license. See the LICENSE file for details.
If you find our work useful, please consider citing:
@article{Ze2023HInDex,
title={H-InDex: Visual Reinforcement Learning with Hand-Informed Representations for Dexterous Manipulation},
author={Yanjie Ze and Yuyao Liu and Ruizhe Shi and Jiaxin Qin and Zhecheng Yuan and Jiashun Wang and Xiaolong Wang and Huazhe Xu},
journal={NeurIPS},
year={2023},
}