Yangyi Huang*
·
Hongwei Yi*
·
Yuliang Xiu*
·
Tingting Liao
·
Jiaxiang Tang
·
Deng Cai
·
Justus Thies
* Equal contribution
teaser.mp4
TeCH considers image-based reconstruction as a conditional generation task, taking conditions from both the input image and the derived descriptions. It is capable of reconstructing "lifelike" 3D clothed humans. “Lifelike” refers to 1) a detailed full-body geometry, including facial features and clothing wrinkles, in both frontal and unseen regions, and 2) a high-quality texture with consistent color and intricate patterns.
Please follow the Installation Instruction to setup all the required packages.
We provide a running script at scripts/run.sh
. Before getting started, you need to set your own environment variables of CUDA_HOME
and REPLICATE_API_TOKEN
(get your token here) in the script.
After that, you can use TeCH to create a highly detailed clothed human textured mesh from a single image, for example:
sh scripts/run.sh input/examples/name.img exp/examples/name
The results will be saved in the experiment folder exp/examples/name
, and the textured mesh will be saved as exp/examples/name/obj/name_texture.obj
It is noted that in "Step 3", the current version of Dreambooth implementation requires 2*32G GPU memory. And 1*32G GPU memory is efficient for other steps. The entire training process for a subject takes ~3 hours on our V100 GPUs.
- Release of evaluation protocols and results data for comparison (on CAPE & THUman 2.0 datasets).
- Switch to the diffuser version of DreamBooth to save training memory.
- Further improvement of efficiency and robustness.
@inproceedings{huang2024tech,
title={{TeCH: Text-guided Reconstruction of Lifelike Clothed Humans}},
author={Huang, Yangyi and Yi, Hongwei and Xiu, Yuliang and Liao, Tingting and Tang, Jiaxiang and Cai, Deng and Thies, Justus},
booktitle={International Conference on 3D Vision (3DV)},
year={2024}
}
Kudos to all of our amazing contributors! TeCH thrives through open-source. In that spirit, we welcome all kinds of contributions from the community.
Contributor avatars are randomly shuffled.
This code and model are available only for non-commercial research purposes as defined in the LICENSE (i.e., MIT LICENSE). Note that, using TeCH, you have to register SMPL-X and agree with the LICENSE of it, and it's not MIT LICENSE, you can check the LICENSE of SMPL-X from https://github.com/vchoutas/smplx/blob/main/LICENSE.
This implementation is mainly built based on Stable Dreamfusion, ECON, DreamBooth-Stable-Diffusion, and the BLIP API from Salesforce on Replicate