- [2024/04/23] Initial release.
NOTE: PEGASUS was tested in an Ubuntu 20.04, CUDA 11.8 environment. All experiments were conducted using eight RTX A6000 GPUs. Training can be significantly slower in environments that do not support multi-GPU setups.
We need many modified open-source modules, so please make the directory (e.g., $HOME/GitHub)
mkdir -p $HOME/GitHub/
cd $HOME/GitHub/
git clone https://github.com/snuvclab/pegasus.git
cd ./scripts
sudo chmod a+x ./install_conda.sh
./install_conda.sh
Dockerfile will be provided soon.
NOTE: The preprocessing process and the files that need to be downloaded are heavily dependent on the preprocess instructions of IMAvatar.
- Download FLAME pkl and sam_vit_h_4b8939.pth. Please register FLAME website first.
cd ./scripts
./download_data.sh
- Download deca_model.tar and put into
./preprocess/DECA/data
- Download modnet_webcam_portrait_matting.ckpt and put into
./preprocess/MODNet/pretrained/
- Download 79999_iter.pth and put into
./preprocess/face-parsing.PyTorch/res/cp/
Currently, we cannot release the pretrained DB avatar and monocular video database
-
Weird: We highly recommend this YouTube channel. Most of our
$V^{db}$ content is sourced from it. - Syuka World: Highly recommended for a diverse range of hat datasets.
- Celebv-HQ: We did not use this dataset for our paper, but it contains high-quality monocular videos. We are concerned that it lacks a variety of head poses, so please choose cautiously.
There are certain conditions for using the synthetic database.
- We recommend using at least 100 processed monocular videos to generate a synthetic database.
- Exclude any frames with occlusions from the videos. We leverage frankmocap to detect the hand and YOLOv5 to detect the objects.
- All of the videos should be cropped
$512\times512$ . We plan to release preprocessing code that automatically crops and excludes noisy frames.
NOTE: We largely follow IMAvatar's structure for datasets and training checkpoints.
mkdir -p ./data
mkdir -p ./data/datasets
mkdir -p ./data/experiments
- Set the video file name as filename.mp4
- Save the video to ./data/datasets/original_db/filename.mp4
- Please run a script to create the monocular video database
$V^{db}$ . Be sure to edit the preferences at the top of the script."
sudo chmod a+x ./preprocess/*.sh
./preprocess/1_initial_original_db.sh
- Generate DB Avatar using
$V^{db}$ . This script includes the rendering.
sudo chmod a+x ./run/db_avatar.sh
./run/db_avatar.sh
- Generate synthetic database.
./preprocess/2_synthesis_eyebrows.sh
./preprocess/2_synthesis_eyes.sh
./preprocess/2_synthesis_hair.sh
./preprocess/2_synthesis_hat.sh
./preprocess/2_synthesis_mouth.sh
./preprocess/2_synthesis_nose.sh
./preprocess/3_source.sh
./run/train.sh
We plan to release the pretrained model soon.
./run/test.sh
Codes are available only for non-commercial research purposes.
Our project is built based on PointAvatar. We sincerely thank the authors of
- PointAvatar
- I M Avatar
- Face Parsing
- DECA
- FLAME for their amazing work and codes!
If you find our code useful, please cite our paper:
@InProceedings{Cha_2024_CVPR,
author = {Cha, Hyunsoo and Kim, Byungjun and Joo, Hanbyul},
title = {PEGASUS: Personalized Generative 3D Avatars with Composable Attributes},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2024},
pages = {1072-1081}
}