title | emoji | colorFrom | colorTo | sdk | sdk_version | app_file | pinned | license |
---|---|---|---|---|---|---|---|---|
Compressed Wav2Lip |
🌟 |
indigo |
pink |
gradio |
4.13.0 |
app.py |
true |
apache-2.0 |
Official codebase for Accelerating Speech-Driven Talking Face Generation with 28× Compressed Wav2Lip.
- Presented at ICCV'23 Demo Track; On-Device Intelligence Workshop @ MLSys'23; NVIDIA GTC 2023 Poster.
git clone https://github.com/Nota-NetsPresso/nota-wav2lip.git
cd nota-wav2lip
docker compose run --service-ports --name nota-compressed-wav2lip compressed-wav2lip bash
Click
git clone https://github.com/Nota-NetsPresso/nota-wav2lip.git
cd nota-wav2lip
apt-get update
apt-get install ffmpeg libsm6 libxext6 tmux git -y
conda create -n nota-wav2lip python=3.9
conda activate nota-wav2lip
pip install -r requirements.txt
Use the below script to run the nota-ai/compressed-wav2lip demo. The models and sample data will be downloaded automatically.
bash app.sh
(1) Download YouTube videos in the LRS3-TED label text file and preprocess them properly.
- Download
lrs3_v0.4_txt.zip
from this link. - Unzip the file and make a folder structure:
./data/lrs3_v0.4_txt/lrs3_v0.4/test
- Run
bash download.sh
- Run
bash preprocess.sh
(2) Run the script to compare the original Wav2Lip with Nota's compressed version.
bash inference.sh
- All rights related to this repository and the compressed models are reserved by Nota Inc.
- The intended use is strictly limited to research and non-commercial projects.
- To obtain compression code and assistance, kindly contact Nota AI ([email protected]). These are provided as part of our business solutions.
- For Q&A about this repo, use this board: Nota-NetsPresso/discussions
- NVIDIA Applied Research Accelerator Program for supporting this research.
- Wav2Lip and LRS3-TED for facilitating the development of the original Wav2Lip.
@article{kim2023unified,
title={A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation},
author={Kim, Bo-Kyeong and Kang, Jaemin and Seo, Daeun and Park, Hancheol and Choi, Shinkook and Song, Hyoung-Kyu and Kim, Hyungshin and Lim, Sungsu},
journal={MLSys Workshop on On-Device Intelligence (ODIW)},
year={2023},
url={https://arxiv.org/abs/2304.00471}
}