An unofficial minimal package for using BigVGAN at inference time
pip install bigvganinference
or install from source:
git clone https://github.com/thunn/BigVGANInference.git
cd BigVGANInference
poetry install
Loading model is as simple as:
from bigvganinference import BigVGANInference
# -- model loading ---
# model is loaded, set to eval and weight norm is removed
model = BigVGANInference.from_pretrained(
'nvidia/BigVGAN-V2-44KHZ-128BAND-512X', use_cuda_kernel=False
)
# also supports loading from local directory
model = BigVGANInference.from_pretrained(
"path/to/local/model", use_cuda_kernel=False
)
# --- usage example ---
path_to_audio = "path/to/audio.wav"
wav, sr = librosa.load(path_to_audio, sr=model.h.sampling_rate, mono=True)
# get mel spectrogram using bigvgan's implementation
# mel: [B(1), MEL_BANDS, T_time]
mel = model.get_mel_spectrogram(wav)
# generate waveform from mel
# note: torch.inference_mode() is used internally
# output_audio: [B(1), 1, T_time]
output_audio = model(input_mel)
# get numpy array
output_audio_np = output_audio.squeeze(0).cpu().numpy()
See the example for full usage example.
This is an unofficial implementation based on original BigVGAN repository.
This project is licensed under the MIT License. See the LICENSE file for details.