Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inference failed #10

Open
wwbnjsace opened this issue Jan 17, 2025 · 9 comments
Open

inference failed #10

wwbnjsace opened this issue Jan 17, 2025 · 9 comments

Comments

@wwbnjsace
Copy link

when i prepare the checkpoint and the env ,when is run bash quick_start.sh ,there is an error ,

SEED EVERYTHING TO 0
Seed set to 0
Add-ons: []
Traceback (most recent call last):
File "/data/tts/music/Awesome-Music-Generation/MMGen_train/infer.py", line 141, in
infer(dataset_json, config_yaml, config_yaml_path, exp_group_name, exp_name)
File "/data/tts/music/Awesome-Music-Generation/MMGen_train/infer.py", line 39, in infer
val_dataset = AudioDataset(
File "/data/tts/music/Awesome-Music-Generation/MMGen_train/utilities/data/dataset.py", line 78, in init
self.build_dsp()
File "/data/tts/music/Awesome-Music-Generation/MMGen_train/utilities/data/dataset.py", line 283, in build_dsp
self.STFT = Audio.stft.TacotronSTFT(
File "/data/tts/music/Awesome-Music-Generation/MMGen_train/utilities/audio/stft.py", line 144, in init
self.stft_fn = STFT(filter_length, hop_length, win_length)
File "/data/tts/music/Awesome-Music-Generation/MMGen_train/utilities/audio/stft.py", line 42, in init
fft_window = pad_center(fft_window, filter_length)
TypeError: pad_center() takes 1 positional argument but 2 were given

@kily-wmz
Copy link
Collaborator

Thank you for your interest in our work! Could you provide more detailed runtime parameters and code? It would be even better if you could provide more complete error messages.

@kily-wmz
Copy link
Collaborator

Perhaps you could check the versions of torch, torchaudio, and librosa.

@wwbnjsace
Copy link
Author

now it has a new error , i just want to generate a song with my prompt ,

Waveform inference save path: ./log/latent_diffusion/quick_start/quick_start/infer_01-20-13:53_cfg_scale_3_ddim_100_n_cand_3
Plotting: Switched to EMA weights
Non-fatal Warning [dataset.py]: The wav path " " is not find in the metadata. Use empty waveform instead. This is normal in the inference process.
Warning: CLMP model normally should use text for evaluation
Plotting: Restored training weights
Traceback (most recent call last):
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/infer.py", line 141, in
infer(dataset_json, config_yaml, config_yaml_path, exp_group_name, exp_name)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/infer.py", line 91, in infer
latent_diffusion.generate_sample(
File "/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/modules/latent_diffusion/ddpm.py", line 1939, in generate_sample
z, c = self.get_input(
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/modules/latent_diffusion/ddpm.py", line 1274, in get_input
melody_npy = np.load("MMGen/melody.npy")
File "/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/numpy/lib/npyio.py", line 405, in load
fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: 'MMGen/melody.npy'

@kily-wmz
Copy link
Collaborator

Thank you for your question! You need to modify the path for loading the melody_npy in ddpm.py, specifically: melody_npy = np.load("your_path/melody.npy"). You can download the corresponding weight files at https://huggingface.co/ManzhenWei/MG2/tree/main. I will add this step to the readme.

@wwbnjsace
Copy link
Author

thank your reply,now it has a new error,

Waveform inference save path: ./log/latent_diffusion/quick_start/quick_start/infer_01-20-14:53_cfg_scale_3_ddim_100_n_cand_3
Plotting: Switched to EMA weights
Non-fatal Warning [dataset.py]: The wav path " " is not find in the metadata. Use empty waveform instead. This is normal in the inference process.
Warning: CLMP model normally should use text for evaluation
Use ddim sampler
Data shape for DDIM sampling is (3, 8, 256, 16), eta 1.0
Running DDIM Sampling with 100 timesteps
DDIM Sampler: 0%| | 0/100 [00:00<?, ?it/s]The shape of UNet input is torch.Size([3, 8, 256, 16])
DDIM Sampler: 100%|███████████████████████████████████████████████████████████████████████| 100/100 [00:06<00:00, 15.21it/s]
/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
return F.conv1d(input, weight, bias, self.stride,
INFO: clmp model calculate the audio embedding as condition
/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/torchaudio/functional/functional.py:1466: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
resampled = torch.nn.functional.conv1d(waveform[:, None], kernel, stride=orig_freq)
Warning: while calculating CLAP score (not fatal), Number of dimensions of repeat dims can not be smaller than number of dimensions of tensor
Plotting: Restored training weights
Traceback (most recent call last):
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/infer.py", line 141, in
infer(dataset_json, config_yaml, config_yaml_path, exp_group_name, exp_name)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/infer.py", line 91, in infer
latent_diffusion.generate_sample(
File "/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/modules/latent_diffusion/ddpm.py", line 2022, in generate_sample
self.save_waveform(waveform, waveform_save_path, name=fnames)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/modules/latent_diffusion/ddpm.py", line 1808, in save_waveform
truncated_basename = self.truncate_filename(os.path.basename(name[i]), max_filename_length)
IndexError: list index out of range

1 similar comment
@wwbnjsace
Copy link
Author

thank your reply,now it has a new error,

Waveform inference save path: ./log/latent_diffusion/quick_start/quick_start/infer_01-20-14:53_cfg_scale_3_ddim_100_n_cand_3
Plotting: Switched to EMA weights
Non-fatal Warning [dataset.py]: The wav path " " is not find in the metadata. Use empty waveform instead. This is normal in the inference process.
Warning: CLMP model normally should use text for evaluation
Use ddim sampler
Data shape for DDIM sampling is (3, 8, 256, 16), eta 1.0
Running DDIM Sampling with 100 timesteps
DDIM Sampler: 0%| | 0/100 [00:00<?, ?it/s]The shape of UNet input is torch.Size([3, 8, 256, 16])
DDIM Sampler: 100%|███████████████████████████████████████████████████████████████████████| 100/100 [00:06<00:00, 15.21it/s]
/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
return F.conv1d(input, weight, bias, self.stride,
INFO: clmp model calculate the audio embedding as condition
/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/torchaudio/functional/functional.py:1466: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
resampled = torch.nn.functional.conv1d(waveform[:, None], kernel, stride=orig_freq)
Warning: while calculating CLAP score (not fatal), Number of dimensions of repeat dims can not be smaller than number of dimensions of tensor
Plotting: Restored training weights
Traceback (most recent call last):
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/infer.py", line 141, in
infer(dataset_json, config_yaml, config_yaml_path, exp_group_name, exp_name)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/infer.py", line 91, in infer
latent_diffusion.generate_sample(
File "/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/modules/latent_diffusion/ddpm.py", line 2022, in generate_sample
self.save_waveform(waveform, waveform_save_path, name=fnames)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/modules/latent_diffusion/ddpm.py", line 1808, in save_waveform
truncated_basename = self.truncate_filename(os.path.basename(name[i]), max_filename_length)
IndexError: list index out of range

@kily-wmz
Copy link
Collaborator

The issue occurred in the save_waveform method. I have already fixed the bug in this part. Please try it again to see if there are any problems. If you still encounter errors, feel free to ask at any time.

@wwbnjsace
Copy link
Author

wwbnjsace commented Jan 21, 2025

the same error is also there,

Plotting: Restored training weights
Traceback (most recent call last):
File "/music/Awesome-Music-Generation-main/MMGen_train/infer.py", line 141, in
infer(dataset_json, config_yaml, config_yaml_path, exp_group_name, exp_name)
File "music/Awesome-Music-Generation-main/MMGen_train/infer.py", line 91, in infer
latent_diffusion.generate_sample(
File "/klai/anaconda3/envs/MMGen/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/music/Awesome-Music-Generation-main/MMGen_train/modules/latent_diffusion/ddpm.py", line 2049, in generate_sample
self.save_waveform(waveform, waveform_save_path, name=fnames)
File "/music/Awesome-Music-Generation-main/MMGen_train/modules/latent_diffusion/ddpm.py", line 1804, in save_waveform
truncated_basename = self.truncate_filename(os.path.basename(name[i]), max_filename_length)
IndexError: list index out of range

and i think the code "save_waveform" have problems ,

def save_waveform(self, waveform, savepath, name="outwav"):
        print(waveform.shape)
        print(name)
        for i in range(waveform.shape[0]):
            if type(name) is str:
                max_filename_length = 25 
                truncated_name = self.truncate_filename(name, max_filename_length)
                path = os.path.join(
                    savepath, "%s_%s_%s.wav" % (self.global_step, i, truncated_name)
                )
            elif type(name) is list:
                max_filename_length = 25  
                truncated_basename = self.truncate_filename(os.path.basename(name[i]), max_filename_length)
                if ".wav" in truncated_basename:
                    truncated_basename = truncated_basename.split(".")[0]
                path = os.path.join(
                    savepath,
                    "%s.wav" % truncated_basename,
                )
            else:
                raise NotImplementedError
            todo_waveform = waveform[i, 0]
            todo_waveform = (
                todo_waveform / np.max(np.abs(todo_waveform))
            ) * 0.8  # Normalize the energy of the generation output
            
            sf.write(path, todo_waveform, samplerate=self.sampling_rate)

it output (3, 1, 163872)
['A_modern_synthesizer_creating_futuristic_soundscapes.']
why the waveform.shape is 3, so the for range will "IndexError: list index out of range" so it will output three audio pieces?
why the name is list?

@wwbnjsace
Copy link
Author

any reply ? why the waveform.shape is 3 , so it generate three audio pieces ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants