inference failed #10

wwbnjsace · 2025-01-17T02:28:30Z

when i prepare the checkpoint and the env ,when is run bash quick_start.sh ,there is an error ,

SEED EVERYTHING TO 0
Seed set to 0
Add-ons: []
Traceback (most recent call last):
File "/data/tts/music/Awesome-Music-Generation/MMGen_train/infer.py", line 141, in
infer(dataset_json, config_yaml, config_yaml_path, exp_group_name, exp_name)
File "/data/tts/music/Awesome-Music-Generation/MMGen_train/infer.py", line 39, in infer
val_dataset = AudioDataset(
File "/data/tts/music/Awesome-Music-Generation/MMGen_train/utilities/data/dataset.py", line 78, in init
self.build_dsp()
File "/data/tts/music/Awesome-Music-Generation/MMGen_train/utilities/data/dataset.py", line 283, in build_dsp
self.STFT = Audio.stft.TacotronSTFT(
File "/data/tts/music/Awesome-Music-Generation/MMGen_train/utilities/audio/stft.py", line 144, in init
self.stft_fn = STFT(filter_length, hop_length, win_length)
File "/data/tts/music/Awesome-Music-Generation/MMGen_train/utilities/audio/stft.py", line 42, in init
fft_window = pad_center(fft_window, filter_length)
TypeError: pad_center() takes 1 positional argument but 2 were given

kily-wmz · 2025-01-17T03:51:44Z

Thank you for your interest in our work! Could you provide more detailed runtime parameters and code? It would be even better if you could provide more complete error messages.

kily-wmz · 2025-01-17T03:56:54Z

Perhaps you could check the versions of torch, torchaudio, and librosa.

wwbnjsace · 2025-01-20T06:11:38Z

now it has a new error , i just want to generate a song with my prompt ,

Waveform inference save path: ./log/latent_diffusion/quick_start/quick_start/infer_01-20-13:53_cfg_scale_3_ddim_100_n_cand_3
Plotting: Switched to EMA weights
Non-fatal Warning [dataset.py]: The wav path " " is not find in the metadata. Use empty waveform instead. This is normal in the inference process.
Warning: CLMP model normally should use text for evaluation
Plotting: Restored training weights
Traceback (most recent call last):
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/infer.py", line 141, in
infer(dataset_json, config_yaml, config_yaml_path, exp_group_name, exp_name)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/infer.py", line 91, in infer
latent_diffusion.generate_sample(
File "/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/modules/latent_diffusion/ddpm.py", line 1939, in generate_sample
z, c = self.get_input(
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/modules/latent_diffusion/ddpm.py", line 1274, in get_input
melody_npy = np.load("MMGen/melody.npy")
File "/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/numpy/lib/npyio.py", line 405, in load
fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: 'MMGen/melody.npy'

kily-wmz · 2025-01-20T06:28:29Z

Thank you for your question! You need to modify the path for loading the melody_npy in ddpm.py, specifically: melody_npy = np.load("your_path/melody.npy"). You can download the corresponding weight files at https://huggingface.co/ManzhenWei/MG2/tree/main. I will add this step to the readme.

wwbnjsace · 2025-01-20T06:59:02Z

thank your reply,now it has a new error,

Waveform inference save path: ./log/latent_diffusion/quick_start/quick_start/infer_01-20-14:53_cfg_scale_3_ddim_100_n_cand_3
Plotting: Switched to EMA weights
Non-fatal Warning [dataset.py]: The wav path " " is not find in the metadata. Use empty waveform instead. This is normal in the inference process.
Warning: CLMP model normally should use text for evaluation
Use ddim sampler
Data shape for DDIM sampling is (3, 8, 256, 16), eta 1.0
Running DDIM Sampling with 100 timesteps
DDIM Sampler: 0%| | 0/100 [00:00<?, ?it/s]The shape of UNet input is torch.Size([3, 8, 256, 16])
DDIM Sampler: 100%|███████████████████████████████████████████████████████████████████████| 100/100 [00:06<00:00, 15.21it/s]
/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
return F.conv1d(input, weight, bias, self.stride,
INFO: clmp model calculate the audio embedding as condition
/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/torchaudio/functional/functional.py:1466: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
resampled = torch.nn.functional.conv1d(waveform[:, None], kernel, stride=orig_freq)
Warning: while calculating CLAP score (not fatal), Number of dimensions of repeat dims can not be smaller than number of dimensions of tensor
Plotting: Restored training weights
Traceback (most recent call last):
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/infer.py", line 141, in
infer(dataset_json, config_yaml, config_yaml_path, exp_group_name, exp_name)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/infer.py", line 91, in infer
latent_diffusion.generate_sample(
File "/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/modules/latent_diffusion/ddpm.py", line 2022, in generate_sample
self.save_waveform(waveform, waveform_save_path, name=fnames)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/modules/latent_diffusion/ddpm.py", line 1808, in save_waveform
truncated_basename = self.truncate_filename(os.path.basename(name[i]), max_filename_length)
IndexError: list index out of range

wwbnjsace · 2025-01-20T07:02:11Z

thank your reply,now it has a new error,

Waveform inference save path: ./log/latent_diffusion/quick_start/quick_start/infer_01-20-14:53_cfg_scale_3_ddim_100_n_cand_3
Plotting: Switched to EMA weights
Non-fatal Warning [dataset.py]: The wav path " " is not find in the metadata. Use empty waveform instead. This is normal in the inference process.
Warning: CLMP model normally should use text for evaluation
Use ddim sampler
Data shape for DDIM sampling is (3, 8, 256, 16), eta 1.0
Running DDIM Sampling with 100 timesteps
DDIM Sampler: 0%| | 0/100 [00:00<?, ?it/s]The shape of UNet input is torch.Size([3, 8, 256, 16])
DDIM Sampler: 100%|███████████████████████████████████████████████████████████████████████| 100/100 [00:06<00:00, 15.21it/s]
/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
return F.conv1d(input, weight, bias, self.stride,
INFO: clmp model calculate the audio embedding as condition
/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/torchaudio/functional/functional.py:1466: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
resampled = torch.nn.functional.conv1d(waveform[:, None], kernel, stride=orig_freq)
Warning: while calculating CLAP score (not fatal), Number of dimensions of repeat dims can not be smaller than number of dimensions of tensor
Plotting: Restored training weights
Traceback (most recent call last):
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/infer.py", line 141, in
infer(dataset_json, config_yaml, config_yaml_path, exp_group_name, exp_name)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/infer.py", line 91, in infer
latent_diffusion.generate_sample(
File "/data0/anaconda3/envs/MMGen_quickstart/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/modules/latent_diffusion/ddpm.py", line 2022, in generate_sample
self.save_waveform(waveform, waveform_save_path, name=fnames)
File "/data3/projects/ns/Awesome-Music-Generation/MMGen_train/modules/latent_diffusion/ddpm.py", line 1808, in save_waveform
truncated_basename = self.truncate_filename(os.path.basename(name[i]), max_filename_length)
IndexError: list index out of range

kily-wmz · 2025-01-21T06:00:48Z

The issue occurred in the save_waveform method. I have already fixed the bug in this part. Please try it again to see if there are any problems. If you still encounter errors, feel free to ask at any time.

wwbnjsace · 2025-01-21T07:06:50Z

the same error is also there,

Plotting: Restored training weights
Traceback (most recent call last):
File "/music/Awesome-Music-Generation-main/MMGen_train/infer.py", line 141, in
infer(dataset_json, config_yaml, config_yaml_path, exp_group_name, exp_name)
File "music/Awesome-Music-Generation-main/MMGen_train/infer.py", line 91, in infer
latent_diffusion.generate_sample(
File "/klai/anaconda3/envs/MMGen/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/music/Awesome-Music-Generation-main/MMGen_train/modules/latent_diffusion/ddpm.py", line 2049, in generate_sample
self.save_waveform(waveform, waveform_save_path, name=fnames)
File "/music/Awesome-Music-Generation-main/MMGen_train/modules/latent_diffusion/ddpm.py", line 1804, in save_waveform
truncated_basename = self.truncate_filename(os.path.basename(name[i]), max_filename_length)
IndexError: list index out of range

and i think the code "save_waveform" have problems ,

def save_waveform(self, waveform, savepath, name="outwav"):
        print(waveform.shape)
        print(name)
        for i in range(waveform.shape[0]):
            if type(name) is str:
                max_filename_length = 25 
                truncated_name = self.truncate_filename(name, max_filename_length)
                path = os.path.join(
                    savepath, "%s_%s_%s.wav" % (self.global_step, i, truncated_name)
                )
            elif type(name) is list:
                max_filename_length = 25  
                truncated_basename = self.truncate_filename(os.path.basename(name[i]), max_filename_length)
                if ".wav" in truncated_basename:
                    truncated_basename = truncated_basename.split(".")[0]
                path = os.path.join(
                    savepath,
                    "%s.wav" % truncated_basename,
                )
            else:
                raise NotImplementedError
            todo_waveform = waveform[i, 0]
            todo_waveform = (
                todo_waveform / np.max(np.abs(todo_waveform))
            ) * 0.8  # Normalize the energy of the generation output
            
            sf.write(path, todo_waveform, samplerate=self.sampling_rate)

it output (3, 1, 163872)
['A_modern_synthesizer_creating_futuristic_soundscapes.']
why the waveform.shape is 3, so the for range will "IndexError: list index out of range" so it will output three audio pieces?
why the name is list?

wwbnjsace · 2025-01-22T10:20:23Z

any reply ? why the waveform.shape is 3 , so it generate three audio pieces ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference failed #10

inference failed #10

wwbnjsace commented Jan 17, 2025

kily-wmz commented Jan 17, 2025

kily-wmz commented Jan 17, 2025

wwbnjsace commented Jan 20, 2025

kily-wmz commented Jan 20, 2025

wwbnjsace commented Jan 20, 2025

wwbnjsace commented Jan 20, 2025

kily-wmz commented Jan 21, 2025

wwbnjsace commented Jan 21, 2025 •

edited

Loading

wwbnjsace commented Jan 22, 2025

inference failed #10

inference failed #10

Comments

wwbnjsace commented Jan 17, 2025

kily-wmz commented Jan 17, 2025

kily-wmz commented Jan 17, 2025

wwbnjsace commented Jan 20, 2025

kily-wmz commented Jan 20, 2025

wwbnjsace commented Jan 20, 2025

wwbnjsace commented Jan 20, 2025

kily-wmz commented Jan 21, 2025

wwbnjsace commented Jan 21, 2025 • edited Loading

wwbnjsace commented Jan 22, 2025

wwbnjsace commented Jan 21, 2025 •

edited

Loading