torch.stft() signature has been updated for PyTorch 1.7+ Please update PyTorch to remain compatible with later versions of NeMo. #2780

briebe · 2021-09-06T12:21:57Z

Describe the bug

[NeMo W 2021-09-06 11:58:47 patch_utils:50] torch.stft() signature has been updated for PyTorch 1.7+
Please update PyTorch to remain compatible with later versions of NeMo.

and followed by

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in _pad(input, pad, mode, value)
4157 assert len(pad) == 2, "3D tensors expect 2 values for padding"
4158 if mode == "reflect":
-> 4159 return torch._C._nn.reflection_pad1d(input, pad)
4160 elif mode == "replicate":
4161 return torch._C._nn.replication_pad1d(input, pad)

RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (256, 256) at dimension 2 of input [1, 2, 2]

also in this notebook, next to the AN4 Source not available problem:

Original Cell: restored_model.setup_finetune_model(config.model)

TypeError Traceback (most recent call last)

in ()
----> 1 restored_model.setup_finetune_model(config.model)

if i change to
Cell: restored_model.setup_finetune_model(model_config = config.model)

TypeError: setup_finetune_model() missing 1 required positional argument: 'model_config'
NameError Traceback (most recent call last)
in ()
----> 1 restored_model.setup_finetune_model(self, model_config=config.model)

NameError: name 'self' is not defined

same with this
cell: restored_model.set_trainer(trainer_finetune)

TypeError Traceback (most recent call last)
in ()
----> 1 restored_model.set_trainer(trainer_finetune)
2 log_dir_finetune = exp_manager(trainer_finetune, config.get("exp_manager", None))
3 print(log_dir_finetune)

TypeError: set_trainer() missing 1 required positional argument: 'trainer'

Steps/Code to reproduce bug

Cell: trainer.fit(speaker_model)

in

https://colab.research.google.com/github/NVIDIA/NeMo/blob/main/tutorials/speaker_recognition/Speaker_Recognition_Verification.ipynb

Expected behavior
(as expected by the ppl that made this notebook.... Colab training should work without bugfixing :-))

Torch 1.9 is installed, no updates possible as it seems...

Environment overview (please complete the following information)

torch @ https://download.pytorch.org/whl/cu102/torch-1.9.0%2Bcu102-cp37-cp37m-linux_x86_64.whl
torch-stft==0.1.4
torchaudio==0.9.0
torchmetrics==0.5.1
torchsummary==1.5.1
torchtext==0.10.0
torchvision @ https://download.pytorch.org/whl/cu102/torchvision-0.10.0%2Bcu102-cp37-cp37m-linux_x86_64.whl

Environment details

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs

nithinraok · 2021-09-15T00:20:09Z

I cant seem to reproduce the issue, its working good on colab. Could you rerun?
Updated link: https://colab.research.google.com/github/NVIDIA/NeMo/blob/main/tutorials/speaker_tasks/Speaker_Identification_Verification.ipynb

briebe · 2021-09-16T06:37:00Z

So for you to be sure i didnt miss anything, i used "Run all" (cells)
Training seems to have worked and final checkpoint could be loaded, (but):

trainer.fit(speaker_model)

[NeMo I 2021-09-16 06:26:58 label_models:240] val_loss: 32.002

Epoch 4, global step 83: val_loss was not in top 3

it now runs without problems until:

"Restoring from a PyTorch Lightning checkpoint

To restore a model using the LightningModule.load_from_checkpoint() class method."

restored_model = nemo_asr.models.EncDecSpeakerLabelModel.load_from_checkpoint(final_checkpoint)

TypeError Traceback (most recent call last)

in ()
----> 1 restored_model = nemo_asr.models.EncDecSpeakerLabelModel.load_from_checkpoint(final_checkpoint)

2 frames

/usr/local/lib/python3.7/dist-packages/pytorch_lightning/core/saving.py in _load_model_state(cls, checkpoint, strict, cls_kwargs_new)
193 _cls_kwargs = {k: v for k, v in _cls_kwargs.items() if k in cls_init_args_name}
194
--> 195 model = cls(_cls_kwargs)
196
197 # give model a chance to load something

TypeError: init() missing 1 required positional argument: 'cfg'

nithinraok · 2021-09-16T07:04:11Z

This looks to me issue with the latest pytorch lightning. Can you manually run
!pip install pytorch_lightning==1.4.2 before the cell where it throws error. Also there was an import fix provided with #2821

briebe · 2021-09-16T07:12:37Z

this fix brings us to cell/code:

manifest_filepath = os.path.join(NEMO_ROOT,'embeddings_manifest.json')
device = 'cuda' if torch.cuda.is_available() else 'cpu'
get_embeddings(verification_model, manifest_filepath, batch_size=64,embedding_dir='./', device=device)

[NeMo I 2021-09-16 07:11:06 audio_to_label:445] Time length considered for collate func is 20
[NeMo I 2021-09-16 07:11:06 audio_to_label:446] Shift length considered for collate func is 0.75
[NeMo I 2021-09-16 07:11:06 collections:267] Filtered duration for loading collection is 0.000000.
[NeMo I 2021-09-16 07:11:06 collections:270] # 5 files loaded accounting to # 5 labels
[NeMo I 2021-09-16 07:11:06 label_models:126] Setting up identification parameters

NameError Traceback (most recent call last)

in ()
1 manifest_filepath = os.path.join(NEMO_ROOT,'embeddings_manifest.json')
2 device = 'cuda' if torch.cuda.is_available() else 'cpu'
----> 3 get_embeddings(verification_model, manifest_filepath, batch_size=64,embedding_dir='./', device=device)

in get_embeddings(speaker_model, manifest_file, batch_size, embedding_dir, device)
18 out_embeddings = {}
19
---> 20 for test_batch in tqdm(speaker_model.test_dataloader()):
21 test_batch = [x.to(device) for x in test_batch]
22 audio_signal, audio_signal_len, labels, slices = test_batch

NameError: name 'tqdm' is not defined

nithinraok · 2021-09-16T07:38:22Z

Please read my above comment, import fix for that is provided through PR #2821

briebe · 2021-09-16T10:13:04Z

ok, i got you. Used the changes you made there and now its running without problems! Great work!
Added myself to the finetuning and will see about the results. :-)
Related question:
I was trying to use the "hi-mia" dataset yesterday, because the AN4 source is/was not very stable in the last week.
This is the first line of my test.json:

{"audio_filepath": "../rivaclient/NeMo/scripts/dataset_processing/data/dev/SPEECHDATA/wav/SV0280/SV0280_6_07_S3653.wav", "offset": 0, "duration": 1.488, "label": "SV0280"}

KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/opt/conda/lib/python3.8/site-packages/nemo/collections/asr/data/audio_to_label.py", line 364, in getitem
t = torch.tensor(self.label2id[sample.label]).long()
KeyError: 'SV0280'

is this related to todays fix? will try later thanks!!!

briebe added the bug Something isn't working label Sep 6, 2021

titu1994 assigned nithinraok Sep 6, 2021

titu1994 closed this as completed Oct 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torch.stft() signature has been updated for PyTorch 1.7+ Please update PyTorch to remain compatible with later versions of NeMo. #2780

torch.stft() signature has been updated for PyTorch 1.7+ Please update PyTorch to remain compatible with later versions of NeMo. #2780

briebe commented Sep 6, 2021 •

edited

Loading

nithinraok commented Sep 15, 2021 •

edited

Loading

briebe commented Sep 16, 2021

nithinraok commented Sep 16, 2021

briebe commented Sep 16, 2021

nithinraok commented Sep 16, 2021

briebe commented Sep 16, 2021

torch.stft() signature has been updated for PyTorch 1.7+ Please update PyTorch to remain compatible with later versions of NeMo. #2780

torch.stft() signature has been updated for PyTorch 1.7+ Please update PyTorch to remain compatible with later versions of NeMo. #2780

Comments

briebe commented Sep 6, 2021 • edited Loading

nithinraok commented Sep 15, 2021 • edited Loading

briebe commented Sep 16, 2021

nithinraok commented Sep 16, 2021

briebe commented Sep 16, 2021

nithinraok commented Sep 16, 2021

briebe commented Sep 16, 2021

briebe commented Sep 6, 2021 •

edited

Loading

nithinraok commented Sep 15, 2021 •

edited

Loading