Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fastconform SSL pretrained recipe failed potentially due to PTL 2.0 #7507

Closed
XuesongYang opened this issue Sep 25, 2023 · 7 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@XuesongYang
Copy link
Collaborator

XuesongYang commented Sep 25, 2023

Describe the bug

the examples/asr/speech_pretraining/speech_pre_training.py is not working properly on the lates main branch, but working in 1.20. At first glance, the error could be related to the upgrade of Pytorch lightning 2.0.

Error Brief

lightning_fabric.utilities.exceptions.MisconfigurationException: `ModelCheckpoint(monitor='val_loss')` could not find the monitored key in the returned metrics: ['learning_rate', 'global_step', 'train_contrastive', 'train_backward_timing in s', 'train_step_timing in s', 'epoch', 'step']. HINT: Did you call `log('val_loss', value)` in the `LightningModule`?

Error Details

error_logs.zip

@XuesongYang XuesongYang added the bug Something isn't working label Sep 25, 2023
@XuesongYang
Copy link
Collaborator Author

It seems this PR will fix the error: #7505

@titu1994
Copy link
Collaborator

Good catch, for some reason PR was closed though.. @KunalDhawan ?

@KunalDhawan
Copy link
Collaborator

I've opened a new PR for the issue to keep a consistent format of logging across RNNT, CTC, and Hybrid models - #7531. Once this is approved we can make the same changes for SSL, SLU and other models

@nithinraok
Copy link
Collaborator

@XuesongYang are you still seeing these errors?

@XuesongYang
Copy link
Collaborator Author

I switched back to r1.20.0 release, and it worked well. I haven't tried it on the latest r1.21.0 yet. Do we have a nightly test for the minimalist fastconformer SSL training recipe?

@nithinraok
Copy link
Collaborator

We have SSL training notebook, and it was fixed for PTL upgrade.

@XuesongYang
Copy link
Collaborator Author

Let's close this issue then. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants