Temporary workaround for loading best model at end with DeepSpeed by regisss · Pull Request #95 · huggingface/optimum-habana

regisss · 2022-09-09T00:07:36Z

What does this PR do?

Loading the best model at the end of training with --load_best_model_at_end fails with the current version of Habana DeepSpeed (0.6.1, see huggingface/transformers#17114).
This PR brings a temporary workaround where the best model at the end of training is loaded as a regular PyTorch model and not as a DeepSpeed engine. This should not be an issue since the best model is loaded for inference only and ZeRO-3 has not been validated yet (see here) while ZeRO-1/2 are useful for training only.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

HuggingFaceDocBuilderDev · 2022-09-09T00:13:32Z

The documentation is not available anymore as the PR was closed or merged.

Temporary WA for laoding bast model at end with DeepSpeed

5f90bef

regisss requested a review from libinta September 9, 2022 00:07

Make style

e1c655e

libinta approved these changes Sep 9, 2022

View reviewed changes

regisss merged commit 3d07854 into main Sep 9, 2022

regisss deleted the fix_load_best_model_deepspeed branch September 9, 2022 06:34

hsubramony pushed a commit that referenced this pull request Mar 13, 2024

Run custom ctc_loss only for Gaudi2 (#95)

05735e5

yeonsily pushed a commit that referenced this pull request Mar 22, 2024

Run custom ctc_loss only for Gaudi2 (#95)

f10da08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Temporary workaround for loading best model at end with DeepSpeed#95

Temporary workaround for loading best model at end with DeepSpeed#95
regisss merged 2 commits into
mainfrom
fix_load_best_model_deepspeed

regisss commented Sep 9, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Sep 9, 2022 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

regisss commented Sep 9, 2022

What does this PR do?

Before submitting

Uh oh!

HuggingFaceDocBuilderDev commented Sep 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HuggingFaceDocBuilderDev commented Sep 9, 2022 •

edited

Loading