New option called `"best"` for `args.save_strategy`. by seanswyi · Pull Request #31817 · huggingface/transformers

seanswyi · 2024-07-06T13:40:00Z

What does this PR do?

Addresses #31626.

Adds a new option called "best" for TrainingArguments.save_strategy which saves the model checkpoint each time a new best performance is achieved.

Details

The previous _save_checkpoint method was in charge of not only saving the model checkpoint but also determining the best metric and best checkpoint. The logic for determining a new best metric was separated out into the _determine_best_metric method.
_determine_best_metric is called after every evaluation inside of _maybe_log_save_evaluate. The return value new_best_metric is used to determine whether or not a new best metric has been achieved, and if the save strategy is "best" then the TrainerControl's should_save flag is switched on.
- Contrary to what I initially thought, best_metric does not seem to be tracked by default. Rather, it's only tracked when args.metric_for_best_model is provided. I believe that a best metric of some sort should always be tracked, and therefore if a value is not provided then the validation loss is used to determine a new best.
A new object called SaveStrategy was created in trainer_utils that adds a new attribute called BEST to the previous IntervalStrategy.

I'm not sure if I like the rather "hack-y" way that I implemented this by manually switching the TrainerControl's should_save flag rather than delegating it to the callback handler like the other flags are dealt with. The problem is that the flags are normally updated before calling _maybe_log_save_evaluate inside of the inner training loop, which means there's no way for us to determine whether or not a new best metric has been achieved with the current logic. I'm not sure if I'm making sense, but I'm open to any other suggestions.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@muellerzr @SunMarc

HuggingFaceDocBuilderDev · 2024-07-08T16:29:30Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

amyeroberts · 2024-08-06T12:46:38Z

Gentle ping @muellerzr @SunMarc

muellerzr

While this may seem hacky to you, I think this perfectly expands our given system. Any chance you could add a few tests to this? (Over in test_trainer.py). Very nice work!

src/transformers/trainer.py

seanswyi · 2024-08-07T00:55:36Z

While this may seem hacky to you, I think this perfectly expands our given system. Any chance you could add a few tests to this? (Over in test_trainer.py). Very nice work!

Thanks! And sure, I think I'll have some time to work on it over the weekend.

seanswyi · 2024-08-11T07:54:59Z

@muellerzr (cc. @SunMarc)

Hello. I've implemented a new test case inside of tests/trainer/test_trainer.py as suggested. Patches were used to control the output of the _evaluate method.

The first test case is when a value for metric_for_best_model is explicitly provided, and the second is when it's not provided and therefore defaults to the loss. Each case should have 2 and 3 checkpoints saved, respectively, as per the side_effects.

I've changed the DefaultFlowCallback object so that if save_strategy is either "no" or "best" then a checkpoint will not be saved at the end of training, as is the default behavior; it didn't really make sense to me that we'd want to keep only the best checkpoint but the last checkpoint is also being saved.

Please let me know if there are any problems or additional changes that I should make!

After inspecting the tests it seems like I wasn't supposed to manually alter the training arguments. The most recent two commits undid those.

seanswyi · 2024-09-17T04:09:40Z

@muellerzr @SunMarc No rush, but just a gentle nudge/reminder! Thanks.

seanswyi · 2024-10-18T11:44:06Z

@muellerzr Hi. Just wondering if this PR is still relevant or not. Thanks.

SunMarc

Thanks for the reminder @seanswyi ! I've added a suggestion. Please also fix the conflits and we should be good to merge !

src/transformers/trainer.py

ArthurZucker

Looks good, just a small nit!

src/transformers/trainer.py

SunMarc

Thanks for iterating !

1. Logic to determine the best logic was separated out from `_save_checkpoint`. 2. In `_maybe_log_save_evaluate`, whether or not a new best metric was achieved is determined after each evaluation, and if the save strategy is "best' then the TrainerControl is updated accordingly.

Same as IntervalStrategy, but with a new attribute called BEST.

`save_strategy` previously followed `IntervalStrategy` but now follows `SaveStrategy`. Changes were made accordingly to the code and the docstring.

1. Checks for both cases where `metric_for_best_model` is explicitly provided and when it's not provided. 2. The first case should have two checkpoints saved, whereas the second should have three saved.

The Trainer saves a checkpoints at the end of training by default as long as `save_strategy != SaveStrategy.NO`. This condition was modified to include `SaveStrategy.BEST` because it would be counterintuitive that we'd only want the best checkpoint to be saved but the last one is as well.

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Changes were made for consistency and also to fix a potential bug.

ArthurZucker

Great work, thanks for iterating!

seanswyi · 2024-11-02T13:12:10Z

@ArthurZucker @SunMarc @muellerzr Sorry to keep pinging you guys on this, but I'm noticing that the tests seem to be possibly stuck after the merge. Could anybody check this when they have the chance? Thanks.

https://github.com/huggingface/transformers/runs/32165247051

SunMarc · 2024-11-04T17:16:56Z

Indeed, thanks for noticing @seanswyi but don't worry, I check the most recent CI and the test is passing. Can you cancel the run @ArthurZucker ? Or it will just cancel by itself after a few days ?

seanswyi · 2024-11-04T20:56:21Z

Indeed, thanks for noticing @seanswyi but don't worry, I check the most recent CI and the test is passing. Can you cancel the run @ArthurZucker ? Or it will just cancel by itself after a few days ?

Ah, got it. Thanks for following up!

* Add _determine_best_metric and new saving logic. 1. Logic to determine the best logic was separated out from `_save_checkpoint`. 2. In `_maybe_log_save_evaluate`, whether or not a new best metric was achieved is determined after each evaluation, and if the save strategy is "best' then the TrainerControl is updated accordingly. * Added SaveStrategy. Same as IntervalStrategy, but with a new attribute called BEST. * IntervalStrategy -> SaveStrategy * IntervalStratgy -> SaveStrategy for save_strat. * Interval -> Save in docstring. * Updated docstring for save_strategy. * Added SaveStrategy and made according changes. `save_strategy` previously followed `IntervalStrategy` but now follows `SaveStrategy`. Changes were made accordingly to the code and the docstring. * Changes from `make fixup`. * Removed redundant metrics argument. * Added new test_save_best_checkpoint test. 1. Checks for both cases where `metric_for_best_model` is explicitly provided and when it's not provided. 2. The first case should have two checkpoints saved, whereas the second should have three saved. * Changed should_training_end saving logic. The Trainer saves a checkpoints at the end of training by default as long as `save_strategy != SaveStrategy.NO`. This condition was modified to include `SaveStrategy.BEST` because it would be counterintuitive that we'd only want the best checkpoint to be saved but the last one is as well. * `args.metric_for_best_model` default to loss. * Undo metric_for_best_model update. * Remove checking metric_for_best_model. * Added test cases for loss and no metric. * Added error for metric and changed default best_metric. * Removed unused import. * `new_best_metric` -> `is_new_best_metric` Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Applied `is_new_best_metric` to all. Changes were made for consistency and also to fix a potential bug. --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>

shcheklein · 2024-12-19T04:09:13Z

src/transformers/trainer.py

            raise ValueError(
                f"You have set `args.eval_strategy` to {args.eval_strategy} but you didn't pass an `eval_dataset` to `Trainer`. Either set `args.eval_strategy` to `no` or pass an `eval_dataset`. "
            )
+        if args.save_strategy == SaveStrategy.BEST or args.load_best_model_at_end:


@seanswyi Q: does this change mean that this logic:

https://huggingface.co/docs/transformers/main/main_classes/trainer#transformers.Seq2SeqTrainingArguments.metric_for_best_model

... Will default to "loss" if unspecified and load_best_model_at_end=True (to use the evaluation loss).

is broken now?

I haven't tested it yet, but it seems like it is. The docs probably need an update.

@shcheklein Yeah, I don't think that there's any case where metric_for_best_model defaults to loss.

umarbutler · 2024-12-21T03:42:12Z

If load_best_model_at_end = True, the strategy cannot be set to best.

umarbutler · 2024-12-21T03:44:10Z

This solves #35070, cheers! 🥳

amyeroberts added the trainer label Jul 6, 2024

huggingface deleted a comment from github-actions bot Aug 6, 2024

muellerzr approved these changes Aug 6, 2024

View reviewed changes

src/transformers/trainer.py Outdated Show resolved Hide resolved

src/transformers/trainer.py Outdated Show resolved Hide resolved

SunMarc approved these changes Oct 18, 2024

View reviewed changes

src/transformers/trainer.py Outdated Show resolved Hide resolved

SunMarc requested a review from ArthurZucker October 18, 2024 15:53

seanswyi commented Oct 19, 2024

View reviewed changes

src/transformers/trainer.py Outdated Show resolved Hide resolved

ArthurZucker reviewed Oct 25, 2024

View reviewed changes

src/transformers/trainer.py Outdated Show resolved Hide resolved

SunMarc approved these changes Oct 25, 2024

View reviewed changes

SunMarc requested a review from ArthurZucker October 25, 2024 15:23

seanswyi added 15 commits October 26, 2024 00:50

Added SaveStrategy.

47ec1dd

Same as IntervalStrategy, but with a new attribute called BEST.

IntervalStrategy -> SaveStrategy

4e3c8bc

IntervalStratgy -> SaveStrategy for save_strat.

1107540

Interval -> Save in docstring.

4528be6

Updated docstring for save_strategy.

9ec07f5

Added SaveStrategy and made according changes.

a494a54

`save_strategy` previously followed `IntervalStrategy` but now follows `SaveStrategy`. Changes were made accordingly to the code and the docstring.

Changes from make fixup.

6831cd4

Removed redundant metrics argument.

4589320

Added new test_save_best_checkpoint test.

1ab8356

1. Checks for both cases where `metric_for_best_model` is explicitly provided and when it's not provided. 2. The first case should have two checkpoints saved, whereas the second should have three saved.

args.metric_for_best_model default to loss.

0956ad2

Undo metric_for_best_model update.

df5a035

Remove checking metric_for_best_model.

bfbe2e0

Added test cases for loss and no metric.

ff932cd

seanswyi and others added 5 commits October 26, 2024 00:50

Added error for metric and changed default best_metric.

f6e2725

Removed unused import.

aa44b30

new_best_metric -> is_new_best_metric

fbc990d

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Applied is_new_best_metric to all.

38216fe

Changes were made for consistency and also to fix a potential bug.

Merge branch 'main' into feat/new-best-save-strategy

ca83144

ArthurZucker approved these changes Oct 28, 2024

View reviewed changes

ArthurZucker merged commit c175343 into huggingface:main Oct 28, 2024

qgallouedec mentioned this pull request Oct 28, 2024

Fix _save_checkpoint for online methods huggingface/trl#2288

Merged

5 tasks

seanswyi deleted the feat/new-best-save-strategy branch October 28, 2024 21:20

shcheklein reviewed Dec 19, 2024

View reviewed changes

umarbutler mentioned this pull request Dec 21, 2024

Make it possible to only save best model (not last checkpoint) #35070

Closed

seanswyi mentioned this pull request Dec 22, 2024

Update doc for metric_for_best_model when save_strategy="best". #35389

Merged

5 tasks

Conversation

seanswyi commented Jul 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Details

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Jul 8, 2024

Uh oh!

amyeroberts commented Aug 6, 2024

Uh oh!

muellerzr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

seanswyi commented Aug 7, 2024

Uh oh!

seanswyi commented Aug 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seanswyi commented Sep 17, 2024

Uh oh!

seanswyi commented Oct 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

seanswyi commented Nov 2, 2024

Uh oh!

SunMarc commented Nov 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seanswyi commented Nov 4, 2024

Uh oh!

shcheklein Dec 19, 2024

Choose a reason for hiding this comment

Uh oh!

seanswyi Dec 20, 2024

Choose a reason for hiding this comment

Uh oh!

seanswyi Dec 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

umarbutler commented Dec 21, 2024

Uh oh!

umarbutler commented Dec 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

seanswyi commented Jul 6, 2024 •

edited

Loading

seanswyi commented Aug 11, 2024 •

edited

Loading

seanswyi commented Oct 18, 2024 •

edited

Loading

SunMarc commented Nov 4, 2024 •

edited

Loading

seanswyi Dec 21, 2024 •

edited

Loading