[constant scheduler] fix: model won't be updated on first training step by 0x404 · Pull Request #1463 · verl-project/verl

0x404 · 2025-05-09T11:17:30Z

What does this PR do?

I found that when using the FSDP checkpoint test introduced by #1288, after one step of training, both comparisons pass the tests. This includes comparing the merged FSDP checkpoint with the verl-saved HF model, and comparing the merged FSDP checkpoint with the original HuggingFace model. This means the FSDP model is not being updated after one step of training.

However, the training log shows that in the first step, the learning rate is 1e-6, which is weird. I found two issues in the existing code:

There's a problem with get_constant_schedule_with_warmup: when num_warmup_steps=0, at the first step (current_step=0), it will return 0.0 instead of 1.0. This is wrong and inconsistent with the existing constant LR definition: https://github.com/huggingface/transformers/blob/774dc274ac966f4bccbcd90d55bba23f6cca37ae/src/transformers/optimization.py#L72
The log saves the learning rate after actor_lr_scheduler.step(), which is incorrect since it records the next step's LR, thus hiding the problem with get_constant_schedule_with_warmup.

This PR fixes these issues and decrease tolerance in FSDP checkpoint test.

Additional Info.

Training: FSDP
Inference: both

Checklist Before Submitting

Read the Contribute Guide.
Apply pre-commit checks.
Add [BREAKING] to the PR title if it breaks any API.
Update the documentation about your changes in the docs.
Add CI test(s) if neccessary.

- Fix LR scheduler step timing to properly record and apply learning rate - Correct warmup scheduler implementation to maintain constant rate after warmup - Increase learning rate in test script for better checkpoint validation

0x404 · 2025-05-21T03:46:12Z

Hi @eric-haibin-lin, Could you re-review this, just resolve conflicts several days ago

…ep (verl-project#1463)

0x404 added 2 commits May 9, 2025 11:03

fix: model won't be updated on first training step

dd1e45c

- Fix LR scheduler step timing to properly record and apply learning rate - Correct warmup scheduler implementation to maintain constant rate after warmup - Increase learning rate in test script for better checkpoint validation

just modify assert_close tolerance don't change e2e lr

e2375cc

eric-haibin-lin previously approved these changes May 15, 2025

View reviewed changes

Merge branch 'main' into fix_lr

cc315dd

0x404 dismissed eric-haibin-lin’s stale review via cc315dd May 17, 2025 01:52

0x404 requested a review from eric-haibin-lin May 17, 2025 01:53

eric-haibin-lin approved these changes May 21, 2025

View reviewed changes

eric-haibin-lin merged commit 80af51b into verl-project:main May 21, 2025
34 checks passed

cedricbeta pushed a commit to cedricbeta/verl that referenced this pull request May 21, 2025

[constant scheduler] fix: model won't be updated on first training st…

81eb767

…ep (verl-project#1463)

yellowbee686 pushed a commit to yellowbee686/verl that referenced this pull request May 22, 2025

[constant scheduler] fix: model won't be updated on first training st…

45d0fdb

…ep (verl-project#1463)

0x404 mentioned this pull request Jun 29, 2025

[algo] fix: correctly aggregate kl metrics in PPO actor #2259

Merged

5 tasks

chenjiaoAngel added a commit to chenjiaoAngel/verl that referenced this pull request Nov 14, 2025

[constant scheduler] fix: model won't be updated on first training st…

38ca7b2

…ep (verl-project#1463)

TimurTaepov pushed a commit to giorgossideris/verl that referenced this pull request Dec 20, 2025

[constant scheduler] fix: model won't be updated on first training st…

eb70b78

…ep (verl-project#1463)

vyomakesh0728 added a commit to vyomakesh0728/verl that referenced this pull request Jan 22, 2026

[constant scheduler] fix: model won't be updated on first training st…

d9db08a

…ep (verl-project#1463)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[constant scheduler] fix: model won't be updated on first training step#1463

[constant scheduler] fix: model won't be updated on first training step#1463
eric-haibin-lin merged 3 commits intoverl-project:mainfrom
0x404:fix_lr

0x404 commented May 9, 2025 •

edited

Loading

Uh oh!

0x404 commented May 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

0x404 commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Additional Info.

Checklist Before Submitting

Uh oh!

0x404 commented May 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

0x404 commented May 9, 2025 •

edited

Loading