[misc] fix: deprecate rollout.mode config option#4690
[misc] fix: deprecate rollout.mode config option#4690vermouth1992 merged 1 commit intoverl-project:mainfrom
Conversation
The `mode` field in RolloutConfig is deprecated since only "async" mode is supported. This PR: 1. Adds validation in RolloutConfig.__post_init__ to: - Raise ValueError for "sync" mode (moved from main_ppo.py) - Warn about unknown modes 2. Simplifies main_ppo.py to always use AsyncActorRolloutRefWorker since sync mode is no longer supported 3. Removes dead code paths for sync mode in: - fsdp_workers.py - megatron_workers.py 4. Simplifies ray_trainer.py to always assume async mode The `mode` field is kept for backward compatibility with existing shell scripts that set `actor_rollout_ref.rollout.mode=async`. Fixes verl-project#4604 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
There was a problem hiding this comment.
Code Review
This pull request effectively deprecates the rollout.mode configuration option, correctly removing the 'sync' mode and simplifying the codebase to only support 'async' mode. The changes are well-contained and the validation logic has been appropriately moved to the configuration class. I have one suggestion to make the validation stricter to avoid silent behavior changes for users with misconfigured modes, which could be easily missed.
| if self.mode == "sync": | ||
| raise ValueError( | ||
| "Rollout mode 'sync' has been removed. Please set " | ||
| "`actor_rollout_ref.rollout.mode=async` or remove the mode setting entirely." | ||
| ) | ||
| if self.mode != "async": | ||
| warnings.warn( | ||
| f"Unknown rollout mode '{self.mode}'. Only 'async' mode is supported. " | ||
| "The 'mode' field is deprecated and will be removed in a future version.", | ||
| DeprecationWarning, | ||
| stacklevel=2, | ||
| ) |
There was a problem hiding this comment.
Using warnings.warn for unknown rollout.mode values can lead to silent behavior changes that might be missed by users, as DeprecationWarning is often suppressed by default. Since only 'async' mode is supported now, any other value should be treated as an error to prevent unexpected behavior. For example, a typo like mode=asyn would have previously fallen back to sync-like behavior, but with this change, it would be treated as 'async' with just a potentially hidden warning. Raising a ValueError for all non-'async' modes is safer and more explicit about the supported configuration.
if self.mode != "async":
if self.mode == "sync":
raise ValueError(
"Rollout mode 'sync' has been removed. Please set "
"`actor_rollout_ref.rollout.mode=async` or remove the mode setting entirely."
)
raise ValueError(
f"Unknown rollout mode '{self.mode}'. Only 'async' mode is supported. "
"Please set `actor_rollout_ref.rollout.mode=async` or remove the mode setting entirely."
)
Summary
Addresses #4604 - Deprecates the
actor_rollout_ref.rollout.modeconfig option since onlyasyncmode is now supported.Changes:
RolloutConfig validation (
verl/workers/config/rollout.py):__post_init__to raiseValueErrorfor "sync" modeDeprecationWarningfor unknown modesSimplified main_ppo.py:
AsyncActorRolloutRefWorkerinstead of conditional selectionRemoved dead code paths:
fsdp_workers.py: Removed sync mode switch-to-trainer-mode codemegatron_workers.py: Same cleanupray_trainer.py: Simplified async mode checks since sync is no longer validBackward Compatibility:
The
modefield is kept in the config for backward compatibility with existing shell scripts that setactor_rollout_ref.rollout.mode=async. These scripts will continue to work.Follow-up Work:
Shell scripts and documentation that explicitly set
mode=asynccan be updated in separate PRs to remove the now-redundant setting.