[doc] fix: set use_dist_checkpointing to False for ref model in qwen3moe-30b script #3198
Conversation
There was a problem hiding this comment.
Code Review
This pull request aims to fix an incorrect configuration for the reference model in the qwen3moe-30b training script by setting use_dist_checkpointing to False. While the intent is correct, the implementation introduces a potential issue by using a shared variable ${USE_DIST_CKPT}. As the pull request description notes, the reference model does not support distributed checkpointing, so this setting should be hardcoded to False to prevent future misconfigurations that could lead to runtime errors. The other change, which fixes a trailing backslash and adds a newline at the end of the file, is a good correction.
| actor_rollout_ref.ref.log_prob_micro_batch_size_per_gpu=${infer_ppo_micro_batch_size_per_gpu} \ | ||
| actor_rollout_ref.ref.log_prob_max_token_len_per_gpu=${infer_ppo_max_token_len} \ | ||
| actor_rollout_ref.ref.megatron.use_dist_checkpointing=True \ | ||
| actor_rollout_ref.ref.megatron.use_dist_checkpointing=${USE_DIST_CKPT} \ |
There was a problem hiding this comment.
Based on the pull request description, use_dist_checkpointing must be False for the reference model because it lacks a distributed checkpoint path. Using the ${USE_DIST_CKPT} variable makes this setting configurable. If a user sets USE_DIST_CKPT=True (e.g., for the actor model), it would also be incorrectly enabled for the reference model, likely causing a runtime error. To ensure the script's robustness and prevent misconfiguration, this value should be hardcoded to False for the reference model.
| actor_rollout_ref.ref.megatron.use_dist_checkpointing=${USE_DIST_CKPT} \ | |
| actor_rollout_ref.ref.megatron.use_dist_checkpointing=False \ |
…moe-30b script (verl-project#3198) ### What does this PR do? Set use_dist_checkpointing to False for ref model in qwen3moe-30b script, because there is not dist_megatron_ckpt model path for ref model.
…moe-30b script (verl-project#3198) ### What does this PR do? Set use_dist_checkpointing to False for ref model in qwen3moe-30b script, because there is not dist_megatron_ckpt model path for ref model.
…moe-30b script (verl-project#3198) ### What does this PR do? Set use_dist_checkpointing to False for ref model in qwen3moe-30b script, because there is not dist_megatron_ckpt model path for ref model.
…moe-30b script (verl-project#3198) ### What does this PR do? Set use_dist_checkpointing to False for ref model in qwen3moe-30b script, because there is not dist_megatron_ckpt model path for ref model.
…moe-30b script (verl-project#3198) ### What does this PR do? Set use_dist_checkpointing to False for ref model in qwen3moe-30b script, because there is not dist_megatron_ckpt model path for ref model.
…moe-30b script (verl-project#3198) ### What does this PR do? Set use_dist_checkpointing to False for ref model in qwen3moe-30b script, because there is not dist_megatron_ckpt model path for ref model.
…moe-30b script (verl-project#3198) ### What does this PR do? Set use_dist_checkpointing to False for ref model in qwen3moe-30b script, because there is not dist_megatron_ckpt model path for ref model.
…moe-30b script (verl-project#3198) ### What does this PR do? Set use_dist_checkpointing to False for ref model in qwen3moe-30b script, because there is not dist_megatron_ckpt model path for ref model.
…moe-30b script (verl-project#3198) ### What does this PR do? Set use_dist_checkpointing to False for ref model in qwen3moe-30b script, because there is not dist_megatron_ckpt model path for ref model.
What does this PR do?
Set use_dist_checkpointing to False for ref model in qwen3moe-30b script, because there is not dist_megatron_ckpt model path for ref model.