Skip to content

chore: Update RL to use megatron-bridge tot#1358

Merged
terrykong merged 14 commits intomainfrom
yuya/update-to-use-mbridge-tot
Nov 4, 2025
Merged

chore: Update RL to use megatron-bridge tot#1358
terrykong merged 14 commits intomainfrom
yuya/update-to-use-mbridge-tot

Conversation

@yaoyu-33
Copy link
Contributor

@yaoyu-33 yaoyu-33 commented Oct 14, 2025

What does this PR do ?

As title

Tests

METRIC PASS    [Step 10/10]         : code_snapshots_mcore_tot/distillation-qwen3-32b-to-1.7b-base-1n8g-megatron-tp2pp2cp2-pack/6814205-logs/ray-driver.log
METRIC PASS    [Step 20/20]         : code_snapshots_mcore_tot/dpo-llama3.1-8b-instruct-4n8g-megatrontp2pp2-quick/6814215-logs/ray-driver.log
METRIC FAIL    [Step 150/150]       : code_snapshots_mcore_tot/dpo-llama3.1-8b-instruct-4n8g-megatron.v2/6826018-logs/ray-driver.log
METRIC PASS    [Step 500/500]       : code_snapshots_mcore_tot/grpo-llama3.2-1b-instruct-1n8g-megatron/6814226-logs/ray-driver.log
METRIC PASS    [Step 3/3]           : code_snapshots_mcore_tot/grpo-math-qwen3-30ba3b-megatron-tp4-32k/6826014-logs/ray-driver.log
METRIC PASS    [Step 30/30]         : code_snapshots_mcore_tot/grpo-moonlight-16ba3b-4n8g-megatron/6814290-logs/ray-driver.log
METRIC PASS    [Step 30/30]         : code_snapshots_mcore_tot/grpo-qwen2.5-7b-instruct-4n8g-megatron/6814301-logs/ray-driver.log
METRIC PASS    [Step 30/30]         : code_snapshots_mcore_tot/grpo-qwen3-30ba3b-8n8g-megatron/6833541-logs/ray-driver.log
METRIC FAIL    [Step 300/300]       : code_snapshots_mcore_tot/sft-llama3.1-70b-8n8g-tp4pp2-long-megatron/6814309-logs/ray-driver.log
METRIC PASS    [Step 250/250]       : code_snapshots_mcore_tot/sft-llama3.1-8b-1n8g-megatron/6823954-logs/ray-driver.log
METRIC PASS    [Step 250/250]       : code_snapshots_mcore_tot/sft-llama3.1-8b-1n8g-megatron-seqpack/6814313-logs/ray-driver.log
METRIC PASS    [Step 200/200]       : code_snapshots_mcore_tot/vlm_grpo-qwen2.5-vl-3b-instruct-clevr-1n2g-megatrontp2.v1/6850360-logs/ray-driver.log

The 70b failure is a slight memory bump that exists in main.
The dpo failure dpo-llama3.1-8b-instruct-4n8g-megatron.v2 is due the num_workers change and caused the shuffling order to change since the default is 1 instead of 0.

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Summary by CodeRabbit

  • New Features

    • Enhanced mixed-precision support with optional float32 expert biases for MoE routers.
    • Automatic padded vocab size calculation during setup.
    • Explicit TP/PP/EP configuration for vLLM.
    • Default generation temperature set to 1.0 across Megatron and vLLM.
  • Bug Fixes

    • More robust checkpoint handling when FSDP is unavailable.
    • Correct boolean handling for vLLM enforce_eager.
    • Aligned generation defaults for consistent behavior.
  • Refactor

    • Updated model initialization flow for improved stability.
  • Chores

    • Added config defaults (train_iters, bias_activation_fusion) and clearer validation for required vocab size.

@yaoyu-33 yaoyu-33 requested review from a team as code owners October 14, 2025 22:56
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 14, 2025

📝 Walkthrough

Walkthrough

Adds a pre-finalize step to Megatron community import. Introduces a CustomFloat16Module and mixed_precision_wrapper logic in Megatron policy worker, including FSDP detection, vocab padding, and stricter config assertions. Updates refit_verifier to explicitly pass TP/PP/EP to vLLM, set temperatures, add train_iters, and fix boolean types.

Changes

Cohort / File(s) Summary
Megatron community import init flow
nemo_rl/models/megatron/community_import.py
Inserted model_provider.finalize() before initialize_model_parallel(...) in import_model_from_hf_name initialization sequence.
Policy worker mixed-precision, MoE bias, vocab, and FSDP handling
nemo_rl/models/policy/megatron_policy_worker.py
Added CustomFloat16Module to re-enable float32 expert biases in MoE routers; introduced mixed_precision_wrapper selection across model/ref model init; guarded FSDP import and HAVE_FSDP2 flag; added padded vocab size computation and enforced vocab_size assertion; updated construction paths to use new wrapper and config types.
Refit verifier config alignment for Megatron and vLLM
tools/refit_verifier.py
Added temperature: 1.0 (Megatron and vLLM); set train_iters: 1 and bias_activation_fusion: False; passed TP/PP/EP explicitly to vLLM (removed product computation); enforce_eager converted to boolean False; updated comments/structure accordingly.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Caller
  participant CommunityImport as Community Import
  participant Provider as Megatron Provider

  Caller->>CommunityImport: import_model_from_hf_name(...)
  CommunityImport->>Provider: finalize()
  Note over Provider: Prepares global state before parallel init
  CommunityImport->>Provider: initialize_model_parallel(...)
  Provider-->>Caller: Initialized model
Loading
sequenceDiagram
  autonumber
  participant Worker as PolicyWorker
  participant Tokenizer
  participant Vocab as VocabUtil
  participant Wrapper as MixedPrecisionWrapper
  participant Model as MegatronModel
  participant Router as MoE Routers
  participant FSDP as torch_FSDP (optional)

  Worker->>Tokenizer: load tokenizer
  Worker->>Vocab: calculate_padded_vocab_size(vocab_size)
  Vocab-->>Worker: final_padded_vocab_size
  Worker->>Wrapper: select wrapper (Float16 / CustomFloat16 / None)
  alt FSDP available
    Worker->>FSDP: set HAVE_FSDP2 flag
  end
  Worker->>Model: get_model(..., mixed_precision_wrapper=Wrapper, vocab_size=asserted)
  opt Using CustomFloat16
    Worker->>Wrapper: re_enable_float32_expert_bias()
    Wrapper->>Router: _maintain_float32_expert_bias()
  end
  Note over Worker,Model: Same flow applied to reference model (pre/post load)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Test Results For Major Changes ⚠️ Warning The pull request introduces substantial new functionality and modifications to core Megatron integration, including a custom float-16 wrapper, revised initialization flow, and API surface changes, which qualify as major changes under our criteria; however, the PR description contains only a placeholder checklist and no actual test results, numeric or convergence validation, or performance benchmarks. Please augment the PR description with concrete testing information, such as unit or functional test outcomes demonstrating no regressions, convergence validation for numeric changes, and before-and-after performance numbers with configuration details.
Title check ❓ Inconclusive The title is vague and incomplete—'megatron-bridge tot' is unclear, possibly truncated or containing an acronym without context, making it difficult to understand the main change. Clarify the title to explicitly state the purpose of the update. Consider: 'Update RL models to use latest megatron-bridge version' or similar descriptive phrasing.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch yuya/update-to-use-mbridge-tot

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
nemo_rl/models/policy/megatron_policy_worker.py (1)

2026-2060: CustomFloat16Module correctly maintains float32 MoE router bias.

The new CustomFloat16Module class properly extends Float16Module to ensure MoE router expert bias stays in float32 for numerical stability. The re_enable_float32_expert_bias() method correctly:

  • Handles VLM models by unwrapping language_model
  • Walks decoder layers to find routers
  • Invokes _maintain_float32_expert_bias() when available

Consider adding defensive checks for robustness:

 def re_enable_float32_expert_bias(self) -> None:
     """Ensure MoE router expert bias stays in float32 for numerical stability.
 
     Walks the wrapped module to find MoE routers and invokes the
     `_maintain_float32_expert_bias()` helper which recreates or casts the
     expert bias tensors to float32 as required by Megatron-LM.
     """
     module = self.module
     # Handle VLM models where language model is nested
     if hasattr(module, "language_model"):
         module = module.language_model
-    if hasattr(module, "decoder") and hasattr(module.decoder, "layers"):
+    # Only process if the model has the expected decoder structure
+    if not (hasattr(module, "decoder") and hasattr(module.decoder, "layers")):
+        return
+    for layer in module.decoder.layers:
-        for layer in module.decoder.layers:
-            mlp = getattr(layer, "mlp", None)
-            router = getattr(mlp, "router", None) if mlp is not None else None
-            if router is not None and hasattr(
-                router, "_maintain_float32_expert_bias"
-            ):
-                router._maintain_float32_expert_bias()
+        mlp = getattr(layer, "mlp", None)
+        router = getattr(mlp, "router", None) if mlp is not None else None
+        if router is not None and hasattr(router, "_maintain_float32_expert_bias"):
+            router._maintain_float32_expert_bias()
nemo_rl/models/megatron/community_import.py (1)

72-72: Update docstring to document finalize() call
Add that model_provider.finalize() runs deferred post-init logic, validates the provider/config, and must be called after config modifications and before initialize_model_parallel().

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 355aa98 and 40979a3.

📒 Files selected for processing (3)
  • nemo_rl/models/megatron/community_import.py (1 hunks)
  • nemo_rl/models/policy/megatron_policy_worker.py (9 hunks)
  • tools/refit_verifier.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: Follow the Google Python Style Guide for all Python code
Target Python 3.12+ for all Python code in NeMo-RL
Indent Python code with 4 spaces; do not use tabs
Python filenames should be snake_case (e.g., some_file.py)
Class names should be PascalCase
Function and method names should be snake_case
Local variable names should be snake_case; if starting with a number, prefix with k (e.g., k_99th_percentile)
Global variables should be UPPER_SNAKE_CASE and prefixed with G_ (e.g., G_MY_GLOBAL)
Constants should be UPPER_SNAKE_CASE
Avoid shadowing variables declared in an outer scope
Initialize all externally visible members of a class in the constructor
For public interfaces used outside a file, prefer docstrings over comments
Use comments mainly for code within a function or interfaces local to a file
Commented-out code must include a nearby comment explaining usage and why it is commented out; otherwise remove before merging
Use Google-style docstrings for classes and functions (Sphinx-parseable)
Avoid using reflection when functionality can be easily achieved without it
Limit except clauses to the smallest specific set of exceptions possible
For duck-typing via try/except, keep the try body minimal and use else for main logic
Add the NVIDIA copyright header (with current year) at the top of all Python files, excluding tests/ and test-only scripts

Files:

  • nemo_rl/models/megatron/community_import.py
  • tools/refit_verifier.py
  • nemo_rl/models/policy/megatron_policy_worker.py
nemo_rl/**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

nemo_rl/**/*.py: Do not set non-None configuration defaults in code; YAML is the single source of truth for defaults
Access required config attributes directly (e.g., policy_cfg["precision"]) and assume presence; do not introduce hidden defaults
Express configuration optionality via TypedDict using typing.NotRequired
When adding a new config key to a TypedDict subclass, document the key’s purpose, valid values/types, and recommended default in code
For any class or function decorated with @ray.remote, add '# pragma: no cover' on the class/def line (and on remote functions)

Files:

  • nemo_rl/models/megatron/community_import.py
  • nemo_rl/models/policy/megatron_policy_worker.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Lint check
  • GitHub Check: Post submodule check comment / Comment on PR
  • GitHub Check: Post automodel integration comment / Comment on PR
🔇 Additional comments (9)
tools/refit_verifier.py (3)

161-161: LGTM! Consistent temperature configuration.

Adding temperature: 1.0 to the Megatron generation config ensures consistency with the vLLM configuration (line 271), which is important for accurate logprob comparison in this verification script.


216-217: LGTM! Required Megatron config additions.

Adding train_iters: 1 and bias_activation_fusion: False aligns with new Megatron configuration requirements. The assertion at line 651 in megatron_policy_worker.py confirms that train_iters is now mandatory.


277-279: LGTM! Explicit parallelism configuration.

Explicitly passing tensor_parallel_size, pipeline_parallel_size, and expert_parallel_size to vLLM config (instead of computing them) improves clarity and aligns with the updated VllmGeneration expectations noted in the comment at line 262.

nemo_rl/models/policy/megatron_policy_worker.py (6)

134-139: LGTM! Proper FSDP2 feature detection.

The try/except block correctly detects FSDP2 availability without requiring a hard dependency. The ImportError-specific exception handling follows best practices and is used appropriately at lines 316 and 771 to conditionally adjust checkpoint loading behavior.


249-249: Stricter vocab_size validation.

The assertion now requires vocab_size to be explicitly specified in the model config, replacing any previous fallback behavior. This is a breaking change that ensures explicit configuration.

Ensure all configuration files and documentation specify vocab_size explicitly.


814-818: LGTM! Explicit padded vocab size calculation.

Calculating final_padded_vocab_size using the imported calculate_padded_vocab_size utility ensures correct vocab padding for tensor parallelism. The calculated value is used at line 1466 for inference configuration.


254-276: Clarify precedence when both freeze_moe_router and defer_fp32_logits are enabled.

The logic sets mixed_precision_wrapper = CustomFloat16Module when freeze_moe_router is true (lines 271-272), then overrides it to None when defer_fp32_logits is enabled (lines 275-276). This means defer_fp32_logits takes precedence.

Verify that the precedence is intentional. If both options can be enabled simultaneously, consider adding a comment or assertion to clarify the expected behavior:

# If deferring fp32 logits, disable mixed-precision wrapper entirely
# This takes precedence over freeze_moe_router which also sets the wrapper
if policy_cfg["megatron_cfg"].get("defer_fp32_logits", None):
    mixed_precision_wrapper = None

745-758: LGTM! Consistent wrapper configuration for reference model.

The reference model uses the same mixed precision wrapper selection logic as the main model (lines 254-276), ensuring consistency. The ref_mixed_precision_wrapper is correctly passed to get_model at line 758.


93-93: LGTM! Required import for CustomFloat16Module.

The TransformerConfig import is necessary for the new CustomFloat16Module class definition at line 2038.

@yfw
Copy link
Contributor

yfw commented Oct 15, 2025

@ZhiyuLi-Nvidia can you double check the fp32 expert bias change in this PR?

yfw
yfw previously approved these changes Oct 15, 2025
@ZhiyuLi-Nvidia
Copy link
Contributor

ZhiyuLi-Nvidia commented Oct 15, 2025

@ZhiyuLi-Nvidia can you double check the fp32 expert bias change in this PR?

@yaoyu-33 could you help me run a test on moonshotai/Moonlight-16B-A3B-Instruct for verification.

I think we are good to go if the experiment is successful.

export HF_HOME=<xxx>

export NCCL_NVLS_ENABLE=0
export NRL_FORCE_REBUILD_VENVS=true 
PYTHONPATH=$HF_HOME/modules:$PYTHONPATH uv run python examples/run_grpo_math.py --config=examples/configs/grpo_math_1B_megatron.yaml \
    cluster.num_nodes=2 \
    grpo.val_batch_size=2 \
    policy.model_name=moonshotai/Moonlight-16B-A3B-Instruct \
    policy.generation.vllm_cfg.tensor_parallel_size=16 \
    cluster.gpus_per_node=8 \
    policy.megatron_cfg.pipeline_model_parallel_size=8 \
    policy.megatron_cfg.num_layers_in_first_pipeline_stage=5 \
    policy.megatron_cfg.num_layers_in_last_pipeline_stage=4 \
    policy.max_total_sequence_length=1024 \
    checkpointing.enabled=False \
    checkpointing.save_period=10 \
    grpo.val_period=10 \
    grpo.max_val_samples=16 \
    grpo.val_batch_size=4 \
    checkpointing.keep_top_k=100 \
    checkpointing.checkpoint_dir=results/grpo_moonlight \
    checkpointing.enabled=False \
    grpo.val_period=-1 \
    policy.megatron_cfg.expert_model_parallel_size=2 \
    policy.generation.vllm_cfg.async_engine=True \
    grpo.max_num_steps=10 \
    grpo.num_prompts_per_step=8 \
    grpo.num_generations_per_prompt=8 \
    policy.train_global_batch_size=64 \
    policy.train_micro_batch_size=1 \
    logger.wandb_enabled=False \
    logger.wandb.project='grpo-dev-zhiyul' \
    logger.wandb.name='moonlight-16b' \
    policy.megatron_cfg.apply_rope_fusion=False

@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: 0f93ad0 (PR #1358 from yuya/update-to-use-mbridge-tot)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/063498590a2923a63c04df1fd3e7a5467a519880/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: 371e458 (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/063498590a2923a63c04df1fd3e7a5467a519880/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: fc33e3b (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/063498590a2923a63c04df1fd3e7a5467a519880/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: e9a3e46 (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/063498590a2923a63c04df1fd3e7a5467a519880/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: 752a7cf (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/f003cd8ca3e4876853b6097e816f0a94ea8fefc1/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: 8675993 (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/f003cd8ca3e4876853b6097e816f0a94ea8fefc1/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: bb2809a (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/f003cd8ca3e4876853b6097e816f0a94ea8fefc1/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: e7851c7 (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/f003cd8ca3e4876853b6097e816f0a94ea8fefc1/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

ZhiyuLi-Nvidia
ZhiyuLi-Nvidia previously approved these changes Oct 28, 2025
Copy link
Contributor

@ZhiyuLi-Nvidia ZhiyuLi-Nvidia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @yaoyu-33. LGTM!

Sync up offline.
Correct convergence/logprob error in moonlight model should verify the effectiveness of re_enable_float32_expert_bias.

@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: 8c7d7f1 (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/f003cd8ca3e4876853b6097e816f0a94ea8fefc1/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

@terrykong terrykong requested a review from a team as a code owner October 28, 2025 06:55
@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: ba4f889 (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/f003cd8ca3e4876853b6097e816f0a94ea8fefc1/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

@terrykong terrykong requested a review from a team as a code owner October 28, 2025 19:43
@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: f9eb786 (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/f003cd8ca3e4876853b6097e816f0a94ea8fefc1/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: 87451fe (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/f003cd8ca3e4876853b6097e816f0a94ea8fefc1/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

@terrykong terrykong force-pushed the yuya/update-to-use-mbridge-tot branch from 87451fe to d7a3e40 Compare October 28, 2025 19:45
@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: d7a3e40 (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/f003cd8ca3e4876853b6097e816f0a94ea8fefc1/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

@terrykong terrykong removed the r0.4.0 label Oct 28, 2025
@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: 3674c3f (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/f003cd8ca3e4876853b6097e816f0a94ea8fefc1/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

yaoyu-33 and others added 9 commits November 3, 2025 22:54
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>

fix more stuff

Signed-off-by: Terry Kong <terryk@nvidia.com>

fix

Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
@terrykong terrykong force-pushed the yuya/update-to-use-mbridge-tot branch from 8797154 to 535f66c Compare November 4, 2025 06:54
@github-actions
Copy link

github-actions bot commented Nov 4, 2025

❌ Submodule Fast-Forward Check Failed

Check based on commit: 8797154 (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/f003cd8ca3e4876853b6097e816f0a94ea8fefc1/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

@github-actions
Copy link

github-actions bot commented Nov 4, 2025

❌ Submodule Fast-Forward Check Failed

Check based on commit: 535f66c (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/f003cd8ca3e4876853b6097e816f0a94ea8fefc1/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

Signed-off-by: Terry Kong <terryk@nvidia.com>
@terrykong terrykong added the CI:L1 Run doctests, unit tests, and functional tests label Nov 4, 2025
@terrykong
Copy link
Collaborator

@yfw i updated the branches: 4ee29a2

@github-actions
Copy link

github-actions bot commented Nov 4, 2025

❌ Submodule Fast-Forward Check Failed

Check based on commit: 4ee29a2 (PR #1358 from yuya/update-to-use-mbridge-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1358 from yuya/update-to-use-mbridge-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/f003cd8ca3e4876853b6097e816f0a94ea8fefc1/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

@terrykong terrykong enabled auto-merge (squash) November 4, 2025 07:03
@terrykong terrykong merged commit 2e2c2b3 into main Nov 4, 2025
39 of 41 checks passed
@terrykong terrykong deleted the yuya/update-to-use-mbridge-tot branch November 4, 2025 11:39
PrinsYin pushed a commit to PrinsYin/RL that referenced this pull request Nov 30, 2025
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Co-authored-by: Terry Kong <terryk@nvidia.com>
@coderabbitai coderabbitai bot mentioned this pull request Jan 13, 2026
4 tasks
yuanhangsu1986 pushed a commit to yuanhangsu1986/RL-Nemontron-Edge-Omni that referenced this pull request Feb 21, 2026
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Co-authored-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L1 Run doctests, unit tests, and functional tests t-mcore

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants