Skip to content

feat: Add GPT-OSS support via mcore#1452

Merged
terrykong merged 3 commits intomainfrom
ashors/gpt-oss-tot
Dec 17, 2025
Merged

feat: Add GPT-OSS support via mcore#1452
terrykong merged 3 commits intomainfrom
ashors/gpt-oss-tot

Conversation

@ashors1
Copy link
Contributor

@ashors1 ashors1 commented Oct 31, 2025

What does this PR do ?

NOTE: blocked on vllm-project/vllm#27722 -- Fix was merged

20B example run

uv run examples/run_grpo_math.py --config examples/configs/grpo_math_70B_megatron.yaml cluster.num_nodes=8 policy.megatron_cfg.expert_model_parallel_size=4 policy.megatron_cfg.tensor_model_parallel_size=4 policy.sequence_packing.enabled=False policy.model_name=openai/gpt-oss-20b policy.megatron_cfg.sequence_parallel=True loss_fn.use_importance_sampling_correction=True policy.megatron_cfg.moe_router_dtype=fp64 policy.megatron_cfg.moe_permute_fusion=True policy.dynamic_batching.enabled=False policy.megatron_cfg.pipeline_model_parallel_size=4
Screenshot 2025-12-04 at 6 45 56 PM

wandb: https://wandb.ai/nvidia/ashors-gpt-oss?nw=jkyx3u96d2

120B example run

uv run examples/run_grpo_math.py --config examples/configs/grpo_math_70B_megatron.yaml cluster.num_nodes=32 policy.megatron_cfg.expert_model_parallel_size=8 policy.megatron_cfg.tensor_model_parallel_size=8 policy.megatron_cfg.pipeline_model_parallel_size=4 policy.sequence_packing.enabled=False policy.model_name=openai/gpt-oss-120b policy.megatron_cfg.sequence_parallel=True loss_fn.use_importance_sampling_correction=True policy.megatron_cfg.moe_router_dtype=fp64 policy.megatron_cfg.moe_permute_fusion=True
Screenshot 2025-12-04 at 6 50 14 PM

wandb: https://wandb.ai/nvidia/ashors-gpt-oss?nw=x17i7zpgtap

Issues

closes #869

Usage

A ready-to-use config is provided for 20B. This config requires 8 nodes and can be used to reproduce the reward curve displayed in the above section.

uv run examples/run_grpo_math.py \
    --config examples/configs/recipes/llm/grpo-gptoss-20b-8n8g-megatron.yaml \
    logger.wandb_enabled=True \
    logger.wandb.project=gpt-oss-test \
    logger.wandb.name=20b

For 120B, please refer to the example command in the above section. Optionally enable wandb logging by adding the following command-line args:

logger.wandb_enabled=True logger.wandb.project=gpt-oss-test logger.wandb.name=120b

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Summary by CodeRabbit

  • New Features

    • Added support for GPT-OSS-20B model training with GRPO methodology.
    • Introduced optimized model name handling for GPT-OSS variants.
    • Added compatibility improvements for enhanced model execution.
  • Tests

    • Added comprehensive test suite for GPT-OSS-20B training on multi-node GPU clusters with integrated metrics validation.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions github-actions bot added documentation Improvements or additions to documentation and removed documentation Improvements or additions to documentation labels Oct 31, 2025
@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: 9c9711a (PR #1452 from ashors/gpt-oss-tot)

✅ Submodules that are properly updated:

Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

❌ Submodules that need attention:

Megatron-Bridge: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/62f4704b8d665ac4a8c318a809a070217caa8901/
CURRENT (PR #1452 from ashors/gpt-oss-tot): https://github.com/NVIDIA-NeMo/Megatron-Bridge/commits/7367429371012ca6aa4f78f56091b9dbb325c12d/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

@github-actions
Copy link

github-actions bot commented Nov 4, 2025

✅ Submodule Fast-Forward Check Results

Check based on commit: 0f7e8e8 (PR #1452 from ashors/gpt-oss-tot)

✅ Submodules that are properly updated:

Megatron-Bridge: ✅ PR branch is ahead of main branch (fast-forward)
Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

All submodule changes look good! ✨

@github-actions
Copy link

✅ Submodule Fast-Forward Check Results

Check based on commit: 46fcd7d (PR #1452 from ashors/gpt-oss-tot)

✅ Submodules that are properly updated:

Megatron-Bridge: ✅ PR branch is ahead of main branch (fast-forward)
Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

All submodule changes look good! ✨

@github-actions
Copy link

✅ Submodule Fast-Forward Check Results

Check based on commit: fec527e (PR #1452 from ashors/gpt-oss-tot)

✅ Submodules that are properly updated:

Megatron-Bridge: ✅ PR branch is ahead of main branch (fast-forward)
Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

All submodule changes look good! ✨

@github-actions
Copy link

✅ Submodule Fast-Forward Check Results

Check based on commit: 71bc701 (PR #1452 from ashors/gpt-oss-tot)

✅ Submodules that are properly updated:

Megatron-Bridge: ✅ PR branch is ahead of main branch (fast-forward)
Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

All submodule changes look good! ✨

@github-actions
Copy link

✅ Submodule Fast-Forward Check Results

Check based on commit: 63d10eb (PR #1452 from ashors/gpt-oss-tot)

✅ Submodules that are properly updated:

Megatron-Bridge: ✅ PR branch is ahead of main branch (fast-forward)
Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

All submodule changes look good! ✨

@github-actions
Copy link

✅ Submodule Fast-Forward Check Results

Check based on commit: bee06cf (PR #1452 from ashors/gpt-oss-tot)

✅ Submodules that are properly updated:

Megatron-Bridge: ✅ PR branch is ahead of main branch (fast-forward)
Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

All submodule changes look good! ✨

@github-actions
Copy link

✅ Submodule Fast-Forward Check Results

Check based on commit: 8e7c503 (PR #1452 from ashors/gpt-oss-tot)

✅ Submodules that are properly updated:

Megatron-Bridge: ✅ PR branch is ahead of main branch (fast-forward)
Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

All submodule changes look good! ✨

@github-actions
Copy link

✅ Submodule Fast-Forward Check Results

Check based on commit: f3e880b (PR #1452 from ashors/gpt-oss-tot)

✅ Submodules that are properly updated:

Megatron-Bridge: ✅ PR branch is ahead of main branch (fast-forward)
Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

All submodule changes look good! ✨

@github-actions
Copy link

✅ Submodule Fast-Forward Check Results

Check based on commit: 716f05a (PR #1452 from ashors/gpt-oss-tot)

✅ Submodules that are properly updated:

Megatron-Bridge: ✅ PR branch is ahead of main branch (fast-forward)
Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

All submodule changes look good! ✨

@ashors1 ashors1 marked this pull request as ready for review December 4, 2025 00:35
@ashors1 ashors1 requested review from a team as code owners December 4, 2025 00:35
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 4, 2025

📝 Walkthrough

Walkthrough

This PR adds GPT-OSS model support to NeMo RL with Megatron-LM integration. It updates submodule references to OSS-specific branches, introduces a new GRPO configuration file for 20B models on 8-node clusters, modifies vLLM and Megatron worker initialization to support GPT-OSS model naming conventions, applies a TransformerEngine import-time patch for Triton dtype compatibility, and adds corresponding test infrastructure.

Changes

Cohort / File(s) Summary
Submodule Updates
.gitmodules, 3rdparty/Megatron-Bridge-workspace/Megatron-Bridge, 3rdparty/Megatron-LM-workspace/Megatron-LM
Updated submodule branch references and commit pointers to GPT-OSS-specific branches (yuya/nemo-rl-use-dev→ashors/dev-with-gpt-oss for Megatron-LM; main→ashors/gpt-oss-tot for Megatron-Bridge)
GRPO Configuration
examples/configs/recipes/llm/grpo-gptoss-20b-8n8g-megatron.yaml
New GRPO recipe for 20B GPT-OSS models with 64 prompts, 32 generations per prompt, Megatron tensor-parallel (8 experts, 4 tensor-parallel), vLLM generation backend, and 8-node/8-GPU cluster setup
vLLM Worker Enhancement
nemo_rl/models/generation/vllm/vllm_worker.py
Added model name rewriting logic to convert openai/gpt-oss-* references to unsloth BF16 variants for Megatron export compatibility
Megatron Worker Enhancement
nemo_rl/models/policy/megatron_policy_worker.py
Added module-level _apply_transformer_engine_patch() function that patches TransformerEngine's permutation.py at import time to fix Triton int dtype handling
Test Infrastructure
tests/test_suites/llm/grpo-gptoss-20b-8n8g-megatron.sh, tests/test_suites/release.txt
New GRPO experiment test script with metrics validation (token_mult_prob_error, reward, step time thresholds) and test suite registry entry

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Megatron policy worker patch: Verify TransformerEngine file patching logic, guard conditions (ImportError/FileNotFoundError/AttributeError), and side effects at module import time
  • vLLM model name translation: Confirm model naming conversion logic correctly handles openai/gpt-oss variants and maps to unsloth BF16 equivalents
  • GRPO configuration parameters: Validate tensor-parallel strategy (8 experts, 4 tensor-model-parallel), batch sizes, sequence lengths, and cluster resource allocation
  • Test metrics thresholds: Ensure metric validation bounds (token_mult_prob_error < 1.05, reward > 0.60, step time quantiles < 210 ms) are appropriate for the model/hardware configuration

Possibly related PRs

  • PR #1334 — Modifies nemo_rl/models/generation/vllm/vllm_worker.py with vLLM initialization/logprobs handling changes
  • PR #1389 — Modifies nemo_rl/models/policy/megatron_policy_worker.py with Megatron tensor-parallel and collective averaging helpers
  • PR #1514 — Updates same Megatron submodule references and applies changes to nemo_rl/models/policy/megatron_policy_worker.py

Suggested labels

CI, Run CICD

Suggested reviewers

  • yaoyu-33
  • terrykong
  • chtruong814

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Test Results For Major Changes ⚠️ Warning PR introduces major GPT-OSS support feature but lacks documented test results, performance metrics, convergence data, or validation evidence. Pre-checks checklist uncompleted. Provide actual training results, performance metrics, numerics validation, evidence of TransformerEngine patch correctness, and complete pre-checks checklist with local testing confirmation.
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: Add GPT-OSS support via mcore' directly and clearly summarizes the main objective of the pull request, matching the core change of adding GPT-OSS support through mcore.
Linked Issues check ✅ Passed The PR successfully implements Mcore support for GPT-OSS as required by issue #869, evidenced by submodule updates, model configuration, worker implementation, and comprehensive test additions.
Out of Scope Changes check ✅ Passed All changes are directly related to GPT-OSS Mcore support: submodule references, configuration files, worker modifications for model compatibility, and test infrastructure.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ashors/gpt-oss-tot

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1c371a9 and 716f05a.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • .gitmodules (1 hunks)
  • 3rdparty/Megatron-Bridge-workspace/Megatron-Bridge (1 hunks)
  • 3rdparty/Megatron-LM-workspace/Megatron-LM (1 hunks)
  • examples/configs/recipes/llm/grpo-gptoss-20b-8n8g-megatron.yaml (1 hunks)
  • nemo_rl/models/generation/vllm/vllm_worker.py (1 hunks)
  • nemo_rl/models/policy/megatron_policy_worker.py (1 hunks)
  • tests/test_suites/llm/grpo-gptoss-20b-8n8g-megatron.sh (1 hunks)
  • tests/test_suites/release.txt (1 hunks)
🧰 Additional context used
📓 Path-based instructions (8)
!(**/tests/**|**/test_*.py|**/test_*.sh)

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Add the NVIDIA copyright header to all Python files and shell scripts (excluding tests). The header should include the current year

Files:

  • 3rdparty/Megatron-Bridge-workspace/Megatron-Bridge
  • 3rdparty/Megatron-LM-workspace/Megatron-LM
  • tests/test_suites/release.txt
  • tests/test_suites/llm/grpo-gptoss-20b-8n8g-megatron.sh
  • examples/configs/recipes/llm/grpo-gptoss-20b-8n8g-megatron.yaml
  • nemo_rl/models/generation/vllm/vllm_worker.py
  • .gitmodules
  • nemo_rl/models/policy/megatron_policy_worker.py
**/*.sh

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.sh: Use uv run instead of python to execute scripts
Follow the Google Shell Style Guide for shell scripts

Files:

  • tests/test_suites/llm/grpo-gptoss-20b-8n8g-megatron.sh
tests/test_suites/**/*.sh

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

tests/test_suites/**/*.sh: When adding support for a new model, create a corresponding driver shell script under tests/test_suites/ in the matching domain
Driver shell scripts should match the YAML base name with .sh extension and invoke training entrypoint with uv run

Files:

  • tests/test_suites/llm/grpo-gptoss-20b-8n8g-megatron.sh
**/*.{py,sh}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

The NVIDIA copyright header should appear at the top of all Python files and shell scripts (excluding tests)

Files:

  • tests/test_suites/llm/grpo-gptoss-20b-8n8g-megatron.sh
  • nemo_rl/models/generation/vllm/vllm_worker.py
  • nemo_rl/models/policy/megatron_policy_worker.py
examples/configs/recipes/**/*.yaml

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

When adding support for a new model, create a recipe YAML under examples/configs/recipes/ in the appropriate domain subdirectory (llm, vlm, etc.)

Files:

  • examples/configs/recipes/llm/grpo-gptoss-20b-8n8g-megatron.yaml
examples/configs/recipes/llm/*.yaml

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Recipe YAML files should follow the naming pattern: --ng-[-modifiers][-long][.vN].yaml for LLM recipes

Files:

  • examples/configs/recipes/llm/grpo-gptoss-20b-8n8g-megatron.yaml
**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: Conform code to Python 3.12+
Indent code with 4 spaces. Do not use tabs
Use snake_case for file names
Use PascalCase for class names
Use snake_case for function and method names
Use snake_case for local variables
Prefix variable names that start with a number with 'k' (e.g., k_99th_percentile)
Use upper snake_case with 'G' prefix for global variables (e.g., G_MY_GLOBAL)
Use upper snake_case for constants
Avoid shadowing variables declared in an outer scope
Initialize all externally visible members of a class in the constructor
Prefer docstrings over comments for interfaces that may be used outside a file
Reserve comments for code within a function or interfaces that are local to a file
If a piece of code is commented out, include a comment describing its usage and why it's commented out. Remove debug comments before merging
Use Google style docstrings for classes and functions in Python, which can be parsed by Sphinx
Avoid using reflection when functionality can be easily achieved without reflection
When using try-except blocks, limit the except clause to the smallest set of specific errors possible
When using try-except blocks for duck-typing, keep the body of the try as small as possible and use the else block for logic
YAML is the single source of truth for configuration defaults. Do not set non-None defaults in code for configuration values
For required configuration attributes, access config directly and expect presence (e.g., policy_cfg['precision']) without hidden defaults
Use typing.NotRequired to mark optional attributes in TypedDict for configuration
When adding a new config key to a TypedDict subclass, document the key's purpose, valid values/types, and recommended default, and reflect the default in exemplar YAMLs under examples/configs/*.yaml
Follow the Google Python Style Guide for Python code

Files:

  • nemo_rl/models/generation/vllm/vllm_worker.py
  • nemo_rl/models/policy/megatron_policy_worker.py
nemo_rl/**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

For any source file under nemo_rl/*.py that defines a class or function decorated with @ray.remote, add a coverage pragma (# pragma: no cover) because these run in separate Ray processes

Files:

  • nemo_rl/models/generation/vllm/vllm_worker.py
  • nemo_rl/models/policy/megatron_policy_worker.py
🧠 Learnings (6)
📚 Learning: 2025-10-12T14:46:57.171Z
Learnt from: zpqiu
Repo: NVIDIA-NeMo/RL PR: 1324
File: tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-megatron-tp2pp2cp2-pack.sh:6-11
Timestamp: 2025-10-12T14:46:57.171Z
Learning: Test scripts in tests/test_suites/llm/ follow a standard configuration pattern that includes NUM_NODES, STEPS_PER_RUN, MAX_STEPS, NUM_RUNS (calculated as `$(( (MAX_STEPS + STEPS_PER_RUN - 1) / STEPS_PER_RUN ))`), and NUM_MINUTES. These variables are part of the test infrastructure's standard interface and should not be flagged as unused even if not directly referenced within the individual script, as they are consumed by external launch tooling or common.env.

Applied to files:

  • tests/test_suites/llm/grpo-gptoss-20b-8n8g-megatron.sh
📚 Learning: 2025-10-12T14:46:55.513Z
Learnt from: zpqiu
Repo: NVIDIA-NeMo/RL PR: 1324
File: tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-megatron-tp2pp2cp2-pack.sh:16-30
Timestamp: 2025-10-12T14:46:55.513Z
Learning: In the NVIDIA-NeMo/RL repository, test scripts under tests/ follow a consistent pattern: use `cd $PROJECT_ROOT` without quotes or error handling, and pass arguments with `$@` unquoted. Maintain this consistency when adding new test scripts.

Applied to files:

  • tests/test_suites/llm/grpo-gptoss-20b-8n8g-megatron.sh
📚 Learning: 2025-11-24T17:24:41.976Z
Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-11-24T17:24:41.976Z
Learning: Applies to tests/test_suites/**/*.sh : Driver shell scripts should match the YAML base name with .sh extension and invoke training entrypoint with uv run

Applied to files:

  • tests/test_suites/llm/grpo-gptoss-20b-8n8g-megatron.sh
📚 Learning: 2025-11-24T17:24:41.976Z
Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-11-24T17:24:41.976Z
Learning: Applies to examples/configs/recipes/llm/*.yaml : Recipe YAML files should follow the naming pattern: <algo>-<model>-<nodes>n<gpus>g-<strategy-and-params>[-modifiers][-long][.vN].yaml for LLM recipes

Applied to files:

  • examples/configs/recipes/llm/grpo-gptoss-20b-8n8g-megatron.yaml
📚 Learning: 2025-11-24T17:24:41.976Z
Learnt from: CR
Repo: NVIDIA-NeMo/RL PR: 0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-11-24T17:24:41.976Z
Learning: Applies to examples/configs/recipes/vlm/*.yaml : Recipe YAML files should follow the naming pattern: vlm_<algo>-<model>-<nodes>n<gpus>g-<strategy>[-modifiers][.vN].yaml for VLM recipes

Applied to files:

  • examples/configs/recipes/llm/grpo-gptoss-20b-8n8g-megatron.yaml
📚 Learning: 2025-09-19T03:00:58.662Z
Learnt from: shuo-nvidia
Repo: NVIDIA-NeMo/RL PR: 1006
File: examples/configs/recipes/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-fsdp2tp1.v1.yaml:85-101
Timestamp: 2025-09-19T03:00:58.662Z
Learning: In distillation and GRPO configurations, max_new_tokens is intentionally set to the full context window (max_total_sequence_length) for consistency across the codebase. Overflow cases when prompt + generation tokens exceed max_model_len are handled by safeguards implemented in vllm_worker.py.

Applied to files:

  • examples/configs/recipes/llm/grpo-gptoss-20b-8n8g-megatron.yaml
🪛 GitHub Actions: CICD NeMo RL
examples/configs/recipes/llm/grpo-gptoss-20b-8n8g-megatron.yaml

[error] 1-1: minimize-check-llm: YAML not minimized. Suggested fix: minimize the file in-place.

🪛 Ruff (0.14.7)
nemo_rl/models/policy/megatron_policy_worker.py

81-81: Do not catch blind exception: Exception

(BLE001)

🪛 Shellcheck (0.11.0)
tests/test_suites/llm/grpo-gptoss-20b-8n8g-megatron.sh

[warning] 6-6: NUM_NODES appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 9-9: NUM_RUNS appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 10-10: NUM_MINUTES appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 16-16: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)


[error] 28-28: Double quote array expansions to avoid re-splitting elements.

(SC2068)

🔇 Additional comments (5)
3rdparty/Megatron-LM-workspace/Megatron-LM (1)

1-1: Submodule pointer update only.

This file represents a git submodule reference update with no reviewable code logic. The AI summary indicates the PR adds GPT-OSS support through multiple implementation files (GRPO config, vLLM worker modifications, Megatron worker modifications, TransformerEngine patch, and test infrastructure), but only the submodule pointer is visible in this review context.

To provide a comprehensive review of the PR's changes, please include the following files mentioned in the AI summary:

  • examples/configs/recipes/llm/grpo-gptoss-20b-8n8g-megatron.yaml
  • nemo_rl/models/generation/vllm/vllm_worker.py
  • nemo_rl/models/policy/megatron_policy_worker.py
  • tests/test_suites/llm/grpo-gptoss-20b-8n8g-megatron.sh
  • .gitmodules
tests/test_suites/release.txt (1)

26-28: ✓ Test entry follows the established pattern.

The new GPT-OSS GRPO test entry is properly formatted and consistently integrated into the release test suite. The section header, script reference, and spacing align with existing entries.

3rdparty/Megatron-Bridge-workspace/Megatron-Bridge (1)

1-1: The submodule is correctly configured for the GPT-OSS branch.

The .gitmodules configuration confirms the Megatron-Bridge submodule is tracking the ashors/gpt-oss-tot branch as intended. The commit hash 1f1ba22275ab66433c6a61bcb93502432a91443f is a valid Git object properly registered as a submodule reference. No action is required.

tests/test_suites/llm/grpo-gptoss-20b-8n8g-megatron.sh (2)

1-40: Script structure and execution approach are sound.

The overall structure follows the established test suite pattern: proper environment sourcing, standard configuration variables (NUM_NODES, STEPS_PER_RUN, MAX_STEPS, NUM_RUNS, NUM_MINUTES), correct use of uv run for both main execution and post-processing, and metrics validation via jq. The metrics thresholds and evaluation logic align with the GRPO evaluation strategy. Minor fixes above aside, the script provides good coverage for validating the GPT-OSS GRPO configuration.

The SC2034 warnings (unused variables on lines 6, 9, 10) are expected and documented as false positives per established patterns—these variables are consumed by external test infrastructure and common.env sourcing, so no action is needed on those.


16-16: Add error handling to cd command.

Line 16 performs a directory change without error handling. If the directory does not exist or the change fails, the script continues executing commands in the wrong location, potentially causing cascading failures.

Apply this diff to add error handling:

-cd $PROJECT_ROOT
+cd $PROJECT_ROOT || exit 1
⛔ Skipped due to learnings
Learnt from: zpqiu
Repo: NVIDIA-NeMo/RL PR: 1324
File: tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-megatron-tp2pp2cp2-pack.sh:16-30
Timestamp: 2025-10-12T14:46:55.513Z
Learning: In the NVIDIA-NeMo/RL repository, test scripts under tests/ follow a consistent pattern: use `cd $PROJECT_ROOT` without quotes or error handling, and pass arguments with `$@` unquoted. Maintain this consistency when adding new test scripts.
Learnt from: zpqiu
Repo: NVIDIA-NeMo/RL PR: 1324
File: tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-megatron-tp2pp2cp2-pack.sh:6-11
Timestamp: 2025-10-12T14:46:57.171Z
Learning: Test scripts in tests/test_suites/llm/ follow a standard configuration pattern that includes NUM_NODES, STEPS_PER_RUN, MAX_STEPS, NUM_RUNS (calculated as `$(( (MAX_STEPS + STEPS_PER_RUN - 1) / STEPS_PER_RUN ))`), and NUM_MINUTES. These variables are part of the test infrastructure's standard interface and should not be flagged as unused even if not directly referenced within the individual script, as they are consumed by external launch tooling or common.env.

@github-actions
Copy link

github-actions bot commented Dec 5, 2025

✅ Submodule Fast-Forward Check Results

Check based on commit: 763c73f (PR #1452 from ashors/gpt-oss-tot)

✅ Submodules that are properly updated:

Megatron-Bridge: ✅ PR branch is ahead of main branch (fast-forward)
Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

All submodule changes look good! ✨

@github-actions
Copy link

github-actions bot commented Dec 5, 2025

✅ Submodule Fast-Forward Check Results

Check based on commit: d328112 (PR #1452 from ashors/gpt-oss-tot)

✅ Submodules that are properly updated:

Megatron-Bridge: ✅ PR branch is ahead of main branch (fast-forward)
Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

All submodule changes look good! ✨

@github-actions
Copy link

✅ Submodule Fast-Forward Check Results

Check based on commit: 0ec5c16 (PR #1452 from ashors/gpt-oss-tot)

✅ Submodules that are properly updated:

Megatron-Bridge: ✅ PR branch is ahead of main branch (fast-forward)
Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

All submodule changes look good! ✨

@github-actions
Copy link

❌ Submodule Fast-Forward Check Failed

Check based on commit: 5249e11 (PR #1452 from ashors/gpt-oss-tot)

❌ Submodules that need attention:

Megatron-LM: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/terrykong/Megatron-LM/commits/25a62edf77b5130f888328ca8119d6a76117cf23/
CURRENT (PR #1452 from ashors/gpt-oss-tot): https://github.com/terrykong/Megatron-LM/commits/7933a950e879cd1d5af35840be86f547901dd9ed/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

Signed-off-by: ashors1 <ashors@nvidia.com>
@terrykong terrykong enabled auto-merge (squash) December 17, 2025 01:17
Signed-off-by: ashors1 <ashors@nvidia.com>
@terrykong terrykong added the CI:L1 Run doctests, unit tests, and functional tests label Dec 17, 2025
Signed-off-by: Terry Kong <terryk@nvidia.com>
@terrykong terrykong added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Dec 17, 2025
@terrykong terrykong merged commit 441f745 into main Dec 17, 2025
40 of 41 checks passed
@terrykong terrykong deleted the ashors/gpt-oss-tot branch December 17, 2025 09:43
@ashors1 ashors1 mentioned this pull request Dec 18, 2025
DeL-TaiseiOzaki pushed a commit to DeL-TaiseiOzaki/RL that referenced this pull request Jan 8, 2026
Signed-off-by: ashors1 <ashors@nvidia.com>
parthmannan pushed a commit to parthmannan/RL that referenced this pull request Jan 15, 2026
Signed-off-by: ashors1 <ashors@nvidia.com>
Signed-off-by: Parth Mannan <pmannan@nvidia.com>
yuanhangsu1986 pushed a commit to yuanhangsu1986/RL-Nemontron-Edge-Omni that referenced this pull request Feb 12, 2026
Signed-off-by: ashors1 <ashors@nvidia.com>
Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
yuanhangsu1986 pushed a commit to yuanhangsu1986/RL-Nemontron-Edge-Omni that referenced this pull request Feb 21, 2026
Signed-off-by: ashors1 <ashors@nvidia.com>
Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
seonjinn pushed a commit that referenced this pull request Mar 8, 2026
Signed-off-by: ashors1 <ashors@nvidia.com>
seonjinn pushed a commit that referenced this pull request Mar 8, 2026
Signed-off-by: ashors1 <ashors@nvidia.com>
seonjinn pushed a commit that referenced this pull request Mar 9, 2026
Signed-off-by: ashors1 <ashors@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L1 Run doctests, unit tests, and functional tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gpt-oss Mcore support

2 participants