Skip to content

feat: add Megatron support for on-policy distillation#1324

Merged
terrykong merged 27 commits intoNVIDIA-NeMo:mainfrom
zpqiu:feat-distillation-mcore
Oct 21, 2025
Merged

feat: add Megatron support for on-policy distillation#1324
terrykong merged 27 commits intoNVIDIA-NeMo:mainfrom
zpqiu:feat-distillation-mcore

Conversation

@zpqiu
Copy link
Contributor

@zpqiu zpqiu commented Oct 9, 2025

This pull request introduces support for Megatron-based training in the on-policy distillation pipeline. It adds new configuration files to enable Megatron parallelism and sequence packing, and removes previous restrictions against Megatron.

Issues

List issues that this PR closes (syntax):

Closes #1151

Usage

  • You can potentially add a usage example below
uv run examples/run_distillation_math.py --config examples/configs/distillation_math_megatron.yaml

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

Alignment Experiments between DTensor and Megatron Pathes

  • Blue: DTenor path, baseline
  • Green: Megatron, CP + Sequence Packing + TP
  • Red: Megatron, CP + Sequence Packing + TP + PP
截屏2025-10-12 22 24 09

Summary by CodeRabbit

  • New Features

    • Enabled Megatron backend for distillation (student and teacher) with scheduler steps auto-aligned to run length.
    • Implemented top‑k logits for Megatron policies with support for sequence packing, tensor/pipeline/context parallelism.
  • Documentation

    • Removed outdated limitation indicating Megatron was unsupported.
  • Examples

    • Added comprehensive Megatron distillation configs and a Qwen3 32B→1.7B recipe.
  • Tests

    • Added functional and unit tests for Megatron distillation and top‑k; included a new nightly test entry and slightly increased the nightly GPU‑hour threshold.

Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
@zpqiu zpqiu linked an issue Oct 9, 2025 that may be closed by this pull request
zpqiu added 5 commits October 10, 2025 02:46
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
@zpqiu zpqiu added the CI:L1 Run doctests, unit tests, and functional tests label Oct 11, 2025
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
@zpqiu zpqiu added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Oct 11, 2025
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
@zpqiu zpqiu added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Oct 11, 2025
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
@zpqiu zpqiu added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Oct 11, 2025
zpqiu added 2 commits October 12, 2025 05:57
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
@zpqiu zpqiu added the CI:L1 Run doctests, unit tests, and functional tests label Oct 17, 2025
@zpqiu zpqiu requested review from terrykong and yuki-97 October 20, 2025 15:09
@zpqiu zpqiu added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Oct 20, 2025
@terrykong terrykong merged commit 73c8725 into NVIDIA-NeMo:main Oct 21, 2025
41 of 42 checks passed
chtruong814 pushed a commit that referenced this pull request Oct 21, 2025
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: alexchiu <qiuzhaopeng@foxmail.com>
Signed-off-by: alexchiu <alexq@nvidia.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
zpqiu added a commit to zpqiu/NeMo-RL that referenced this pull request Oct 26, 2025
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: alexchiu <qiuzhaopeng@foxmail.com>
Signed-off-by: alexchiu <alexq@nvidia.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Yuki Huang <yukih@nvidia.com>
lbliii pushed a commit that referenced this pull request Nov 3, 2025
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: alexchiu <qiuzhaopeng@foxmail.com>
Signed-off-by: alexchiu <alexq@nvidia.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>
@coderabbitai coderabbitai bot mentioned this pull request Nov 27, 2025
4 tasks
PrinsYin pushed a commit to PrinsYin/RL that referenced this pull request Nov 30, 2025
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: alexchiu <qiuzhaopeng@foxmail.com>
Signed-off-by: alexchiu <alexq@nvidia.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Yuki Huang <yukih@nvidia.com>
@coderabbitai coderabbitai bot mentioned this pull request Feb 12, 2026
4 tasks
yuanhangsu1986 pushed a commit to yuanhangsu1986/RL-Nemontron-Edge-Omni that referenced this pull request Feb 21, 2026
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: alexchiu <qiuzhaopeng@foxmail.com>
Signed-off-by: alexchiu <alexq@nvidia.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L1 Run doctests, unit tests, and functional tests r0.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request]Megatron support for On-policy distillation

7 participants