Skip to content

remove deprecated MP params#2511

Merged
ko3n1g merged 2 commits intomainfrom
dpykhtar/remove_mp_args
Feb 25, 2026
Merged

remove deprecated MP params#2511
ko3n1g merged 2 commits intomainfrom
dpykhtar/remove_mp_args

Conversation

@dimapihtar
Copy link
Contributor

@dimapihtar dimapihtar commented Feb 24, 2026

What does this PR do ?

Removes deprecated parameters from ModelParallel config as they were removed in MLM.

Changelog

  • Add specific line by line info of high level changes in this PR.

GitHub Actions CI

See the CI sectionin the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

  • Related to # (issue)

Summary by CodeRabbit

  • Updates

    • Updated Megatron-LM submodule to latest version
  • Bug Fixes

    • Corrected distributed training parallelization configuration to use proper defaults instead of forced overrides
    • Improved model parallelization settings across multiple model providers
    • Fixed mixture of experts configuration handling

dimapihtar and others added 2 commits February 24, 2026 03:59
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Signed-off-by: dimapihtar <dpihtar@gmail.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 24, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 24, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e8896cd and f2ccbb6.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • 3rdparty/Megatron-LM
  • src/megatron/bridge/models/deepseek/deepseek_v2_bridge.py
  • src/megatron/bridge/models/deepseek/deepseek_v3_bridge.py
  • src/megatron/bridge/models/kimi/kimi_provider.py
  • src/megatron/bridge/models/model_provider.py
  • src/megatron/bridge/models/qwen_vl/qwen3_vl_provider.py
  • src/megatron/bridge/recipes/nemotronh/nemotron_3_nano.py
  • src/megatron/bridge/training/model_load_save.py
  • tests/unit_tests/peft/test_utils.py
  • tests/unit_tests/training/test_model_load_save.py
💤 Files with no reviewable changes (9)
  • tests/unit_tests/peft/test_utils.py
  • src/megatron/bridge/models/model_provider.py
  • src/megatron/bridge/models/qwen_vl/qwen3_vl_provider.py
  • src/megatron/bridge/models/kimi/kimi_provider.py
  • src/megatron/bridge/models/deepseek/deepseek_v2_bridge.py
  • tests/unit_tests/training/test_model_load_save.py
  • src/megatron/bridge/models/deepseek/deepseek_v3_bridge.py
  • src/megatron/bridge/recipes/nemotronh/nemotron_3_nano.py
  • src/megatron/bridge/training/model_load_save.py

📝 Walkthrough

Walkthrough

Removes the async_tensor_model_parallel_allreduce configuration flag from multiple model provider bridges and training recipes, eliminates the moe_extended_tp field from ModelParallelKwargs and related code, and updates the Megatron-LM submodule pointer. These are configuration cleanup changes that streamline model parallel settings across the codebase.

Changes

Cohort / File(s) Summary
Submodule Update
3rdparty/Megatron-LM
Updated Megatron-LM git submodule pointer from commit 3d1a4ba to 23dd639.
Bridge Providers - Async Tensor Model Parallel Allreduce Removal
src/megatron/bridge/models/deepseek/deepseek_v2_bridge.py, src/megatron/bridge/models/deepseek/deepseek_v3_bridge.py, src/megatron/bridge/models/kimi/kimi_provider.py, src/megatron/bridge/models/qwen_vl/qwen3_vl_provider.py
Removed async_tensor_model_parallel_allreduce field/assignment from provider configurations across DeepSeek V2/V3, Kimi, and Qwen3VL model providers.
Training Configuration - MoE Extended TP Removal
src/megatron/bridge/models/model_provider.py, src/megatron/bridge/training/model_load_save.py
Removed moe_extended_tp field from ModelParallelKwargs TypedDict and eliminated forced False assignment in load_megatron_model function.
Recipes - Async Tensor Model Parallel Allreduce Removal
src/megatron/bridge/recipes/nemotronh/nemotron_3_nano.py
Removed explicit enabling of async_tensor_model_parallel_allreduce from pretrain and finetune configuration paths.
Test Utilities
tests/unit_tests/peft/test_utils.py, tests/unit_tests/training/test_model_load_save.py
Removed async_tensor_model_parallel_allreduce initialization from MockModelParallelConfig and removed moe_extended_tp assignments and assertions from model_load_save tests.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

Run CICD

Suggested reviewers

  • ananthsub
  • kamran-nvidia
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'remove deprecated MP params' clearly and concisely describes the main change in the pull request: removing deprecated ModelParallel parameters across multiple configuration files and providers.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Test Results For Major Changes ✅ Passed PR qualifies as minor changes with straightforward removal of 12 lines across 9 files—deprecated configuration parameters removal with no new logic, API changes, or complex refactoring.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dpykhtar/remove_mp_args

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ko3n1g
Copy link
Contributor

ko3n1g commented Feb 24, 2026

/ok to test f2ccbb6

@ko3n1g ko3n1g merged commit b058b66 into main Feb 25, 2026
56 checks passed
@ko3n1g ko3n1g deleted the dpykhtar/remove_mp_args branch February 25, 2026 10:36
pengdurice pushed a commit to pengdurice/Megatron-Bridge that referenced this pull request Feb 26, 2026
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Signed-off-by: dimapihtar <dpihtar@gmail.com>
Signed-off-by: pengdurice <pengduhit@gmail.com>
pengdurice pushed a commit to pengdurice/Megatron-Bridge that referenced this pull request Feb 26, 2026
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Signed-off-by: dimapihtar <dpihtar@gmail.com>
Signed-off-by: pengdurice <pengduhit@gmail.com>
copy-pr-bot bot pushed a commit that referenced this pull request Mar 19, 2026
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Signed-off-by: dimapihtar <dpihtar@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants