[fix] Example scripts miscellaneous enhancement by cuichenx · Pull Request #2362 · NVIDIA-NeMo/Megatron-Bridge

cuichenx · 2026-02-12T23:18:21Z

What does this PR do ?

Fix MTP example script.
Add note to upgrade to transformers v5 and not use uv run in ministral example scripts.
Add note to set WandB token or disable WandB.

Changelog

Add specific line by line info of high level changes in this PR.

GitHub Actions CI

See the CI sectionin the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

Related to # (issue)

Summary by CodeRabbit

Documentation
- Updated multi-token prediction training guide with configuration adjustments and setup instructions.
- Added setup guidance to training scripts, including WANDB API key configuration and Transformers v5 version requirements.
Chores
- Simplified launcher commands in example scripts for improved execution clarity.

Signed-off-by: Chen Cui <chcui@nvidia.com>

coderabbitai · 2026-02-12T23:22:03Z

📝 Walkthrough

Walkthrough

Updates documentation for multi-token prediction training with new import paths and configuration structure, moves MTP config keys under cfg.model, adds forward step parameter to training invocation, and adds Weights & Biases setup guidance comments to multiple VLM example scripts along with launcher command updates.

Changes

Cohort / File(s)	Summary
Multi-Token Prediction Documentation `docs/training/multi-token-prediction.md`	Updated import path to use `qwen3_30b_a3b_pretrain_config`, relocated MTP configuration keys from top-level `cfg` to `cfg.model`, added `forward_step` parameter to training invocation, and adjusted example configuration initialization.
VLM Script Wandb Guidance `examples/models/vlm/gemma3_vl/peft.sh`, `examples/models/vlm/gemma3_vl/sft.sh`, `examples/models/vlm/glm_45v/slurm_peft.sh`, `examples/models/vlm/glm_45v/slurm_sft.sh`, `examples/models/vlm/qwen3_vl/peft.sh`, `examples/models/vlm/qwen3_vl/sft.sh`	Added comment blocks instructing users to set `WANDB_API_KEY` or disable wandb logging before training; no functional changes.
Ministral3 Script Updates `examples/models/vlm/ministral3/conversion.sh`, `examples/models/vlm/ministral3/inference.sh`, `examples/models/vlm/ministral3/peft.sh`, `examples/models/vlm/ministral3/sft.sh`	Replaced `uv run` launcher invocations with direct `python` or `torchrun` commands, added comments noting Transformers v5 requirement and wandb guidance; hyperparameters added to peft.sh training call.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Possibly related PRs

[model] fix: correct GLM-4.5V inference parallelism for 46-layer model #2322: Modifies MTP configuration by setting m.config.mtp_num_layers at load time during inference, directly related to the relocation of MTP config keys to cfg.model in this PR.
[docs] add MTP guide #2138: Originally added the MTP training documentation that this PR now updates with new configuration structure and import paths.
cp: [docs, model] Add Ministral 3 Examples (2139) into r0.3.0 #2204: Modifies the same Ministral3 example scripts for launcher and configuration updates as this PR.

Suggested reviewers

kamran-nvidia
yaoyu-33
ko3n1g

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title is vague and generic, using the non-descriptive term 'miscellaneous enhancement' that does not clearly convey the specific changes made (MTP example fixes, Ministral 3 launcher updates, and WandB setup guidance).	Use a more specific title that describes the primary changes, such as 'Fix MTP example and update launcher commands for Ministral 3 scripts' or include the main objective more clearly.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`
Test Results For Major Changes	✅ Passed	Changes are minor documentation updates, comments, and example enhancements without affecting core training algorithms or numerics.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch chcui/misc_fixes

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@examples/models/vlm/ministral3/conversion.sh`:
- Around line 23-35: The script uses bare `python` to run the conversion and
round-trip commands (the `convert_checkpoints.py import` and
`convert_checkpoints.py export` invocations and the `python -m
torch.distributed.run ...` call), which deviates from the repo guideline to use
`uv run`; either change those three invocations to use `uv run python ...` or,
if there is a legitimate reason to call `python` directly (e.g., specific
interpreter/virtualenv requirements for distributed launch), add a brief inline
comment above each command explaining the intentional deviation so it is not
auto-corrected later.

In `@examples/models/vlm/ministral3/sft.sh`:
- Line 48: Replace the direct use of "torchrun" in the launch command in the
scripts/training invocation with the project's required "uv run" form (e.g., use
"uv run python -m torch.distributed.run" or "uv run torchrun" when invoking the
distributed launcher in the sft.sh recipe), updating the command that currently
calls torchrun in the line with scripts/training/run_recipe.py; if there is a
known incompatibility with "uv run" and transformers v5, instead keep the
existing torchrun but add a clear inline comment immediately above that command
explaining the exact incompatibility and why "uv run" cannot be used so future
contributors won't revert it.

🧹 Nitpick comments (3)

examples/models/vlm/glm_45v/slurm_sft.sh (1)

47-49: Minor duplication with existing WANDB guidance at line 96.

The new comment block duplicates the WANDB_API_KEY guidance already present in the "Authentication tokens" section (line 96). Not a blocker for an example script, but consider consolidating to avoid drift between the two locations.

examples/models/vlm/glm_45v/slurm_peft.sh (1)

47-49: Minor duplication with existing WANDB guidance at line 96.

Same as slurm_sft.sh — the auth tokens section already documents WANDB_API_KEY. Consider consolidating.
examples/models/vlm/ministral3/inference.sh (1)
19-20: Nit: Consider specifying the minimum version.

The comment says "requires transformers version 5" but the pip command would install whatever latest is available. Consider making it more explicit:
 # Note: Ministral 3 requires transformers version 5
-# pip install --upgrade transformers
+# pip install "transformers>=5"

examples/models/vlm/ministral3/conversion.sh

examples/models/vlm/ministral3/sft.sh

Signed-off-by: Chen Cui <chcui@nvidia.com>

cuichenx added 3 commits February 12, 2026 15:09

fix example script in mtp doc

7b30fd2

Signed-off-by: Chen Cui <chcui@nvidia.com>

add transformers v5 note to ministral scripts

c4abbb9

Signed-off-by: Chen Cui <chcui@nvidia.com>

add note on wandb in training scripts

8e1c9bc

Signed-off-by: Chen Cui <chcui@nvidia.com>

cuichenx added the r0.3.0 Cherry-pick label for r0.3.0 release branch label Feb 12, 2026

copy-pr-bot bot had a problem deploying to nemo-ci February 12, 2026 23:18 Error

copy-pr-bot bot had a problem deploying to test February 12, 2026 23:19 Error

cuichenx added the docs-only With great power comes great responsibility. label Feb 12, 2026

coderabbitai bot reviewed Feb 12, 2026

View reviewed changes

examples/models/vlm/ministral3/conversion.sh Outdated Show resolved Hide resolved

examples/models/vlm/ministral3/sft.sh Outdated Show resolved Hide resolved

add note why uv run is not used

604f928

Signed-off-by: Chen Cui <chcui@nvidia.com>

copy-pr-bot bot temporarily deployed to nemo-ci February 12, 2026 23:26 Inactive

remove duplicate wandb

7c60891

Signed-off-by: Chen Cui <chcui@nvidia.com>

copy-pr-bot bot temporarily deployed to nemo-ci February 12, 2026 23:28 Inactive

typo

6695869

Signed-off-by: Chen Cui <chcui@nvidia.com>

copy-pr-bot bot temporarily deployed to nemo-ci February 12, 2026 23:28 Inactive

cuichenx requested a review from yaoyu-33 February 12, 2026 23:29

use uv run no sync

93d6f35

Signed-off-by: Chen Cui <chcui@nvidia.com>

copy-pr-bot bot temporarily deployed to nemo-ci February 12, 2026 23:51 Inactive

yaoyu-33 approved these changes Feb 13, 2026

View reviewed changes

yaoyu-33 merged commit 65c02a5 into main Feb 13, 2026
24 checks passed

yaoyu-33 deleted the chcui/misc_fixes branch February 13, 2026 01:41

cuichenx added a commit that referenced this pull request Feb 14, 2026

[fix] Example scripts miscellaneous enhancement (#2362)

ac24d05

This was referenced Feb 14, 2026

[docs] Cherrypick 2267 2362 #2382

Merged

[misc] fix: Fix CLI_OVERRIDES quoting in slurm scripts and use uv wrapper #2467

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix] Example scripts miscellaneous enhancement#2362

[fix] Example scripts miscellaneous enhancement#2362
yaoyu-33 merged 7 commits intomainfrom
chcui/misc_fixes

cuichenx commented Feb 12, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 12, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cuichenx commented Feb 12, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Changelog

GitHub Actions CI

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 12, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cuichenx commented Feb 12, 2026 •

edited by coderabbitai bot

Loading