cp: `feat: default packed_sequence to True in all finetune recipes (2284)` into `r0.3.0` by ko3n1g · Pull Request #2332 · NVIDIA-NeMo/Megatron-Bridge

ko3n1g · 2026-02-11T17:56:03Z

beep boop [🤖]: Hi @yaoyu-33 👋,

we've cherry picked #2284 into  for you! 🚀

Please review and approve this cherry pick by your convenience!

Summary by CodeRabbit

Release Notes

Configuration Changes
- Sequence packing is now enabled by default during finetuning for most model architectures (Gemma, Qwen, Llama, GPT-OSS, Nemotron, Moonlight, and OLMoE), improving training efficiency and memory utilization.
- GLM 4.5 continues to use sequence packing disabled by default; users can override this setting per their requirements.

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>

copy-pr-bot · 2026-02-11T17:56:07Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

ko3n1g · 2026-02-11T17:56:07Z

/ok to test dca08ae

coderabbitai · 2026-02-11T18:00:18Z

📝 Walkthrough

Walkthrough

This PR updates the default value of the packed_sequence parameter from False to True across multiple model recipe finetune helper functions and the default_squad_config utility. The GLM4.5 recipe adds an assertion to enforce packed_sequence as False despite the signature change. One test file adds a mock attribute.

Changes

Cohort / File(s)	Summary
Gemma and Gemma3 Recipes `src/megatron/bridge/recipes/gemma/gemma2.py`, `src/megatron/bridge/recipes/gemma/gemma3.py`	Updated `packed_sequence` parameter default from `False` to `True` in finetune helper functions.
GLM4.5 Recipe `src/megatron/bridge/recipes/glm/glm45.py`	Added `packed_sequence: bool` to `GLM45FinetuneKwargs`, changed default in `_glm45_finetune_common` to `True`, and added runtime assertion (`assert not packed_sequence`) to enforce `False` behavior. Finetune config helpers explicitly set `packed_sequence` to `False`.
GPT-OSS Recipe `src/megatron/bridge/recipes/gpt_oss/gpt_oss.py`	Updated `packed_sequence` parameter default from `False` to `True` in `_gpt_oss_finetune_common`.
Llama Recipe `src/megatron/bridge/recipes/llama/llama3.py`	Updated `packed_sequence` parameter default from `False` to `True` in `_llama3_finetune_common`.
Moonlight Recipe `src/megatron/bridge/recipes/moonlight/moonlight_16b.py`	Updated `packed_sequence` parameter default from `False` to `True` in `_moonlight_finetune_common`.
Nemotron Recipes `src/megatron/bridge/recipes/nemotronh/nemotron_3_nano.py`, `src/megatron/bridge/recipes/nemotronh/nemotronh.py`	Updated `packed_sequence` parameter default from `False` to `True` in finetune helper functions.
OLMoE Recipe `src/megatron/bridge/recipes/olmoe/olmoe_7b.py`	Updated `packed_sequence` parameter default from `False` to `True` in `_olmoe_finetune_common`.
Qwen Recipes `src/megatron/bridge/recipes/qwen/qwen2.py`, `src/megatron/bridge/recipes/qwen/qwen3.py`, `src/megatron/bridge/recipes/qwen/qwen3_moe.py`, `src/megatron/bridge/recipes/qwen/qwen3_next.py`	Updated `packed_sequence` parameter default from `False` to `True` in finetune helper functions. Qwen3-Next also explicitly sets `packed_sequence: False` in recommended config with documentation note.
Finetune Utilities `src/megatron/bridge/recipes/utils/finetune_utils.py`	Updated `default_squad_config` function signature to default `packed_sequence` parameter to `True` instead of `False`.
Test Updates `tests/unit_tests/recipes/test_qwen_recipes.py`	Added `context_parallel_size = 1` attribute to mock model configuration used by Qwen test suite.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

feat: default packed_sequence to True in all finetune recipes #2284 — Makes identical default value changes for packed_sequence across the same set of finetune helper functions.
Nemotron-3-Nano Model Support #1914 — Modifies the _nemotron_3_nano_finetune_common function that was introduced by Nemotron-3-Nano support.

Suggested labels

r0.3.0

Suggested reviewers

yaoyu-33
cuichenx

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Test Results For Major Changes	⚠️ Warning	PR changes default packed_sequence from False to True across 13+ recipe files affecting training behavior, but lacks test results, convergence validation, or performance metrics.	Add test results, convergence validation, and performance metrics to PR documentation demonstrating no regressions across affected models.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly describes the main change: defaulting packed_sequence to True across all finetune recipes. It is specific, concise, and directly reflects the primary purpose of the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch cherry-pick-2284-r0.3.0

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments

src/megatron/bridge/recipes/glm/glm45.py (1)
417-417: Default packed_sequence=True immediately triggers the assertion on line 446.

The signature defaults to True, but line 446 asserts not packed_sequence. Any direct call to _glm45_finetune_common() without explicitly passing packed_sequence=False will raise AssertionError. While the public configs both override this to False, the contradictory default is confusing and fragile.

Consider defaulting to False here (unlike the other recipes) since GLM 4.5 doesn't support packing, and preferably use raise ValueError(...) instead of assert for input validation that should not be stripped by -O.
Suggested fix
-    packed_sequence: bool = True,
+    packed_sequence: bool = False,
And on line 446:
-    assert not packed_sequence, "Packed sequence is not supported for GLM 4.5 finetuning"
+    if packed_sequence:
+        raise ValueError("Packed sequence is not supported for GLM 4.5 finetuning")
Also applies to: 446-446

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

cuichenx · 2026-02-17T18:56:20Z

need thorough testing before merging

ko3n1g · 2026-02-26T20:38:57Z

merged via the finetune PR

feat: default packed_sequence to True in all finetune recipes (#2284)

dca08ae

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>

ko3n1g requested a review from yaoyu-33 February 11, 2026 17:56

ko3n1g added cherry-pick Run CICD labels Feb 11, 2026

copy-pr-bot bot temporarily deployed to nemo-ci February 11, 2026 17:56 Inactive

copy-pr-bot bot temporarily deployed to test February 11, 2026 17:56 Inactive

copy-pr-bot bot temporarily deployed to public February 11, 2026 18:04 Inactive

ko3n1g marked this pull request as draft February 11, 2026 18:22

copy-pr-bot bot had a problem deploying to nemo-ci February 11, 2026 18:36 Failure

copy-pr-bot bot temporarily deployed to nemo-ci February 13, 2026 22:16 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 13, 2026 22:25 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 13, 2026 22:36 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci February 13, 2026 22:36 Failure

copy-pr-bot bot temporarily deployed to nemo-ci February 13, 2026 22:36 Inactive

cuichenx marked this pull request as ready for review February 17, 2026 03:07

copy-pr-bot bot temporarily deployed to nemo-ci February 17, 2026 03:08 Inactive

cuichenx approved these changes Feb 17, 2026

View reviewed changes

cuichenx marked this pull request as draft February 17, 2026 18:55

ko3n1g closed this Feb 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cp: `feat: default packed_sequence to True in all finetune recipes (2284)` into `r0.3.0`#2332

cp: `feat: default packed_sequence to True in all finetune recipes (2284)` into `r0.3.0`#2332
ko3n1g wants to merge 1 commit intor0.3.0from
cherry-pick-2284-r0.3.0

ko3n1g commented Feb 11, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Feb 11, 2026

Uh oh!

ko3n1g commented Feb 11, 2026

Uh oh!

coderabbitai bot commented Feb 11, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

cuichenx commented Feb 17, 2026

Uh oh!

ko3n1g commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ko3n1g commented Feb 11, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

copy-pr-bot bot commented Feb 11, 2026

Uh oh!

ko3n1g commented Feb 11, 2026

Uh oh!

coderabbitai bot commented Feb 11, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

cuichenx commented Feb 17, 2026

Uh oh!

ko3n1g commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ko3n1g commented Feb 11, 2026 •

edited by coderabbitai bot

Loading