[Canonical LoRA] fix: use correct q_out_features for `linear_q` by HollowMan6 · Pull Request #1627 · NVIDIA-NeMo/Megatron-Bridge

HollowMan6 · 2025-12-07T12:50:07Z

What does this PR do ?

Fix the following error:

  File "megatron/core/transformer/transformer_layer.py", line 455, in forward
    hidden_states, context = self._forward_attention(*args, **kwargs)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "megatron/core/transformer/transformer_layer.py", line 529, in _forward_attention
    attention_output_with_bias = self.self_attention(
                                 ^^^^^^^^^^^^^^^^^^^^
  File "torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "megatron/core/transformer/attention.py", line 768, in forward
    qkv_output = self.get_query_key_value_tensors(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "megatron/core/transformer/attention.py", line 1151, in get_query_key_value_tensors
    mixed_qkv, _ = self.linear_qkv(hidden_states)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "Megatron-Bridge/src/megatron/bridge/peft/canonical_lora.py", line 88, in forward
    return linear_output + adapter_output, bias
           ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
RuntimeError: The size of tensor a (2560) must match the size of tensor b (1536) at non-singleton dimension 2

Changelog

Use m.config.kv_channels * m.config.num_attention_heads for calculating q_out_features instead of just directly using in_features

GitHub Actions CI

See the CI sectionin the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

Related to # (issue)

_{✨ Presented to you with Mind Lab - A Lab for Experiential Intelligence.}

Fix the following error: ```log File "megatron/core/transformer/transformer_layer.py", line 455, in forward hidden_states, context = self._forward_attention(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "megatron/core/transformer/transformer_layer.py", line 529, in _forward_attention attention_output_with_bias = self.self_attention( ^^^^^^^^^^^^^^^^^^^^ File "torch/nn/modules/module.py", line 1773, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "torch/nn/modules/module.py", line 1784, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "megatron/core/transformer/attention.py", line 768, in forward qkv_output = self.get_query_key_value_tensors( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "megatron/core/transformer/attention.py", line 1151, in get_query_key_value_tensors mixed_qkv, _ = self.linear_qkv(hidden_states) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "torch/nn/modules/module.py", line 1773, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "torch/nn/modules/module.py", line 1784, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "Megatron-Bridge/src/megatron/bridge/peft/canonical_lora.py", line 88, in forward return linear_output + adapter_output, bias ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~ RuntimeError: The size of tensor a (2560) must match the size of tensor b (1536) at non-singleton dimension 2 ``` Signed-off-by: Hollow Man <hollowman@opensuse.org>

copy-pr-bot · 2025-12-07T12:50:11Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

yaoyu-33 · 2025-12-07T19:14:46Z

/ok to test 290e577

…IA-NeMo#1627) Signed-off-by: Hollow Man <hollowman@opensuse.org>

### What does this PR do? Now that several important fixes have been merged into Megatron-Bridge, it's better to update the instructions so that everything can really work correctly. Related to: - NVIDIA-NeMo/Megatron-Bridge#1564 - NVIDIA-NeMo/Megatron-Bridge#1603 - NVIDIA-NeMo/Megatron-Bridge#1627 - NVIDIA-NeMo/Megatron-Bridge#1628 Fix: #4303 ### Checklist Before Starting - [X] Search for similar PRs. Paste at least one query link here: ... - [X] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [X] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [X] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [X] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [X] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [X] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) ✨ Presented to you with <a href="https://macaron.im/mindlab">Mind Lab</a> - A Lab for Experiential Intelligence. Signed-off-by: Hollow Man <hollowman@opensuse.org>

…roject#4533) ### What does this PR do? Now that several important fixes have been merged into Megatron-Bridge, it's better to update the instructions so that everything can really work correctly. Related to: - NVIDIA-NeMo/Megatron-Bridge#1564 - NVIDIA-NeMo/Megatron-Bridge#1603 - NVIDIA-NeMo/Megatron-Bridge#1627 - NVIDIA-NeMo/Megatron-Bridge#1628 Fix: verl-project#4303 ### Checklist Before Starting - [X] Search for similar PRs. Paste at least one query link here: ... - [X] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [X] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [X] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [X] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [X] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [X] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) ✨ Presented to you with <a href="https://macaron.im/mindlab">Mind Lab</a> - A Lab for Experiential Intelligence. Signed-off-by: Hollow Man <hollowman@opensuse.org>

### What does this PR do? Now that several important fixes have been merged into Megatron-Bridge, it's better to update the instructions so that everything can really work correctly. Related to: - NVIDIA-NeMo/Megatron-Bridge#1564 - NVIDIA-NeMo/Megatron-Bridge#1603 - NVIDIA-NeMo/Megatron-Bridge#1627 - NVIDIA-NeMo/Megatron-Bridge#1628 Fix: verl-project/verl#4303 ### Checklist Before Starting - [X] Search for similar PRs. Paste at least one query link here: ... - [X] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [X] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [X] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [X] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [X] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [X] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) ✨ Presented to you with <a href="https://macaron.im/mindlab">Mind Lab</a> - A Lab for Experiential Intelligence. Signed-off-by: Hollow Man <hollowman@opensuse.org>

github-actions bot added the community-request label Dec 7, 2025

HollowMan6 mentioned this pull request Dec 7, 2025

[LoRA] Fix LoRA merge and support CanonicalLoRA merge #1603

Merged

5 tasks

yaoyu-33 approved these changes Dec 7, 2025

View reviewed changes

copy-pr-bot bot temporarily deployed to nemo-ci December 7, 2025 19:15 Inactive

yaoyu-33 enabled auto-merge (squash) December 7, 2025 19:15

copy-pr-bot bot temporarily deployed to test December 7, 2025 19:15 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci December 7, 2025 21:13 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci December 7, 2025 21:24 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci December 7, 2025 21:30 Inactive

yaoyu-33 merged commit 322df45 into NVIDIA-NeMo:main Dec 8, 2025
47 checks passed

matthew-frank pushed a commit to matthew-frank/Megatron-Bridge that referenced this pull request Dec 8, 2025

[Canonical LoRA] fix: use correct q_out_features for linear_q (NVID…

4758d40

…IA-NeMo#1627) Signed-off-by: Hollow Man <hollowman@opensuse.org>

HollowMan6 deleted the canonical_lora_q branch December 8, 2025 21:32

HollowMan6 mentioned this pull request Dec 15, 2025

[megatron,ci] chore: update instructions and scripts for LoRA verl-project/verl#4533

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Canonical LoRA] fix: use correct q_out_features for `linear_q`#1627

[Canonical LoRA] fix: use correct q_out_features for `linear_q`#1627
yaoyu-33 merged 1 commit intoNVIDIA-NeMo:mainfrom
HollowMan6:canonical_lora_q

HollowMan6 commented Dec 7, 2025 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Dec 7, 2025

Uh oh!

yaoyu-33 commented Dec 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

HollowMan6 commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Changelog

GitHub Actions CI

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Dec 7, 2025

Uh oh!

yaoyu-33 commented Dec 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HollowMan6 commented Dec 7, 2025 •

edited

Loading