Support Qwen3-ASR Megatron Bridge by yuekaizhang · Pull Request #2836 · NVIDIA-NeMo/Megatron-Bridge

yuekaizhang · 2026-03-17T13:36:18Z

What does this PR do ?

Support https://github.com/QwenLM/Qwen3-ASR in M-bridge.

Changelog

Add specific line by line info of high level changes in this PR.

GitHub Actions CI

See the CI sectionin the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

Related to # (issue)

Summary by CodeRabbit

Release Notes

New Features
- Added support for Qwen3-ASR model, enabling audio speech recognition capabilities.
- Enhanced model architecture lookup with fallback mechanism for custom and string-registered models.
Tests
- Added comprehensive unit tests for the new ASR model functionality, including model freezing, embeddings, and tensor handling.

- Add Qwen3-ASR model bridge with audio encoder, thinker model, and RoPE - Add transformer config conversion for Qwen3-ASR - Add provider bridge with kv_channels support - Fix auto_bridge to support custom models not in transformers - Fix bridge registration to use string-based source - Fix dtype mismatch in audio encoder forward pass - Add unit tests for Qwen3-ASR model Signed-off-by: zhangyuekai <zhangyuekai@foxmail.com> Signed-off-by: root <root@h20-2.cm.cluster>

copy-pr-bot · 2026-03-17T13:36:22Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

yuekaizhang · 2026-03-17T13:37:40Z

Megatron Inference Results:

AUDIO_URL_3="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen2-Audio/audio/1272-128104-0000.flac"

assistant\nlanguage English<asr_text>Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel.

src/megatron/bridge/models/qwen3_asr/qwen3_asr_provider.py

coderabbitai · 2026-03-17T13:55:30Z

📝 Walkthrough

Walkthrough

This pull request introduces Qwen3-ASR (automatic speech recognition) support to the Megatron framework. It adds a complete new model implementation that combines a HuggingFace audio encoder with a Qwen3-based language model, includes a bridge for converting HuggingFace Qwen3-ASR models, and integrates the new architecture into the framework with fallback registry support.

Changes

Cohort / File(s)	Summary
ASR Model Implementation `src/megatron/bridge/models/qwen3_asr/modeling_qwen3_asr/__init__.py`, `src/megatron/bridge/models/qwen3_asr/modeling_qwen3_asr/model.py`, `src/megatron/bridge/models/qwen3_asr/modeling_qwen3_asr/thinker_model.py`, `src/megatron/bridge/models/qwen3_asr/modeling_qwen3_asr/rope.py`, `src/megatron/bridge/models/qwen3_asr/modeling_qwen3_asr/transformer_config.py`	New model components for Qwen3-ASR: wrapper model (Qwen3ASRModel), combined audio+language thinker model with audio encoder integration, MRope position indexing, and ASR-specific transformer configuration with audio token IDs and multimodal rope settings.
ASR Bridge and Provider `src/megatron/bridge/models/qwen3_asr/__init__.py`, `src/megatron/bridge/models/qwen3_asr/qwen3_asr_bridge.py`, `src/megatron/bridge/models/qwen3_asr/qwen3_asr_provider.py`	Bridge registration, HuggingFace-to-Megatron model conversion with parameter mapping (QKV, gated MLP), and provider for constructing Qwen3-ASR models with audio encoder, language model, parallelism, and freezing configurations.
Framework Integration `src/megatron/bridge/models/__init__.py`, `src/megatron/bridge/models/conversion/auto_bridge.py`	Exposed new ASR entities (Qwen3ASRBridge, Qwen3ASRModel, Qwen3ASRModelProvider) in public API; added fallback registry lookup in AutoBridge for custom/string-registered architectures to support models not directly in transformers.
Unit Tests `tests/unit_tests/models/qwen3_asr/modeling_qwen3_asr/test_qwen3_asr_model.py`	Test suite covering model instantiation, freezing API with language/audio parameter control, shared embedding access, and input tensor routing for pre/post-processing stages.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Model Forward
    participant ThinkerModel as Qwen3ASRThinkerModel
    participant AudioEnc as Audio Encoder
    participant LanguageModel as Qwen3VLGPTModel
    participant RopeIdx as RopeIndex
    
    Client->>ThinkerModel: forward(input_ids, input_features, ...)
    activate ThinkerModel
    
    ThinkerModel->>RopeIdx: get_rope_index(input_ids, attention_mask)
    RopeIdx-->>ThinkerModel: position_ids, mrope_deltas
    
    ThinkerModel->>AudioEnc: get_audio_features(input_features)
    activate AudioEnc
    AudioEnc-->>ThinkerModel: audio_embeddings
    deactivate AudioEnc
    
    ThinkerModel->>ThinkerModel: merge audio embeddings at audio token positions
    ThinkerModel->>LanguageModel: forward(decoder_input, position_ids, attention_mask, labels, ...)
    activate LanguageModel
    LanguageModel-->>ThinkerModel: output logits/loss
    deactivate LanguageModel
    
    ThinkerModel-->>Client: model output
    deactivate ThinkerModel

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 79.31% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Results For Major Changes	⚠️ Warning	PR introduces major Qwen3-ASR changes but lacks test execution results, performance metrics, and complete changelog documentation.	Update PR description with test results, performance analysis, complete changelog, and address architectural concerns before marking ready for review.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Support Qwen3-ASR Megatron Bridge' clearly and concisely describes the main change—adding support for Qwen3-ASR in the Megatron-Bridge repository.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

CodeRabbit can approve the review once all CodeRabbit's comments are resolved.

Enable the reviews.request_changes_workflow setting to automatically approve the review once all CodeRabbit's comments are resolved.

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (3)

tests/unit_tests/models/qwen3_asr/modeling_qwen3_asr/test_qwen3_asr_model.py (1)

84-85: Add the unit-test marker.

These new tests live under tests/unit_tests/..., but the class is only tagged with timeout.

Suggested fix

-@pytest.mark.timeout(30)
+@pytest.mark.unit
+@pytest.mark.timeout(30)
 class TestQwen3ASRModel:

As per coding guidelines, `tests/**/*.py`: Use pytest markers to categorize tests (unit, integration, system).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unit_tests/models/qwen3_asr/modeling_qwen3_asr/test_qwen3_asr_model.py`
around lines 84 - 85, The test class TestQwen3ASRModel is missing the unit test
marker; update the class decorators to include the pytest marker for unit tests
(e.g., add `@pytest.mark.unit` alongside the existing `@pytest.mark.timeout`(30)),
or define a module-level pytestmark = [pytest.mark.unit,
pytest.mark.timeout(30)] so the tests under TestQwen3ASRModel are correctly
categorized as unit tests per the test guidelines.

src/megatron/bridge/models/qwen3_asr/qwen3_asr_provider.py (1)

48-48: Simplify default_factory - lambda wrapper is unnecessary.

The lambda is redundant when the factory is just calling the class constructor with no arguments.

Suggested simplification

-    thinker_config: Qwen3ASRThinkerConfig = field(default_factory=lambda: Qwen3ASRThinkerConfig())
+    thinker_config: Qwen3ASRThinkerConfig = field(default_factory=Qwen3ASRThinkerConfig)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/megatron/bridge/models/qwen3_asr/qwen3_asr_provider.py` at line 48, The
field definition for thinker_config uses a redundant lambda as default_factory;
update the dataclass field declaration to use the constructor directly by
setting default_factory=Qwen3ASRThinkerConfig so replace
"default_factory=lambda: Qwen3ASRThinkerConfig()" with
"default_factory=Qwen3ASRThinkerConfig" on the thinker_config field to simplify
the code.

src/megatron/bridge/models/qwen3_asr/modeling_qwen3_asr/model.py (1)

83-97: Add type hints for untyped parameters.

Per coding guidelines, all function arguments should have type hints. The following parameters are missing type annotations:

input_features
feature_attention_mask
audio_feature_lengths

Suggested type annotations

     def forward(
         self,
         input_ids: torch.Tensor,
-        input_features=None,
+        input_features: torch.Tensor | None = None,
         position_ids: torch.Tensor | None = None,
         attention_mask: torch.Tensor | None = None,
         labels: torch.Tensor | None = None,
         loss_mask: torch.Tensor | None = None,
         inference_params: InferenceParams | None = None,
         packed_seq_params: PackedSeqParams | None = None,
         extra_block_kwargs: dict | None = None,
-        feature_attention_mask=None,
-        audio_feature_lengths=None,
+        feature_attention_mask: torch.Tensor | None = None,
+        audio_feature_lengths: torch.Tensor | None = None,
         **kwargs,
     ) -> torch.Tensor:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/megatron/bridge/models/qwen3_asr/modeling_qwen3_asr/model.py` around
lines 83 - 97, The forward signature in modeling_qwen3_asr.model.py is missing
type hints for input_features, feature_attention_mask, and
audio_feature_lengths; update the forward method signature (function name:
forward) to add explicit types—e.g. input_features: torch.Tensor | None,
feature_attention_mask: torch.Tensor | None, and audio_feature_lengths:
torch.Tensor | Sequence[int] | None (or int[]-like type your codebase uses)—and
keep existing types for input_ids, position_ids, attention_mask, labels,
loss_mask, inference_params: InferenceParams | None, and packed_seq_params:
PackedSeqParams | None so static checkers and IDEs can validate usage. Ensure
imports/types used are available in the module or add them if necessary.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/megatron/bridge/models/__init__.py`:
- Around line 124-128: The top-level imports of Qwen3ASRBridge, Qwen3ASRModel,
and Qwen3ASRModelProvider currently force-import the external qwen_asr package;
either add qwen_asr to project dependencies in pyproject.toml or guard the
imports with lazy loading—implement a module-level __getattr__ (or try/except
import inside a function) that imports and returns Qwen3ASRBridge,
Qwen3ASRModel, and Qwen3ASRModelProvider on-demand, and apply the same change
for the other import block referenced (lines around the second occurrence).

In `@src/megatron/bridge/models/qwen3_asr/modeling_qwen3_asr/thinker_model.py`:
- Around line 233-261: The bug is that position_ids remain global while
combined_embeddings is split for context-parallel (CP) ranks; when cp_size>1 and
packed_seq_params is None you must shard position_ids the same way as
combined_embeddings so each CP rank gets matching RoPE positions. After
computing position_ids via get_rope_index (and before applying SP
padding/scatter), detect the same cp_size/cp_rank condition used for
combined_embeddings and call the same splitter (split_data_cp_rank) on
position_ids (preserving the same dims/order), then continue with the existing
SP padding/replicate padding logic so position_ids and combined_embeddings stay
aligned; ensure you reference combined_embeddings, split_data_cp_rank, cp_size,
cp_rank, position_ids, get_rope_index, and packed_seq_params when editing.
- Around line 55-84: The constructor currently types pg_collection as optional
but immediately dereferences it (self.cp_group = pg_collection.cp etc.), causing
AttributeError; either make pg_collection a required parameter (remove the "=
None" and update callers) or guard/mater ialize defaults before use by resolving
a default ProcessGroupCollection (e.g., via your parallel_state default getter)
and assigning self.pg_collection to that resolved instance before setting
self.cp_group, self.tp_group, self.pp_group and self.embd_group; also keep the
existing assert for embd but ensure it runs against the non-None
self.pg_collection.
- Around line 163-180: In the loop in thinker_model.py where you iterate "for
input_feature, feature_len in zip(input_features, feature_lens):", convert the
0-dim tensor feature_len to a Python int before using it for slicing (e.g., use
feature_len.item() for the slice limit) and pass a proper 1-D tensor/shape to
the audio model for feature_lens (e.g., construct a
torch.tensor([feature_len_value], device=input_feature.device, dtype=torch.long)
or otherwise match expected type), and change the zip call to
zip(input_features, feature_lens, strict=True) to enforce batch-size
consistency; update references to feature_len used for slicing and for
feature_lens argument accordingly.

In
`@tests/unit_tests/models/qwen3_asr/modeling_qwen3_asr/test_qwen3_asr_model.py`:
- Around line 88-113: The teardown currently destroys the global distributed
process group even when setup_class skipped initialization; modify setup_class
to record ownership (e.g., set a class attribute like cls._owns_process_group =
True only when you call dist.init_process_group) and ensure teardown_class only
calls dist.destroy_process_group() if dist.is_initialized() and
cls._owns_process_group is True; set the flag to False when skipping init so you
don't destroy pre-initialized groups, and clear or reset the flag after
destroying in teardown_class.

---

Nitpick comments:
In `@src/megatron/bridge/models/qwen3_asr/modeling_qwen3_asr/model.py`:
- Around line 83-97: The forward signature in modeling_qwen3_asr.model.py is
missing type hints for input_features, feature_attention_mask, and
audio_feature_lengths; update the forward method signature (function name:
forward) to add explicit types—e.g. input_features: torch.Tensor | None,
feature_attention_mask: torch.Tensor | None, and audio_feature_lengths:
torch.Tensor | Sequence[int] | None (or int[]-like type your codebase uses)—and
keep existing types for input_ids, position_ids, attention_mask, labels,
loss_mask, inference_params: InferenceParams | None, and packed_seq_params:
PackedSeqParams | None so static checkers and IDEs can validate usage. Ensure
imports/types used are available in the module or add them if necessary.

In `@src/megatron/bridge/models/qwen3_asr/qwen3_asr_provider.py`:
- Line 48: The field definition for thinker_config uses a redundant lambda as
default_factory; update the dataclass field declaration to use the constructor
directly by setting default_factory=Qwen3ASRThinkerConfig so replace
"default_factory=lambda: Qwen3ASRThinkerConfig()" with
"default_factory=Qwen3ASRThinkerConfig" on the thinker_config field to simplify
the code.

In
`@tests/unit_tests/models/qwen3_asr/modeling_qwen3_asr/test_qwen3_asr_model.py`:
- Around line 84-85: The test class TestQwen3ASRModel is missing the unit test
marker; update the class decorators to include the pytest marker for unit tests
(e.g., add `@pytest.mark.unit` alongside the existing `@pytest.mark.timeout`(30)),
or define a module-level pytestmark = [pytest.mark.unit,
pytest.mark.timeout(30)] so the tests under TestQwen3ASRModel are correctly
categorized as unit tests per the test guidelines.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: dd9f3779-471e-44ea-80e2-f4c687b581c9

📥 Commits

Reviewing files that changed from the base of the PR and between 589cee4 and cd0ef60.

📒 Files selected for processing (13)

src/megatron/bridge/models/__init__.py
src/megatron/bridge/models/conversion/auto_bridge.py
src/megatron/bridge/models/qwen3_asr/__init__.py
src/megatron/bridge/models/qwen3_asr/modeling_qwen3_asr/__init__.py
src/megatron/bridge/models/qwen3_asr/modeling_qwen3_asr/model.py
src/megatron/bridge/models/qwen3_asr/modeling_qwen3_asr/rope.py
src/megatron/bridge/models/qwen3_asr/modeling_qwen3_asr/thinker_model.py
src/megatron/bridge/models/qwen3_asr/modeling_qwen3_asr/transformer_config.py
src/megatron/bridge/models/qwen3_asr/qwen3_asr_bridge.py
src/megatron/bridge/models/qwen3_asr/qwen3_asr_provider.py
tests/unit_tests/models/qwen3_asr/__init__.py
tests/unit_tests/models/qwen3_asr/modeling_qwen3_asr/__init__.py
tests/unit_tests/models/qwen3_asr/modeling_qwen3_asr/test_qwen3_asr_model.py

src/megatron/bridge/models/__init__.py

src/megatron/bridge/models/qwen3_asr/modeling_qwen3_asr/thinker_model.py

tests/unit_tests/models/qwen3_asr/modeling_qwen3_asr/test_qwen3_asr_model.py

Signed-off-by: root <zhangyuekai@foxmail.com>

yaoyu-33 · 2026-03-18T22:13:46Z

/claude review

src/megatron/bridge/models/conversion/auto_bridge.py

tests/unit_tests/models/qwen3_asr/modeling_qwen3_asr/test_qwen3_asr_model.py

src/megatron/bridge/models/conversion/auto_bridge.py

yaoyu-33 · 2026-03-18T22:21:19Z

Please check AI comments.

Signed-off-by: root <zhangyuekai@foxmail.com>

examples/models/audio_lm/inference_qwen3_asr.sh

yuekaizhang · 2026-03-19T05:28:33Z

Please check AI comments.

Done. Many thanks for the review.

yaoyu-33 · 2026-03-19T17:38:24Z

@yuekaizhang need to add a functional tests, and a L0 bash. This is required for all new models.

Signed-off-by: root <zhangyuekai@foxmail.com>

yuekaizhang · 2026-03-20T03:31:55Z

@yuekaizhang need to add a functional tests, and a L0 bash. This is required for all new models.

Done.

yuekaizhang · 2026-03-24T05:13:12Z

@chtruong814 @yaoyu-33 All issues has been resolved. I was wondering if you could help review and do CI/CD test please. Thanks.

suiyoubi · 2026-03-30T13:53:37Z

/ok to test 78e019f

github-actions bot added the community-request label Mar 17, 2026

yuekaizhang commented Mar 17, 2026

View reviewed changes

src/megatron/bridge/models/qwen3_asr/qwen3_asr_provider.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

fix coderabit

a542e1f

Signed-off-by: root <zhangyuekai@foxmail.com>

claude bot reviewed Mar 18, 2026

View reviewed changes

src/megatron/bridge/models/conversion/auto_bridge.py Show resolved Hide resolved

claude bot reviewed Mar 18, 2026

View reviewed changes

tests/unit_tests/models/qwen3_asr/modeling_qwen3_asr/test_qwen3_asr_model.py Show resolved Hide resolved

yaoyu-33 reviewed Mar 18, 2026

View reviewed changes

src/megatron/bridge/models/conversion/auto_bridge.py Show resolved Hide resolved

yaoyu-33 added needs-follow-up Issue needs follow-up needs-author Author action is required before review or merge can continue area:model Model implementations and HF bridge logic labels Mar 18, 2026

chtruong814 removed the needs-follow-up Issue needs follow-up label Mar 18, 2026

yuekaizhang added 2 commits March 19, 2026 11:38

add qwen3-asr hf codes and remove qwen-asr dependency

76e9252

Signed-off-by: root <zhangyuekai@foxmail.com>

add examples, reslove claude comments

3bd58b3

Signed-off-by: root <zhangyuekai@foxmail.com>

yuekaizhang commented Mar 19, 2026

View reviewed changes

examples/models/audio_lm/inference_qwen3_asr.sh Show resolved Hide resolved

yaoyu-33 added needs-follow-up Issue needs follow-up needs-author Author action is required before review or merge can continue and removed needs-author Author action is required before review or merge can continue labels Mar 19, 2026

chtruong814 removed the needs-follow-up Issue needs follow-up label Mar 19, 2026

yuekaizhang added 2 commits March 20, 2026 10:47

Merge branch 'main' into qwen3-asr

4535d9b

add functional test

e14d178

Signed-off-by: root <zhangyuekai@foxmail.com>

chtruong814 added the needs-follow-up Issue needs follow-up label Mar 22, 2026

Merge branch 'main' into qwen3-asr

78e019f

chtruong814 removed the needs-follow-up Issue needs follow-up label Mar 23, 2026

chtruong814 added the needs-follow-up Issue needs follow-up label Mar 26, 2026

copy-pr-bot bot temporarily deployed to test March 30, 2026 13:54 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci March 30, 2026 14:08 Inactive

copy-pr-bot bot temporarily deployed to public March 30, 2026 14:24 Inactive

chtruong814 removed the needs-follow-up Issue needs follow-up label Mar 30, 2026

copy-pr-bot bot had a problem deploying to nemo-ci March 30, 2026 14:48 Failure

Conversation

yuekaizhang commented Mar 17, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Changelog

GitHub Actions CI

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Release Notes

Uh oh!

copy-pr-bot bot commented Mar 17, 2026

Uh oh!

yuekaizhang commented Mar 17, 2026

Uh oh!

Uh oh!

coderabbitai bot commented Mar 17, 2026

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yaoyu-33 commented Mar 18, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yaoyu-33 commented Mar 18, 2026

Uh oh!

Uh oh!

yuekaizhang commented Mar 19, 2026

Uh oh!

yaoyu-33 commented Mar 19, 2026

Uh oh!

yuekaizhang commented Mar 20, 2026

Uh oh!

yuekaizhang commented Mar 24, 2026

Uh oh!

suiyoubi commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

yuekaizhang commented Mar 17, 2026 •

edited by coderabbitai bot

Loading