add qwen3 to megatron conversion #802

abdalgader-a · 2025-07-30T12:20:49Z

IMPORTANT:
The NeMo submodule should be updated as in this PR: NVIDIA-NeMo/NeMo#14378

What does this PR do ?

This PR allow to export qwen3 model type from megatron format to HF

Test

Tested locally via run the megatron conversion script on Qwen3-1.7B.

Issues

List issues that this PR closes (syntax):

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

...

Signed-off-by: Abdalgader Abubaker <[email protected]>

ashors1 · 2025-07-31T22:34:22Z

Thank you for the contribution! It looks like the NeMo submodule change touched a lot of unrelated files. Would it be possible to revert those unrelated changes and only keep the relevant ones?

abdalgader-a · 2025-08-01T07:14:08Z

Thanks @ashors1 for reviewing --- my apologies seems I mistakenly merged other changes. Now, I raised a new PR (#14378) in the NeMo submodule with only the relevant changes. Also, closed the wrong PR.

ashors1 · 2025-08-01T21:42:39Z

nemo_rl/models/megatron/community_import.py


        exporter_cls = HFQwen2Exporter
+
+    elif hf_config.model_type == "qwen3":


Suggested change

elif hf_config.model_type == "qwen3":

elif hf_config.model_type in ("qwen3", "qwen3_moe"):

Ah, actually I just tested this and it looks like this doesn't work with MoE models at the moment. Could we extend this to support MoE? If too much work, we could always add MoE support in a follow up PR

ashors1 · 2025-08-05T19:08:15Z

@abdalgader-a let me know if you'd like to extend this PR to support Qwen3 MoE. I'm happy to help out if not

abdalgader-a · 2025-08-07T11:24:34Z

@ashors1 -- let me try to extend it. In case it takes long and much time needed we can raise another PR. I'll get back on this asap.

ashors1 · 2025-08-07T14:18:02Z

@abdalgader-a I went ahead and started this process becuase we received some internal requests for the qwen3 moe exporter: https://github.com/NVIDIA-NeMo/RL/tree/ashors/qwen3-moe-export. I can raise the new PR and I'll be sure to add you as co-author.

abdalgader-a · 2025-08-08T08:42:57Z

now worries @ashors1! I can help out in the new PR too. I'll let you to choose either merge this PR or have them all in the new one.

ashors1 · 2025-08-08T16:04:52Z

@abdalgader-a I went ahead and opened a new PR which adds MoE export support on top of your commits: #873. Please take a look. If this looks good to you, let's close this PR and focus on #873.

abdalgader-a · 2025-08-14T06:14:34Z

@ashors1 -- sorry for late reply. let's go ahead with #873. I'll close this PR.

add qwen3 to megatron conversion

9d6701d

Signed-off-by: Abdalgader Abubaker <[email protected]>

terrykong requested a review from ashors1 July 31, 2025 19:23

abdalgader-a mentioned this pull request Aug 1, 2025

Fix: add HFQwen3Exporter NVIDIA-NeMo/NeMo#14378

Closed

8 tasks

ashors1 reviewed Aug 1, 2025

View reviewed changes

abdalgader-a closed this Aug 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add qwen3 to megatron conversion #802

add qwen3 to megatron conversion #802

Uh oh!

abdalgader-a commented Jul 30, 2025 •

edited

Loading

Uh oh!

ashors1 commented Jul 31, 2025

Uh oh!

abdalgader-a commented Aug 1, 2025

Uh oh!

ashors1 Aug 1, 2025

Uh oh!

ashors1 Aug 1, 2025

Uh oh!

ashors1 commented Aug 5, 2025

Uh oh!

abdalgader-a commented Aug 7, 2025

Uh oh!

ashors1 commented Aug 7, 2025

Uh oh!

abdalgader-a commented Aug 8, 2025

Uh oh!

ashors1 commented Aug 8, 2025

Uh oh!

abdalgader-a commented Aug 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		exporter_cls = HFQwen2Exporter

		elif hf_config.model_type == "qwen3":

	elif hf_config.model_type == "qwen3":
	elif hf_config.model_type in ("qwen3", "qwen3_moe"):

add qwen3 to megatron conversion #802

add qwen3 to megatron conversion #802

Uh oh!

Conversation

abdalgader-a commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Test

Issues

Usage

Before your PR is "Ready for review"

Additional Information

Uh oh!

ashors1 commented Jul 31, 2025

Uh oh!

abdalgader-a commented Aug 1, 2025

Uh oh!

ashors1 Aug 1, 2025

Choose a reason for hiding this comment

Uh oh!

ashors1 Aug 1, 2025

Choose a reason for hiding this comment

Uh oh!

ashors1 commented Aug 5, 2025

Uh oh!

abdalgader-a commented Aug 7, 2025

Uh oh!

ashors1 commented Aug 7, 2025

Uh oh!

abdalgader-a commented Aug 8, 2025

Uh oh!

ashors1 commented Aug 8, 2025

Uh oh!

abdalgader-a commented Aug 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

abdalgader-a commented Jul 30, 2025 •

edited

Loading