Skip to content

Conversation

@abdalgader-a
Copy link
Contributor

@abdalgader-a abdalgader-a commented Jul 30, 2025

IMPORTANT:
The NeMo submodule should be updated as in this PR: NVIDIA-NeMo/NeMo#14378

What does this PR do ?

This PR allow to export qwen3 model type from megatron format to HF

Test

Tested locally via run the megatron conversion script on Qwen3-1.7B.

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Signed-off-by: Abdalgader Abubaker <[email protected]>
@terrykong terrykong requested a review from ashors1 July 31, 2025 19:23
@ashors1
Copy link
Contributor

ashors1 commented Jul 31, 2025

Thank you for the contribution! It looks like the NeMo submodule change touched a lot of unrelated files. Would it be possible to revert those unrelated changes and only keep the relevant ones?

@abdalgader-a
Copy link
Contributor Author

Thanks @ashors1 for reviewing --- my apologies seems I mistakenly merged other changes. Now, I raised a new PR (#14378) in the NeMo submodule with only the relevant changes. Also, closed the wrong PR.


exporter_cls = HFQwen2Exporter

elif hf_config.model_type == "qwen3":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
elif hf_config.model_type == "qwen3":
elif hf_config.model_type in ("qwen3", "qwen3_moe"):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, actually I just tested this and it looks like this doesn't work with MoE models at the moment. Could we extend this to support MoE? If too much work, we could always add MoE support in a follow up PR

@ashors1
Copy link
Contributor

ashors1 commented Aug 5, 2025

@abdalgader-a let me know if you'd like to extend this PR to support Qwen3 MoE. I'm happy to help out if not

@abdalgader-a
Copy link
Contributor Author

@ashors1 -- let me try to extend it. In case it takes long and much time needed we can raise another PR. I'll get back on this asap.

@ashors1
Copy link
Contributor

ashors1 commented Aug 7, 2025

@abdalgader-a I went ahead and started this process becuase we received some internal requests for the qwen3 moe exporter: https://github.com/NVIDIA-NeMo/RL/tree/ashors/qwen3-moe-export. I can raise the new PR and I'll be sure to add you as co-author.

@abdalgader-a
Copy link
Contributor Author

now worries @ashors1! I can help out in the new PR too. I'll let you to choose either merge this PR or have them all in the new one.

@ashors1
Copy link
Contributor

ashors1 commented Aug 8, 2025

@abdalgader-a I went ahead and opened a new PR which adds MoE export support on top of your commits: #873. Please take a look. If this looks good to you, let's close this PR and focus on #873.

@abdalgader-a
Copy link
Contributor Author

@ashors1 -- sorry for late reply. let's go ahead with #873. I'll close this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants