Skip to content

Qwen3 MoE#2060

Merged
Borda merged 15 commits intoLightning-AI:mainfrom
ysjprojects:qwen3_moe
Jun 16, 2025
Merged

Qwen3 MoE#2060
Borda merged 15 commits intoLightning-AI:mainfrom
ysjprojects:qwen3_moe

Conversation

@ysjprojects
Copy link
Collaborator

@ysjprojects ysjprojects commented May 28, 2025

https://qwenlm.github.io/blog/qwen3/
https://arxiv.org/abs/2505.09388

Performance:
image

Models Added:
Qwen/Qwen3-235B-A22B
Qwen/Qwen3-30B-A3B
Qwen/Qwen3-30B-A3B-Base

@ysjprojects ysjprojects changed the title (WIP) qwen3 moe Qwen3 MoE May 29, 2025
@Borda
Copy link
Collaborator

Borda commented Jun 10, 2025

not sure if the GPU error is related...
I beilve it is the #2062

Copy link
Contributor

@KaelanDt KaelanDt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thank you @ysjprojects !

@Borda Borda merged commit 72ea185 into Lightning-AI:main Jun 16, 2025
24 checks passed
mseeger pushed a commit to mseeger/litgpt that referenced this pull request Jul 4, 2025
Co-authored-by: shijie.yu <shijie@tensorplex.ai>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants