Fix(qwenmoe): Fix SP issue in Qwen Moe #2741

Ace-To-HYB · 2025-10-16T08:18:37Z

PR types

Bug fixes

PR changes

Description

修复Qwen2/3 Mode系列模型开启SP并行Hang住问题

paddle-bot · 2025-10-16T08:18:43Z

Thanks for your contribution!

codecov-commenter · 2025-10-16T08:45:29Z

Codecov Report

❌ Patch coverage is 83.67347% with 8 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@9d3e01d). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
paddleformers/transformers/qwen2_moe/modeling.py	84.61%	4 Missing ⚠️
paddleformers/transformers/qwen3_moe/modeling.py	82.60%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #2741   +/-   ##
==========================================
  Coverage           ?   29.08%           
==========================================
  Files              ?      334           
  Lines              ?    55916           
  Branches           ?        0           
==========================================
  Hits               ?    16261           
  Misses             ?    39655           
  Partials           ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lugimzzz · 2025-10-17T04:01:31Z

paddleformers/transformers/qwen2_moe/modeling.py


-        shared_expert_output = self.shared_expert(hidden_states)
-        shared_expert_output = F.sigmoid(self.shared_expert_gate(hidden_states)) * shared_expert_output
+        shared_expert_output = self.shared_expert(residuals)


share expert是非splayer，输入不做gather，相当于每一张卡都只跑了部分数据，TPlayer的输入应该是完整的序列。share expert的输入也应该是gather的

lugimzzz

LGTM

Fix SP issue in qwenmoe

99010be

paddle-bot bot added the contributor label Oct 16, 2025

lugimzzz reviewed Oct 17, 2025

View reviewed changes

Ace-To-HYB added 3 commits October 17, 2025 17:42

Merge branch 'PaddlePaddle:develop' into fix_qwen_sp

b20db7b

fix sp bug

02f748a

Merge branch 'PaddlePaddle:develop' into fix_qwen_sp

6034afa

lugimzzz approved these changes Oct 20, 2025

View reviewed changes

lugimzzz merged commit ddaa474 into PaddlePaddle:develop Oct 20, 2025
5 of 6 checks passed

Ace-To-HYB deleted the fix_qwen_sp branch October 20, 2025 10:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix(qwenmoe): Fix SP issue in Qwen Moe #2741

Fix(qwenmoe): Fix SP issue in Qwen Moe #2741

Uh oh!

Ace-To-HYB commented Oct 16, 2025

Uh oh!

paddle-bot bot commented Oct 16, 2025

Uh oh!

codecov-commenter commented Oct 16, 2025 •

edited

Loading

Uh oh!

lugimzzz Oct 17, 2025

Uh oh!

Ace-To-HYB Oct 20, 2025

Uh oh!

lugimzzz left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix(qwenmoe): Fix SP issue in Qwen Moe #2741

Fix(qwenmoe): Fix SP issue in Qwen Moe #2741

Uh oh!

Conversation

Ace-To-HYB commented Oct 16, 2025

PR types

PR changes

Description

Uh oh!

paddle-bot bot commented Oct 16, 2025

Uh oh!

codecov-commenter commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

lugimzzz Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Ace-To-HYB Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

lugimzzz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Oct 16, 2025 •

edited

Loading