Skip to content

Conversation

@Ace-To-HYB
Copy link
Collaborator

PR types

Bug fixes

PR changes

Description

修复Qwen2/3 Mode系列模型开启SP并行Hang住问题

@paddle-bot
Copy link

paddle-bot bot commented Oct 16, 2025

Thanks for your contribution!

@codecov-commenter
Copy link

codecov-commenter commented Oct 16, 2025

Codecov Report

❌ Patch coverage is 83.67347% with 8 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@9d3e01d). Learn more about missing BASE report.

Files with missing lines Patch % Lines
paddleformers/transformers/qwen2_moe/modeling.py 84.61% 4 Missing ⚠️
paddleformers/transformers/qwen3_moe/modeling.py 82.60% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #2741   +/-   ##
==========================================
  Coverage           ?   29.08%           
==========================================
  Files              ?      334           
  Lines              ?    55916           
  Branches           ?        0           
==========================================
  Hits               ?    16261           
  Misses             ?    39655           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


shared_expert_output = self.shared_expert(hidden_states)
shared_expert_output = F.sigmoid(self.shared_expert_gate(hidden_states)) * shared_expert_output
shared_expert_output = self.shared_expert(residuals)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

share expert是非splayer,输入不做gather,相当于每一张卡都只跑了部分数据,TPlayer的输入应该是完整的序列。share expert的输入也应该是gather的

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

@lugimzzz lugimzzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lugimzzz lugimzzz merged commit ddaa474 into PaddlePaddle:develop Oct 20, 2025
5 of 6 checks passed
@Ace-To-HYB Ace-To-HYB deleted the fix_qwen_sp branch October 20, 2025 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants