[AutoParallel] Add auto parallel moe layer #9886

pkuzyc · 2025-02-17T13:36:52Z

PR types

New features

PR changes

Models

Description

Add auto parallel moe layer

paddle-bot · 2025-02-17T13:36:57Z

Thanks for your contribution!

codecov · 2025-02-17T14:11:40Z

Codecov Report

Attention: Patch coverage is 14.25960% with 469 lines in your changes missing coverage. Please review.

Project coverage is 51.17%. Comparing base (347d77c) to head (8bfa877).
Report is 6 commits behind head on develop.

Files with missing lines	Patch %	Lines
paddlenlp/transformers/moe_gate_auto.py	10.47%	282 Missing ⚠️
paddlenlp/transformers/moe_layer_auto.py	18.36%	120 Missing ⚠️
...addlenlp/transformers/deepseek_v2/modeling_auto.py	26.41%	39 Missing ⚠️
paddlenlp/transformers/auto_utils.py	12.50%	28 Missing ⚠️

❌ Your patch check has failed because the patch coverage (14.25%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.
❌ Your project check has failed because the head coverage (51.17%) is below the target coverage (58.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #9886      +/-   ##
===========================================
- Coverage    51.34%   51.17%   -0.18%     
===========================================
  Files          745      748       +3     
  Lines       118567   119129     +562     
===========================================
+ Hits         60877    60961      +84     
- Misses       57690    58168     +478

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

DrownFish19 · 2025-02-18T04:00:46Z

paddlenlp/transformers/moe_gate_auto.py

+            me = paddle.stack(me_list).mean(0)
+            ce = paddle.stack(ce_list).mean(0)
+        aux_loss = paddle.sum(me * ce) * float(self.num_experts)
+        return aux_loss


辛苦这里增加一下计算seq_aux_loss的代码
https://github.com/PaddlePaddle/PaddleNLP/pull/9876/files#diff-f64288c3a9f8f14d1f8e2f18a5538e17198567d85c055c90b1dd902de939a677

DrownFish19 · 2025-02-18T04:01:56Z

paddlenlp/transformers/moe_gate_auto.py

+            # Make sure the capacity value does not exceed the number of tokens.
+            capacity = int(min(new_capacity, paddle.tensor(mask1.size(0))))
+
+        l_aux = self._cal_aux_loss(gates, mask1)


这里的l_aux需要判断一下是否使用seq_aux_loss，https://github.com/PaddlePaddle/PaddleNLP/pull/9876/files#diff-f64288c3a9f8f14d1f8e2f18a5538e17198567d85c055c90b1dd902de939a677

DrownFish19 reviewed Feb 18, 2025

View reviewed changes

pkuzyc added 3 commits February 20, 2025 15:16

add auto parallel moe layer

81a9849

add local layer in MoELayer

986c1de

add expert parallel with dygraph auto parallel

8bfa877

pkuzyc force-pushed the auto_moe_layer branch from a75f03b to 8bfa877 Compare February 24, 2025 02:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoParallel] Add auto parallel moe layer #9886

[AutoParallel] Add auto parallel moe layer #9886

pkuzyc commented Feb 17, 2025

paddle-bot bot commented Feb 17, 2025

codecov bot commented Feb 17, 2025 •

edited

Loading

DrownFish19 Feb 18, 2025

DrownFish19 Feb 18, 2025

[AutoParallel] Add auto parallel moe layer #9886

Are you sure you want to change the base?

[AutoParallel] Add auto parallel moe layer #9886

Conversation

pkuzyc commented Feb 17, 2025

PR types

PR changes

Description

paddle-bot bot commented Feb 17, 2025

codecov bot commented Feb 17, 2025 • edited Loading

Codecov Report

DrownFish19 Feb 18, 2025

Choose a reason for hiding this comment

DrownFish19 Feb 18, 2025

Choose a reason for hiding this comment

codecov bot commented Feb 17, 2025 •

edited

Loading