Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DSK] Implement mla use matrix-absorption #9875

Open
wants to merge 39 commits into
base: develop
Choose a base branch
from

Conversation

yuanlehome
Copy link
Collaborator

Before submitting

  • Lint code. If there are lint issues, please format the code first.
# Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install

# Process previous code files separately
pre-commit run --file XXXX.py
  • Add test cases into tests folder. If there are codecov issues, please add tests cases first.

PR types

Performance optimization

PR changes

Others

Description

Copy link

paddle-bot bot commented Feb 16, 2025

Thanks for your contribution!

Copy link

codecov bot commented Feb 16, 2025

Codecov Report

Attention: Patch coverage is 0% with 401 lines in your changes missing coverage. Please review.

Project coverage is 51.23%. Comparing base (235c24e) to head (03d6a0b).
Report is 23 commits behind head on develop.

Current head 03d6a0b differs from pull request most recent head 2fb3378

Please upload reports for the commit 2fb3378 to get more accurate results.

Files with missing lines Patch % Lines
...erimental/transformers/fused_transformer_layers.py 0.00% 276 Missing ⚠️
.../experimental/transformers/deepseek_v2/modeling.py 0.00% 78 Missing ⚠️
paddlenlp/experimental/transformers/proposers.py 0.00% 9 Missing ⚠️
...enlp/experimental/transformers/generation_utils.py 0.00% 6 Missing ⚠️
...dlenlp/experimental/transformers/bloom/modeling.py 0.00% 5 Missing ⚠️
...p/experimental/transformers/chatglm_v2/modeling.py 0.00% 5 Missing ⚠️
...dlenlp/experimental/transformers/llama/modeling.py 0.00% 5 Missing ⚠️
...enlp/experimental/transformers/mixtral/modeling.py 0.00% 5 Missing ⚠️
...dlenlp/experimental/transformers/qwen2/modeling.py 0.00% 5 Missing ⚠️
...lp/experimental/transformers/qwen2_moe/modeling.py 0.00% 5 Missing ⚠️
... and 1 more
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9875      +/-   ##
===========================================
- Coverage    51.66%   51.23%   -0.43%     
===========================================
  Files          739      745       +6     
  Lines       117426   118834    +1408     
===========================================
+ Hits         60668    60886     +218     
- Misses       56758    57948    +1190     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@yuanlehome yuanlehome marked this pull request as draft February 17, 2025 07:38
// float in_scale,
// bool causal,
// cudaStream_t &stream,
// paddle::Tensor *out);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个文件里除了 MLA decode attention之外,还是支持 GQA/MHA 吗?

fix MLA && trick avoid append_dec
@CLAassistant
Copy link

CLAassistant commented Feb 20, 2025

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ lizhenyun01
✅ yuanlehome
❌ lizhenyu04


lizhenyu04 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@yuanlehome yuanlehome marked this pull request as ready for review February 24, 2025 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants