Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add long sequence strategies #8076

Merged
merged 30 commits into from
Mar 26, 2024
Merged

Conversation

WAI-clear
Copy link
Contributor

PR types

PR changes

Models、APIs

Description

将长序列方案和模型解耦

Copy link

paddle-bot bot commented Mar 8, 2024

Thanks for your contribution!

@gongel gongel self-requested a review March 8, 2024 04:23
Copy link

codecov bot commented Mar 15, 2024

Codecov Report

Attention: Patch coverage is 43.16940% with 104 lines in your changes are missing coverage. Please review.

Project coverage is 55.41%. Comparing base (db49062) to head (dc8da0a).
Report is 52 commits behind head on develop.

Files Patch % Lines
...s/long_sequence_strategies/embedding_strategies.py 25.39% 47 Missing ⚠️
...s/long_sequence_strategies/attention_strategies.py 37.50% 15 Missing ⚠️
...ng_sequence_strategies/long_sequence_strategies.py 31.25% 11 Missing ⚠️
paddlenlp/transformers/llama/modeling.py 35.71% 9 Missing ⚠️
paddlenlp/transformers/chatglm/modeling.py 41.66% 7 Missing ⚠️
paddlenlp/transformers/bloom/modeling.py 50.00% 6 Missing ⚠️
paddlenlp/transformers/chatglm_v2/modeling.py 45.45% 6 Missing ⚠️
paddlenlp/transformers/qwen/modeling.py 62.50% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #8076      +/-   ##
===========================================
- Coverage    56.56%   55.41%   -1.16%     
===========================================
  Files          589      600      +11     
  Lines        89964    91642    +1678     
===========================================
- Hits         50889    50782     -107     
- Misses       39075    40860    +1785     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -152,7 +152,7 @@ def main():
)
if hasattr(model_config, "use_flash_attention"):
model_config.use_flash_attention = model_args.use_flash_attention

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个文件不需要被修改吧

"zero_padding": false,
"use_flash_attention": false
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个json也不需要被修改


class AttentionWithLinearBias(nn.Layer):
"""
init_args:bool_attention_mask,num_heads,dtype,tensor_parallel_degree
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+ self._get_interleave(2 * closest_power_of_2)[0::2][: n - closest_power_of_2]
)

def forward(self, bool_attention_mask: Tensor, num_heads: int, dtype: paddle.dtype, tensor_parallel_degree=1):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

传入tensor_parallel_degree的用处是?

def _get_interleave(self, n):
def _get_interleave_power_of_2(n):
start = 2 ** (-(2 ** -(math.log2(n) - 3)))
ratio = start
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个ratio和start相等?是否可以复用

"""
try:
import_class = importlib.import_module(f"paddlenlp.transformers.LongSequenceStrategies.{strategy_type}")
except ValueError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

应该是ModuleNotFoundError?

strategy_class = getattr(import_class, stratety_name)
strategy_instance = strategy_class(**init_args)
return strategy_instance
except AttributeError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果是strategy_class,报的错误是AttributeError?

@@ -0,0 +1,49 @@
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文件命名,目录命名,小写

@wawltor wawltor merged commit 6b5099a into PaddlePaddle:develop Mar 26, 2024
7 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants