Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix attn_bias_add bug. #37147

Merged
merged 6 commits into from
Nov 16, 2021
Merged

Conversation

limin2021
Copy link
Contributor

@limin2021 limin2021 commented Nov 11, 2021

PR types

Bug fixes

PR changes

OPs

Describe

问题描述:
fused_attention_op的实现中,使用了bias_add,且其实现是通过使用kernel primitive来实现的,之后kernel primitive的WriteData api接口及函数内部实现发生了更改,将判断越界的逻辑移到了template的参数中,使得调用的分支有错误,产生了越界赋值操作,污染了别的显存空间的内容。具体表现为:test_fused_attention_op_api.py 单次执行基本上不会报错,多次循环执行不同shape的输入,结果计算不对,具有偶发性,bug不易察觉。

解决:
调用处,由
kernel_primitives::WriteData<OutT, VecSize, 1, 1>(out + fix, result, num);
改成:
kernel_primitives::WriteData<OutT, VecSize, 1, 1, true>(out + fix, result, num);

为了避免此类问题的发生,直接将fused_attention_op中对bias_add和reduce的实现改成直接调用paddle已有op的最外层接口进行实现。

2.修改了fused_attention和fused_feedforward op的api参数的english doc(根据 #36972 中chenlong的review进行修改)。
预览(只展示修改的地方)如下:

functional.fused_multi_head_attention:
image
image

nn.FusedMultiHeadAttention:
image

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

xingfeng01
xingfeng01 previously approved these changes Nov 12, 2021
Copy link
Contributor

@xingfeng01 xingfeng01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改动会对性能有影响吗 ?

@@ -11,10 +11,14 @@ limitations under the License. */

#pragma once

#include "paddle/fluid/operators/fused/attn_bias_add.cu.h"
// #include "paddle/fluid/operators/fused/attn_bias_add.cu.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议在下个 PR 里删除掉不需要的行

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

之前手动实现的bias_add是模仿已有op的实现写的。现在恢复成直接调用已有op实现,对性能没有影响。

}
}

void ComputeBackward(const T* input, const T* weight, const T* d_output,
T* d_input, T* d_weight, T* d_bias) {
// void ComputeBackward(const T* input, const T* weight, const T* d_output,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议在下个 PR 里删除不需要的行

Copy link
Contributor Author

@limin2021 limin2021 Nov 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@AnnaTrainingG
Copy link
Contributor

LGTM for LaunchElementwiseCudaKernel

xingfeng01
xingfeng01 previously approved these changes Nov 13, 2021
@lanxianghit lanxianghit merged commit a9e7a85 into PaddlePaddle:develop Nov 16, 2021
zkh2016 pushed a commit to zkh2016/Paddle that referenced this pull request Nov 16, 2021
fused_attention_op的实现中,使用了bias_add,且其实现是通过使用kernel primitive来实现的,之后kernel primitive的WriteData api接口及函数内部实现发生了更改,将判断越界的逻辑移到了template的参数中,使得调用的分支有错误,产生了越界赋值操作,污染了别的显存空间的内容。具体表现为:test_fused_attention_op_api.py 单次执行基本上不会报错,多次循环执行不同shape的输入,结果计算不对,具有偶发性,bug不易察觉。
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants