-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sparse_attention OP, test=develop #35676
Add sparse_attention OP, test=develop #35676
Conversation
Thanks for your contribution! |
这个是为了支持哪一些模型呢? |
NLP 那边的 sparse transformer |
@@ -46,6 +46,7 @@ | |||
'cudnn_lstm', \ | |||
'rnn', \ | |||
'lgamma', \ | |||
'sparse_attention', \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
添加白名单的理由是否充分?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
目前主要是矩阵乘导致的精度误差,咨询了彭军才,同意添加
example里给出的API定义不符合2.0规范 paddle.fluid.core.sparse_attention |
目前行数比较多,Python API的封装将在下一个PR中提交 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI没有CUDA11.2以上的环境,请在PR描述里面贴一下单测本地测试的结果。
&output_lists[i], M, N, false, false); | ||
} | ||
#else | ||
PADDLE_THROW(platform::errors::InvalidArgument( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
错误类型好像有Unsupported
return -1 | ||
|
||
|
||
def get_linux_platform(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
操作系统,可以在CMakelists.txt里面控制。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CMake可以控制C++的相关参数,但是python端不能直接控制。需要通过类似于注册算子的方式实现,例如#26180
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
没那么复杂,cmake中可以判断是否WIN
或MACOS
系统,这两个系统就不定义这个单测了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, Thanks
… Add_sparse_attention_api
… Add_sparse_attention_api
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. 这个PR可以先合进去,sparse handle绑定ctx中已有的stream,下个PR需要处理下。
const T* srcptr = src + layout_rowptr[cur_block_row]; | ||
T* attnptr = nullptr; | ||
if (attn_mask != nullptr) { | ||
const T* attnptr = attn_mask + cur_block_row * num_rows; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这是想赋值给L84定义的attnptr吧,但是这样写重新定义了一个局部变量,所以赋值是无效的。
*/ | ||
template <typename DeviceContext, typename T> | ||
void SparseSoftmaxForward(const platform::CUDADeviceContext& ctx, | ||
const Tensor* offset, const Tensor* columns, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
输入最好用const Tensor &
类型。
} | ||
} | ||
|
||
void CusparseDestroy(cusparseDnMatDescr_t* dn_mat_first, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这种封装方式不太好,一般应该遵守谁创建、谁销毁的原则。
GetTransposeOperation(b_transpose), &alpha, mat_a, mat_b, &beta, mat_c, | ||
gpu_type, CUSPARSE_SDDMM_ALG_DEFAULT, &buffer_size); | ||
auto d_buffer_ptr = paddle::memory::Alloc(ctx, buffer_size); | ||
void* d_buffer = static_cast<void*>(d_buffer_ptr->ptr()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个是workspace
吗?
return -1 | ||
|
||
|
||
def get_linux_platform(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
没那么复杂,cmake中可以判断是否WIN
或MACOS
系统,这两个系统就不定义这个单测了。
Add sparse_attention OPs, python api will be added in next pr
Add sparse_attention OPs, python api will be added in next pr
PR types
New features
PR changes
OPs
Describe
Add paddle._C_ops.sparse_attention OPs
Example
Precautions
The code of this PR can only support CUDA 11.2. Currently, CI does not have GPU with CUDA 11.2 , and all tests will be skipped automatically.
The new OP is paddle._C_ops.sparse_attention. Regarding the work of the python API, it will be resolved in a follow-up PR.
The code of this PR lacks tests on dynamic graphs and static graphs, and will be added in subsequent PRs.
Result