Add sparse_attention OP, test=develop #35676

Liu-xiandong · 2021-09-13T03:02:50Z

PR types

New features

PR changes

OPs

Describe

Add paddle._C_ops.sparse_attention OPs

Example

import paddle
import numpy as np

query_data = np.array([[[[0, 1,], [2, 3], [ 0, 1], [2, 3]]]]).astype("float32")
key_data = np.array([[[[0, 1,], [2, 3], [ 0, 1], [2, 3]]]]).astype("float32")
value_data = np.array([[[[0, 1,], [2, 3], [ 0, 1], [2, 3]]]]).astype("float32")
sparse_csr_offset_data = np.array([[[0, 2, 4, 6, 8]]]).astype("int32")
sparse_csr_columns_data = np.array([[[0, 1, 0, 1, 2, 3, 2, 3]]]).astype("int32")
print(query_data.shape)
# (1, 1, 4, 2)
print(sparse_csr_offset_data.shape)
# (1, 1, 5)
print(sparse_csr_columns_data.shape)
# (1, 1, 8)
paddle.disable_static()
query = paddle.to_tensor(query_data, stop_gradient=False, place=paddle.CUDAPlace(0))
key = paddle.to_tensor(key_data, stop_gradient=False, place=paddle.CUDAPlace(0))
value = paddle.to_tensor(value_data, stop_gradient=False, place=paddle.CUDAPlace(0))
offset = paddle.to_tensor(sparse_csr_offset_data, stop_gradient=False, place=paddle.CUDAPlace(0))
columns = paddle.to_tensor(sparse_csr_columns_data, stop_gradient=False, place=paddle.CUDAPlace(0))
output = paddle._C_ops.sparse_attention(query, key, value, offset, columns)
print(output)

# [[[[1.60885942, 2.60885954],
#       [1.99830270, 2.99830270],
#       [1.60885942, 2.60885954],
#       [1.99830270, 2.99830270]]]]

Precautions

The code of this PR can only support CUDA 11.2. Currently, CI does not have GPU with CUDA 11.2 , and all tests will be skipped automatically.
The new OP is paddle._C_ops.sparse_attention. Regarding the work of the python API, it will be resolved in a follow-up PR.
The code of this PR lacks tests on dynamic graphs and static graphs, and will be added in subsequent PRs.

Result

paddle-bot-old · 2021-09-13T03:03:03Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

paddle/fluid/operators/sparse_attention_op.cc

paddle/fluid/operators/sparse_attention_op.cu

python/paddle/fluid/tests/unittests/test_sparse_attention_op.py

paddle/fluid/operators/sparse_attention_op.cu

ZHUI · 2021-09-17T07:56:49Z

这个是为了支持哪一些模型呢？

Liu-xiandong · 2021-09-17T08:14:47Z

这个是为了支持哪一些模型呢？

NLP 那边的 sparse transformer

lanxianghit · 2021-09-17T09:06:19Z

python/paddle/fluid/tests/unittests/white_list/op_threshold_white_list.py

@@ -46,6 +46,7 @@
    'cudnn_lstm', \
    'rnn', \
    'lgamma', \
+    'sparse_attention', \


添加白名单的理由是否充分？

目前主要是矩阵乘导致的精度误差，咨询了彭军才，同意添加

paddle/fluid/operators/sparse_attention_op.cc

lanxianghit · 2021-09-17T09:09:18Z

example里给出的API定义不符合2.0规范 paddle.fluid.core.sparse_attention

paddle/fluid/operators/sparse_attention_op.cu

Liu-xiandong · 2021-09-22T07:08:09Z

example里给出的API定义不符合2.0规范 paddle.fluid.core.sparse_attention

目前行数比较多，Python API的封装将在下一个PR中提交

… Add_sparse_attention_api

Xreki

CI没有CUDA11.2以上的环境，请在PR描述里面贴一下单测本地测试的结果。

paddle/fluid/operators/sparse_attention_op.cc

paddle/fluid/operators/sparse_attention_op.cu

Xreki · 2021-09-24T02:28:35Z

paddle/fluid/operators/sparse_attention_op.cu

+                               &output_lists[i], M, N, false, false);
+    }
+#else
+    PADDLE_THROW(platform::errors::InvalidArgument(


错误类型好像有Unsupported

Xreki · 2021-09-24T02:33:29Z

python/paddle/fluid/tests/unittests/test_sparse_attention_op.py

+        return -1
+
+
+def get_linux_platform():


操作系统，可以在CMakelists.txt里面控制。

CMake可以控制C++的相关参数，但是python端不能直接控制。需要通过类似于注册算子的方式实现，例如#26180

没那么复杂，cmake中可以判断是否WIN或MACOS系统，这两个系统就不定义这个单测了。

… Add_sparse_attention_api

Xreki

LGTM. 这个PR可以先合进去，sparse handle绑定ctx中已有的stream，下个PR需要处理下。

Xreki · 2021-09-27T05:49:58Z

paddle/fluid/operators/sparse_attention_op.cu

+    const T* srcptr = src + layout_rowptr[cur_block_row];
+    T* attnptr = nullptr;
+    if (attn_mask != nullptr) {
+      const T* attnptr = attn_mask + cur_block_row * num_rows;


这是想赋值给L84定义的attnptr吧，但是这样写重新定义了一个局部变量，所以赋值是无效的。

Xreki · 2021-09-28T08:37:57Z

paddle/fluid/operators/sparse_attention_op.cu

+*/
+template <typename DeviceContext, typename T>
+void SparseSoftmaxForward(const platform::CUDADeviceContext& ctx,
+                          const Tensor* offset, const Tensor* columns,


输入最好用const Tensor &类型。

Xreki · 2021-09-28T08:40:31Z

paddle/fluid/operators/sparse_attention_op.cu

+  }
+}
+
+void CusparseDestroy(cusparseDnMatDescr_t* dn_mat_first,


这种封装方式不太好，一般应该遵守谁创建、谁销毁的原则。

Xreki · 2021-09-28T08:42:49Z

paddle/fluid/operators/sparse_attention_op.cu

+      GetTransposeOperation(b_transpose), &alpha, mat_a, mat_b, &beta, mat_c,
+      gpu_type, CUSPARSE_SDDMM_ALG_DEFAULT, &buffer_size);
+  auto d_buffer_ptr = paddle::memory::Alloc(ctx, buffer_size);
+  void* d_buffer = static_cast<void*>(d_buffer_ptr->ptr());


这个是workspace吗？

Xreki · 2021-09-28T08:45:26Z

python/paddle/fluid/tests/unittests/test_sparse_attention_op.py

+        return -1
+
+
+def get_linux_platform():


没那么复杂，cmake中可以判断是否WIN或MACOS系统，这两个系统就不定义这个单测了。

Add sparse_attention OPs, python api will be added in next pr

Add sparse_attention api, test=develop

617ad1e

Liu-xiandong added 7 commits September 13, 2021 04:20

clean useless code

a03c68f

add feature, batch_size and multi_heads

a48b7d9

clean code

5fe2fdc

clean useless code

d3f38a9

clean useless code

b8b810b

fix bug

1798bc8

fix ROCM-compile bug

8feba2b

xingfeng01 reviewed Sep 17, 2021

View reviewed changes

Modify the code

55d1360

AnnaTrainingG reviewed Sep 17, 2021

View reviewed changes

Liu-xiandong closed this Sep 17, 2021

add GetGpuOperation func

a478897

Liu-xiandong reopened this Sep 17, 2021

modify the PADDLE_ENFORCE_EQ

126527b

lanxianghit reviewed Sep 17, 2021

View reviewed changes

zkh2016 reviewed Sep 17, 2021

View reviewed changes

paddle/fluid/operators/sparse_attention_op.cu Outdated Show resolved Hide resolved

Liu-xiandong added 5 commits September 22, 2021 08:02

use memory::Alloc and remove cudaMalloc

fabe062

Added windows judgment logic

c261042

clean some code

7251d8b

clean some code

ab7cd2b

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

3c35c33

… Add_sparse_attention_api

Xreki reviewed Sep 24, 2021

View reviewed changes

Liu-xiandong added 2 commits September 24, 2021 12:15

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

274563f

… Add_sparse_attention_api

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

3f9bb9b

… Add_sparse_attention_api

Liu-xiandong added 2 commits September 26, 2021 09:26

modify based on reviewer

bfb2827

modify CMake

2a069bc

xingfeng01 approved these changes Sep 27, 2021

View reviewed changes

AnnaTrainingG approved these changes Sep 27, 2021

View reviewed changes

zkh2016 approved these changes Sep 27, 2021

View reviewed changes

Xreki approved these changes Sep 28, 2021

View reviewed changes

kolinwei approved these changes Sep 28, 2021

View reviewed changes

juncaipeng approved these changes Sep 28, 2021

View reviewed changes

lanxianghit approved these changes Sep 28, 2021

View reviewed changes

Liu-xiandong changed the title ~~Add sparse_attention api, test=develop~~ Add sparse_attention OP, test=develop Sep 28, 2021

lanxianghit merged commit 6b587e9 into PaddlePaddle:develop Sep 28, 2021

AnnaTrainingG pushed a commit to AnnaTrainingG/Paddle that referenced this pull request Sep 29, 2021

Add sparse_attention api, test=develop (PaddlePaddle#35676)

8456d0d

Add sparse_attention OPs, python api will be added in next pr

Liu-xiandong mentioned this pull request Oct 8, 2021

Add nn.functional.sparse_attention and some test cases, test=develop #35757

Merged

Liu-xiandong added a commit to Liu-xiandong/Paddle that referenced this pull request Oct 14, 2021

Add sparse_attention api, test=develop (PaddlePaddle#35676)

551425f

Add sparse_attention OPs, python api will be added in next pr

This was referenced Oct 14, 2021

[cherry-pick]Add sparse attention cherrypick #36447

Merged

[cherry-pick]Add nn sparse attention #36448

Closed

Add nn.functional.sparse_attention and some test cases, test=develop … #36551

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sparse_attention OP, test=develop #35676

Add sparse_attention OP, test=develop #35676

Liu-xiandong commented Sep 13, 2021 •

edited

Loading

paddle-bot-old bot commented Sep 13, 2021

ZHUI commented Sep 17, 2021

Liu-xiandong commented Sep 17, 2021

lanxianghit Sep 17, 2021

Liu-xiandong Sep 22, 2021

lanxianghit commented Sep 17, 2021

Liu-xiandong commented Sep 22, 2021

Xreki left a comment

Xreki Sep 24, 2021

Xreki Sep 24, 2021

Liu-xiandong Sep 26, 2021

Xreki Sep 28, 2021

Liu-xiandong Sep 28, 2021

Xreki left a comment

Xreki Sep 27, 2021

Xreki Sep 28, 2021

Xreki Sep 28, 2021

Xreki Sep 28, 2021

Xreki Sep 28, 2021

Add sparse_attention OP, test=develop #35676

Add sparse_attention OP, test=develop #35676

Conversation

Liu-xiandong commented Sep 13, 2021 • edited Loading

PR types

PR changes

Describe

Example

Precautions

Result

paddle-bot-old bot commented Sep 13, 2021

ZHUI commented Sep 17, 2021

Liu-xiandong commented Sep 17, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lanxianghit commented Sep 17, 2021

Liu-xiandong commented Sep 22, 2021

Xreki left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Xreki left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Liu-xiandong commented Sep 13, 2021 •

edited

Loading