Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modified RandomKernel with Kernel Primitive API #39666

Merged
merged 4 commits into from
Feb 22, 2022

Conversation

AnnaTrainingG
Copy link
Contributor

@AnnaTrainingG AnnaTrainingG commented Feb 17, 2022

PR types

Others

PR changes

OPs

Describe

Modifide RandomKernel with Kernel Primitive API

背景:为提升KP 算子覆盖率,为XPU 40+ 模型提供算子支持,此处将gaussian/uniform_random中的Kernel实现替换为KP Kernel实现。

PR改动:

  1. 【新增】根据数据index生成随机数的Kernel: IndexKernel 在paddle/fluid/operators/index_impl.cu.h文件中
  2. 【统一代码】paddle/fluid/operators/uniform_random_inplace_op.cu / paddle/fluid/operators/uniform_random_op.cu 实现代码重复较多,将重复的代码放到了 paddle/fluid/operators/uniform_random_op.h 中。
  3. 【新增】InitWithDataIndex API到[primitive/datamover_primitives.h]中,将数据index 放到 寄存器中。

正确性说明: uniform_random_inplace_op/ uniform_random_op/ gaussian_random 的单测均已在develop分支中,py3可以测试到。

  1. test_uniform_random_bf16_op
  2. test_gaussian_random_op
    3.test_uniform_random_op

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@ZzSean ZzSean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for op benchmark

@AnnaTrainingG AnnaTrainingG changed the title Modifide RandomKernel with Kernel Primitive API Modified RandomKernel with Kernel Primitive API Feb 21, 2022
@zhangting2020
Copy link
Contributor

PR描述需要说明下问题的背景,PR的改动点比如你这个PR里新增、删除、修改了什么

Copy link
Contributor

@limin2021 limin2021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@@ -714,5 +714,14 @@ __device__ __forceinline__ void ReadDataBc(
}
}

template <typename T, int NX, int NY, int BlockSize>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个接口是否需要加一下说明,使用场景?其他的接口都有解释

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

会的,后面会统一加注释和说明文档

@@ -21,7 +21,7 @@
import paddle.fluid.core as core
from paddle.fluid.op import Operator
from paddle.fluid.executor import Executor
from op_test import OpTest
from paddle.fluid.tests.unittests.op_test import OpTest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里应该不用改?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个不该另外一个文件引用的时候会挂掉(不同目录下的)

int grid = config.block_per_grid.x;
int block = config.thread_per_block.x;
auto stream = dev_ctx.stream();
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XPU的线程配置是否合并到GetGpuLaunchConfig1D里面更好?这里线程配置就不用写分支了。包括stream的获取。可以考虑优化下

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以的, 后面会加进去

@AnnaTrainingG AnnaTrainingG merged commit 9f94821 into PaddlePaddle:develop Feb 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants