-
Notifications
You must be signed in to change notification settings - Fork 5.9k
【CUDA Kernel No.32】Add .h file for box_clip_kernel -part #75592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Related PaddleCustomDevice PR: #2021 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q1: 我要写入头文件的应该是GPUBoxClipKernel对吧?
A1: 是的。
Q2: 那它是不是应当为仅GPU实现的Kernel,还是CPU和GPU均实现的Kernel?
A2: 这种视为仅GPU实现,头文件可以放在gpu目录下。
Q3: 我是不是不需要在CPU实现中include此头文件?
A3: 不需要在CPU实现中include。CPU可以什么都不改。
|
非常感谢!已优化代码 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
/re-run all-failed |
|
hi, @algorithm1832
|
PR Category
Custom Device
PR Types
Improvements
Description
另外有个小问题:
CPU实现注册的是phi::BoxClipKernel,GPU实现注册的是phi::GPUBoxClipKernel,名称不同。那么: