Skip to content

elementwise op#238

Merged
zjing14 merged 22 commits into
developfrom
eltwise_op
May 19, 2022
Merged

elementwise op#238
zjing14 merged 22 commits into
developfrom
eltwise_op

Conversation

@rocking5566
Copy link
Copy Markdown
Collaborator

  1. kernel of 1D binary_elementwise.
  2. Device op of 1D and 2D of binary_elementwise
  3. elementwise_add and broadcast_add example

@rocking5566 rocking5566 requested a review from zjing14 May 17, 2022 05:02
@zjing14 zjing14 mentioned this pull request May 17, 2022
@rocking5566 rocking5566 requested review from asroy and qianfengz and removed request for asroy May 17, 2022 05:35
Comment thread include/ck/tensor_operation/gpu/device/device_binary_elementwise.hpp Outdated
Comment thread include/ck/tensor_operation/gpu/device/device_binary_elementwise.hpp Outdated
@rocking5566 rocking5566 requested a review from zjing14 May 17, 2022 17:08
Comment thread include/ck/tensor_operation/gpu/grid/gridwise_binary_elementwise_1d.hpp Outdated
Comment thread example/19_binary_elementwise/elementwise_add_1d.cpp Outdated
Comment thread example/19_binary_elementwise/elementwise_add_1d.cpp Outdated
Comment thread example/19_binary_elementwise/elementwise_add_1d.cpp Outdated
Comment thread include/ck/tensor_operation/gpu/device/device_binary_elementwise.hpp Outdated
Comment thread example/19_binary_elementwise/elementwise_add_1d.cpp Outdated
Comment thread example/19_binary_elementwise/elementwise_add_1d.cpp Outdated
Comment thread include/ck/tensor_operation/gpu/grid/gridwise_binary_elementwise_1d.hpp Outdated
Comment thread include/ck/tensor_operation/gpu/grid/gridwise_binary_elementwise_1d.hpp Outdated
Copy link
Copy Markdown
Contributor

@zjing14 zjing14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zjing14 zjing14 merged commit aafc3ac into develop May 19, 2022
#pragma once
#include <vector>

namespace ck {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function doesn't need to stay in ck namespce

Copy link
Copy Markdown
Collaborator Author

@rocking5566 rocking5566 May 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I send another PR
#242

a_m_device_buf.GetDeviceBuffer(),
b_m_device_buf.GetDeviceBuffer(),
c_m_device_buf.GetDeviceBuffer(),
ck::convert_vector_element_type<std::size_t, ck::index_t>(nchw),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does assign works? If so, convert_vector_element_type would not be needed
https://en.cppreference.com/w/cpp/container/vector/assign

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion.
You make it found a better way.
I will use constructor instead.
std::vectorck::index_t{nchw.begin(), nchw.end()}


Tensor<ABDataType> a_m(nchw);
Tensor<ABDataType> b_m(nchw);
Tensor<ABDataType> c_m(nchw);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo, CDataType

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I send another PR
#242


struct Invoker : public BaseInvoker
{
Invoker(index_t blockSize) : BaseInvoker(), blockSize_(blockSize) {}
Copy link
Copy Markdown
Contributor

@asroy asroy May 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blockSize should be inside Argument, not Invoker.

By design, DeviceOp and Argument should contain all the information necessary to launch a kernel, otherwise IsSupportedArgument(Argument*) would not know for sure if DeviceOp support a configuration or not

I think MakeArgument() should not have blockSize as argument, because MakeArgument() should not reflect kernel implementation detail. You can put blockSize as a member of this DeviceOp, and give it a default value (1024, for example), user can choose to change the value of blockSize then self, if really needed.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I send another PR
#242

@rocking5566 rocking5566 deleted the eltwise_op branch July 26, 2023 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants