Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OneDNN md-in-tensor refactoring part 5: Memory descriptor enabled for elementwises, reductions and expand_v2 ops #43036

Merged
merged 5 commits into from
May 30, 2022

Conversation

jakpiase
Copy link
Contributor

PR types

Others

PR changes

OPs

Describe

Memory descriptor enabled for elementwises, reductions and expand_v2 ops

@jakpiase jakpiase added the Intel label May 26, 2022
@paddle-bot-old
Copy link

很抱歉,经过我们的反复讨论,你的PR暂未达到合入标准,请阅读飞桨原生算子开发规范,你可以重新提交新的PR,我们先将此PR关闭,感谢你的贡献。
Sorry to inform you that through our discussion, your PR fails to meet the merging standard (Reference: Paddle Custom Operator Design Doc). You can also submit an new one. Thank you.

@jakpiase
Copy link
Contributor Author

Reopened due to license/cla being stuck

@jakpiase jakpiase reopened this May 27, 2022
@jakpiase jakpiase requested a review from jczaja May 27, 2022 16:25
jczaja
jczaja previously approved these changes May 27, 2022
Copy link
Contributor

@jczaja jczaja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jakpiase
Copy link
Contributor Author

@tsocha please review this PR

Comment on lines +709 to +711
platform::Place cpu_place, const Tensor* x,
Tensor* out, float scale_x, float scale_y,
const std::vector<int64_t>& extended_x_dims)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN]
Type is almost the same but order has changed.
I hope that our tests will detect potential problems with it.

Was this change of order necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not necessary, but for me order: input, output is more intuitive than output, input. I have checked all the tests for reduce_grad and expand_v2 and they all work correctly after my changes

Comment on lines 719 to 721
dnnl::primitive_attr attrs;
attrs.set_scales(DNNL_ARG_SRC_0, 0, {scale_x});
attrs.set_scales(DNNL_ARG_SRC_1, 0, {scale_y});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[OPINION]
The old name was good.

const Tensor* y, std::vector<int64_t> y_tz,
const dnnl::primitive_attr& attr = NULL)
const Tensor* out, std::vector<int64_t> out_tz,
const dnnl::primitive_attr& attrs = NULL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[COMMENT]
If dnnl::primitive_attr can hold multiple attributes its name should be changed in onednn, but if it can't then our parameter should have the old name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can hold scales, post-ops, zero-points, so it definitely supports multiple attributes, but probably because of backward compatibility it won't be changed inside oneDNN


std::vector<int64_t> output_dims(phi::vectorize(input->dims()));
std::vector<int64_t> output_dims = phi::vectorize(input->dims());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[QUESTION]
Was this change necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it will produce the exact same assembly, I will revert it then

for (size_t i = 0; i < reduce_dims.size(); ++i) {
// handle negative dims, f.e. -1 means last dimension
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN]
I think last dimension is not the best description.
(1, 3, -1, 512, 512) << the last dimension is -1 or 512?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I'll change it to "rightmost"

Comment on lines +55 to +64
const auto* x = ctx.Input<LoDTensor>("X");
auto* out = ctx.Output<Tensor>("Out");

auto reduce_dims = ctx.Attr<std::vector<int>>("dim");
bool reduce_all = ctx.Attr<bool>("reduce_all");
bool keep_dim = ctx.Attr<bool>("keep_dim");

auto output_dims =
CalculateReducedDims(input, output, reduce_dims, reduce_all, keep_dim);
auto input_dims = phi::vectorize(input->dims());
auto x_tz = phi::vectorize(x->dims());
auto out_tz =
CalculateReducedDims(x, out, reduce_dims, reduce_all, keep_dim);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[OPINION]
I see why you changed names here but input>x IMO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed them to unify all of our ops, since all new ops are following this pattern, i.e. inputs are described as 'X' and 'Y' and output is described as 'Out'. Some legacy operators are having different names because of backwards compatibility, but I was thinking that unifying them to get rid of the additional noise would be nice


auto reorder_dst_memory_p = reorder_handler.AcquireDstMemory(
output, input->mem_desc(), ctx.GetPlace());
// reuse same mem desc since it is a simple copy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// reuse same mem desc since it is a simple copy
// reuse mem desc since it is a simple copy

@@ -124,25 +126,18 @@ class ReduceGradMKLDNNKernel : public framework::OpKernel<T> {
bool keep_dim = ctx.Attr<bool>("keep_dim");
bool reduce_all = ctx.Attr<bool>("reduce_all");
auto dims = ctx.Attr<std::vector<int>>("dim");
const auto* dout = ctx.Input<Tensor>(framework::GradVarName("Out"));
auto* dout = ctx.Input<Tensor>(framework::GradVarName("Out"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN]
Do you plan to change this pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, nice catch, that was a mistake, thanks

Comment on lines +132 to +133
auto dout_tz = CalculateReducedDims(dx, dout, dims, reduce_all, keep_dim);
auto dx_tz = phi::vectorize(dx->dims());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONCERN]
The logic changed here.
Input has been swaped with output, was it intentional?

What means tz and d in this context?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Input and Output were incorrectly swapped here earlier. Reduce kernel was one of the first ops that I have made and there were some inconsistencies and bugs inside of it, so that change is intentional

d means that it is a grad tensor, so dout is the input and dx is the output
I have no idea what tz means, but it is commonly used as a synonym for dims inside PaddlePaddle, oneDNN and other frameworks

Copy link
Contributor

@jczaja jczaja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jczaja jczaja merged commit 12d8a56 into PaddlePaddle:develop May 30, 2022
fuyou765 pushed a commit to fuyou765/Paddle that referenced this pull request Jun 7, 2022
… elementwises, reductions and expand_v2 ops (PaddlePaddle#43036)

* enabled md in elementwises, reductions and expand_v2

* CI fix for invalid numpy copy

* fixed formatting

* CI rerun

* changes after review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants