Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added shape (U)INT8/BF16/FP32 oneDNN kernel #36033

Merged
merged 13 commits into from
Feb 11, 2022
43 changes: 43 additions & 0 deletions paddle/fluid/operators/mkldnn/shape_mkldnn_op.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
/* Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */

#include "paddle/fluid/framework/op_registry.h"
#include "paddle/fluid/operators/shape_op.h"
#include "paddle/fluid/platform/mkldnn_helper.h"

namespace paddle {
namespace operators {

using paddle::framework::Tensor;

template <typename T>
class ShapeMKLDNNKernel : public ShapeKernel<T> {
public:
void Compute(const framework::ExecutionContext& ctx) const override {
ShapeKernel<T>::Compute(ctx);

auto* out = ctx.Output<Tensor>("Out");
out->set_layout(framework::DataLayout::kMKLDNN);
out->set_format(platform::GetPlainMKLDNNFormat(out->dims().size()));
}
};
} // namespace operators
} // namespace paddle

namespace ops = paddle::operators;
REGISTER_OP_KERNEL(shape, MKLDNN, paddle::platform::CPUPlace,
ops::ShapeMKLDNNKernel<float>,
ops::ShapeMKLDNNKernel<paddle::platform::bfloat16>,
ops::ShapeMKLDNNKernel<int8_t>,
ops::ShapeMKLDNNKernel<uint8_t>);
25 changes: 25 additions & 0 deletions paddle/fluid/operators/shape_op.cc
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,21 @@ class ShapeOp : public framework::OperatorWithKernel {
ctx->SetOutputDim("Out", {in_dim.size()});
}

framework::OpKernelType GetExpectedKernelType(
const framework::ExecutionContext &ctx) const override {
auto input_data_type =
framework::OperatorWithKernel::IndicateVarDataType(ctx, "Input");

#ifdef PADDLE_WITH_MKLDNN
if (this->CanMKLDNNBeUsed(ctx, input_data_type)) {
return framework::OpKernelType(input_data_type, ctx.GetPlace(),
framework::DataLayout::kMKLDNN,
framework::LibraryType::kMKLDNN);
}
#endif
return framework::OpKernelType(input_data_type, ctx.GetPlace());
}

protected:
framework::OpKernelType GetKernelTypeForVar(
const std::string &var_name, const framework::Tensor &tensor,
Expand All @@ -58,6 +73,16 @@ Shape Operator.

Return the shape of the input.
)DOC");
AddAttr<bool>("use_mkldnn",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidanqing-intel Do we still support modifying the properties of native operators now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@baoachun Plan is (as executed in #36541) that the only new attribute to be added in native ops is "use_mkldnn" . That is becasue in training there are no passes, so we still need an attribute (set with env var FLAGS_use_mkldnn) so that PaddlePaddle runs oneDNN kerenels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use_mkldnn attribute is necessary for running mkldnn kernels during training since there are not inference passes in that mode. This attribute is used to determine whether mkldnn kernel should be run or not and that behavior is checked in GetExpectedKernelType(). It is a common behavior for all mkldnn kernels and that attribute must be included in original operator version but I can create another onednn-based op and move mkldnn_data_type attribute into it, but that cannot be done with use_mkldnn attribute. Personally, I would like to keep both of these attributes in native op to allow common inference infrastructure usage but if that's a problem then I'll create a separate op for that.
Here's code for shape op's GetExpecterKernelType() which uses use_mkldnn in this->CanMKLDNNBeUsed() function

framework::OpKernelType GetExpectedKernelType(
const framework::ExecutionContext &ctx) const override {
auto input_data_type =
framework::OperatorWithKernel::IndicateVarDataType(ctx, "Input");
#ifdef PADDLE_WITH_MKLDNN
if (this->CanMKLDNNBeUsed(ctx, input_data_type)) {
return framework::OpKernelType(input_data_type, ctx.GetPlace(),
framework::DataLayout::kMKLDNN,
framework::LibraryType::kMKLDNN);
}
#endif
return framework::OpKernelType(input_data_type, ctx.GetPlace());
}

Copy link
Contributor

@jczaja jczaja Feb 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@baoachun , @jakpiase
Training does no use passes. So Executor used for training is having a mechanism that when FLAGS_use_mkldnn=True then iteration through all ops is done and those having use_mkldnn attribute are having this attribute set to True. So that is the way how during training to let know Paddle to use oneDNN kernels.

Regarding GPU and its "use_cudnn". I looked to code and most of operators (older ones like, conv2d, softmax, gelu, activation, abs,....) are having "use_cudnn" defined. Probably due to API compatibility reasons.
Some newer ops e.g. Matmul . Are not having use_cudnn. But matmul CPU and GPU kernel are the same method and execution decision is made based on parameter Place e.g. CPU or GPU.

Some others like affine_channel are having their own kernel for GPU , different from CPU. And decision which run is made based on searching through registers kernels when Key is OpKernelType using following comparison:

bool OpKernelType::operator==(const OpKernelType& o) const {
return platform::places_are_same_class(place_, o.place_) &&
data_type_ == o.data_type_ && data_layout_ == o.data_layout_ &&
library_type_ == o.library_type_ &&
customized_type_value_ == o.customized_type_value_;

So you can see that Kernel is chosen if required type matches .e.g library, place, type .

For GPU kernels there is GPU place. So library plain and GPU place inidicate that GPU kerenels should be used.
For oneDNN kernels it is a bit harder as there is only CPU place which is used for CPU kernels and oneDNN kernels.
So to pick oneDNN kernels we also need to set library to MKLDNN. And to set library to MKLDNN we currently use "use_mkldnn".

What I'm trying to write is that if there were oneDNN place then we could do the same as for GPU e.g. discard use_mkldnn attribute. But since there is no oneDNN place and we use CPU place then we need some way to indicate
during the training that library should be MKLDNN and this is done using attribute use_mkldnn.

"(bool, default false) Only used in mkldnn kernel")
.SetDefault(false)
.AsExtra();
AddAttr<std::string>(
"mkldnn_data_type",
"(string, default \"float32\"). Data type of mkldnn kernel")
.SetDefault("float32")
.InEnum({"float32", "bfloat16", "int8"})
.AsExtra();
}
};

Expand Down
9 changes: 0 additions & 9 deletions paddle/fluid/platform/mkldnn_helper.h
Original file line number Diff line number Diff line change
Expand Up @@ -346,31 +346,22 @@ inline dnnl::memory::format_tag GetPlainMKLDNNFormat(int tensor_rank) {
switch (tensor_rank) {
case 1:
return dnnl::memory::format_tag::a;
break;
case 2:
return dnnl::memory::format_tag::ab;
break;
case 3:
return dnnl::memory::format_tag::abc;
break;
case 4:
return dnnl::memory::format_tag::abcd;
break;
case 5:
return dnnl::memory::format_tag::abcde;
break;
case 6:
return dnnl::memory::format_tag::abcdef;
break;
case 7:
return dnnl::memory::format_tag::abcdefg;
break;
case 8:
return dnnl::memory::format_tag::abcdefgh;
break;
case 9:
return dnnl::memory::format_tag::abcdefghi;
break;
default:
PADDLE_THROW(platform::errors::Unimplemented(
"Paddle support tensors with rank in range <1, 9>, but received "
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from auto_scan_test import MkldnnAutoScanTest
from program_config import TensorConfig, ProgramConfig, OpConfig
import numpy as np
from functools import partial
import unittest
from hypothesis import given
import hypothesis.strategies as st


class TestMkldnnShapeOp(MkldnnAutoScanTest):
def is_program_valid(self, program_config: ProgramConfig) -> bool:
return True

def sample_program_configs(self, *args, **kwargs):
def generate_input(*args, **kwargs):
return np.random.random(kwargs['in_shape']).astype(kwargs[
'in_dtype'])

shape_op = OpConfig(
type="shape",
inputs={"Input": ["input_data"]},
outputs={"Out": ["output_data"]})

program_config = ProgramConfig(
ops=[shape_op],
weights={},
inputs={
"input_data": TensorConfig(data_gen=partial(generate_input,
*args, **kwargs)),
},
outputs=["output_data"])

yield program_config

def sample_predictor_configs(self, program_config):
config = self.create_inference_config(use_mkldnn=True)
yield config, (1e-5, 1e-5)

@given(
in_shape=st.lists(
st.integers(
min_value=1, max_value=3), min_size=1, max_size=9),
in_dtype=st.sampled_from([np.float32, np.uint16, np.int8, np.uint8]))
def test(self, *args, **kwargs):
self.run_test(quant=False, *args, **kwargs)


if __name__ == "__main__":
unittest.main()
62 changes: 62 additions & 0 deletions python/paddle/fluid/tests/unittests/mkldnn/test_shape_mkldnn_op.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from __future__ import print_function

import unittest
import numpy as np
from paddle.fluid.tests.unittests.op_test import OpTest, OpTestTool
import paddle
from paddle.fluid import core
from paddle.fluid.op import Operator


@OpTestTool.skip_if_not_cpu_bf16()
class TestShape3DFP32OneDNNOp(OpTest):
def setUp(self):
self.op_type = "shape"
self.config()
self.attrs = {'use_mkldnn': True}
self.inputs = {'Input': np.zeros(self.shape).astype(self.dtype)}
self.outputs = {'Out': np.array(self.shape)}

def config(self):
self.shape = [5, 7, 4]
self.dtype = np.float32

def test_check_output(self):
self.check_output_with_place(core.CPUPlace())


class TestShape6DBF16OneDNNOp(TestShape3DFP32OneDNNOp):
def config(self):
self.shape = [10, 2, 3, 4, 5, 2]
self.dtype = np.uint16


class TestShape9DINT8OneDNNOp(TestShape3DFP32OneDNNOp):
def config(self):
self.shape = [1, 2, 3, 4, 5, 6, 7, 8, 9]
self.dtype = np.int8


class TestShape2DUINT8OneDNNOp(TestShape3DFP32OneDNNOp):
def config(self):
self.shape = [7, 11]
self.dtype = np.uint8


if __name__ == '__main__':
paddle.enable_static()
unittest.main()