MKLDNN sum OP: implement primitive cache and class refact #12

jinhuang415 · 2018-01-10T14:56:16Z

Description

Implement primitive cache for MKLDNN Sum OP
Implement class refact/encapsulate for Sum OP

Checklist

Essentials

Passed code style checking (make lint)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage
For user-facing API changes, API doc string has been updated. For new C++ functions in header files, their functionalities and arguments are well-documented.
To my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Comments

this PR need to go after PR OP primitive cache: use memory as signature for MKLDNN storage type #11, this is due to Sum primitive cache need to use memory to do signature and OP primitive cache: use memory as signature for MKLDNN storage type #11 will convert NDArray sign (which we use in this PR) to memory sign. Otherwise there will be convergence issue.

…into refactor

This commit may add some overhead of managing NDArray for each fallback.

Conflicts: src/operator/nn/mkldnn/mkldnn_batch_norm-inl.h

2. Add memory into signature; 3. Try to split BatchNorm into .h file and .cc file. Will finish it after backward code is refactored.

Caching primitive for BatchNorm forward computation

zheng-da · 2018-01-10T23:18:44Z

src/operator/tensor/elemwise_sum.h

+    return this->num_args == other.num_args;
+  }
+};
+


why do you create a new parameter? You don't have to use MKLDNNParamOpSign. You can use MKLDNNOpSignature.

The parameter definition is moved from src/operator/tensor/elemwise_sum.cc to this header file in order to be used by mkldnn_sum.cc, in order to support "add_n"(alias "ElementWiseSum") OP which has valid param, I used MKLDNNParamOpSign.

zheng-da · 2018-01-10T23:20:27Z

src/operator/nn/mkldnn/mkldnn_sum.cc

+    param = nnvm::get<ElementWiseSumParam>(attrs.parsed);
+  } else {
+    memset(&param, 0, sizeof(param));
+  }


The parameter doesn't exist.

In case MKLDNNSumCompute() is invoked via "add_n"(alias "ElementWiseSum") OP, the param will exists, so in order to adapt for these 2 scenarios and provide a unified interface, set param to 0 in case there is no param.

zheng-da · 2018-01-10T23:27:56Z

src/operator/nn/mkldnn/mkldnn_sum.cc

+ private:
+  std::shared_ptr<mkldnn::sum> fwd;
+  std::vector<std::shared_ptr<mkldnn::memory>> in_data;
+  mkldnn_output_t out;


you shouldn't use mkldnn_output_t. this is designed to use with CreateMKLDNNMem and CommitOutput. it's not supposed to use for holding the mkldnn memory because the second field is a raw pointer. We shouldn't use a raw pointer to hold memory.

Thanks for the suggestion, will change to use std::shared_ptr to hold the output memory.

zheng-da · 2018-01-11T07:38:10Z

src/operator/nn/mkldnn/mkldnn_sum.cc

-  CommitOutput(out_data, out_mem);
+  stream->RegisterPrim(*(this->fwd));
+  auto out_mem = CreateMKLDNNMem(output, this->fwd_pd->dst_primitive_desc(),
+                                 req);


you shouldn't call CreateMKLDNNMem twice. The way you restructure the code makes it difficult to work with the interface I designed.

In order to only invoke CreateMKLDNNMem() once and keep a unified SetDataHandle/Execute API interface (pass NDArray as parameters), one way is to define 2 members under
MKLDNNSumFwd: "std::shared_ptrmkldnn::memory out" and "OutDataOp op" to save created memory and data_op and use it later (maybe a little ugly). Please let me know your suggestions as well if there is better choice.

* Added tutorial for FIT API * Added tests for Fit API tutorial * Updated index.md for the new tutorial to show up * Addressed PR feedback * Addressed PR feedback * Removed spurious comment for Py2 and Py3 compatibility * Address PR feedback * Addressed PR feedback * Fixed typo * Added example to showcase custom event handler * Fixed imports as estimator moved to contrib package * Added a side note to inform about estimator reference being updated by the handlers * Corrected typo * update tutorial * address comments * new line * fix import * fix cached graph * fix import * address comments * fix doc gen * add softmax * add to website index * fix doc string * Fix doc gen (zheng-da#12) * fix warining * fix test * fix * fix * fix print * fix test (zheng-da#13) * fix warning (zheng-da#14) * fix href (zheng-da#15)

zheng-da added 30 commits December 17, 2017 17:46

Make upsampling stateless.

9854b4b

Make pooling stateless.

5bb99c8

Make dropout stateless.

046eb81

Make batchnorm stateless.

f4c6f1c

Make SoftmaxActivation stateless.

30b5fd9

Fix a code style problem.

95ef90e

pass amalgamation test for batch norm.

921859a

pass amalgamation test for dropout.

485f58f

Get convolution ops from a function.

660968f

Fix compilation errors for GPU.

26e9430

Fix thread local in diff platforms.

5504e2c

Avoid using thread_local for non-CuDNN conv/deconv.

6324176

Remove TODO in deconv.

36c466f

Fix a compilation error in dropout.

6410684

Fix a bug in batch norm.

1fa3898

Fix a bug in fully connected.

588383a

Don't set #inputs for backward convolution.

66a281a

Remove MKL code.

d3ce902

Update MXNet for MKLDNN.

caa3bf3

Enable MKLDNN Relu.

db10bb1

Fix a compilation error.

99c1e08

Change Makefile for MKLDNN.

a6c2c82

Remove infer storage in convolution.

3f75f52

Update MXNet for MKLDNN.

edf6842

Support MKLDNN storage type in python.

c96ca26

Update activation.

1a6e06e

Add MKLDNN base classes.

ca30cac

Implement MKLDNN fully connected.

79c563c

Add MKLDNN convolution.

2f5ed28

Update MKLDNN interface in NDArray.

126b85e

zheng-da and others added 14 commits January 2, 2018 19:00

Avoid reallocation in NDArray.

f4b73db

Merge branch 'refactor' of https://github.com/zheng-da/incubator-mxnet …

c8578c0

…into refactor

Handle weight arrays with special MKLDNN layouts.

ac8f9fd

Remove unnecessary GetWeights.

24200a0

Fix compilation error without MKLDNN.

19d8749

Fix a bug in (de)conv for weight arrays.

18236fc

Fix a minor bug in MKLDNN conv.

1cd8bad

Avoid caching TBlob from NDArray.

9b3c8b2

This commit may add some overhead of managing NDArray for each fallback.

Fix a bug in MKLDNNOpSignature.

c426bfa

Merge remote-tracking branch 'da/refactor' into bn-primitive

623b994

Conflicts: src/operator/nn/mkldnn/mkldnn_batch_norm-inl.h

1. Fix coding style in BatchNorm;

5825191

2. Add memory into signature; 3. Try to split BatchNorm into .h file and .cc file. Will finish it after backward code is refactored.

Merge pull request zheng-da#5 from TaoLv/bn-primitive

87fd9d5

Caching primitive for BatchNorm forward computation

Update mkldnn_base-inl.h

7c957ef

MKLDNN sum OP: implement primitive cache and class refact

29715d1

zheng-da reviewed Jan 10, 2018

View reviewed changes

Use shared pointer to hold output memory for MKLDNN sum primitive cache

b74e063

zheng-da reviewed Jan 11, 2018

View reviewed changes

jinhuang415 mentioned this pull request Jan 11, 2018

Implement primitive cache for FullyConnected OP and class refact #17

Open

5 tasks

zheng-da force-pushed the refactor branch from 1cba97f to 7fa4d0a Compare January 11, 2018 20:13

Fix CreateMKLDNNMem() invoke twice issue

6d6f26c

zheng-da force-pushed the refactor branch 8 times, most recently from c5b06e8 to 8ba9736 Compare January 19, 2018 02:02

zheng-da force-pushed the refactor branch from 9c1745d to 9719b07 Compare February 2, 2018 00:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MKLDNN sum OP: implement primitive cache and class refact #12

MKLDNN sum OP: implement primitive cache and class refact #12

jinhuang415 commented Jan 10, 2018 •

edited

Loading

zheng-da Jan 10, 2018

jinhuang415 Jan 11, 2018

zheng-da Jan 10, 2018

jinhuang415 Jan 11, 2018

zheng-da Jan 10, 2018

jinhuang415 Jan 11, 2018

zheng-da Jan 11, 2018

jinhuang415 Jan 11, 2018

MKLDNN sum OP: implement primitive cache and class refact #12

Are you sure you want to change the base?

MKLDNN sum OP: implement primitive cache and class refact #12

Conversation

jinhuang415 commented Jan 10, 2018 • edited Loading

Description

Checklist

Essentials

Comments

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jinhuang415 commented Jan 10, 2018 •

edited

Loading