Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

cherry-pick bug fixes in MKLDNN for v1.2.0 #11212

Merged
merged 12 commits into from
Jun 13, 2018
Merged

Conversation

zheng-da
Copy link
Contributor

@zheng-da zheng-da commented Jun 8, 2018

Description

This PR cherry-picks bug fixes for MKLDNN in the master branch to the v1.2.0 branch

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

xinyu-intel and others added 6 commits June 8, 2018 20:33
* add windows mkldnn instruction

* update readme

* typo full mkl to mkldnn

* update blas

* update mxnet url

* update mkl build

* intel mkl liscence

* retrigger
* use mkl-dnn for 'valid' pooling_convention only

* pooling convention full not supported by current mkl-dnn impl

* disable unreachable code

* add sample model test for mkldnn

* fix review feedback

* add jira link to comment

* fix lint issue

* rename python test for mkl

* enable python tests for mkldnn in CI

* use vgg16 with convention full

* fix unittest
…#10616)

* ensure same mkldnn engine is used for consistency

* add unittest for mkldnn engine thread testing

* add comments for thread context switching

* fix lint issue

* use dummy data
* handle inplace in mkldnn FallBackCompute

* add comments

* handle kAddTo in mkldnn FallBackCompute

* add PR feedback

* add unittest for mkldnn inplace sum with cpu data

* add back mkldnn engine threading unittest

* separate mkldnn install test and fix pylint issue

* remove --build from mkldnn jenkins test

* update mkldnn unittests

* update comments for mkldnn test

* remove python test doc string so unittest name is used
@zheng-da
Copy link
Contributor Author

zheng-da commented Jun 8, 2018

@anirudh2290

@anirudh2290
Copy link
Member

anirudh2290 commented Jun 8, 2018

Was #10591 mentioned in the dev ? Which PR are
89d7b9b and e728cf0 part of ?

@pengzhao-intel
Copy link
Contributor

@zheng-da please try if this issue (#11028) is gone in your branch.

@anirudh2290
Copy link
Member

this pr is also making a change to the mkldnn submodule through #10578 . This worries me since the mkldnn submodule change has a change of 124 files and this makes it a really big change for the patch release. I am not comfortable including #10578. Do other PRs depend on the change ?

@zheng-da
Copy link
Contributor Author

@anirudh2290 i have removed #10578
@pengzhao-intel it should work fine for #11028

@pengzhao-intel
Copy link
Contributor

@anirudh2290 The update of MKL-DNN in #10578 fixed the depthConv issue.

@anirudh2290
Copy link
Member

@pengzhao-intel I understand depthwise convolution is widely used but it is too big a change to be added to the patch release. What do others think @zheng-da @piiswrong ?

@pengzhao-intel
Copy link
Contributor

@anirudh2290 Totally understand it will be a risk.
The reason is MXNET used the specified CI of MKL-DNN. When we updated MKL-DNN to the bugfix CI, there are lots of other CIs so you see big changes. We have verified the performance of this CI, it works very well :)

@anirudh2290
Copy link
Member

@pengzhao-intel There are still a lot of non CI-specific code changes. Since MKLDNN is marked as experimental for the release, I would err on the side of caution, add it to known issues for MKLDNN, and not add it to the release.

// Check whether the number of format is correct.
CHECK_EQ(mkldnn_format_last, 56);
CHECK_EQ(mkldnn_nchw, 5);
CHECK_EQ(mkldnn_oihw, 12);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these unnecessary tests and how are they running on master ?

@zheng-da
Copy link
Contributor Author

zheng-da commented Jun 12, 2018 via email

@anirudh2290
Copy link
Member

@zheng-da ok got it!

zheng-da and others added 6 commits June 13, 2018 11:30
* test inference multiple times.

* Fix a bug in GetMKLDNNData().

* Update comments.

* Handle all cases for GetMKLDNNDataReorder

* avoid unnecessary message.

* Add C++ unit test for NDArray.

* Fix a minor bug.

* Unit tests on GetMKLDNNDataReorder.

* Fix lint error.

* Add more test cases.

* add comments for the test code.

* Reorganize test code.

* Fix cpp tests.

* test.

* Add a new Jenkins compile task.

* Update jenkins.

* update jenkins.

* Fix a Jenkins.

* Fix jenkins.

* Fix jenkins.

* Fix CMake for MKLDNN.

* Fix jenkins.

* update jenkins.

* update CMake.

* Fix cmake.

* update CI.

* add comment.

* add comments.

* cmake builds mkldnn with -mtune=generic by default.

* adjust comments.

remove unnecessary tests.
* Revert "Revert "invalidate outputs for imperative.""

This reverts commit b428937.

* invalidate mkldnn memory.

* enable test.
…apache#10651)

* handle writeinplace correctly for mkldnn arrays.

* Add unit tests.

* Fix a bug in mkldnn copy.

* Fix a bug in ndarray copy.

* Verify results.
* Fix bugs in MKLDNN.

* add more test cases.

* Fix CopyFrom when it's the view of an NDArray.

* add test.

* check same shape correctly.

* add unit test for CopyFrom.

* Fix warning.

* Add test sum.

* fix sum.

* Fix fallback.

* Fix fallback of sum.

* add tests.

* Update mkldnn.cc
@anirudh2290 anirudh2290 merged commit 62a47a7 into apache:v1.2.0 Jun 13, 2018
@zheng-da zheng-da deleted the v1.2.0 branch September 29, 2018 21:31
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants