Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

handle fallback correctly for write inplace when the array is MKLDNN. #10651

Merged
merged 5 commits into from
May 15, 2018

Conversation

zheng-da
Copy link
Contributor

@zheng-da zheng-da commented Apr 23, 2018

Description

For non-MKLDNN operators, providing an MKLDNN array triggers fallback. In this case, the input array and the output array are no longer using the same memory even if the output requirement is write-inplace. For operators that handle write-inplace differently, this can potentially cause problems, as spotted in #9835

The fix in this PR is to change the original output requirement to WriteTo. @piiswrong is this an acceptable fix?

Thank @ashokei for finding the root cause of #9835
The code below can trigger the problem.

from mxnet import gluon
from mxnet import nd
import mxnet as mx
import numpy as np
net = gluon.nn.HybridSequential()
with net.name_scope():
    net.add(gluon.nn.Conv2D(channels=32, kernel_size=3, activation='relu'))
    net.add(gluon.nn.Flatten())
    net.add(gluon.nn.Dense(10))
net.hybridize()
ctx = mx.cpu(0)
net.collect_params().initialize(mx.init.Xavier(magnitude=2.24), ctx=ctx)
x=nd.array(np.ones([32,3,224,224]), ctx)
y=net(x)
z=y[0].asnumpy()
print(z[0:5])

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

@ThomasDelteil
Copy link
Contributor

ThomasDelteil commented Apr 23, 2018

@zheng-da could we add a unit test since you are able to reproduce the problem with this snippet?

Also, if you could create a jira issue and stick it in the title of your PR, that'd be great 👍

@zheng-da
Copy link
Contributor Author

yes, I'll add unit tests. I'm thinking of adding C++ unit tests to have more comprehensive tests to cover all different cases.

@pengzhao-intel
Copy link
Contributor

related to #10765

@pengzhao-intel
Copy link
Contributor

@zheng-da what's the progress of this PR? We need this fix for our tests.

@piiswrong piiswrong merged commit bae1332 into apache:master May 15, 2018
jinhuang415 pushed a commit to jinhuang415/incubator-mxnet that referenced this pull request May 29, 2018
…apache#10651)

* handle writeinplace correctly for mkldnn arrays.

* Add unit tests.

* Fix a bug in mkldnn copy.

* Fix a bug in ndarray copy.

* Verify results.
rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 4, 2018
…apache#10651)

* handle writeinplace correctly for mkldnn arrays.

* Add unit tests.

* Fix a bug in mkldnn copy.

* Fix a bug in ndarray copy.

* Verify results.
zheng-da added a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 8, 2018
…apache#10651)

* handle writeinplace correctly for mkldnn arrays.

* Add unit tests.

* Fix a bug in mkldnn copy.

* Fix a bug in ndarray copy.

* Verify results.
zheng-da added a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 11, 2018
…apache#10651)

* handle writeinplace correctly for mkldnn arrays.

* Add unit tests.

* Fix a bug in mkldnn copy.

* Fix a bug in ndarray copy.

* Verify results.
zheng-da added a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 13, 2018
…apache#10651)

* handle writeinplace correctly for mkldnn arrays.

* Add unit tests.

* Fix a bug in mkldnn copy.

* Fix a bug in ndarray copy.

* Verify results.
anirudh2290 pushed a commit that referenced this pull request Jun 13, 2018
…#10651)

* handle writeinplace correctly for mkldnn arrays.

* Add unit tests.

* Fix a bug in mkldnn copy.

* Fix a bug in ndarray copy.

* Verify results.
zheng-da added a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 28, 2018
…apache#10651)

* handle writeinplace correctly for mkldnn arrays.

* Add unit tests.

* Fix a bug in mkldnn copy.

* Fix a bug in ndarray copy.

* Verify results.
@zheng-da zheng-da deleted the fix_fallback_inplace branch July 5, 2018 06:20
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants