Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

MKLDNN gradients incorrect with hybrid called with static_shape=True and train_mode=False #13445

Closed
azai91 opened this issue Nov 28, 2018 · 3 comments

Comments

@azai91
Copy link
Contributor

azai91 commented Nov 28, 2018

When a gluon network is hybridized with static_shape=True and train_mode=False the gradients are different from when hybridized is not called.

Unit test for this has been disabled (https://github.com/apache/incubator-mxnet/pull/12411/files#diff-962bd5bb7248659d7eb3be37ee8a4c6bR1246). To reproduced just change the following line to:

check_hybrid_static_memory(train_mode=[True, False], static_alloc=True, static_shape=True)

For Q & A and discussion, please start a discussion thread at https://discuss.mxnet.io

Description

(Brief description of the problem in no more than 2 sentences.)

Environment info (Required)

What to do:
1. Download the diagnosis script from https://raw.githubusercontent.com/apache/incubator-mxnet/master/tools/diagnose.py
2. Run the script using `python diagnose.py` and paste its output here.

Package used (Python/R/Scala/Julia):
(I'm using ...)

For Scala user, please provide:

  1. Java version: (java -version)
  2. Maven version: (mvn -version)
  3. Scala runtime if applicable: (scala -version)

For R user, please provide R sessionInfo():

Build info (Required if built from source)

Compiler (gcc/clang/mingw/visual studio):

MXNet commit hash:
(Paste the output of git rev-parse HEAD here.)

Build config:
(Paste the content of config.mk, or the build command.)

Error Message:

(Paste the complete error message, including stack trace.)

Minimum reproducible example

(If you are using your own code, please provide a short script that reproduces the error. Otherwise, please provide link to the existing example.)

Steps to reproduce

(Paste the commands you ran that produced the error.)

What have you tried to solve it?

@vrakesh
Copy link
Contributor

vrakesh commented Nov 28, 2018

@mxnet-label-bot add [MKLDNN, Gluon]

@vrakesh
Copy link
Contributor

vrakesh commented Nov 28, 2018

@azai91 thank you for reporting the issue, requesting to add an example of the gluon network code being used for reference.

@pengzhao-intel
Copy link
Contributor

closing since no reproducible case and we have fixed several bugs recently.
If the issue still exists, feel free to reopen.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants