-
Notifications
You must be signed in to change notification settings - Fork 6.8k
FullyConnected op with float64 and MKL-DNN fails if gradient are not set in a specific way #15767
Comments
Hey, this is the MXNet Label Bot. |
@mxnet-label-bot add [Bug] |
@wuxun-zhang please take a look for this bug. |
@matteosal Thanks for reporting this issue. I can reproduce this issue locally. Firstly, |
I also get the same problem with
Other RNN modes besides 'rnn_tanh' are also affected. |
@wuxun-zhang let's double-check all data type in MKLDNN backend. Maybe fix should be in 1.5.1. @TaoLv |
Seems that there are no data type check for MKL-DNN stateful RNN implementation (see https://github.com/apache/incubator-mxnet/blob/master/src/operator/rnn.cc#L226). So, when input data is |
The execution trace of RNN is maked out as below. |
It's not all about float64, but about |
@pengzhao-intel @TaoLv v1.5.0 doesn't have this issue. So don't need to fix in v1.5.1. |
@ZhennanQin Can we add data type check here #L1663 to disable subgraph when input data type is not supported by MKL-DNN? |
It's nice and we can try to resolve in 1.6. |
@matteosal sorry for the delay. The PR is blocked by 3rd party package but it is resolved and will be merged soon. |
Description
With MKL-DNN and float64 arrays, getting the output of a FullyConnected op after a forward pass fails unless the gradient update method is not
'null'
and explicit gradient arrays are specified (even though no backward pass is involved).Environment info (Required)
Package used: python
Build info (Required if built from source)
Compiler: gcc
MXNet commit hash: 3255d87
Build config: plain
config.mk
withUSE_OPENCV=0
Error Message:
Minimum reproducible example
The above script works, but setting
args_grad = None
orgrad_req = 'null'
(or both) makes it fail with this error:Every combination used to work in commit 076b2f3
The text was updated successfully, but these errors were encountered: