-
Notifications
You must be signed in to change notification settings - Fork 6.8k
MKL-DNN gives wrong convolution bias gradient if weights gradient is not requested #15464
Comments
Hey, this is the MXNet Label Bot. |
Thanks for reporting the potential issues. |
@matteosal I think this is perhaps caused by the fact that the forward convolution operator supports calculation either with or without bias. And both cases require weights in the convolutional operator. So in the backward process, the gradient of bias could be either calculated or not. This leads to only two backward APIs supported by MKL-DNN corresponding to the cases of the forward operations. Actually, we can just take the gradients of the weights away. And we have located the problem. Further verification is required before taking a PR. |
I don't have a use case for this. I'm reporting the bug because our unittests at Wolfram Research have spotted it. |
@matteosal no problem. we will create the case and file PR soon. |
Thanks! |
The fix is merged and close this issue. |
Description
When using MKL-DNN and asking the gradient of a convolution with respect to its biases, the result is wrong unless the gradient with respect to the weights is also requested.
Environment info (Required)
Using the python interface
Build info (Required if built from source)
Compiler (gcc/clang/mingw/visual studio): gcc
MXNet commit hash: 6a8d9eb
Build config: unchanged
config.mk
, except forUSE_OPENCV = 0
Minimum reproducible example
The above script prints a wrong value (0) for
grad2['b']
, whilegrad1['b'] is correct (9)
:running with
MXNET_MKLDNN_ENABLED=0
produces the correct result (9) for both gradientsThe text was updated successfully, but these errors were encountered: