You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
When using gluon.nn.BatchNorm(scale=False) on gpu, the computed grad for beta is not correct. The grad of beta seem to be accumulated between iterations.
When setting scale=True or running on cpu, it goes correctly.
This problem may make network hard to converge during trainning.
Environment info (Required)
CentOS Linux release 7.2.1511 (Core)
GTX 1080Ti
Driver Version: 384.69
CUDA Version 9.0.176
installed with pip:
numpy 1.17.2
mxnet-cu90 1.5.0
Code
In this example, the grad of beta shuold be [1, 1, 1] at each iteration.
Hey, this is the MXNet Label Bot.
Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it.
Here are my recommended label(s): Gluon, Bug
When using gluon.nn.BatchNorm(scale=False) on gpu, the computed grad for beta is not correct. The grad of beta seem to be accumulated between iterations.
When setting scale=True or running on cpu, it goes correctly.
This problem may make network hard to converge during trainning.
Environment info (Required)
CentOS Linux release 7.2.1511 (Core)
GTX 1080Ti
Driver Version: 384.69
CUDA Version 9.0.176
installed with pip:
numpy 1.17.2
mxnet-cu90 1.5.0
Code
In this example, the grad of beta shuold be [1, 1, 1] at each iteration.
output:
The text was updated successfully, but these errors were encountered: