-
Notifications
You must be signed in to change notification settings - Fork 6.8k
problem when set fix_gamma=True in batchnorm #9624
Comments
This is a legacy design defect. When fix_gamma is true there shouldn't be a gamma parameter. |
@sandeep-krishnamurthy : Tag: Bug |
@solin319 how did you check the value of gamma? I suspect I am facing the same issue and hence, want to verify. But unfortunately, I couldn't find gamma in either arg_params or aux_params. |
@mxnet-label-bot add [Operator] |
@solin319 I guess, the fix is already merged and closed. |
@Vikas89 the fix is only for coreml converter. The operator hasnt been fixed yet. |
@solin319 Does it means that the parameters would be updated even its grad_req is set to "null" if the wd_mult is not set to zero? I think this behavior is unexpected. |
If fix_gamma is true, then set gamma to 1 and its gradient to 0.
But the value of gamma will be changed during parameters update. So the gamma saved in param file was not 1. These will bring a problem in convert MXNet parameters to other deep-learning platforms.
This problem was caused by we set a default weight-decay in SGD optimizer.
We must define variable gamma with wd_mult=0 to fix gamma=1 during training.
Can MXNet set wd of gamma to 0 automatically when fix_gamma=1?
The text was updated successfully, but these errors were encountered: