You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
There is an edge case where the focal re-weighing scheme can produce numerically unstable derivatives. It is defined as: $$u(p) = -\alpha (1 - p) ^ \gamma \log(p)$$
The derivative w.r.t $p$ is: $$\frac{\partial u}{\partial p} = \alpha\gamma(1-p)^{\gamma-1}\log(p) -\alpha (1 - p) ^ \gamma \frac{\partial }{\partial p}\log(p)$$
Since $\gamma-1 \leq 0$, the derivative is not defined for $1-p=0$. This occurred to me during training of a model, resulting is NaNs in the model weights.
I fixed it by adding a small value to the focal re-weighing implementation (line 43 in ece_loss\loss.py): self.alpha * ((1.0 - loss_p + 1e-10) ** self.gamma) * loss
The text was updated successfully, but these errors were encountered:
Describe the bug
$$u(p) = -\alpha (1 - p) ^ \gamma \log(p)$$ $p$ is:
$$\frac{\partial u}{\partial p} = \alpha\gamma(1-p)^{\gamma-1}\log(p) -\alpha (1 - p) ^ \gamma \frac{\partial }{\partial p}\log(p)$$
There is an edge case where the focal re-weighing scheme can produce numerically unstable derivatives. It is defined as:
The derivative w.r.t
Since$\gamma-1 \leq 0$ , the derivative is not defined for $1-p=0$ . This occurred to me during training of a model, resulting is NaNs in the model weights.
I fixed it by adding a small value to the focal re-weighing implementation (line 43 in
ece_loss\loss.py
):self.alpha * ((1.0 - loss_p + 1e-10) ** self.gamma) * loss
The text was updated successfully, but these errors were encountered: