Focal re-weighing instability #1

jeroendecloet · 2023-05-11T10:00:31Z

Describe the bug
There is an edge case where the focal re-weighing scheme can produce numerically unstable derivatives. It is defined as:
$$u(p) = -\alpha (1 - p) ^ \gamma \log(p)$$
The derivative w.r.t $p$ is:
$$\frac{\partial u}{\partial p} = \alpha\gamma(1-p)^{\gamma-1}\log(p) -\alpha (1 - p) ^ \gamma \frac{\partial }{\partial p}\log(p)$$

Since $\gamma-1 \leq 0$, the derivative is not defined for $1-p=0$. This occurred to me during training of a model, resulting is NaNs in the model weights.

I fixed it by adding a small value to the focal re-weighing implementation (line 43 in ece_loss\loss.py):
self.alpha * ((1.0 - loss_p + 1e-10) ** self.gamma) * loss

The text was updated successfully, but these errors were encountered:

jeroendecloet added the bug Something isn't working label May 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Focal re-weighing instability #1

Focal re-weighing instability #1

jeroendecloet commented May 11, 2023

Focal re-weighing instability #1

Focal re-weighing instability #1

Comments

jeroendecloet commented May 11, 2023