-
Notifications
You must be signed in to change notification settings - Fork 911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Got -9223372036854775808 (-2^63) when differentiating a sequential sum of squares #370
Comments
Think it's integer overflow - i.e. if you ensure your input array is floating point it will likely work. Might still be considered a bug in autograd if so as silently returning the wrong value is pretty nasty, even if the value returned is likely to be pretty obviously wrong |
I tried your code and converted your input and a corresponding output with
Perhaps something is going on with
works fine given your test vector
This won't work necessarily on the rows of a matrix - but if I understand you correctly then you can extend it as
This still works fine for the test vector, and for a test matrix seems to work as well e.g.,
|
This is probably a problem that essentially all automatic differentiation algorithm will encounter. A minimum example looks like this (run it twice to get rid of the warning :-)): from autograd import grad
import autograd.numpy as np
func = lambda x: np.sqrt(x**2)**2
x = np.array([0.0])
grad_fval = grad(func)(x)
print(grad_fval) Mathematically, the reason is clear when write down the chain rule: $ d/dt sqrt(x^2)^2 = (2 * w2) * (1/2 / w1^1/2) * (2 * x), w1 = x^2, w2 = sqrt(x^2). $ When x = 0, it is something like 0 * (1/0) * 0, a typical indeterminate form. In other words, the function norm() creates a discontinuity of derivative at the origin, but then you square it, making it continued again. The program is just not clever enough to eliminate this spurious discontinuity. Luckily, as @jermwatt pointed out, there is a warning, and this warning should not be ignored. The message BTW: I'm worrying about this "bug", because that the highly nonlinear objective function might -- if unfortunate enough -- hit this problem. Do we have a guarantee that the Warning about non-smooth or indeterminate result is always shown? |
Fix issues HIPS#370 Now the gradient at zero (origin) point of np.linalg.norm() is the same as np.abs, which is zero, one of its subgradient. For second order gradients, mathematically they should be +infinity, but here when ord>=2 it returns 0 (same as np.abs()), when 1<ord<2, it is NaN with plenty of warnings, which should be enough to prevent user from doing that.
I have the following objective function. Given an N by M matrix x, it computes the sum of squared norms between two adjacent rows. (For instance, If x is a sequence of 2D locations, then this function computes the length of total displacement.)
I'd like to compute the gradient of this function with respect to x, so I used
autograd.grad
. But for the example below,I get
This doesn't look right (unless I made silly mistake): If I manually compute the gradient of
f
with respect to a component of x, say x[i], the gradient formula I get is:Then, in the above example, when i=1, df(x)/dx[1] = 4*0 - 2*1 - 2*0 = -2. How come it becomes
-9223372036854775808
? I'm very confused.However, if I just change x to
I get
This is correct, because again when i=1, df(x)/dx[1] = 4*0 - 2*1 - 2*0.1 = -2.2
I found that
autograd
doesn't compute the correct result when any of the adjacent components ofx
have the same value. It just unanimously outputs-9223372036854775808
for the gradient w.r.t all those components. Is this a bug?The text was updated successfully, but these errors were encountered: