Got -9223372036854775808 (-2^63) when differentiating a sequential sum of squares #370

zkytony · 2018-03-12T14:53:50Z

I have the following objective function. Given an N by M matrix x, it computes the sum of squared norms between two adjacent rows. (For instance, If x is a sequence of 2D locations, then this function computes the length of total displacement.)

def min_displacement_obj(x):
    """
    Computes:
      T-1
      sum ||x_{t+1} - x_{t}||^2
      t=1
    """
    return np.sum(np.array([np.linalg.norm(x[i] - x[i-1])**2
                                          for i in range(1, len(x))]))

I'd like to compute the gradient of this function with respect to x, so I used autograd.grad. But for the example below,

import autograd.numpy as np
from autograd import grad

f = min_displacement_obj
grad_f = grad(f)

x = np.array([[1],[0],[0],[1]])
grad_fval = grad_f(x)
print(grad_fval)

I get

[[                   2]
 [-9223372036854775808]
 [-9223372036854775808]
 [                   2]]

This doesn't look right (unless I made silly mistake): If I manually compute the gradient of f with respect to a component of x, say x[i], the gradient formula I get is:

df(x)/dx[i] = d(||x[i] - x[i-1]||^2 + ||x[i+1] - x[i]||^2)/dx = 4x[i] - 2x[i-1] - 2x[i+1]

Then, in the above example, when i=1, df(x)/dx[1] = 4*0 - 2*1 - 2*0 = -2. How come it becomes -9223372036854775808? I'm very confused.

However, if I just change x to

x = np.array([[1],[0],[0.1],[1]])

I get

[[ 2. ]
 [-2.2]
 [-1.6]
 [ 1.8]]

This is correct, because again when i=1, df(x)/dx[1] = 4*0 - 2*1 - 2*0.1 = -2.2

I found that autograd doesn't compute the correct result when any of the adjacent components of x have the same value. It just unanimously outputs -9223372036854775808 for the gradient w.r.t all those components. Is this a bug?

The text was updated successfully, but these errors were encountered:

dhirschfeld · 2018-03-14T04:45:04Z

Think it's integer overflow - i.e. if you ensure your input array is floating point it will likely work.

Might still be considered a bug in autograd if so as silently returning the wrong value is pretty nasty, even if the value returned is likely to be pretty obviously wrong

neonwatty · 2018-03-14T19:11:37Z

I tried your code and converted your input x = np.array([[1],[0],[0],[1]]) to floats first but got the runtime error
RuntimeWarning: invalid value encountered in double_scalars return expand(g / ans) * x

and a corresponding output with nans of

[[  2.]
 [ nan]
 [ nan]
 [  2.]]

Perhaps something is going on with np.linalg? Re-writing what you gave above without using np.linalg - using np.sum instead - calling this min_displacement_obj_2

def min_displacement_obj_2(x):
    """
    Computes:
      T-1
      sum ||x_{t+1} - x_{t}||^2
      t=1
    """    
    return np.sum([np.sum(x[i] - x[i-1])**2  for i in range(1, x.shape[0])])

works fine given your test vector

f = min_displacement_obj_2
grad_f = grad(f)

x = np.array([[1.0],[0.0],[0.0],[1.0]])
x = x.astype(float)
grad_fval = grad_f(x)
print(grad_fval)

[[ 2.]
 [-2.]
 [-2.]
 [ 2.]]

This won't work necessarily on the rows of a matrix - but if I understand you correctly then you can extend it as min_displacement_obj_3like below

def min_displacement_obj_3(x):
    """
    Computes:
      T-1
      sum ||x_{t+1} - x_{t}||^2
      t=1
    """    
    return np.sum([np.sum(x[i,:] - x[i-1,:],axis = 0)**2 for i in range(1, x.shape[0])])

This still works fine for the test vector, and for a test matrix seems to work as well e.g.,

X = np.array([[1,1,1],[2,2,2],[3,3,3],[4,4,4]])
X  = X.astype(float)
grad_fval = grad_f(X)
print(grad_fval)

[[-6. -6. -6.]
 [ 0.  0.  0.]
 [ 0.  0.  0.]
 [ 6.  6.  6.]]

bewantbe · 2018-03-25T15:14:49Z

This is probably a problem that essentially all automatic differentiation algorithm will encounter.

A minimum example looks like this (run it twice to get rid of the warning :-)):

from autograd import grad
import autograd.numpy as np

func = lambda x: np.sqrt(x**2)**2
x = np.array([0.0])
grad_fval = grad(func)(x)
print(grad_fval)

Mathematically, the reason is clear when write down the chain rule:

$ d/dt sqrt(x^2)^2 = (2 * w2) * (1/2 / w1^1/2) * (2 * x), w1 = x^2, w2 = sqrt(x^2). $

When x = 0, it is something like 0 * (1/0) * 0, a typical indeterminate form.

In other words, the function norm() creates a discontinuity of derivative at the origin, but then you square it, making it continued again. The program is just not clever enough to eliminate this spurious discontinuity.

Luckily, as @jermwatt pointed out, there is a warning, and this warning should not be ignored. The message expand(g / ans) * x essentially g / ans * x is the derivative of the L2-norm times g.

BTW: I'm worrying about this "bug", because that the highly nonlinear objective function might -- if unfortunate enough -- hit this problem. Do we have a guarantee that the Warning about non-smooth or indeterminate result is always shown?

Fix issues HIPS#370 Now the gradient at zero (origin) point of np.linalg.norm() is the same as np.abs, which is zero, one of its subgradient. For second order gradients, mathematically they should be +infinity, but here when ord>=2 it returns 0 (same as np.abs()), when 1<ord<2, it is NaN with plenty of warnings, which should be enough to prevent user from doing that.

bewantbe mentioned this issue May 3, 2018

Return zero gradient for zero norm function. #393

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Got -9223372036854775808 (-2^63) when differentiating a sequential sum of squares #370

Got -9223372036854775808 (-2^63) when differentiating a sequential sum of squares #370

zkytony commented Mar 12, 2018 •

edited

Loading

dhirschfeld commented Mar 14, 2018

neonwatty commented Mar 14, 2018 •

edited

Loading

bewantbe commented Mar 25, 2018

Got -9223372036854775808 (-2^63) when differentiating a sequential sum of squares #370

Got -9223372036854775808 (-2^63) when differentiating a sequential sum of squares #370

Comments

zkytony commented Mar 12, 2018 • edited Loading

dhirschfeld commented Mar 14, 2018

neonwatty commented Mar 14, 2018 • edited Loading

bewantbe commented Mar 25, 2018

zkytony commented Mar 12, 2018 •

edited

Loading

neonwatty commented Mar 14, 2018 •

edited

Loading