Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Second order gradient wrt inputs, expected behaviour. #14991

Closed
larroy opened this issue May 18, 2019 · 7 comments · Fixed by #14779
Closed

Second order gradient wrt inputs, expected behaviour. #14991

larroy opened this issue May 18, 2019 · 7 comments · Fixed by #14779

Comments

@larroy
Copy link
Contributor

larroy commented May 18, 2019

What would be the expected behaviour of this code?

It tries to calculate the gradient of a function using the gradient wrt of the inputs of the first gradient.

def test_ag_grad():
    x = mx.nd.ones((3,3))
    y = mx.nd.ones((3,3))
    x.attach_grad()
    y.attach_grad()
    with mx.autograd.record():
        z = x + y
        x_grad_y_grad = mx.autograd.grad(z, [x,y], create_graph=True, retain_graph=True)
        print(x_grad_y_grad)
        first_grad = nd.concat(*[x.reshape(-1) for x in x_grad_y_grad], dim=0)
        fg_f = 2 * first_grad
        second_grad = mx.autograd.grad(fg_f, [x,y], retain_graph=True)
@vdantu
Copy link
Contributor

vdantu commented May 19, 2019

@mxnet-label-bot add [question]

@apeforest
Copy link
Contributor

apeforest commented May 20, 2019

Calling autograd.grad on a first order ndarray seems not working this way. The API design could have been better documented.
The following block works.

def test_ag_grad():
    x = mx.nd.array([1, 2, 3])
    y = mx.nd.array([2, 3, 4])
    x.attach_grad()
    y.attach_grad()
    with mx.autograd.record():
        z = nd.elemwise_add(x, y)
        first_grad = mx.autograd.grad(z, x, create_graph=True, retain_graph=True)[0]
        print(first_grad)
        fg_f = 2 * first_grad
    fg_f.backward()
    print(x.grad)

@larroy
Copy link
Contributor Author

larroy commented May 20, 2019

Which branch are you using? I'm getting the following when running your example:

  File "/home/ANT.AMAZON.COM/pllarroy/devel/mxnet/python/mxnet/ndarray/ndarray.py", line 2216, in backward
    ctypes.c_void_p(0)))
  File "/home/ANT.AMAZON.COM/pllarroy/devel/mxnet/python/mxnet/base.py", line 254, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [15:37:55] /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/src/imperative/imperative.cc:357: Check failed: var_nodes.variable_nodes.size() > 0 (0 vs. 0) : There are no inputs in computation graph that require gradients.
Stack trace:
  [bt] (0) /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/cmake-build-debug/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x4a) [0x7f6590fc7b5e]
  [bt] (1) /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/cmake-build-debug/libmxnet.so(mxnet::Imperative::CreateGradientVariableNodes(std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<nnvm::NodeEntry, std::allocator<nnvm::NodeEntry> > const&)+0x535) [0x7f65911c63e1]
  [bt] (2) /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/cmake-build-debug/libmxnet.so(mxnet::Imperative::Backward(std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, bool, bool, bool)+0x2ff) [0x7f65911c6ae1]
  [bt] (3) /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/cmake-build-debug/libmxnet.so(MXAutogradBackwardEx+0x249) [0x7f6591034c12]
  [bt] (4) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c) [0x7f65aac85dae]
  [bt] (5) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call+0x22f) [0x7f65aac8571f]
  [bt] (6) /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/py3_venv/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(_ctypes_callproc+0x2b4) [0x7f65aae99524]
  [bt] (7) /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/py3_venv/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(+0x11b93) [0x7f65aae99b93]
  [bt] (8) /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/py3_venv/bin/python(_PyObject_FastCallKeywords+0x19c) [0x5a730c]


-------------------- >> begin captured logging << --------------------
common: INFO: Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=1526948194 to reproduce.
--------------------- >> end captured logging << ---------------------

@larroy
Copy link
Contributor Author

larroy commented May 20, 2019

I merged your branch, works, thanks.

@larroy
Copy link
Contributor Author

larroy commented May 21, 2019

I get the warning: "[16:36:03] /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/src/imperative/imperative.cc:362: There are no inputs in computation graph that require gradients."

@larroy
Copy link
Contributor Author

larroy commented May 21, 2019

This example works for me. I'm able to call grad two times, the second gradient has the correct value, given than the second function is 3 * 2 *x, so the grad is 6 for all elements.

When using ag.grad the gradient is not stored in x though. One key issue here is that create_graph in the second call to grad has to be set to false, otherwise we are re-creating the graph and having the problem I had before that the graph only contains backward nodes and NOT the original nodes.


def test_ag_grad():
    x = mx.nd.array([[2,2,2],[2,2,2],[2,2,4]])
    y = mx.nd.array([[2,2,2], [3,3,3], [4,4,4]])
    x.attach_grad()
    y.attach_grad()
    with mx.autograd.record():
        z = nd.elemwise_add(nd.elemwise_mul(x,x),y)
        x_grad_y_grad = mx.autograd.grad(z, x, create_graph=True, retain_graph=True)[0]
        print("dz/dx |x")
        print(type(x_grad_y_grad))
        print(x_grad_y_grad)
        fg_f = 3 * x_grad_y_grad
        second_grad = mx.autograd.grad(fg_f, [x], create_graph=False, retain_graph=True)
        print("second grad")
        print(second_grad)
        print("x.grad")
        print(x.grad)
    #fg_f.backward()
    print("x.grad")
    print(x.grad)

test_autograd.test_ag_grad ... variables 
[[2. 2. 2.]
 [2. 2. 2.]
 [2. 2. 4.]]
<NDArray 3x3 @cpu(0)>
var_handles: <mxnet.base.c_void_p_Array_1 object at 0x7f0b9e536a60>
dz/dx |x: 

[[4. 4. 4.]
 [4. 4. 4.]
 [4. 4. 8.]]
<NDArray 3x3 @cpu(0)>
variables 
[[2. 2. 2.]
 [2. 2. 2.]
 [2. 2. 4.]]
<NDArray 3x3 @cpu(0)>
var_handles: <mxnet.base.c_void_p_Array_1 object at 0x7f0b9e536b70>
second grad: 

[[6. 6. 6.]
 [6. 6. 6.]
 [6. 6. 6.]]
<NDArray 3x3 @cpu(0)>
x.grad: 

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
<NDArray 3x3 @cpu(0)>
x.grad: 

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
<NDArray 3x3 @cpu(0)>

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants