Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Commit

Permalink
CR comments
Browse files Browse the repository at this point in the history
  • Loading branch information
larroy committed Jun 3, 2019
1 parent c073cde commit 1b86fc1
Showing 1 changed file with 20 additions and 3 deletions.
23 changes: 20 additions & 3 deletions docs/api/python/autograd/autograd.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,23 +98,40 @@ backward nodes, not the full initial graph that includes the forward nodes.
The idiom to calculate higher order gradients is the following:

```python
import mxnet autograd as ag
from mxnet import ndarray as nd
from mxnet import autograd as ag
x=nd.array([1,2,3])
def f(x):
# A function which supports higher oder gradients
return x*x
```

If the operators used in `f` don't support higher order gradients you will get an error like
`operator ... is non-differentiable because it didn't register FGradient attribute.`. This means
that it doesn't support getting the gradient of the gradient. Which is, running backward on
the backward graph.

Using mxnet.autograd.grad multiple times:

```python
with ag.record():
y = f(x)
y_grad = ag.grad(y, x, create_graph=True, retain_graph=True)[0]
y_grad_grad = ag.grad(y_grad, x, create_graph=False, retain_graph=True)[0]
```

or
Running backward on the backward graph:

```python
import mxnet autograd as ag
with ag.record():
y = f(x)
y_grad = ag.grad(y, x, create_graph=True, retain_graph=True)[0]
y_grad_grad = y_grad.backward()
```

Both methods are equivalent, except that in the second case, retain_graph on running backward is set
to False by default. But both calls are running a backward pass as on the graph as usual to get the
gradient of the first gradient `y_grad` with respect to `x` evaluated at the value of `x`.



Expand Down

0 comments on commit 1b86fc1

Please sign in to comment.