[MXNET-978] Higher Order Gradient Support `reciprocal`, `abs`. #15413

kshitij12345 · 2019-06-29T08:11:42Z

Description

PR intends to add support for higher order gradient for reciprocal, abs.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA-978 issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

higher order gradient for a reciprocal, abs.
unit test for the same.

apeforest · 2019-07-01T21:07:14Z

src/operator/tensor/elemwise_unary_op_basic.cc

+    const std::unordered_map<std::string, std::string> args = {{"scalar", "-2.0"}};
+
+    auto dydx_mul_dldy = nnvm::NodeEntry{n};  // f'(x) * head_grads
+    auto dydx = MakeNode("elemwise_div", n->attrs.name + "_dydx",


Do we need to divide this explicitly here? I think the final _backward_grad_grad_input will also carry the term head_grads in the output, we may not need this extra node?

Now I see that you need this node for the first output "_backward_grad_grad"

apeforest · 2019-07-01T21:10:49Z

tests/python/unittest/test_higher_order_grad.py

+        return nd.reciprocal(x)
+
+    def grad_grad_op(x):
+        return 2/x**3


add space between /

apeforest · 2019-07-01T21:17:22Z

@larroy Please help to review.

anirudhacharya · 2019-07-01T22:28:28Z

@mxnet-label-bot add [pr-awaiting-review]

apeforest · 2019-07-02T04:56:50Z

tests/python/unittest/test_higher_order_grad.py

+        shape = rand_shape_nd(dim)
+        array = random_arrays(shape)
+        check_second_order_unary(array, abs, grad_grad_op)
+


nit: please remove extra line

two lines between functions as per pep8: https://stackoverflow.com/questions/2953250/python-pep8-blank-lines-convention

It is fixed actually. I guess I removed the lower line so it is not showing up here.

apeforest · 2019-07-02T04:59:54Z

src/operator/tensor/elemwise_unary_op_basic.cc

+    [](const nnvm::NodePtr& n, const std::vector<nnvm::NodeEntry>& ograds) {
+      // ograds[0]: dL/dxgrad
+      // inputs[0]: dL/dy
+      // inputs[1]: y


Shouldn't this term be x? _backward_abs is using ElemwiseGradUseIn

* fix extra line in tests. * fix missing space. * fix incorrect comment.

…to develop/add-higher-order/reciprocal-abs

larroy · 2019-07-04T00:49:52Z

src/operator/tensor/elemwise_unary_op_basic.cc

+    auto dydx_mul_dldy = nnvm::NodeEntry{n};  // f'(x) * head_grads
+    auto dydx = MakeNode("elemwise_div", n->attrs.name + "_dydx",
+                         {dydx_mul_dldy, n->inputs[0]}, nullptr, &n);
+    auto fx = MakeNode("reciprocal", n->attrs.name + "_fx",


Small thing, Could we get fx from the first backward (node->inputs) if we do ElemwiseGradUseInOut ? I guess we would avoid additional divisions if so.

I don't think we can use it as that would work as our _backward_reciprocal which is binary will have to support 3 inputs.

https://github.com/apache/incubator-mxnet/blob/8ebaa5c0384ecbef244150859b3e24ea2f02095d/src/operator/elemwise_op_common.h#L213-L227

larroy · 2019-07-04T00:50:43Z

src/operator/tensor/elemwise_unary_op_basic.cc

+
+    std::vector<nnvm::NodeEntry> ret;
+
+    ret.emplace_back(MakeNode("elemwise_mul", n->attrs.name + "_backward_grad_grad",


Maybe a comment would help here, this one is the output corresponding to dL/dy from the first backward right?

I'm still unclear since the previous PRs on what
dL/dxgrad * dy/dx represents. To me is not obvious after spending more than half an hour thinking.

#15120

Even I am not sure of its significance in literature. But if you look at dL/dx = dL/dy * dy/dx as just c = a * b, then dc/da = b while dc/db=a.
So that is all I am thinking, does dL/dy affect our dL/dx.

This term will be useful when you calculate the third order (and above) gradient.

larroy · 2019-07-04T00:51:35Z

src/operator/tensor/elemwise_unary_op_basic.cc

+
+    ret.emplace_back(MakeNode("elemwise_mul", n->attrs.name + "_backward_grad_grad",
+                             {ograds[0], nnvm::NodeEntry{dydx}}, nullptr, &n));
+    ret.emplace_back(MakeNode("elemwise_mul", n->attrs.name + "_backward_grad_grad_inp",


This seems ok.

larroy · 2019-07-04T00:53:08Z

src/operator/tensor/elemwise_unary_op_basic.cc

+      std::vector<nnvm::NodeEntry> ret;
+      ret.emplace_back(MakeNode("elemwise_mul", n->attrs.name + "_backward_grad_grad",
+                                {ograds[0], nnvm::NodeEntry(dydx)}, nullptr, &n));
+      ret.emplace_back(MakeNode("zeros_like", n->attrs.name + "_backward_grad_grad_in",


larroy · 2019-07-04T00:53:42Z

src/operator/tensor/elemwise_unary_op_basic.cc

+                           {nnvm::NodeEntry{n}, n->inputs[0]}, nullptr, &n);
+
+      std::vector<nnvm::NodeEntry> ret;
+      ret.emplace_back(MakeNode("elemwise_mul", n->attrs.name + "_backward_grad_grad",


same question as above.

larroy

I don't get the first output, but the result for x_grad_grad looks fine to me.

sxjscience · 2019-07-05T02:58:07Z

Sorry that I've been busy this week (for the upcoming conference). I'll take a look next week.

kshitij12345 · 2019-07-05T07:09:12Z

Sure. No worries. Good Luck!

apeforest

LGTM

kshitij12345 added 2 commits June 29, 2019 12:53

add higher order support for reciprocal and abs

0a08a3b

add relevant tests

32c9346

kshitij12345 changed the title ~~Higher Order Gradient Support reciprocal, abs.~~ [MXNET-978] Higher Order Gradient Support reciprocal, abs. Jun 29, 2019

apeforest requested review from apeforest and sxjscience July 1, 2019 21:02

apeforest reviewed Jul 1, 2019

View reviewed changes

marcoabreu added the pr-awaiting-review PR is waiting for code review label Jul 1, 2019

apeforest reviewed Jul 2, 2019

View reviewed changes

kshitij12345 added 2 commits July 2, 2019 22:08

address comments

ced9e30

* fix extra line in tests. * fix missing space. * fix incorrect comment.

Merge branch 'master' of https://github.com/apache/incubator-mxnet in…

8878d7d

…to develop/add-higher-order/reciprocal-abs

larroy reviewed Jul 4, 2019

View reviewed changes

larroy approved these changes Jul 4, 2019

View reviewed changes

apeforest approved these changes Jul 7, 2019

View reviewed changes

apeforest merged commit a3ae309 into apache:master Jul 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MXNET-978] Higher Order Gradient Support `reciprocal`, `abs`. #15413

[MXNET-978] Higher Order Gradient Support `reciprocal`, `abs`. #15413

kshitij12345 commented Jun 29, 2019

apeforest Jul 1, 2019

apeforest Jul 2, 2019

apeforest Jul 1, 2019 •

edited

Loading

apeforest commented Jul 1, 2019

anirudhacharya commented Jul 1, 2019

apeforest Jul 2, 2019

larroy Jul 4, 2019 •

edited

Loading

kshitij12345 Jul 4, 2019

apeforest Jul 2, 2019

kshitij12345 Jul 4, 2019

larroy Jul 4, 2019

kshitij12345 Jul 4, 2019

larroy Jul 4, 2019 •

edited

Loading

kshitij12345 Jul 4, 2019

apeforest Jul 7, 2019

larroy Jul 4, 2019

larroy Jul 4, 2019

larroy Jul 4, 2019

larroy left a comment

sxjscience commented Jul 5, 2019

kshitij12345 commented Jul 5, 2019

apeforest left a comment


		std::vector<nnvm::NodeEntry> ret;

		ret.emplace_back(MakeNode("elemwise_mul", n->attrs.name + "_backward_grad_grad",

[MXNET-978] Higher Order Gradient Support reciprocal, abs. #15413

[MXNET-978] Higher Order Gradient Support reciprocal, abs. #15413

Conversation

kshitij12345 commented Jun 29, 2019

Description

Checklist

Essentials

Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apeforest Jul 1, 2019 • edited Loading

Choose a reason for hiding this comment

apeforest commented Jul 1, 2019

anirudhacharya commented Jul 1, 2019

Choose a reason for hiding this comment

larroy Jul 4, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

larroy Jul 4, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

larroy left a comment

Choose a reason for hiding this comment

sxjscience commented Jul 5, 2019

kshitij12345 commented Jul 5, 2019

apeforest left a comment

Choose a reason for hiding this comment

[MXNET-978] Higher Order Gradient Support `reciprocal`, `abs`. #15413

[MXNET-978] Higher Order Gradient Support `reciprocal`, `abs`. #15413

apeforest Jul 1, 2019 •

edited

Loading

larroy Jul 4, 2019 •

edited

Loading

larroy Jul 4, 2019 •

edited

Loading