fix test_activation by lowering threshold + validate eps for check_numeric_gradient #12560

azai91 · 2018-09-14T05:18:07Z

Description

Address problem with #12377 by setting threshold my appropriately. Ran test with 10000 random seeds and did not produce error.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

set activation to 1e-5

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

lebeg · 2018-09-14T09:25:53Z

tests/python/mkl/test_mkldnn.py

@@ -292,7 +291,7 @@ def check_activation_training(stype):
            in_location = [mx.nd.array(data_tmp).tostype(stype)]

            test = mx.symbol.Activation(data, act_type="relu")
-            check_numeric_gradient(test, in_location, numeric_eps=1e-2, rtol=0.16, atol=1e-4)
+            check_numeric_gradient(test, in_location, numeric_eps=1e-5, rtol=0.16, atol=1e-4)


Isn't this almost an exact fix as in #12418 that didn't solve the problem?

there's a significant difference between using 1e-5 vs 1e-6. I commented in #12377. in short, you should never use anything less than 1e-5 as the floats do not have enough precision to calculate the difference in the numerator.

Ok, thanks for the explanation!

kalyc · 2018-09-14T17:45:25Z

Thanks for your contribution @azai91
Could you update the PR title to be more descriptive?

@mxnet-label-bot[pr-awaiting-review]

lebeg · 2018-09-17T08:48:14Z

tests/python/mkl/test_mkldnn.py

@@ -292,7 +291,7 @@ def check_activation_training(stype):
            in_location = [mx.nd.array(data_tmp).tostype(stype)]

            test = mx.symbol.Activation(data, act_type="relu")
-            check_numeric_gradient(test, in_location, numeric_eps=1e-2, rtol=0.16, atol=1e-4)
+            check_numeric_gradient(test, in_location, numeric_eps=1e-5, rtol=0.16, atol=1e-4)


Ok, thanks for the explanation!

lupesko · 2018-09-17T19:15:05Z

Flagging for @anirudh2290 @sandeep-krishnamurthy @nswamy for review/merge.

sandeep-krishnamurthy

Thanks! LGTM.

azai91 added 4 commits September 13, 2018 14:14

remove disable flag

776b5b2

finite difference should use mean

231cb36

lower numerical eps

adcdced

set threshold to 1e-5

10b20ac

azai91 mentioned this pull request Sep 14, 2018

Flaky test: test_mkldnn.test_activation #12377

Closed

lebeg reviewed Sep 14, 2018

View reviewed changes

check numeric_eps

82c5a3d

azai91 requested a review from szha as a code owner September 14, 2018 17:30

marcoabreu added the pr-awaiting-review PR is waiting for code review label Sep 14, 2018

azai91 changed the title ~~Fix/test activation~~ fix test_activation by lowering threshold + validate eps for check_numeric_gradient Sep 14, 2018

azai91 added 2 commits September 14, 2018 15:02

update assertion

f007322

fix lint

7c6d7ac

lebeg approved these changes Sep 17, 2018

View reviewed changes

lupesko approved these changes Sep 17, 2018

View reviewed changes

access2rohit approved these changes Sep 17, 2018

View reviewed changes

sandeep-krishnamurthy approved these changes Sep 20, 2018

View reviewed changes

sandeep-krishnamurthy merged commit 97a7457 into apache:master Sep 20, 2018

lebeg mentioned this pull request Oct 9, 2018

[MXNET-12377] Disable Flaky Test: test_mkldnn.test_activation #12496

Closed

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix test_activation by lowering threshold + validate eps for check_numeric_gradient #12560

fix test_activation by lowering threshold + validate eps for check_numeric_gradient #12560

azai91 commented Sep 14, 2018 •

edited

Loading

lebeg Sep 14, 2018

azai91 Sep 14, 2018

lebeg Sep 17, 2018

kalyc commented Sep 14, 2018

lebeg Sep 17, 2018

lupesko commented Sep 17, 2018

sandeep-krishnamurthy left a comment

fix test_activation by lowering threshold + validate eps for check_numeric_gradient #12560

fix test_activation by lowering threshold + validate eps for check_numeric_gradient #12560

Conversation

azai91 commented Sep 14, 2018 • edited Loading

Description

Checklist

Essentials

Changes

Comments

lebeg Sep 14, 2018

Choose a reason for hiding this comment

azai91 Sep 14, 2018

Choose a reason for hiding this comment

lebeg Sep 17, 2018

Choose a reason for hiding this comment

kalyc commented Sep 14, 2018

lebeg Sep 17, 2018

Choose a reason for hiding this comment

lupesko commented Sep 17, 2018

sandeep-krishnamurthy left a comment

Choose a reason for hiding this comment

azai91 commented Sep 14, 2018 •

edited

Loading