-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Fixes #17304 Flaky Test -> test_higher_order_grad.test_tanh #17321
Fixes #17304 Flaky Test -> test_higher_order_grad.test_tanh #17321
Conversation
Codecov Report
@@ Coverage Diff @@
## master #17321 +/- ##
=======================================
Coverage 67.5% 67.5%
=======================================
Files 275 275
Lines 31227 31227
Branches 4721 4721
=======================================
Hits 21080 21080
Misses 8780 8780
Partials 1367 1367 Continue to review full report at Codecov.
|
@kshitij12345 Have you tried to run this new version with > 500 trials? |
@haojin2 , Hi can you tell me how I can do that. |
@kshitij12345 like this: |
@haojin2 Thank You. That helped, first order test failed once. Have relaxed the tolerance for first order to Ran this -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, will merge after CI passes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there is some numerical instability due to precision. Please re-run your tests multiple times as @haojin2 suggested. Otherwise, LGTM. Thanks for your prompt response.
@kshitij12345 Could you please also do so for other operators? I suspect they might have similar issues as well. Thanks! |
@apeforest @haojin2 , Yes I have seen that it is happening for |
Merged, please continue with fixing other flaky higher-order gradient tests @kshitij12345 |
@apeforest @reminisce
Going through the code for
_backward_tanh
, it is implemented as1 - (tanh^2(x))
, which is equivalent tosech^2(x)
or(1/cosh(x))^2
.However for the failed seed, I have verified that
1 - (tanh^2(x))
is not coming equivalent to(1/ cosh(x))^2
for the randomly generated array of dim 4 (will try to investigate further for the cause).For now replacing
grad_op
with its equivalent1- tanh^2(x)
looks to work without the need to relax tolerance.