Flaky test: test_operator.test_layer_norm #10227

reminisce · 2018-03-23T20:09:50Z

It failed in one of my PRs on Windows, Python2 GPU. Please confirm whether the difference is expected. If so, consider using a bigger atol for comparing two values that are close to zero.
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-9552/34/pipeline/577
@sxjscience

FAIL: test_operator.test_layer_norm

----------------------------------------------------------------------

Traceback (most recent call last):

  File "C:\Anaconda3\envs\py2\lib\site-packages\nose\case.py", line 197, in runTest

    self.test(*self.arg)

  File "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\test_operator.py", line 2551, in test_layer_norm

    forward_check_eps=forward_check_eps)

  File "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\test_operator.py", line 2542, in check_layer_normalization

    numeric_eps=1e-2, rtol=1e-2, atol=1e-3)

  File "C:\jenkins_slave\workspace\ut-python-gpu\pkg_vc14_gpu\python\mxnet\test_utils.py", line 917, in check_numeric_gradient

    ("NUMERICAL_%s"%name, "BACKWARD_%s"%name))

  File "C:\jenkins_slave\workspace\ut-python-gpu\pkg_vc14_gpu\python\mxnet\test_utils.py", line 493, in assert_almost_equal

    raise AssertionError(msg)

AssertionError: 

Items are not equal:

Error 4.857872 exceeds tolerance rtol=0.010000, atol=0.001000.  Location of maximum error:(9, 4, 4), a=0.005262, b=0.000385

 NUMERICAL_data: array([[[-0.05418463, -0.29006594,  0.1394535 , -0.35051909,  0.55530882],

        [ 0.72161782, -0.4398739 ,  0.49481598, -1.1300931 ,  0.35353807],

        [-0.11519305, -0.61403626,  0.79913807, -0.87961853,  0.80963707],...

 BACKWARD_data: array([[[-0.0541792 , -0.29006325,  0.13945937, -0.35054162,  0.55532471],

        [ 0.72162286, -0.43987606,  0.49481591, -1.13009876,  0.35353626],

        [-0.11518719, -0.6140214 ,  0.79915995, -0.87960533,  0.80965401],...

The text was updated successfully, but these errors were encountered:

sxjscience · 2018-03-23T20:12:14Z

Should change to 1E-2

reminisce · 2018-03-23T20:15:07Z

Thanks, I will make the change along with the PR.

sxjscience · 2018-03-23T20:16:55Z

I find sometimes it's really hard to make the numerical check pass for all the seeds 😅

reminisce · 2018-03-23T20:27:15Z

Same here. We should come up with a method of avoid comparing gradient values that are too close to zero.

haojin2 · 2018-08-02T21:30:50Z

Happened again: http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11986/2/pipeline

======================================================================

FAIL: test_operator.test_layer_norm

----------------------------------------------------------------------

Traceback (most recent call last):

  File "C:\Anaconda3\envs\py2\lib\site-packages\nose\case.py", line 197, in runTest

    self.test(*self.arg)

  File "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\test_operator.py", line 3147, in test_layer_norm

    forward_check_eps=forward_check_eps)

  File "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\test_operator.py", line 3083, in check_layer_normalization

    numeric_eps=1e-2, rtol=1e-2, atol=1e-2)

  File "C:\jenkins_slave\workspace\ut-python-gpu\pkg_vc14_gpu\python\mxnet\test_utils.py", line 917, in check_numeric_gradient

    ("NUMERICAL_%s"%name, "BACKWARD_%s"%name))

  File "C:\jenkins_slave\workspace\ut-python-gpu\pkg_vc14_gpu\python\mxnet\test_utils.py", line 493, in assert_almost_equal

    raise AssertionError(msg)

AssertionError: 

Items are not equal:

Error 1.031757 exceeds tolerance rtol=0.010000, atol=0.010000.  Location of maximum error:(4, 2, 0), a=0.016902, b=0.027503

 NUMERICAL_data: array([[[-0.83571225, -0.39195567, -0.65418553, -0.39927065, -0.25080964],

        [ 0.68750679,  0.03032088,  0.02044216, -0.84786415, -0.23824871],

        [-0.61543953, -0.70688128, -0.02324693, -0.69272518, -0.51978898],...

 BACKWARD_data: array([[[-0.83574855, -0.39192304, -0.65416901, -0.39928018, -0.25077481],

        [ 0.68751844,  0.03033861,  0.02035761, -0.84786007, -0.23824724],

        [-0.6154799 , -0.70687513, -0.02326486, -0.6927109 , -0.51977907],...

ChaiBapchya · 2019-10-02T17:25:26Z

Happened again: #16336 (unrelated PR)
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-16336/7/pipeline/

======================================================================

FAIL: test_operator.test_layer_norm

----------------------------------------------------------------------

Traceback (most recent call last):

  File "/usr/local/lib/python3.5/dist-packages/nose/case.py", line 198, in runTest

    self.test(*self.arg)

  File "/work/mxnet/tests/python/unittest/test_operator.py", line 3863, in test_layer_norm

    finite_grad_check=finite_grad_check)

  File "/work/mxnet/tests/python/unittest/test_operator.py", line 3733, in check_layer_normalization

    assert_almost_equal(exe.grad_dict['data'].asnumpy(), gt_data_grad, backward_check_eps, backward_check_eps)

  File "/work/mxnet/python/mxnet/test_utils.py", line 533, in assert_almost_equal

    raise AssertionError(msg)

AssertionError: 

Items are not equal:

Error 1.454597 exceeds tolerance rtol=0.000100, atol=0.000100.  Location of maximum error:(8, 3, 0), a=0.732305, b=0.732053

 a: array([[[ -2.4676247 ,   0.11147801,   1.2913225 ,   0.03173554,

           1.0330889 ],

        [  1.348523  ,  -0.26897073,   0.22984871,  -0.33579123,...

 b: array([[[ -2.4676248 ,   0.111478  ,   1.29132235,   0.03173556,

           1.03308889],

        [  1.34852288,  -0.2689707 ,   0.22984875,  -0.33579115,...

sxjscience · 2019-10-02T17:28:29Z

I guess we need to use 1E-3

ChaiBapchya · 2019-10-02T17:34:48Z

Um.. 1E-1, 1E-2, now 1E-3.. it has to stop somewhere.. but anyway.. for temporary fix I will push that as PR then

sxjscience · 2019-10-02T17:37:56Z

@ChaiBapchya Yep, would it be possible to just test a few seeds? Also, we could somehow remove the finite-difference test.

ChaiBapchya · 2019-10-02T18:01:46Z

Fixing the seed isn't a good practice is it? Removing finite difference will make this test loose right?

sxjscience · 2019-10-02T18:03:05Z

@ChaiBapchya In theory it will not if our manual backward logic is correct.

sxjscience · 2019-10-02T18:10:10Z

@ChaiBapchya I mean the test would be as strict as the original. Also, fixing the seed works in some randomness tests.

reminisce added Operator Flaky labels Mar 23, 2018

reminisce closed this as completed Mar 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky test: test_operator.test_layer_norm #10227

Flaky test: test_operator.test_layer_norm #10227

reminisce commented Mar 23, 2018

sxjscience commented Mar 23, 2018

reminisce commented Mar 23, 2018

sxjscience commented Mar 23, 2018

reminisce commented Mar 23, 2018

haojin2 commented Aug 2, 2018

ChaiBapchya commented Oct 2, 2019

sxjscience commented Oct 2, 2019

ChaiBapchya commented Oct 2, 2019

sxjscience commented Oct 2, 2019

ChaiBapchya commented Oct 2, 2019

sxjscience commented Oct 2, 2019

sxjscience commented Oct 2, 2019

Flaky test: test_operator.test_layer_norm #10227

Flaky test: test_operator.test_layer_norm #10227

Comments

reminisce commented Mar 23, 2018

sxjscience commented Mar 23, 2018

reminisce commented Mar 23, 2018

sxjscience commented Mar 23, 2018

reminisce commented Mar 23, 2018

haojin2 commented Aug 2, 2018

ChaiBapchya commented Oct 2, 2019

sxjscience commented Oct 2, 2019

ChaiBapchya commented Oct 2, 2019

sxjscience commented Oct 2, 2019

ChaiBapchya commented Oct 2, 2019

sxjscience commented Oct 2, 2019

sxjscience commented Oct 2, 2019