Flaky test test_operator_gpu.test_sparse_dot #10920

eric-haibin-lin · 2018-05-13T05:53:11Z

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-10913/7/pipeline

======================================================================

FAIL: test_operator_gpu.test_sparse_dot

----------------------------------------------------------------------

Traceback (most recent call last):

  File "/usr/local/lib/python3.5/dist-packages/nose/case.py", line 198, in runTest

    self.test(*self.arg)

  File "/usr/local/lib/python3.5/dist-packages/nose/util.py", line 620, in newfunc

    return func(*arg, **kw)

  File "/work/mxnet/tests/python/gpu/../unittest/common.py", line 157, in test_new

    orig_test(*args, **kwargs)

  File "/work/mxnet/tests/python/gpu/../unittest/test_sparse_operator.py", line 1343, in test_sparse_dot

    lhs_d, rhs_d, False, True)

  File "/work/mxnet/tests/python/gpu/../unittest/test_sparse_operator.py", line 1237, in test_infer_forward_stype

    assert_almost_equal(out.tostype('default').asnumpy(), out_np, rtol=1e-4, atol=1e-5)

  File "/work/mxnet/python/mxnet/test_utils.py", line 493, in assert_almost_equal

    raise AssertionError(msg)

AssertionError: 

Items are not equal:

Error 1.067252 exceeds tolerance rtol=0.000100, atol=0.000010.  Location of maximum error:(34, 15), a=0.022217, b=0.022204

 a: array([[ -9.000863  ,   4.9705057 ,  -2.7022123 , ..., -10.717851  ,

         19.614717  ,  17.951117  ],

       [-19.35049   ,   2.4999516 ,  -7.9741106 , ...,  15.310856  ,...

 b: array([[ -9.000862  ,   4.9705014 ,  -2.7022119 , ..., -10.717848  ,

         19.614723  ,  17.951107  ],

       [-19.350492  ,   2.4999518 ,  -7.9741096 , ...,  15.310851  ,...

@haojin2

The text was updated successfully, but these errors were encountered:

haojin2 · 2018-05-13T20:37:20Z

Should be a rtol/atol problem, will fix this ASAP

haojin2 · 2018-05-15T03:27:50Z

Met this again:
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-10931/7/pipeline

======================================================================

FAIL: test_operator_gpu.test_sparse_dot

----------------------------------------------------------------------

Traceback (most recent call last):

  File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest

    self.test(*self.arg)

  File "/usr/local/lib/python2.7/dist-packages/nose/util.py", line 620, in newfunc

    return func(*arg, **kw)

  File "/work/mxnet/tests/python/gpu/../unittest/common.py", line 157, in test_new

    orig_test(*args, **kwargs)

  File "/work/mxnet/tests/python/gpu/../unittest/test_sparse_operator.py", line 1343, in test_sparse_dot

    lhs_d, rhs_d, False, True)

  File "/work/mxnet/tests/python/gpu/../unittest/test_sparse_operator.py", line 1237, in test_infer_forward_stype

    assert_almost_equal(out.tostype('default').asnumpy(), out_np, rtol=1e-4, atol=1e-5)

  File "/work/mxnet/python/mxnet/test_utils.py", line 493, in assert_almost_equal

    raise AssertionError(msg)

AssertionError: 

Items are not equal:

Error 1.045963 exceeds tolerance rtol=0.000100, atol=0.000010.  Location of maximum error:(1, 3), a=0.018232, b=0.018245

 a: array([[  4.6516194,  -1.9191772, -12.820089 , ...,   8.844241 ,

        -13.252303 , -12.635316 ],

       [ -4.1542487,   7.7496295,  -0.1049877, ...,   4.691146 ,...

 b: array([[  4.65162   ,  -1.919177  , -12.820091  , ...,   8.844247  ,

        -13.252301  , -12.635329  ],

       [ -4.154251  ,   7.749628  ,  -0.10498619, ...,   4.6911426 ,...

-------------------- >> begin captured logging << --------------------

common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=33958090 to reproduce.

--------------------- >> end captured logging << ---------------------

eric-haibin-lin · 2018-06-22T21:57:38Z

======================================================================

FAIL: test_operator_gpu.test_sparse_dot

----------------------------------------------------------------------

Traceback (most recent call last):

  File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest

    self.test(*self.arg)

  File "/usr/local/lib/python2.7/dist-packages/nose/util.py", line 620, in newfunc

    return func(*arg, **kw)

  File "/work/mxnet/tests/python/gpu/../unittest/common.py", line 157, in test_new

    orig_test(*args, **kwargs)

  File "/work/mxnet/tests/python/gpu/../unittest/test_sparse_operator.py", line 1385, in test_sparse_dot

    lhs_d, rhs_d, False, False)

  File "/work/mxnet/tests/python/gpu/../unittest/test_sparse_operator.py", line 1281, in test_infer_forward_stype

    assert_almost_equal(out.tostype('default').asnumpy(), out_np, rtol=1e-4, atol=1e-5)

  File "/work/mxnet/python/mxnet/test_utils.py", line 493, in assert_almost_equal

    raise AssertionError(msg)

AssertionError: 

Items are not equal:

Error 1.040007 exceeds tolerance rtol=0.000100, atol=0.000010.  Location of maximum error:(41, 6), a=-0.002038, b=-0.002048

 a: array([[-17.577438  ,  -1.1808437 ,  13.078005  , ...,  -3.15891   ,

         -6.6346364 ,  -1.1883267 ],

       [ 14.642713  ,  -4.0540075 ,  15.738339  , ..., -14.301232  ,...

 b: array([[-17.577438  ,  -1.1808434 ,  13.078006  , ...,  -3.1589074 ,

         -6.63463   ,  -1.1883259 ],

       [ 14.642715  ,  -4.0540066 ,  15.738338  , ..., -14.301228  ,...

-------------------- >> begin captured logging << --------------------

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11360/8/pipeline/731

anirudh2290 · 2018-06-27T23:41:45Z

assigned to haibin. @haojin2 has a PR open for this.

haojin2 · 2018-06-29T21:55:32Z

@eric-haibin-lin should be fixed at this time.

zheng-da · 2018-09-11T22:31:47Z

It seems that the previous PR doesn't fix the problem completely.

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11641/24/pipeline

eric-haibin-lin added Test Flaky labels May 13, 2018

haojin2 mentioned this issue Jun 25, 2018

[MXNET-566] Fix flaky test_operator_gpu.test_sparse_dot #11389

Merged

6 tasks

anirudh2290 assigned eric-haibin-lin Jun 27, 2018

eric-haibin-lin closed this as completed Jun 30, 2018

eric-haibin-lin reopened this Sep 11, 2018

haojin2 mentioned this issue Sep 12, 2018

Further bump up tolerance for sparse dot #12527

Merged

5 tasks

eric-haibin-lin closed this as completed in #12527 Sep 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky test test_operator_gpu.test_sparse_dot #10920

Flaky test test_operator_gpu.test_sparse_dot #10920

eric-haibin-lin commented May 13, 2018

haojin2 commented May 13, 2018

haojin2 commented May 15, 2018 •

edited

Loading

eric-haibin-lin commented Jun 22, 2018

anirudh2290 commented Jun 27, 2018

haojin2 commented Jun 29, 2018

zheng-da commented Sep 11, 2018

Flaky test test_operator_gpu.test_sparse_dot #10920

Flaky test test_operator_gpu.test_sparse_dot #10920

Comments

eric-haibin-lin commented May 13, 2018

haojin2 commented May 13, 2018

haojin2 commented May 15, 2018 • edited Loading

eric-haibin-lin commented Jun 22, 2018

anirudh2290 commented Jun 27, 2018

haojin2 commented Jun 29, 2018

zheng-da commented Sep 11, 2018

haojin2 commented May 15, 2018 •

edited

Loading