Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Flaky test: test_hybrid_static_memory #11809

Closed
anirudh2290 opened this issue Jul 19, 2018 · 6 comments · Fixed by #16843
Closed

Flaky test: test_hybrid_static_memory #11809

anirudh2290 opened this issue Jul 19, 2018 · 6 comments · Fixed by #16843

Comments

@anirudh2290
Copy link
Member

Please see: http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11806/2/pipeline/846/


======================================================================

FAIL: test_operator_gpu.test_hybrid_static_memory

----------------------------------------------------------------------

Traceback (most recent call last):

  File "C:\Anaconda3\envs\py3\lib\site-packages\nose\case.py", line 197, in runTest

    self.test(*self.arg)

  File "C:\Anaconda3\envs\py3\lib\site-packages\nose\util.py", line 620, in newfunc

    return func(*arg, **kw)

  File "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\gpu\../unittest\common.py", line 172, in test_new

    orig_test(*args, **kwargs)

  File "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\gpu\../unittest\test_gluon.py", line 1204, in test_hybrid_static_memory

    check_hybrid_static_memory()

  File "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\gpu\../unittest\test_gluon.py", line 1200, in check_hybrid_static_memory

    assert_almost_equal(grads1[key].asnumpy(), grads2[key].asnumpy(), rtol=1e-3, atol=1e-5)

  File "C:\jenkins_slave\workspace\ut-python-gpu\pkg_vc14_gpu_mkldnn\python\mxnet\test_utils.py", line 493, in assert_almost_equal

    raise AssertionError(msg)

AssertionError: 

Items are not equal:

Error 1.541786 exceeds tolerance rtol=0.001000, atol=0.000010.  Location of maximum error:(2,), a=-0.009735, b=-0.009766

 a: array([ 5.0684638 ,  0.49830177, -0.00973539, ...,  3.59016871,

       -1.98168564, -1.38615167], dtype=float32)

 b: array([ 5.06845665,  0.49829596, -0.00976587, ...,  3.59016991,

       -1.9816792 , -1.3861587 ], dtype=float32)

-------------------- >> begin captured logging << --------------------

common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=2006690968 to reproduce.

--------------------- >> end captured logging << ---------------------

@haojin2
Copy link
Contributor

haojin2 commented Jul 23, 2018

Seems like this is a problem with the tolerance, would you please disable the test first? @anirudh2290

@ChaiBapchya
Copy link
Contributor

@haojin2 should I disable?

@haojin2
Copy link
Contributor

haojin2 commented Aug 26, 2019

@ChaiBapchya Can you actually bump up the tolerance a bit, like atol to 1e-4 instead of 1e-5?

@haojin2
Copy link
Contributor

haojin2 commented Nov 16, 2019

Happening again: http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-gpu/detail/PR-16829/9/pipeline/

======================================================================
FAIL: test_gluon_gpu.test_hybrid_static_memory
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/usr/local/lib/python3.5/dist-packages/nose/util.py", line 620, in newfunc
    return func(*arg, **kw)
  File "/work/mxnet/tests/python/gpu/../unittest/common.py", line 177, in test_new
    orig_test(*args, **kwargs)
  File "/work/mxnet/tests/python/gpu/../unittest/test_gluon.py", line 1680, in test_hybrid_static_memory
    check_hybrid_static_memory(static_alloc=True)
  File "/work/mxnet/tests/python/gpu/../unittest/test_gluon.py", line 1675, in check_hybrid_static_memory
    assert_almost_equal(grads1[key].asnumpy(), grads2[key].asnumpy(), rtol=1e-3, atol=1e-5)
  File "/work/mxnet/python/mxnet/test_utils.py", line 627, in assert_almost_equal
    raise AssertionError(msg)
AssertionError: 
Items are not equal:
Error 2.432765 exceeds tolerance rtol=1.000000e-03, atol=1.000000e-05 (mismatch 0.002713%).
Location of maximum error: (45, 50, 2, 1), a=0.00251389, b=0.00254440
 ACTUAL: array([[[[ -4.9708405 ,   5.0227804 ,   1.2681608 ],
         [ -3.95312   ,  20.59486   ,  17.561035  ],
         [ -2.8548512 ,  12.628487  ,   5.902029  ]],...
 DESIRED: array([[[[ -4.9708395 ,   5.022783  ,   1.2681627 ],
         [ -3.9531207 ,  20.59486   ,  17.561031  ],
         [ -2.8548517 ,  12.628485  ,   5.902028  ]],...
-------------------- >> begin captured stdout << ---------------------

*** Maximum errors for vector of size 36864:  rtol=0.001, atol=1e-05

  1: Error 2.432765  Location of error: (45, 50, 2, 1), a=0.00251389, b=0.00254440

--------------------- >> end captured stdout << ----------------------
-------------------- >> begin captured logging << --------------------
common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1509166681 to reproduce.
--------------------- >> end captured logging << ---------------------

@ChaiBapchya
Copy link
Contributor

Alright will bump it up then.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants