[Flaky] test_operator_gpu.test_dropout #14288

junrushao · 2019-02-28T23:04:22Z

http://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-validation/pipelines/unix-gpu/branches/PR-14270/runs/5/nodes/271/steps/624/log/?start=0

======================================================================
FAIL: test_operator_gpu.test_dropout
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/usr/local/lib/python2.7/dist-packages/nose/util.py", line 620, in newfunc
    return func(*arg, **kw)
  File "/work/mxnet/tests/python/gpu/../unittest/common.py", line 173, in test_new
    orig_test(*args, **kwargs)
  File "/work/mxnet/tests/python/gpu/../unittest/test_operator.py", line 6107, in test_dropout
    check_dropout_ratio(1.0, shape, cudnn_off=False)
  File "/work/mxnet/tests/python/gpu/../unittest/test_operator.py", line 6040, in check_dropout_ratio
    check_correctness(exe, exe.arg_arrays[0].asnumpy(), ratio)
  File "/work/mxnet/tests/python/gpu/../unittest/test_operator.py", line 6005, in check_correctness
    assert output_zeroes == len(input)
AssertionError: 
-------------------- >> begin captured logging << --------------------
common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=623337751 to reproduce.
--------------------- >> end captured logging << ---------------------

The text was updated successfully, but these errors were encountered:

mxnet-label-bot · 2019-02-28T23:04:24Z

Hey, this is the MXNet Label Bot.
Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it.
Here are my recommended labels: Test, Flaky

junrushao · 2019-02-28T23:06:04Z

See also

http://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-validation/pipelines/unix-gpu/branches/PR-14270/runs/5/nodes/271/steps/624/log/?start=0

junrushao · 2019-03-01T08:23:12Z

@mxnet-label-bot add [Test, Flaky]

perdasilva · 2019-05-20T07:37:01Z

Seeing it again: http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-gpu/detail/master/650/pipeline

Making a PR to disable test until it is fixed.

ChaiBapchya · 2019-08-12T23:01:18Z

Was able to reproduce this issue.
With cudnn_off=True,

check_dropout_ratio(1.0, shape)

Works correctly
assert output_zeroes == len(input) is correct (both values 10000)

However, With cudnn_off=False i.e. with cudnn

check_dropout_ratio(1.0, shape)

Fails at the assertion because the values are
output_zeroes 9999
input 10000

eric-haibin-lin · 2019-12-25T09:48:43Z

@DickJC123 @ptrendx it looks like a bug in the cudnn dropout implementation on the ratio=1 case

Two issues. Issue 1: #14288 Issue 2: [2020-11-17T06:58:34.678Z] def check_passthrough(ratio, shape, cudnn_off=True): [2020-11-17T06:58:34.678Z] # test inference_mode forward and then backward [2020-11-17T06:58:34.678Z] a = mx.random.uniform(shape=shape) [2020-11-17T06:58:34.678Z] a.attach_grad() [2020-11-17T06:58:34.678Z] with mx.autograd.record(train_mode=False): [2020-11-17T06:58:34.678Z] b = mx.nd.Dropout(a, ratio, cudnn_off=cudnn_off) # dropout acts as identity [2020-11-17T06:58:34.678Z] b.backward() [2020-11-17T06:58:34.678Z] assert_almost_equal(a.grad.asnumpy(), mx.nd.ones_like(b).asnumpy()) [2020-11-17T06:58:34.678Z] [2020-11-17T06:58:34.678Z] shape = (100, 100) [2020-11-17T06:58:34.678Z] check_dropout_ratio(0.5, shape) [2020-11-17T06:58:34.678Z] check_dropout_ratio(0.0, shape) [2020-11-17T06:58:34.678Z] > check_dropout_ratio(1.0, shape) [...] [2020-11-17T06:58:34.678Z] # Hopefully should be within ratio/2 % [2020-11-17T06:58:34.678Z] error = abs(output_sum - input_sum) / input_sum [2020-11-17T06:58:34.678Z] if ratio == 1.0: [2020-11-17T06:58:34.678Z] > assert output_zeroes == len(input) [2020-11-17T06:58:34.678Z] E assert 9999 == 10000 [2020-11-17T06:58:34.678Z] E +9999 [2020-11-17T06:58:34.678Z] E -10000

junrushao changed the title ~~Flaky test: test_operator_gpu.test_dropout~~ [Flaky] test_operator_gpu.test_dropout Feb 28, 2019

marcoabreu added Flaky Test labels Mar 1, 2019

perdasilva mentioned this issue May 20, 2019

[Flaky Test] Disables flaky test_droupout #15003

Merged

5 tasks

roywei mentioned this issue Oct 22, 2019

fix dropout gpu seed #16532

Closed

apeforest mentioned this issue Jan 22, 2020

[Flaky Test] enable test_dropout with cudnn off #17403

Merged

leezu mentioned this issue Nov 17, 2020

Mark test_dropout as flaky #19553

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Flaky] test_operator_gpu.test_dropout #14288

[Flaky] test_operator_gpu.test_dropout #14288

junrushao commented Feb 28, 2019

mxnet-label-bot commented Feb 28, 2019

junrushao commented Feb 28, 2019

junrushao commented Mar 1, 2019

perdasilva commented May 20, 2019

ChaiBapchya commented Aug 12, 2019

eric-haibin-lin commented Dec 25, 2019

[Flaky] test_operator_gpu.test_dropout #14288

[Flaky] test_operator_gpu.test_dropout #14288

Comments

junrushao commented Feb 28, 2019

mxnet-label-bot commented Feb 28, 2019

junrushao commented Feb 28, 2019

junrushao commented Mar 1, 2019

perdasilva commented May 20, 2019

ChaiBapchya commented Aug 12, 2019

eric-haibin-lin commented Dec 25, 2019