-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Flaky test: test_gluon_gpu.test_slice_batchnorm_reshape_batchnorm #12767
Comments
Possibly related to: |
I'm unsure this is a flaky test, I think it's a cuda / cudnn or CI environment problem. Could you reproduce? |
@mxnet-label-bot [flaky, Gluon] |
Another consecutive run failed on master CI:
|
The deconvolution failure is tracked in #12579 |
Flaky test failure. |
Another failure:
|
Another failure can be seen here :
|
@lebeg is there anybody working on this? tests are still failing. |
Another failure for me here : http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-12749/18/pipeline/996
|
@lanking520 I proposed a mitigation here #12768 until this will be fixed. You are welcome to participate in the discussion and help merging it. Although this will not fix the problem, it could help reduce the failure rate. As far as I know @nswamy was investigating the root case. We have been working in the direction of updating CUDA drivers: #12850, but it's blocked until the new AMIs will be deployed with updated CUDA drivers. |
@larroy is currently doing the driver updates. |
#12887 duplicated issue |
did you reenable the test?
…On Thu, Nov 1, 2018 at 8:05 AM Anton Chernov ***@***.***> wrote:
Closed #12767 <#12767>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#12767 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABJxQtAkR6_7NIzFpsmEoe7SanOAY769ks5uqw3MgaJpZM4XTaqc>
.
|
@nswamy I was thinking https://github.com/apache/incubator-mxnet/pull/12986/files would reenable it |
#12986 did enable the test again. |
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/master/1728/pipeline
The text was updated successfully, but these errors were encountered: