-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Flaky test: test_operator_gpu.test_countsketch #10988
Comments
We should make a focussed effort to resolve all flaky tests in Q3 2018. |
I think I hit this same issue.
|
Was able to reproduce the issue with a different seed, the problem is that the absolute tolerance is very low (1e-12), I've bumped it up to 1e-5 to see if it could pass 1000 runs. |
assigned to haibin @haojin2 is working on this. |
From the reproduced error we can see that only part of the grad ndarray is filled:
|
Fix in #11780. |
@larroy Please take a look at my comments on how the test could fail without a sync. |
Reopening since this again failed here : http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-gpu/detail/PR-15118/13/pipeline |
But all of them are failing, might not be specific to this one, seems memory corruption or hardware issue. |
|
Fixed in a 'tack on' commit to #20876. |
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-10983/1/pipeline/
The text was updated successfully, but these errors were encountered: