-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[Nightly Test Failure] Tutorial test_tutorials.test_gluon_end_to_end Test Failure #14026
Comments
Hey, this is the MXNet Label Bot. |
@mxnet-label-bot add [Test, Gluon] |
@roywei can you please take a look at this failure. It is probably caused by your PR. |
@roywei it times out, too many epochs or too long dataset download |
@mseth10 @ThomasDelteil I m looking into it. |
@roywei This appears to still be broken from the logs. http://jenkins.mxnet-ci.amazon-ml.com/job/NightlyTestsForBinaries/job/master/220/console Did you double-check the changes by running the NightlyTestsForBinaries locally? If it is failing in Jenkins but not locally, you should use docker containers to simulate the exact environment. Please try to fix urgently as the Nightly tests have been broken for 15 days. Thanks! |
@vishaalkapoor i m trying to fix it, it's passing on local tests in 120s, way below the timeout limit. But i m not able to run docker containers setup according to cwiki step 2 I m using Deep Learning Base AMI ubuntu on g3.8xlarge instance
|
There's a connection issue in the logs. Perhaps has to do with running a docker image and being sandboxed in some manner.
Re: Docker
(no need for docker registry above!) Additionally, try a different region and/or use a higher verbosity with docker. |
@marcoabreu @Chancebair could you reopen this issue? On the test failure That's why this test is passing on local and only fails on docker. On docker reproduction failure building dependency on g3.8xlarge with cuda9.1, cudnn7, nvidia-docker2 using the following command
give the following error:
running the test without dependency built:
gives |
Hi @roywei, Cool - do you a log message that points to #14119? If this is at the root cause stage or even at the stage pending a fix for #14119, I think it would be best to comment out the test, try to repro with docker using the same hardware instance type as the test runner, fix and then re-enable. The Berlin office hours are tomorrow morning if they're still Tuesday mornings. It might be helpful to debug the dockers issues so that you can more easily repro. Vishaal |
Hi @vishaalkapoor, I have created another PR to disable it and set the fix as WIP, will try to reproduce and test the fix before merge. |
great, thank you @roywei :) |
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/NightlyTestsForBinaries/detail/master/206/pipeline
This appears to have been failing since Jan 24th as a result of #13411
The text was updated successfully, but these errors were encountered: