-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows Buildkite agents are taking new jobs before the VM has rebooted #42
Comments
This often manifests itself with the following error message:
|
Thanks to the debugging code added in, we can see in this example that the previous run was canceled by a SIGINT, presumably sent because the entire job was canceled due to a new commit being pushed before the old commit was done testing: Old run: https://buildkite.com/julialang/julia-master/builds/20727#0186216b-d26a-4bde-b924-469346201a0b/4392-4408 This shouldn't cause a problem, but clearly we do have a problem, so let's trace through and see if we can find any issues with our reboot functionality:
So the smoking gun points to, unsurprisingly, the NSSM script that should be invoking shutdown, but it's failing for some reason. |
I'm always nervous about shell quoting, particularly on Windows. Are we sure that this ends up getting parsed appropriately by |
I logged into a machine running a job, and used a powershell script to tail the windows event log for
So it looks to me like it is correctly invoking the |
I wonder if |
Those are all the success case though. Do we have any logs for the failure case? |
No, I have to login and instrument the VM beforehand, as all state is lost after each job finishes. |
Can we dump the logs in one of the early scripts if |
No description provided.
The text was updated successfully, but these errors were encountered: