-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/build/cmd/gomote: 502 Bad Gateway error #28365
Comments
I'm seeing this frequently too. I don't think it's actually a time limit, but it's opaque enough that I'm not sure. CC @golang/osp-team |
I ran into the same problem this morning. Very mysterious error. |
Change https://golang.org/cl/200738 mentions this issue: |
Go 1.11 added ReverseProxy.ErrorHandler; use it to make the httputil.ReverseProxy failures print the underlying error back to the (trusted) client. (Normally the client isn't necessarily trusted enough to get the full info) Also, log more to stderr where we can search for it. Updates golang/go#28365 Change-Id: Iac2d863b159f24fda2e0e6e1f7374ed05434d3e4 Reviewed-on: https://go-review.googlesource.com/c/build/+/200738 Reviewed-by: Bryan C. Mills <[email protected]> Run-TryBot: Bryan C. Mills <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
Deployed coordinator with new logging. Please report any errors you see. We can also search the coordinator's logs for it too. |
New error just now:
|
Change https://golang.org/cl/203217 mentions this issue: |
…llation Maybe this will solve the golang/go#28365 problems. But at least it gets us into codepaths that are known & trusted, and removes use of deprecated API. Also, add more logging to help debug golang/go#28365. Updates golang/go#28365 Change-Id: Ibff2b03fd82573cbeedbbc22d12c30ae1a3c3aa0 Reviewed-on: https://go-review.googlesource.com/c/build/+/203217 Reviewed-by: Bryan C. Mills <[email protected]>
Go 1.11 added ReverseProxy.ErrorHandler; use it to make the httputil.ReverseProxy failures print the underlying error back to the (trusted) client. (Normally the client isn't necessarily trusted enough to get the full info) Also, log more to stderr where we can search for it. Updates golang/go#28365 Change-Id: Iac2d863b159f24fda2e0e6e1f7374ed05434d3e4 Reviewed-on: https://go-review.googlesource.com/c/build/+/200738 Reviewed-by: Bryan C. Mills <[email protected]> Run-TryBot: Bryan C. Mills <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
…llation Maybe this will solve the golang/go#28365 problems. But at least it gets us into codepaths that are known & trusted, and removes use of deprecated API. Also, add more logging to help debug golang/go#28365. Updates golang/go#28365 Change-Id: Ibff2b03fd82573cbeedbbc22d12c30ae1a3c3aa0 Reviewed-on: https://go-review.googlesource.com/c/build/+/203217 Reviewed-by: Bryan C. Mills <[email protected]>
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
As @bcmills noted in #37001 (comment), there's a chance this is the same issue as #37001, which has been resolved on Feb 6, 2020. /cc @toothrot The last comments here are from 2019. Has this happened to anyone again since then? |
I spent some quality time with a |
Per above — I think this is fixed. Setting to WaitingForInfo to see if anyone chimes in with recent gomote trouble. |
Timed out in state WaitingForInfo. Closing. (I am just a bot, though. Please speak up if this is a mistake or you have the requested information.) |
I just saw this on an iOS gomote. Log:
|
I've been running stress tests on a pool of 8 windows-amd64-longtest builders and over the course of about an hour I've gotten dozens of these failures: 2022/01/19 13:06:00 creating windows-amd64-longtest buildlet One odd thing is that they seem to happen in clusters. My stress testing tool will retry setting up a new gomote 5 times before giving up, and many of the logs show the buildlet failing part way through make.bat and then failing several times in a row with "502 Bad Gateway": 2022/01/19 13:03:05 creating windows-amd64-longtest buildlet 2022/01/19 13:03:33 created buildlet user-austin-windows-amd64-longtest-3 2022/01/19 13:03:34 installing go1.4 2022/01/19 13:03:41 Remote doesn't have "src/syscall/zsysnum_freebsd_386.go" 2022/01/19 13:03:41 Remote doesn't have "test/fixedbugs/issue19548.go" 2022/01/19 13:03:41 Remote doesn't have "src/cmd/compile/internal/ssa/deadcode_t est.go" 2022/01/19 13:03:41 Remote doesn't have "src/cmd/go/internal/modfetch/sumdb.go" 2022/01/19 13:03:41 Remote doesn't have "src/cmd/vendor/golang.org/x/sys/windows /race0.go" 2022/01/19 13:03:41 Remote doesn't have 10994 files (only showed 5). 2022/01/19 13:03:41 Remote lacks a VERSION file; sending a fake one 2022/01/19 13:03:47 Uploading 10995 new/changed files; 40413585 byte .tar.gz Building Go cmd/dist using C:\workdir\go1.4 Building Go toolchain1 using C:\workdir\go1.4. Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1. Building Go toolchain2 using go_bootstrap and Go toolchain1. Building Go toolchain3 using go_bootstrap and Go toolchain2. Error running run: Error trying to execute go/src/make.bat: missing Process-Stat e trailer from HTTP response; buildlet built with old (<= 1.4) Go? 2022/01/19 13:06:00 setup command failed: exit status 1 2022/01/19 13:06:00 creating windows-amd64-longtest buildlet 2022/01/19 13:06:28 created buildlet user-austin-windows-amd64-longtest-3 2022/01/19 13:06:28 installing go1.4 Error running push: 502 Bad Gateway; body: (golang.org/issue/28365): gomote prox y error: Post "http://10.128.0.55/writetgz?dir=go1.4": read tcp 10.102.128.7:490 68->10.128.0.55:80: read: connection reset by peer 2022/01/19 13:06:29 setup command failed: exit status 1 2022/01/19 13:06:29 creating windows-amd64-longtest buildlet 2022/01/19 13:07:08 created buildlet user-austin-windows-amd64-longtest-3 2022/01/19 13:07:08 installing go1.4 Error running push: 502 Bad Gateway; body: (golang.org/issue/28365): gomote prox y error: Post "http://10.128.0.70/writetgz?dir=go1.4": read tcp 10.102.128.7:383 34->10.128.0.70:80: read: connection reset by peer 2022/01/19 13:07:10 setup command failed: exit status 1 2022/01/19 13:07:11 creating windows-amd64-longtest buildlet 2022/01/19 13:07:48 created buildlet user-austin-windows-amd64-longtest-3 2022/01/19 13:07:48 installing go1.4 Error running push: 502 Bad Gateway; body: (golang.org/issue/28365): gomote prox y error: Post "http://10.128.0.76/writetgz?dir=go1.4": read tcp 10.102.128.7:419 10->10.128.0.76:80: read: connection reset by peer 2022/01/19 13:07:50 setup command failed: exit status 1 2022/01/19 13:07:50 creating windows-amd64-longtest buildlet 2022/01/19 13:08:31 created buildlet user-austin-windows-amd64-longtest-3 2022/01/19 13:08:31 installing go1.4 Error running push: 502 Bad Gateway; body: (golang.org/issue/28365): gomote prox y error: Post "http://10.128.0.82/writetgz?dir=go1.4": read tcp 10.102.128.7:333 78->10.128.0.82:80: read: connection reset by peer 2022/01/19 13:08:33 setup command failed: exit status 1 2022/01/19 13:08:33 giving up after 5 retries exited: status 1 |
To add to my previous message, I've noticed something very strange. I had spun up 25 windows-amd64-2016 builders and they ran fine for a few minutes. Then all but 3 of them died with errors like in my previous message. My gopool program kept trying to recreate the 22 failed builders, but I never saw one that lasted more than 2.1 seconds after creation. All but the 3 kept failing with 502 errors. I believe exactly the same thing had happened in a previous run, though I wasn't watching it as closely. While this was happening, I saw the following on farmer:
|
As a data point, I tried running
in 12 parallel loops. I expected this to quickly produce 502 errors, but I let it go for about an hour and didn't get any 502 errors. |
I often encounter this 502 Bad Gateway error while actively working with a buildlet created using gomote. I guess this is because I was holding the buildlet too long (~30min) and there is a hidden time limit on each buildlet. If so, we need a better error message than this:
And,
gomote list
still shows the lease is not yet expired. That's misleading.The text was updated successfully, but these errors were encountered: