-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/build: meta bug tracking trybot build speed (goal: 15 minutes; further improvements are blocked on completing LUCI migration and robust metrics) #17104
Comments
CL https://golang.org/cl/29153 mentions this issue. |
The go_test_bench:* tests run: go test -short -race -run=^$ -benchtime=.1s -cpu=4 $PKG ... on each discovered package with any tests. (The same set used for the "go_test:*" tests) That set was 168 packages: $ go tool dist test -list | grep go_test: | wc -l 168 But only 76 of those have a "func Benchmark", and running each "go_test_bench:" test and compiling it in race mode, just to do nothing took 1-2 seconds each. So stop doing that and filter out the useless packages earlier. Now: $ go tool dist test -list -race | grep go_test_bench: | wc -l 76 Should save 90-180 seconds. (or maybe 45 seconds for trybots, since they're sharded) Updates #17104 Change-Id: I08ccb072a0dc0454ea425540ee8e74b59f83b773 Reviewed-on: https://go-review.googlesource.com/29153 Run-TryBot: Brad Fitzpatrick <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]>
CL https://golang.org/cl/29156 mentions this issue. |
Shave 6.5 minutes off the *-race build time. The *-race builders run: go test -short -race -run=^$ -benchtime=.1s -cpu=4 $PKG ... for each package with benchmarks. The point isn't to measure the speed of the packages, but rather to see if there are any races. (which is why a benchtime of 0.1 seconds is used) But running in race mode makes things slower and our benchmarks aren't all very fast to begin with. The regexp benchmarks in race were taking over 6.5 minutes. With this CL, it's now 8 seconds. Updates #17104 Change-Id: I054528d09b1568d37aac9f9b515d6ed90a5cf5b0 Reviewed-on: https://go-review.googlesource.com/29156 Run-TryBot: Brad Fitzpatrick <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: David Crawshaw <[email protected]>
Don't benchmark so many sizes during the race builder's benchmark run. This package doesn't even use goroutines. Cuts off 10 seconds. Updates #17104 Change-Id: Ibb2c7272c18b9014a775949c656a5b930f197cd4 Reviewed-on: https://go-review.googlesource.com/29158 Reviewed-by: David Crawshaw <[email protected]>
CL https://golang.org/cl/29163 mentions this issue. |
CL https://golang.org/cl/29159 mentions this issue. |
No coverage is gained by running the 1e6 versions of the test over the 1e4 versions. It just adds 140 seconds of race overhead time. Updates #17104 Change-Id: I41408aedae34a8b1a148eebdda20269cdefffba3 Reviewed-on: https://go-review.googlesource.com/29159 Run-TryBot: Brad Fitzpatrick <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Josh Bleecher Snyder <[email protected]>
No need to test so many sizes in race mode, especially for a package which doesn't use goroutines. Reduces test time from 2.5 minutes to 25 seconds. Updates #17104 Change-Id: I7065b39273f82edece385c0d67b3f2d83d4934b8 Reviewed-on: https://go-review.googlesource.com/29163 Reviewed-by: David Crawshaw <[email protected]>
Current race build speeds:
|
If the coordinator has a better guess as to how long various tests take, then it can do better critical path scheduling and reduce the overall time the sharded tests take to complete. This table still needs to die and be based on recent empirical data, but at least it's more accurate now, after a long delay in being updated. Update is from golang/go#17104 (comment) Updates golang/go#17104 Change-Id: I115aad23fbdb0cde1b196e71a4131fbe36480cc0 Reviewed-on: https://go-review.googlesource.com/29167 Reviewed-by: David Crawshaw <[email protected]>
OpenBSD is deployed and tested. FreeBSD has only been tested by hand, but this CL doesn't fail the buildlet if the remount fails. It only logs either way. Updates golang/go#17104 Change-Id: Ia9662b42ae8305ad9eaa4292c94fa3194cc26b11 Reviewed-on: https://go-review.googlesource.com/29238 Reviewed-by: Matthew Dempsky <[email protected]>
CL https://golang.org/cl/29430 mentions this issue. |
Some builders (OpenBSD, FreeBSD, Plan9 at least?) have their buildlet process's stdout/stderr hooked up to their serial console. The log line for each untarred Go1.4 + Go src tarball going to the serial console added just shy of 1 minute (!!) to the build time. Now it takes 3 seconds. (Or 12 seconds before change to use an async+noatime filesystem on the BSDs) Updates golang/go#17104 Change-Id: I1e6f00bcca955ead26b279a79729e50319384593 Reviewed-on: https://go-review.googlesource.com/29430 Reviewed-by: Matthew Dempsky <[email protected]>
CL https://golang.org/cl/29473 mentions this issue. |
Also, don't start obtaining test sharding buildlets early if they're reverse buildlets. Reverse buildlets are either immediately available, or they're not. No point monopolizing them earlier than needed. Updates golang/go#17104 Change-Id: If5a0bbd0c59b55750adfeeaa8d0f81cdbcc8ad48 Reviewed-on: https://go-review.googlesource.com/29473 Reviewed-by: Matthew Dempsky <[email protected]>
CL https://golang.org/cl/29551 mentions this issue. |
Our builders are named of the form "GOOS-GOARCH" or "GOOS-GOARCH-suffix". Over time we've grown many builders. This CL doesn't change that. Builders continue to be named and operate as before. Previously the build configuration file (dashboard/builders.go) made each builder type ("linux-amd64-race", etc) define how to create a host running a buildlet of that type, even though many builders had identical host configs. For example, these builders all share the same host type (a Kubernetes container): linux-amd64 linux-amd64-race linux-386 linux-386-387 And these are the same host type (a GCE VM): windows-amd64-gce windows-amd64-race windows-386-gce This CL creates a new concept of a "hostType" which defines how the buildlet is created (Kube, GCE, Reverse, and how), and then each builder itself references a host type. Users never see the hostType. (except perhaps in gomote list output) But they at least never need to care about them. Reverse buildlets now can only be one hostType at a time, which simplifies things. We were no longer using multiple roles per machine once moving to VMs for OS X. gomote continues to operate as it did previously but its underlying protocol changed and clients will need to be updated. As a new feature, gomote now has a new flag to let you reuse a buildlet host connection for different builder rules if they share the same underlying host type. But users can ignore that. This CL is a long-standing TODO (previously attempted and aborted) and will make many things easier and faster, including the linux-arm cross-compilation effort, and keeping pre-warmed buildlets of VM types ready to go. Updates golang/go#17104 Change-Id: Iad8387f48680424a8441e878a2f4762bf79ea4d2 Reviewed-on: https://go-review.googlesource.com/29551 Reviewed-by: Matthew Dempsky <[email protected]>
CL https://golang.org/cl/29670 mentions this issue. |
This is a new builder in prep for the change to the "linux-arm" builder where the GOARCH=arm make.bash will be cross-compiled from a Kubernetes container on fast hardware. Updates golang/go#17105 (cross-compile ARM builders' make.bash) Updates golang/go#17104 (5 minute trybots) Change-Id: Icfd2644d77639f731151abe54839322960418254 Reviewed-on: https://go-review.googlesource.com/29670 Reviewed-by: Matthew Dempsky <[email protected]>
CL https://golang.org/cl/29677 mentions this issue. |
CL https://golang.org/cl/29751 mentions this issue. |
Takes a bit too long to run it all the time. Fixes #17217 Update #17104 Change-Id: I4802190ea16ee0f436a7f95b093ea0f995f5b11d Reviewed-on: https://go-review.googlesource.com/29751 Run-TryBot: Keith Randall <[email protected]> Reviewed-by: Brad Fitzpatrick <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
Saves 4.5 minutes or so by using fast x86 machines to build the ARM build instead of running make.bash on Scaleway ARM machines. We still run the tests on ARM, and have a separate builder only running make.bash on ARM (see prior golang.org/cl/29670) Fixes golang/go#17105 Updates golang/go#17104 Change-Id: I1cb7b0e5b1cc8b644195f262328884ed3aff120a Reviewed-on: https://go-review.googlesource.com/29677 Reviewed-by: Brad Fitzpatrick <[email protected]>
Change https://golang.org/cl/279723 mentions this issue: |
The goal in CL 279512 was to reduce numTryTestHelpers from 5 to 4, but it unintentionally set the field only in the 386 builder, not the amd64 one. This caused the amd64 OpenBSD TryBot to get very slow (20-25 minutes), making it very much a bottleneck for overall TryBot completion time. Also stop explicitly setting some fields in openbsd-amd64-62 builder that have no effect by now, to simplify configuration. For golang/go#17104. Updates golang/go#35712. Change-Id: I22823e4848cab65f11bde2a1cc70527929d0792d Reviewed-on: https://go-review.googlesource.com/c/build/+/279723 Trust: Dmitri Shuralyov <[email protected]> Run-TryBot: Dmitri Shuralyov <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Alexander Rakoczy <[email protected]>
Change https://golang.org/cl/303669 mentions this issue: |
Replace low-level Stackdriver monitoring API usage for OpenCensus with a Stackdriver exporter. To benefit local development, expose metrics at an /metrics endpoint (to be picked up with Prometheus). This makes it much easier to add new metrics, to test them locally, and brings our metrics solution in sync with what's currently in use in x/playground (see CL 302769). It's expected to be preferable to migrate to OpenTelemetry in the future when a good migration path becomes available, and both x/build and x/playground can be updated at that time. This CL is based on work in CL 229679 and CL 138522. For golang/go#26779. For golang/go#44406. For golang/go#17104. Co-authored-by: Alexander Rakoczy <[email protected]> Co-authored-by: Emmanuel T Odeke <[email protected]> Change-Id: Iad45730feace471db1668e828b7c9775377be8a9 Reviewed-on: https://go-review.googlesource.com/c/build/+/303669 Run-TryBot: Dmitri Shuralyov <[email protected]> TryBot-Result: Go Bot <[email protected]> Trust: Dmitri Shuralyov <[email protected]> Reviewed-by: Alexander Rakoczy <[email protected]> Reviewed-by: Emmanuel Odeke <[email protected]>
Change https://golang.org/cl/313210 mentions this issue: |
Between Go 1.15, 1.16 and tip, the following 29 ports don't have a real TryBot and instead rely on misc-compile TryBots for their pre-submit coverage: • aix/ppc64 • darwin/amd64 • darwin/arm64 • dragonfly/amd64 • freebsd/386 • freebsd/arm • freebsd/arm64 • illumos/amd64 • linux/mips • linux/mips64 • linux/mipsle • linux/mips64le • linux/ppc64 • linux/ppc64le • linux/riscv64 • linux/s390x • netbsd/386 • netbsd/amd64 • netbsd/arm • netbsd/arm64 • openbsd/386 • openbsd/arm • openbsd/arm64 • openbsd/mips64 • plan9/386 • plan9/amd64 • plan9/arm • solaris/amd64 • windows/arm The previous approach for misc-compile target selection was to break them up primarily by GOOS value. However, as new architectures were added over time, some misc-compile TryBots got to a point where they were testing upwards of 5 ports (for example, misc-compile-openbsd was testing 386, amd64, arm, arm64, and mips64 architectures). Since each port is tested sequentially, allocating too many to one misc-compile TryBot can cause it to become the bottleneck of an entire TryBot run, exceeding the 10 minute completion time goal. Arrange it so misc-compile TryBot target selection is done explicitly in x/build, and pick 3 as max number of targets per TryBot for now. Based on recent timing observations, that should strike a decent balance between resource use (spinning up a builder) vs chance of a misc-compile TryBot becoming a bottleneck. It will also give us an opportunity to compare timing of 1, 2 and 3 targets per misc-compile in the future. (When we start tracking timing for TryBot completion time holistically, we'll be in a better position to refine this strategy further.) Making misc-compile target selection explicit in x/build also enables removing unnecessary duplicate misc-compile coverage from ports that already have a real TryBot (for example, openbsd/amd64 was previously tested via both the openbsd-amd64-68 TryBot and misc-compile-openbsd). This shouldn't be needed, so it's no longer done. For golang/go#17104. Fixes golang/go#32632. Change-Id: Iac918377b91af3e48780b38ffdf3153e213eeba2 Reviewed-on: https://go-review.googlesource.com/c/build/+/313210 Trust: Dmitri Shuralyov <[email protected]> Run-TryBot: Dmitri Shuralyov <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Heschi Kreinick <[email protected]> Reviewed-by: Carlos Amedee <[email protected]>
Change https://go.dev/cl/413420 mentions this issue: |
This change increases the default resources allocated to builders running on GCE. Testing has shown a reduction in all.bash build times for the machines with larger resources. A few unit tests that use the default sizes have also been updated throughout the repository. Updates golang/go#17104 Change-Id: I92ba4509bf667da432f011d8f61d2dea7dac5fc4 Reviewed-on: https://go-review.googlesource.com/c/build/+/413420 Reviewed-by: Alex Rakoczy <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]>
Change https://go.dev/cl/419077 mentions this issue: |
CL 413420 changed the default GCE instance type from e2-highcpu-2 (2 vCPU, 2 GB) to e2-standard-8 (8 vCPU, 32 GB), and the default containerized instance type from e2-standard-4 (4 vCPU, 16 GB) to e2-standard-16 (16 vCPU, 64 GB). Any entries in the dashboard table that previously opted in to bigger, faster machines are now opting into smaller, slower machines. That seems undesirable, so delete all the opt-in smaller, slower machineType entries. For golang/go#17104. Change-Id: I88806c1586b219257229c9e2302464efdcc558a6 Reviewed-on: https://go-review.googlesource.com/c/build/+/419077 Reviewed-by: Dmitri Shuralyov <[email protected]> Reviewed-by: Carlos Amedee <[email protected]> Auto-Submit: Russ Cox <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: Russ Cox <[email protected]>
Change https://go.dev/cl/515735 mentions this issue: |
It completes in under 30 minutes as is. Go with a tentative target of 10 minutes for now (via golang/go#17104), we can revisit it later. In comparison, the previous build system currently uses effectively the equivalent of 33 shards (that each handle one port), so even 12 shards is only about a third of that. For golang/go#61698. For golang/go#17104. Change-Id: Ie641e0574127fcb72b081219f540fb5e7a9cf020 Reviewed-on: https://go-review.googlesource.com/c/build/+/515735 TryBot-Bypass: Dmitri Shuralyov <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]> Auto-Submit: Dmitri Shuralyov <[email protected]> Reviewed-by: Michael Knyszek <[email protected]>
Updated in 2021: The goal is to make most trybot runs take complete within the time specified in the issue subject. It's currently blocked on creating a mechanism to systematically measure and track TryBot completion time.
Updated in 2024: Also blocked on finishing the build system migration to LUCI.
The text was updated successfully, but these errors were encountered: