Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/cmd/coordinator: misc/cgo/testplugin.TestIssue25756pie fails on darwin-arm64 builder deterministically but only in sharded mode #46239

Closed
cherrymui opened this issue May 18, 2021 · 10 comments
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@cherrymui
Copy link
Member

CL https://go-review.googlesource.com/c/go/+/319489 added a test to misc/cgo/testplugin. The test is failing in the darwin-arm64 builder.

Interestingly, I cannot reproduce the failure locally on a darwin-arm64 machine, by running cd GOROOT/misc/cgo/testplugin; go test. I also cannot reproduce the failure on gomote, by running gomote run $VM go/bin/go tool dist test testplugin or gomote run $VM go/src/all.bash.

But it does fail consistently on the build dashboard and with trybot ( https://storage.googleapis.com/go-build-log/d4df70bf/darwin-arm64-11_0-toothrot_5af7c061.log )

Maybe there is some unusual setting on the builder?

cc @toothrot @cagedmantis @dmitshur

@bcmills bcmills added this to the Go1.17 milestone May 18, 2021
@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 18, 2021
@dmitshur dmitshur added the Builders x/build issues (builders, bots, dashboards) label May 18, 2021
@dmitshur
Copy link
Contributor

dmitshur commented May 18, 2021

This test has also passed for me locally on an M1 machine with macOS 11.3.1, so it seems this being a builder issue is likely. I'll try to look at why it's failing in post-submit but not via gomote.

@dmitshur
Copy link
Contributor

dmitshur commented May 19, 2021

The test also passes on the builder as invoked during release tests (i.e., with something like release -target=darwin-arm64 -version=go1.17beta123 -watch -rev=ff7d5f97b3375a87e2de90f42e96983e5f0f95a4). This supports a theory that the problem could be related to coordinator using SplitMakeRun mode (where it compiles, then snapshots, then runs tests) during TryBots and post-submit builds.

@bcmills
Copy link
Contributor

bcmills commented May 19, 2021

This supports a theory that the problem could be related to coordinator using SplitMakeRun mode

That suggests a possible connection to #33598.

@dmitshur
Copy link
Contributor

I've investigated this more today. The test is passing on a physical darwin/arm64 machine, via sequential (non-sharded) test execution, when executed on its own, but failing specifically when executed by cmd/coordinator in sharded test mode on this builder. It's really nice that it fails so reproducibly and gives us a chance to investigate it, as it's likely the root cause for this problem may shared across more builder issues, though the feedback loop is quite slow.

Getting to the bottom of it will take longer, so I'll send a CL to skip the test for now, so that its failure doesn't mask other darwin/arm64 regressions in the mean time.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/321349 mentions this issue: misc/cgo/testplugin: skip TestIssue25756pie on darwin/arm64 builder

gopherbot pushed a commit that referenced this issue May 20, 2021
This test is known to be broken on the darwin/arm64 builder.
Skip it while it's being investigated so it doesn't mask other failures.

For #46239.
Updates #43228.

Change-Id: I8fe57a0636bba84c3100337146dcb96cc264e524
Reviewed-on: https://go-review.googlesource.com/c/go/+/321349
Trust: Dmitri Shuralyov <[email protected]>
Run-TryBot: Dmitri Shuralyov <[email protected]>
TryBot-Result: Go Bot <[email protected]>
Reviewed-by: Cherry Mui <[email protected]>
@dmitshur dmitshur added the okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 label May 27, 2021
@heschi heschi removed the okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 label Jun 10, 2021
@dmitshur
Copy link
Contributor

Based on investigation above, this is a bug in the build system rather than a test failure. The problematic builder is skipped via CL 321349, so this is no longer blocking Go 1.17 release.

I think this is a good opportunity for us to investigate further, likely after the 1.17 release, so retitling and moving to next milestone.

@dmitshur dmitshur changed the title misc/cgo/testplugin: TestIssue25756pie fails on darwin-arm64 builder x/build/cmd/coordinator: misc/cgo/testplugin.TestIssue25756pie fails on darwin-arm64 builder deterministically but only in sharded mode Jun 10, 2021
@dmitshur dmitshur modified the milestones: Go1.17, Go1.18 Jun 10, 2021
@heschi
Copy link
Contributor

heschi commented Dec 6, 2021

I confirmed that the test passes on MacOS 12 as a simple all.bash invocation. I'll add a skip matching https://go-review.googlesource.com/c/go/+/321349/.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/369748 mentions this issue: [release-branch.go1.17] misc/cgo/testplugin: skip TestIssue25756pie on darwin/arm64 builder

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/369752 mentions this issue: misc/cgo/testplugin: remove skip in TestIssue25756pie

gopherbot pushed a commit that referenced this issue Dec 6, 2021
…n darwin/arm64 builder

Repeat of CL 321349 for macOS 12. We won't need to do this again -- the
test is passing at tip.

Updates #46239.

Change-Id: Ib279ada443ee03eb8e70fde4bbfba65ce0f6322e
Reviewed-on: https://go-review.googlesource.com/c/go/+/369748
Trust: Heschi Kreinick <[email protected]>
Reviewed-by: Dmitri Shuralyov <[email protected]>
Run-TryBot: Heschi Kreinick <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
gopherbot pushed a commit that referenced this issue Dec 7, 2021
Though this was a problem for Go 1.17,
it appears not to be a problem on tip.

This reverts change made in CL 321349.

For #46239.

Change-Id: Ie4d6649fbabce3bb2c1cf04d97760ba6ceadaca5
Reviewed-on: https://go-review.googlesource.com/c/go/+/369752
Run-TryBot: Dmitri Shuralyov <[email protected]>
Reviewed-by: Cherry Mui <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Trust: Dmitri Shuralyov <[email protected]>
@ianlancetaylor
Copy link
Member

This seems to be fixed. We are running the test and nobody is complaining.

@heschi heschi moved this to Done in Go Release Sep 27, 2022
@golang golang locked and limited conversation to collaborators Jan 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Archived in project
Development

No branches or pull requests

6 participants