Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: TestScript/list_std_stale fails if CGO_CFLAGS was set to a custom value during ./make.bash #46347

Closed
dmitshur opened this issue May 24, 2021 · 22 comments
Labels
FrozenDueToAge GoCommand cmd/go NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@dmitshur
Copy link
Contributor

dmitshur commented May 24, 2021

As found by @bcmills in #46292, this happens on tip (tested with commit 52d7033).

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/dmitshur/Library/Caches/go-build"
GOENV="/Users/dmitshur/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/dmitshur/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/dmitshur/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/dmitshur/gotip"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/dmitshur/gotip/pkg/tool/darwin_amd64"
GOVCS=""
GOVERSION="devel go1.17-52d7033ff6 Mon May 24 15:03:18 2021 +0000"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/dmitshur/gotip/src/go.mod"
CGO_CFLAGS="-O3"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/zb/5p8cwfhj29gf_m8vdy8ylmlr00jwcj/T/go-build1142646826=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

# Set CGO_CFLAGS in environment.
$ export CGO_CFLAGS='-O3'

# Build Go and run a test.
$ cd gotip/src
$ ./make.bash
$ export PATH="$(cd ../bin; pwd):$PATH"
$ go test -run=TestScript/list_std_stale -short -count=1 cmd/go

What did you expect to see?

A test result that verifies issue #44725 is fixed, or a skipped test execution if the current environment doesn't make it possible to run the test reliably.

What did you see instead?

The test fails if ./make.bash was run with CGO_CFLAGS set as above.

go test proxy running at GOPROXY=http://127.0.0.1:51387/mod
--- FAIL: TestScript (0.01s)
    --- FAIL: TestScript/list_std_stale (0.24s)
        script_test.go:252: 
            # https://golang.org/issue/44725: packages in std should not be reported as stale,
            # regardless of whether they are listed from within or outside GOROOT/src.
            # Control case: net should not be stale at the start of the test,
            # and should depend on vendor/golang.org/… instead of golang.org/…. (0.186s)
            > ! stale net
            FAIL: testdata/script/list_std_stale.txt:7: net is unexpectedly stale
            
            
FAIL
FAIL	cmd/go	1.683s
FAIL

Unsetting the env var afterwards doesn't make a difference:

$ unset CGO_CFLAGS
$ go test -run=TestScript/list_std_stale -short -count=1 cmd/go
[...]
FAIL	cmd/go	1.605s

Edit: Simplified reproduce case from CGO_CFLAGS='-mmacosx-version-min=10.13' to CGO_CFLAGS='-O3'.

@dmitshur dmitshur added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. GoCommand cmd/go labels May 24, 2021
@dmitshur dmitshur added this to the Go1.17 milestone May 24, 2021
@bcmills
Copy link
Contributor

bcmills commented May 24, 2021

Probably we should either explicitly propagate or explicitly clear out every environment variable in the union of envcmd.MkEnv(), envcmd.ExtraEnvVars(), and envcmd.ExtraEnvVarsCostly(), except for the ones that aren't writable (as controlled by envcmd.checkEnvWrite).

That way we can at least make a conscious decision about which environment variables should affect the test.

CC @jayconrod @matloob

@bcmills bcmills added NeedsFix The path to resolution is known, but the work has not been done. Testing An issue that has been verified to require only test changes, not just a test failure. labels May 24, 2021
@gopherbot gopherbot removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 24, 2021
@dmitshur
Copy link
Contributor Author

The current plan in #46292 is to rely on this testing issue being resolved to unblock the release, so transitively applying release-blocker label.

@bcmills
Copy link
Contributor

bcmills commented May 24, 2021

Hmm, I'm thinking about this some more.

If make.bash sets the environment variable CGO_CFLAGS='-mmacosx-version-min=10.13', won't that imply that users on MacOS will also see the net package as stale (and rebuild it) the first time they build anything that uses it?
Or does that setting somehow get baked into the resulting binaries?

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 24, 2021
@gopherbot gopherbot removed the NeedsFix The path to resolution is known, but the work has not been done. label May 24, 2021
@dmitshur
Copy link
Contributor Author

dmitshur commented May 24, 2021

The change to set CGO_CFLAGS='-mmacosx-version-min=10.13' was done as part the work to fix issues #36025 and #35459. Looking at it now, it does seem it might cause net package (or other cgo packages) to become stale if that environment variable is used as part of staleness calculation now and users don't set it. I remember that issue had a lot of complexity and nuance though, so there may be additional factors to consider here. (CC @cagedmantis, @jayconrod who worked on it last time.)

When I looked into it more recently for darwin/arm64 in CL 278435 I hadn't found a way to confirm that setting CGO_CFLAGS before make.bash was still needed to fix a problem, so that that's why the same change hasn't been applied for darwin/arm64 at this time.

@bcmills
Copy link
Contributor

bcmills commented May 24, 2021

I think the default -mmacosx-version-min flag probably ought to be baked into cmd/go, and then not set explicitly on the builder (CC @cherrymui for macOS and linker expertise).

I also wonder why we don't have the same problem with ARM builders from setting the GOARM flags: https://github.com/golang/build/blob/d0819edf598a7f08d0fa642ca67f9d2c094e775b/cmd/release/release.go#L421-L425

@dmitshur
Copy link
Contributor Author

dmitshur commented May 24, 2021

I also wonder why we don't have the same problem with ARM builders from setting the GOARM flags

It looks like cmd/go's TestScript is completely skipped on GOARCH=arm via testenv.SkipIfShortAndSlow.

@dmitshur
Copy link
Contributor Author

dmitshur commented May 24, 2021

I think the default -mmacosx-version-min flag probably ought to be baked into cmd/go, and then not set explicitly on the builder (CC @cherrymui for macOS and linker expertise).

If that's possible, I think it would likely be an improvement. The fewer special cases we need to maintain in x/build/cmd/release (which apply on top of make.bash and cmd/go but only during the release process and release testing), the better.

I wanted to verify that the original report in this issue applies more generally, so I tested on linux/amd64, and can confirm it also reproduces with something like this:

$ export CGO_CFLAGS='-O3'
$ ./make.bash # etc.
$ go test -run=TestScript/list_std_stale cmd/go

(The same test passes if CGO_CFLAGS in unset before make.bash.)

@cherrymui
Copy link
Member

$ CGO_CFLAGS='-mmacosx-version-min=10.13' ./make.bash 
...
$ CGO_CFLAGS='' go list -f '{{.Stale}}' net
true
$ CGO_CFLAGS='-mmacosx-version-min=10.13' go list -f '{{.Stale}}' net
false

Apparently it is not reported stale if the same flag is passed at staleness check and at make.bash. Does TestScript/list_std_stale do anything special with the flags, like unsetting CGO_CFLAGS?

@bcmills
Copy link
Contributor

bcmills commented May 24, 2021

Does TestScript/list_std_stale do anything special with the flags, like unsetting CGO_CFLAGS?

TestScript (intentionally) only propagates a specific subset of environment variables. I think the idea is that the behavior of the toolchain ought to be independent of the user's environment — especially given that the Go toolchain may be installed by a system administrator but run by an individual user — and most users should not need to set CGO_CFLAGS explicitly.

We have another existing staleness check that is currently skipped on darwin configurations here:

[!darwin] ! stale cmd/cgo # The darwin builders are spuriously stale; see #33598.

If cmd/cgo is actually stale on the macOS builds that we're distributing to users, I'm not sure exactly what that implies — we don't generally cache binaries, so is it possible that cmd/cgo is being reinstalled on users' machines, or relinked from the build cache every time a macOS user builds any package involving cgo?

@cherrymui
Copy link
Member

I'm not sure exactly what that implies — we don't generally cache binaries, so is it possible that cmd/cgo is being reinstalled on users' machines, or relinked from the build cache every time a macOS user builds any package involving cgo?

I don't think we check binary staleness when invoking a tool binary. The go command just invokes the installed binary, whether it is stale or not.

@cherrymui
Copy link
Member

I think the idea is that the behavior of the toolchain ought to be independent of the user's environment — especially given that the Go toolchain may be installed by a system administrator but run by an individual user — and most users should not need to set CGO_CFLAGS explicitly.

This sounds like we may want to somehow bake the make.bash-time CGO_CFLAGS into the toolchain.

On the other hand, what if a user just wants to set CGO_CFLAGS when building the standard library and toolchain, but not when building user code? Or maybe nobody wants to do that...

@bcmills
Copy link
Contributor

bcmills commented May 25, 2021

On the other hand, what if a user just wants to set CGO_CFLAGS when building the standard library and toolchain, but not when building user code?

I could perhaps see someone setting CGO_FLAGS for building the toolchain.

I don't think that would make sense for the standard library, though — it would just cause each cgo-enabled standard-library package to be rebuilt when the user first builds a program that imports it.

@bcmills
Copy link
Contributor

bcmills commented May 25, 2021

Hmm, ok. So the impact of a CGO_CFLAGS mismatch is:

  • Certain packages in std are reported as stale (because they might be changed by the flags).

    • These packages will be rebuilt, at a small one-time cost, when first needed as a go build or go test dependency.
  • Certain binaries in cmd are reported as stale (by go list -json and similar commands). It isn't obvious to me whether that is inherently a problem for external tools.

  • Certain tests in cmd may fail:

    • cmd/link.TestDWARF and cmd/link.TestDWARFiOS require that cmd/link itself is not stale.

    • Multiple tests in cmd/go assume that packages in the Go distribution are not stale:

      • TestScript/build_package_not_stale_trailing_slash assumes that runtime, os, and io are not stale. It is not obvious to me whether os can depend on CGO_CFLAGS, particularly on the platforms that use C libraries for system calls.
      • TestScript/list_std_stale assumes that net is not stale. That test uses staleness as a proxy for whether the dependencies of the package are resolved identically inside and outside of std, but it could perhaps be refactored not to do that.
      • TestScript/test_race_install_cgo checks that cmd/cgo is not stale at the start of the test, but that property might no longer be important to the test as of CL 68338 (Oct. 2017): the test will no longer fail spuriously even if cmd/cgo is stale, as demonstrated by the fact that the test passes on the darwin builders today.
      • Tests that involve the -i flag check that the runtime package is not stale, because otherwise they would attempt to install it to GOROOT. (Tests must in general not assume that they can write to GOROOT.)
        • TestScript/test_relative_import_dash_i tests go test -i.
        • TestGoTestDashIDashOWritesBinary tests go test -i.
        • TestInstallDeps tests go install -i.
        • Note that as of CL 266368 (Go 1.16) the -i flag is deprecated (cmd/go: deprecate the -i flag #41696).

@bcmills
Copy link
Contributor

bcmills commented May 25, 2021

Looking at the above, several things stand out:

  1. We have a lot of unintended technical debt surrounding CGO_CFLAGS and staleness checks. 😅
  2. A mismatched CGO_CFLAGS setting is mostly harmless as long as the user doesn't run go test cmd.
  3. Of the handful of cmd tests that do fail due to staleness, most (all?) arguably shouldn't, especially given that we already bypass some of those staleness checks on the darwin builders.

@cherrymui
Copy link
Member

cmd/link.TestDWARF and cmd/link.TestDWARFiOS require that cmd/link itself is not stale.

I still think that test should not check linker staleness at all. (This was added just to help local development.)

@bcmills
Copy link
Contributor

bcmills commented May 25, 2021

It appears that TestScript/build_package_not_stale_trailing_slash is out-of-date anyway: the test is checking for a missed filepath.Clean, but even if I change GOROOT to a lexically-different symlink the targets are still not reported as stale.

@perillo
Copy link
Contributor

perillo commented May 25, 2021

The problem seems to be a build ID mismatch. The build ID stored in net.o does not match with the build ID used in Builder.useCache.

@perillo
Copy link
Contributor

perillo commented May 25, 2021

The problem occurs when setting any of the CGO flags. From cgo documentation:

When building, the CGO_CFLAGS, CGO_CPPFLAGS, CGO_CXXFLAGS, CGO_FFLAGS and
CGO_LDFLAGS environment variables are added to the flags derived from
these directives. Package-specific flags should be set using the
directives, not the environment variables, so that builds work in
unmodified environments. Flags obtained from environment variables
are not subject to the security limitations described above.

Adding an fmt.Println in the Builder.buildActionID function to print each processed package cflags, I got:

$ CGO_CFLAGS='-O3' ./make.bash
runtime/cgo [-O3 -Wall -Werror]
os/user [-O3]
net [-O3]
os/signal/internal/pty [-O3]
plugin [-O3]

and

go test -run=TestScript/list_std_stale -short -count=1 cmd/go
runtime/cgo [-g -O2 -Wall -Werror]
net [-g -O2]

On the other hand:

$ ./make.bash
runtime/cgo [-g -O2 -Wall -Werror]
os/user [-g -O2]
net [-g -O2]
os/signal/internal/pty [-g -O2]
plugin [-g -O2]
$ go env CGO_CFLAGS
-g -O2

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/322629 mentions this issue: cmd/go,cmd/link: do not check for staleness in most tests

@bcmills bcmills self-assigned this May 25, 2021
@perillo
Copy link
Contributor

perillo commented May 25, 2021

I have a doubt: if for packages using cgo the build ID is meaningless, what's the purpose of creating a build ID for them?

@bcmills
Copy link
Contributor

bcmills commented May 25, 2021

The build ID is not meaningless — that determines whether the package needs to be rebuilt.

It just happens that “rebuilding a package” is not nearly as big a deal now that we always have a build cache.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/381854 mentions this issue: cmd/go: rewrite TestScript/cgo_stale_precompiled to be agnostic to staleness

gopherbot pushed a commit that referenced this issue Jan 31, 2022
…aleness

The configuration set by x/build/cmd/releasebot causes runtime/cgo to
be stale in the darwin/amd64 release (see #36025, #35459).
That staleness is mostly benign because we can reasonably assume that
users on macOS will either disable CGO entirely or have a C compiler
installed to rebuild (and cache) the stale packages if needed.

Fixes #50892
Fixes #50893
Updates #46347

Change-Id: Ib9ce6b5014de436264238f680f7ca4ae02c9a220
Reviewed-on: https://go-review.googlesource.com/c/go/+/381854
Trust: Bryan Mills <[email protected]>
Run-TryBot: Bryan Mills <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Russ Cox <[email protected]>
@rsc rsc unassigned bcmills Jun 23, 2022
@golang golang locked and limited conversation to collaborators Jun 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge GoCommand cmd/go NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

5 participants