Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: cache link output binaries in the build cache #69290

Open
matloob opened this issue Sep 5, 2024 · 25 comments
Open

cmd/go: cache link output binaries in the build cache #69290

matloob opened this issue Sep 5, 2024 · 25 comments

Comments

@matloob
Copy link
Contributor

matloob commented Sep 5, 2024

Proposal Details

This proposal is for cmd/go to cache the binary outputs of link actions in builds to the build cache. Binary outputs would be trimmed from the cache earlier than package outputs.

cmd/go currently caches the outputs of compiling a package (build actions) in the build cache, but does not cache the outputs of linking a binary (link actions) in the build cache:

// Cache package builds, but not binaries (link steps).

The primary reasons binaries are not cached are that built binaries are much larger than individual package object files and they are not reused as often. We would mitigate that by trimming binaries with a shorter trimLimit than we currently use for the package objects in the cache: we currently remove a package output if it hasn't been used for five days, but we would perhaps choose two days for binaries. To make it easy to identify binaries for trimming, we would store them in a different location than package objects: perhaps instead of $GOCACHE// they would be stored in $GOCACHE/exe//.

We would also need to figure out what to do about the potential for ETXTBSY issues trying to execute the built binaries: see #22220. If the go command tries to write to a binary and then execute it we can get errors executing the binary. We'll have to figure out what to do about this because we would need to write the build id into the binary and then execute it, if we're doing a go run.

cc @rsc @samthanawalla

@gopherbot gopherbot added this to the Proposal milestone Sep 5, 2024
@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Sep 5, 2024
@ianlancetaylor
Copy link
Contributor

How often do we think the cache would be used for a binary? Pretty much any change to the code of any of the inputs will require a relink. When would this save time in practice?

@ianthehat
Copy link

I think the main use case is go run package@version which will always produce the same binary.

Could we consider a total space ejection policy rather than a time based one? (drop oldest from cache when we try to add something to cache that pushes it over the limit)

@matloob
Copy link
Contributor Author

matloob commented Sep 6, 2024

Yes, I think the main use would be for go run package@version. But I think it could also be useful for go run package. This can be useful for tools used by projects such as those in tools.go and it would also be useful for the new tool feature being implemented for go 1.24 (#48429).

@ianlancetaylor
Copy link
Contributor

Thanks. That at least raises the possibility of using the build cache for go run but not for go build or go install.

@ConradIrwin
Copy link
Contributor

One thing I'd like for this is if the package name showed up in the output of ps.

I think that means that we should store binaries like stringer on disk in a directory: $GOCACHE/exe/<ha>/<hash>/stringer, though it may also be OK to do something like $GOCACHE/exe/<hash>-stringer if there's a lot of overhead per directory; so that if a tool is misbehaving I can ps ax | grep stringer to find it.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/613095 mentions this issue: cmd/go: prototype of binary caching

@matloob
Copy link
Contributor Author

matloob commented Sep 13, 2024

I've put together a basic prototype (caches all binaries, doesn't use package name or ExeName as file name) at golang.org/cl/613095

@adonovan
Copy link
Member

Does this need a proposal? It seems to be a mere implementation detail that shouldn't affect the observable behavior, except quantitatively.

(I suppose @ConradIrwin's point that "go run" processes might notice their argv[0] name has changed is a counterargument but @matloob has an implementation that avoids that by using a subdirectory <hash>/stringer.)

@mvdan
Copy link
Member

mvdan commented Sep 19, 2024

ETXTBSY might not be an issue on Linux for too long: #22315 (comment)

@ianlancetaylor
Copy link
Contributor

There are already people who complain about the size of the build cache (e.g., #68872), so I do think this is more than an implementation detail.

@ConradIrwin
Copy link
Contributor

The original go tool proposal had us caching the outputs not based on the build graph, but on the current module path + tool package path.

For the go tool use-case, it doesn't make much difference. If the key is based on build graph then you'll end up with multiple copies in the cache when the dependencies change (which is rare), but you reduce the number of copies if you are actively working on multiple modules that use the same tool at the same version (which is rare).

That said, I would ideally like a longer cache expiry time for tools than 2 days. It's very common for people to take weekends off (and even long weekends happen occasionally). 5 days is actually quite good for the use-case of "stop working on Thursday afternoon and pick it up again on Tuesday morning".

If we're changing to cache go run too as a part of this change, it's more of an issue. It's very common to make changes to a module and then continually go run it. If we use a build graph based cache key we'll accumulate builds that will almost never be re-used; if we chose a cache key based on current module + package path then we only ever have the latest version, which would significantly reduce the cache size, and probably doesn't affect the cache hit rate very much (only if someone "undoes" to a previous state of the code, which while nice to be fast is not a very common case).

To be clear, if we're not changing the caching behavior of go run, then it doesn't matter either way; if we are changing that behavior, then I think we should deliberately choose our cache keys to ensure that only the latest copy of "the same" binary is kept. (And I think current module path + package path is a reasonable definition of "the same").

@aclements
Copy link
Member

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.

@aclements aclements moved this from Incoming to Active in Proposals Oct 23, 2024
@ChrisHines
Copy link
Contributor

@ConradIrwin said:

If the key is based on build graph then you'll end up with multiple copies in the cache when the dependencies change (which is rare), but you reduce the number of copies if you are actively working on multiple modules that use the same tool at the same version (which is rare).

Please elaborate on these statements, I'm not sure I understand. That last part about it being rare to actively work on multiple modules that use the same tool at the same version being rare caught my attention. My current project has several mock services written as standalone modules that are used by many other modules's test suites. The tests go build the mock services and run them. go build can find the mock service main packages because we've used the tool dependency pattern to get them into the go.mod file of the module under test. That pattern is repeated across many of our micro-service modules and they all depend on the same versions of the mock services.

@ConradIrwin
Copy link
Contributor

@ChrisHines interesting. Are all your modules in one workspace? My experience has been using one module with a bunch of packages to get the same effect. How do you keep all your services' dependencies' versions in sync?

@ChrisHines
Copy link
Contributor

@ChrisHines interesting. Are all your modules in one workspace? My experience has been using one module with a bunch of packages to get the same effect.

Each module is in a separate Git repository.

How do you keep all your services' dependencies' versions in sync?

Excellent question. This project was late to adopt modules for reasons I wrote about last year in #60915. Prior to modules our dependencies were managed by a tool that allowed depending on a branch head and we used that for our internal dependencies. With that tool all of our "modules" referenced specific versions of third party libraries, but referenced the main branch of our own "modules". The main branch is protected and only updated by CI if all checks pass. Meanwhile we have a tip branch that is the integration branch all PRs are merged to.

We wanted to preserve the existing CI and developer workflows the project had for years as we transitioned to Go modules and fortunately we were able to figure out a way to do that. The short and simplified explanation follows.

For any new commit to the tip branch of a module or the main branch of any of its internal dependencies our CI pipeline runs go get <dependency>@main for our own modules. If that results in changes to the go.mod file they are committed back to tip. If there are no changes to go.mod then all tests are run and, if they pass, the main branch is fast-forwarded to match the revision of tip in the build. That change might trigger other modules to update/build in the same way until any update to a library module is fully propagated. This process requires that we don't have any cycles in the internal subset of our module graph.

@aclements
Copy link
Member

I think it makes sense to cache the build output of go run and not go build or go install. What about go test? That seems less valuable because we can already cache test output, but not useless either.

@mvdan
Copy link
Member

mvdan commented Oct 30, 2024

I would not do go test right from the start - even setting aside test output caching, I develop code while testing changes continuously, so that would likely mean dozens if not hundreds of cached test binaries per day.

@seankhliao
Copy link
Member

it seems less useful for go test since most of the time you only rerun a test after changing / fixing some code

@matloob
Copy link
Contributor Author

matloob commented Oct 31, 2024

Yeah, I don't think we need to cache the binary for go test.

@matloob
Copy link
Contributor Author

matloob commented Nov 4, 2024

For the practical purposes of this proposal we'd want to support go run, and also the new custom go tool behavior in #48429.

@aclements aclements moved this from Active to Likely Accept in Proposals Nov 13, 2024
@aclements
Copy link
Member

Based on the discussion above, this proposal seems like a likely accept.

This proposal is for cmd/go to cache the binary outputs of link actions to the build cache for go run and custom go tools. Binary outputs would be trimmed from the cache earlier than package outputs.

@bradfitz
Copy link
Contributor

Yeah, I don't think we need to cache the binary for go test.

There's actually one valid use case for that: sharding slow tests across many machines, with each running a different subset of tests.

e.g. we have a test helper like:

// Shard skips t if it's not running if the TS_TEST_SHARD test shard is set to
// "n/m" and this test execution number in the process mod m is not equal to n-1.
// That is, to run with 4 shards, set TS_TEST_SHARD=1/4, ..., TS_TEST_SHARD=4/4
// for the four jobs.
func Shard(t testing.TB) {
	e := os.Getenv("TS_TEST_SHARD")
	a, b, ok := strings.Cut(e, "/")
	if !ok {
		return
	}
	wantShard, _ := strconv.ParseInt(a, 10, 32)
	shards, _ := strconv.ParseInt(b, 10, 32)
	if wantShard == 0 || shards == 0 {
		return
	}

	shard := ((testNum.Add(1) - 1) % int32(shards)) + 1
	if shard != int32(wantShard) {
		t.Skipf("skipping shard %d/%d (process has TS_TEST_SHARD=%q)", shard, shards, e)
	}
}

And then we configure GitHub to run our tests multiple times, each running a different subset of the tests:

  test:
    strategy:
      fail-fast: false # don't abort the entire matrix if one element fails
      matrix:
        include:
          - goarch: amd64
            coverflags: "-coverprofile=/tmp/coverage.out"
          - goarch: amd64
            buildflags: "-race"
            shard: '1/3'
          - goarch: amd64
            buildflags: "-race"
            shard: '2/3'
          - goarch: amd64
            buildflags: "-race"
            shard: '3/3'
...
    - name: integration tests as root
      run: PATH=$PWD/tool:$PATH /tmp/testwrapper -exec "sudo -E" -race ./tstest/integration/
      env:
        TS_TEST_SHARD: ${{ matrix.shard }}

So with GOCACHEPROG, we'd like all three machines to share the same linked test binary.

@aclements
Copy link
Member

I don't want to let the perfect be the enemy of the good here. I think we have a solid and agreed upon core for caching go run and go tools that we can move forward on and we can separately figure out if and how this should interact with go test.

@aclements aclements moved this from Likely Accept to Accepted in Proposals Nov 21, 2024
@aclements
Copy link
Member

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.

This proposal is for cmd/go to cache the binary outputs of link actions to the build cache for go run and custom go tools. Binary outputs would be trimmed from the cache earlier than package outputs.

@aclements aclements changed the title proposal: cmd/go: cache link output binaries in the build cache cmd/go: cache link output binaries in the build cache Nov 21, 2024
@aclements aclements modified the milestones: Proposal, Backlog Nov 21, 2024
@dmitshur dmitshur modified the milestones: Backlog, Go1.24 Nov 22, 2024
@dmitshur
Copy link
Contributor

This cmd/go change might be noteworthy enough¹ to be covered in Go 1.24 release notes, so reopening with a release blocker to track that.

¹ In addition to being a nice enhancement, it'd help users understand changes to the size of build cache as mentioned in #69290 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests