Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeated issues with unable to create directory for _empty.go on Windows #3558

Closed
tomqwpl opened this issue May 12, 2023 · 19 comments · Fixed by #3566
Closed

Repeated issues with unable to create directory for _empty.go on Windows #3558

tomqwpl opened this issue May 12, 2023 · 19 comments · Fixed by #3566

Comments

@tomqwpl
Copy link

tomqwpl commented May 12, 2023

What version of rules_go are you using?

v0.39.0

What version of gazelle are you using?

v0.29.0

What version of Bazel are you using?

bazel 6.1.0

Does this issue reproduce with the latest releases of all the above?

Yes. tried 0.39.1 of rules, bazel 6.2.0, 0.30.0 of gazelle.

What operating system and processor architecture are you using?

Windows x64

Any other potentially useful information about your toolchain?

What did you do?

Just trying to build golang

What did you expect to see?

That it works

What did you see instead?

Many golang targets (almost always tests, don't know whether that is significant) fail with errors like:

compilepkg: could not create directory for _empty.go: mkdir C:\users\me\_bazel_me\dotzqxa5\execroot\altr_sw_hub\bazel-out\x64_windows-fastbuild\bin\golang\pkg\auth\auth\go_altair_com_slchub_pkg_auth_auth: Cannot create a file when that file already exists.

There is an issue elsewhere that reports that this only happens when there are multiple go_test targets in the BUILD.bazel file. That isn't the case here, or at least not in all cases (it is in a couple).

It also happens when running with --jobs=1, though I haven't run a completely clean build like that. I ran bazel with --keep_going to try and run all targets that would succeed and then tried running with --jobs=1 to see whether the rest would succeed.
I could get it to run using --jobs=1 and then each time it failed, deleting the directory it was trying to create. After a few rounds of that the build succeeded.

I have repeated the test by first manually deleting the "execroot" output directory to force a completely clean build and I get the same thing.

The curious thing is that looking at the source code for the rules, I don't really understand why it's even trying to make an _empty.go file. The appears to be that it is only created if len(srcs.goSrcs) == 0. But all of my rules have lists of go files, and none of the files have golang build conditions in them, so it ought to have real code to build.

@tomqwpl
Copy link
Author

tomqwpl commented May 12, 2023

I have just done a complete clean build with "--jobs=1" and it ran through successfully without intervention, so this would appear to be a concurrency issue of some kind. Note that I'm not using the experimental "sandboxing" feature on Windows.

As I say, the curiosity I have is why it's even attempting to create an "_empty.go" file.

@fmeum
Copy link
Collaborator

fmeum commented May 15, 2023

cc @sluongng as you worked on this in the past

I could see this happening if a build action is interrupted before performing the cleanup that deletes the directory, but that should be rare.

@tomqwpl Does the directory exist after every build? Are there two targets with importpaths that become identical after being converted to directory names (e.g. slashes replaced with underscores)?

@sluongng
Copy link
Contributor

Yeah, I ran into this in the past. See #3145 for a detailed explanation.

If you could help provide how you configured your tests in the same package, it would be tremendously helpful for us to troubleshoot this.

Part of the rules_go test suite configuration also "demonstrates" this well: there is no sandboxing on Windows. Bazel's experimental sandboxing for Windows is actually a no-op without a BazelSandbox.exe which is not included in any releases.

So long story short, when working with Bazel on Windows, expect no sandbox available.
I will create a topic on Bazel discussion https://github.com/bazelbuild/bazel/discussions to share my understanding of the state of Windows sandboxing further.

@tomqwpl
Copy link
Author

tomqwpl commented May 15, 2023

@fmeum In response to "Does the directory exist after every build?".
Not totally sure what you mean. Perhaps this helps.
I just tried running a build. Failed in the above described way.
I then deleted the directory it said it was trying to create and reran the build. It failed on another instance of the same thing. However, the directory I had deleted, then exists, and is empty.
The directory that it says it's trying to create for the second failure isn't empty, it contains "_empty.go".

So I believe that when it fails with the "can't create directory for "_empty.go", the directory it's trying to create exists, and already contains _empty.go. If I delete that directory and rerun, then the directory is created, and at the end of the build is actually empty.

Hopefully that makes sense

@tomqwpl
Copy link
Author

tomqwpl commented May 15, 2023

@sluongng Here is an example that's just failed:

load("@io_bazel_rules_go//go:def.bzl", "go_library", "go_test")

go_library(
    name = "rbac",
    srcs = ["migrations.go"],
    embedsrcs = glob(["migrations/*.sql"]),
    importpath = "go.altair.com/slchub/pkg/migrations/rbac",
    visibility = ["//visibility:public"],
)

go_test(
    name = "rbac_test",
    srcs = ["migrations_test.go"],
    embed = [":rbac"],
    deps = [
        "@com_github_onsi_ginkgo//:ginkgo",
        "@com_github_onsi_ginkgo//reporters",
        "@com_github_onsi_gomega//:gomega",
    ],
)

None of the source files have any build conditions in them.

Note that the BUILD.bazel file has only one go_test in. BUILD.bazel is generated from gazelle.

@tomqwpl
Copy link
Author

tomqwpl commented May 15, 2023

Are there two targets with importpaths that become identical after being converted to directory names (e.g. slashes replaced with underscores)?

As far as I can tell, not. All of the BUILD.bazel build files are generated by gazelle. The one that just failed (golang\pkg\migrations\rbac\go_altair_com_slchub_pkg_migrations_rbac) I'm sure that there's not another package with the same name if you convert slashes to underscores. And anyway, it would have to exist in the same "golang\pkg\migrations\rbac\BUILD.bazel" build file for it to be trying to create a directory under the same "golang\pkg\migrations\rbac" directory. That file is exhibited above. It's as simple as it can be.

@sluongng
Copy link
Contributor

Rewrite the error above a bit for easier reading.

compilepkg: 
  could not create directory for _empty.go: 
    mkdir C:\users\me\_bazel_me\dotzqxa5\execroot\altr_sw_hub\bazel-out\x64_windows-fastbuild\bin\golang\pkg\auth\auth\go_altair_com_slchub_pkg_auth_auth: 
      Cannot create a file when that file already exists.

This is happening on

emptyDir := filepath.Join(filepath.Dir(outPath), sanitizePathForIdentifier(importPath))
if err := os.Mkdir(emptyDir, 0o700); err != nil {
return fmt.Errorf("could not create directory for _empty.go: %v", err)
}
defer os.RemoveAll(emptyDir)

It seems like the directory C:\users\me\_bazel_me\dotzqxa5\execroot\altr_sw_hub\bazel-out\x64_windows-fastbuild\bin\golang\pkg\auth\auth\go_altair_com_slchub_pkg_auth_auth is being re-created 🤔, which suggests that there is another go_test in the same package that may have created the directory and enter the same race condition.

In https://github.com/bazelbuild/rules_go/pull/3145/files#diff-f5ba5554688cd727c306fb33aa10f80e92fd27201da575159972b133f7c6916cR105, when there is no implicit importpath attribute set on the go_test target, we use the label name( things like name = "auth") as importpath instead.

Since I don't have a way to reproduce this on my end, @tomqwpl is it possible for you to setup a small example git repo that we could use to reproduce and fix the problem?

@tomqwpl
Copy link
Author

tomqwpl commented May 15, 2023

This is happening on

The bit I'm suspicious about is the "if len(srcs.goSrcs) == 0 {" line. Given the BUILD.bazel file above, why is that condition true? Or am I misunderstanding something?

I can have a go at setting something up, but as you can see from my example above, it's a pretty simple example. This has the feel of something that will only occur in a large example though. It doesn't always happen in the same place, though it always happens somewhere for me at the moment.
I will try and separate that into its own example and see if fails on its own.

@sluongng
Copy link
Contributor

The bit I'm suspicious about is the "if len(srcs.goSrcs) == 0 {" line. Given the BUILD.bazel file above, why is that condition true? Or am I misunderstanding something?

You would need to understand how Go (and rules_go) compiles tests in a package. I would recommend reading #3145 as I tried explaining it there in detail.

But when there is a go_test target being compiled, rules_go will actually split that into 2 libraries underneath, 1 internal and 1 external.

# Compile the library to test with internal white box tests
internal_library = go.new_library(go, testfilter = "exclude")
internal_source = go.library_to_source(go, ctx.attr, internal_library, ctx.coverage_instrumented())
internal_archive = go.archive(go, internal_source)
go_srcs = split_srcs(internal_source.srcs).go
# Compile the library with the external black box tests
external_library = go.new_library(
go,
name = internal_library.name + "_test",
importpath = internal_library.importpath + "_test",
testfilter = "only",
)
external_source = go.library_to_source(go, struct(
srcs = [struct(files = go_srcs)],
embedsrcs = [struct(files = internal_source.embedsrcs)],
deps = internal_archive.direct + [internal_archive],
x_defs = ctx.attr.x_defs,
), external_library, ctx.coverage_instrumented())
external_source, internal_archive = _recompile_external_deps(go, external_source, internal_archive, [t.label for t in ctx.attr.embed])
external_archive = go.archive(go, external_source)

It's possible that you use external tests or internal tests exclusively in that package though. In that case, one of the 2 packages will be empty without any source files to compile. And because Bazel expects deterministic outputs, we would still need to provide it with something... which is why we would create _empty.go there and compile it to create a dummy result.

@tomqwpl
Copy link
Author

tomqwpl commented May 15, 2023

It feels to me like there are two complete copies of bazel running at the same time both trying to do the same build, thus conflicting with each other. The things it is complaining about aren't duplicated as far as I can see. There is only one "go_test" in each BUILD.bazel file that it complaint at.

At the beginning of the build log I get:

DEBUG: C:/work/slc-hub/main/packaging/naming.bzl:12:10: {"version": "0.0.0.0", "branch": "master", "name_suffix": "DEV", "noinfo": "passed"}
DEBUG: C:/work/slc-hub/main/packaging/naming.bzl:12:10: {"version": "0.0.0.0", "branch": "master", "name_suffix": "DEV", "noinfo": "passed"}

I can only see one place in the BUILB.bazel files where the code to generate that is being referenced, yet it's being output twice.

I have only one java.exe process running. If I bazel shutdown it goes away. Symptoms persist when I try again after that.

I will try rebooting and see whether that makes any difference.

@tomqwpl
Copy link
Author

tomqwpl commented May 15, 2023

Rebooting makes no difference.
It runs reliably with --jobs=1.
I only get errors when there are tests being built. If I'm building just the "build the distribution zip or distribution MSI" target I don't have a problem. But with any time I do "build //..." it fails on this kind of error, unless "--jobs=1". That takes a while though (albeit less time than manually restarting it having deleted the directories it complains about each time).

@tomqwpl
Copy link
Author

tomqwpl commented May 15, 2023

It runs reliably with --jobs=1.

Possibly spoke too soon on that front.
Paraphrased output:

bazel build --jobs=1 //...

...

compilepkg: could not create directory for _empty.go: mkdir C:\users\tquarendon\_bazel_tquarendon\dotzqxa5\execroot\altr_sw_hub\bazel-out\x64_windows-fastbuild\bin\golang\pkg\auth\usergroup\store\go_altair_com_slchub_pkg_auth_usergroup_store: Cannot create a file when that file already exists

erase /q <that directory>

bazel build --jobs=1 //...

same complaint.

That is, it complained twice on subsequent runs about the same directory, despite the fact that I erased it in between (and I have checked the directories against each other, they are the same and I deleted the same one). So second time the directory didn't exist before bazel started, it must have been created by bazel.
In fact I appear to be able to do that over and over.

@sluongng
Copy link
Contributor

Yeah that's because you have multiple actions under //... that are creating a similar path.
It might help narrow things down if you were to do

bazel clean --expunge
bazel build --jobs=1 //golang/pkg/auth/usergroup/store/:all

That would let you build all targets under the same package and check if there is any duplication within that package.

@tomqwpl
Copy link
Author

tomqwpl commented May 15, 2023

>bazel clean --expunge
INFO: Starting clean.

>bazel build --jobs=1 //golang/pkg/auth/usergroup/store:all
INFO: Analyzed 2 targets (178 packages loaded, 10293 targets configured).
INFO: Found 2 targets...
INFO: Elapsed time: 208.235s, Critical Path: 57.10s
INFO: 137 processes: 9 internal, 128 local.
INFO: Build completed successfully, 137 total actions

@sluongng
Copy link
Contributor

@fmeum I know it's unlikely due to bazel-out\x64_windows-fastbuild but is there a chance that a downstream transition forced the same target to recompile using the same path and cause this collision? Not quite sure why the same package is being recompiled multiple times here.

@fmeum
Copy link
Collaborator

fmeum commented May 17, 2023

Different configurations should result in different output paths and the directory we create _empty.go in is relative to the output path.

@tomqwpl I have access to Windows - it would be very helpful if you could provide me with a reproducer.

@tomqwpl
Copy link
Author

tomqwpl commented May 18, 2023

So far I'm been unable to reproduce on a small example. If I just take the one BUILD.bazel file that fails for example and try and put that in a separate directory, it then succeeds. But equally, it seems that just running the one single build target seems to work. The BUILD.bazel files are nothing complicated though, see example above.

Now, the interesting thing is that I don't seem to be able to reproduce this any more. I've just done a "clean --expunge" and then a "build //..." and it worked without issue. Previously this would have given me an issue. I'm fairly sure nothing in the code has changed, and that I'm using the same directory as I was before.
I will try again, but I may have to just close this as now unreproducible.

@fmeum
Copy link
Collaborator

fmeum commented May 19, 2023

This is happening in Bazel CI (https://buildkite.com/bazel/bazel-at-head-plus-downstream/builds/3035#0188343c-c91d-4305-866f-aabf6f5e5a9d):

(13:48:16) ERROR: C:/b/bk-windows-v72b/bazel-downstream-projects/bazelisk/platforms/BUILD:14:8: GoCompilePkg platforms/platforms_test_test.external.a failed: (Exit 1): builder.exe failed: error executing GoCompilePkg command (from target //platforms:platforms_test)
--
  | cd /d C:/b/txumw2c7/execroot/__main__
  | SET CGO_ENABLED=1
  | SET GOARCH=amd64
  | SET GOOS=windows
  | SET GOPATH=
  | SET GOROOT=external/go_sdk
  | SET GOROOT_FINAL=GOROOT
  | SET INCLUDE=C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um;C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt;C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared;C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um;C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt;C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt
  | SET LIB=C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\lib\x64;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x64;C:\Program Files (x86)\Windows Kits\10\lib\10.0.22621.0\ucrt\x64;C:\Program Files (x86)\Windows Kits\10\lib\10.0.22621.0\um\x64
  | SET PATH=;C:/Program Files (x86)/Microsoft Visual Studio/2019/BuildTools/VC/Tools/MSVC/14.29.30133/bin/HostX64/x64;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\x64\;C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\Common7\IDE\;C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin;C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\Common7\IDE\CommonExtensions\Microsoft\CMake\Ninja;C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\Common7\IDE\VC\VCPackages;C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\Common7\Tools\;C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\Common7\Tools\devinit;C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\MSBuild\Current\bin\Roslyn;C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64;C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\\MSBuild\Current\Bin;C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\x64;C:\Program Files (x86)\Windows Kits\10\bin\x64;C:\Windows\Microsoft.NET\Framework64\v4.0.30319;C:\Windows\system32
  | SET TEMP=C:\temp
  | SET TMP=C:\temp
  | bazel-out\x64_windows-opt-exec-ST-f007e4e0eadc\bin\external\go_sdk\builder.exe compilepkg -sdk external/go_sdk -installsuffix windows_amd64 -src platforms/platforms.go -src platforms/platforms_test.go -embedroot  -embedroot bazel-out/x64_windows-fastbuild/bin -embedlookupdir platforms -arc github.com/bazelbuild/bazelisk/versions=github.com/bazelbuild/bazelisk/versions=bazel-out/x64_windows-fastbuild/bin/versions/go_default_library.x -arc github.com/hashicorp/go-version=github.com/hashicorp/go-version=bazel-out/x64_windows-fastbuild/bin/external/com_github_hashicorp_go_version/go-version.x -arc github.com/bazelbuild/bazelisk/platforms=github.com/bazelbuild/bazelisk/platforms=bazel-out/x64_windows-fastbuild/bin/platforms/platforms_test.internal.x -importpath github.com/bazelbuild/bazelisk/platforms -p github.com/bazelbuild/bazelisk/platforms_test -package_list bazel-out/x64_windows-opt-exec-ST-f007e4e0eadc/bin/external/go_sdk/packages.txt -o bazel-out/x64_windows-fastbuild/bin/platforms/platforms_test_test.external.a -x bazel-out/x64_windows-fastbuild/bin/platforms/platforms_test_test.external.x -testfilter only -gcflags  -asmflags
  | # Configuration: 2a3cf7208bd3ae1853df083f9823da61b7246f31df8a4ec512381271fcad013d
  | # Execution platform: @local_config_platform//:host
  | compilepkg: could not create directory for _empty.go: mkdir C:\b\txumw2c7\execroot\__main__\bazel-out\x64_windows-fastbuild\bin\platforms\github.com_bazelbuild_bazelisk_platforms: Cannot create a file when that file already exists.

It's possible that file deletion sometimes just fails and that leaves behind _empty.go files from previous invocations. Although in this case there really should be sources...

@fmeum
Copy link
Collaborator

fmeum commented May 19, 2023

Turns out that every test triggers this logic since there is a separate compile action for the external test package <name>_test, which is often empty.

fmeum added a commit to fmeum/bazelisk that referenced this issue May 19, 2023
Duplicated `go_tests` can result in compilation actions failing due to bazelbuild/rules_go#3558.
fweikert pushed a commit to bazelbuild/bazelisk that referenced this issue May 19, 2023
Duplicated `go_tests` can result in compilation actions failing due to bazelbuild/rules_go#3558.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants