Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Genrules in 6.0.0-pre.20220112.2 occasionally try to run shared libraries built by go_binary #14626

Closed
phst opened this issue Jan 22, 2022 · 13 comments · Fixed by bazelbuild/rules_go#3108

Comments

@phst
Copy link
Contributor

phst commented Jan 22, 2022

Description of the problem / feature request:

It looks like starting with the newest rolling release (6.0.0-pre.20220112.2) genrules occasionally try to execute shared libraries instead of binaries (which obviously fails).

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Apologies for not providing a more minimal example, but it seems that this issue is a bit subtle. It should be reproducible like this:

  1. Clone https://github.com/phst/emacs and check out commit 0cc0712941a0620a38f290b04de6c6d7314fba69 (most likely the precise commit doesn't matter)
  2. Try to build the repository with the newest rolling release, e.g. using Bazelisk and USE_BAZEL_VERSION=6.0.0-pre.20220112.2 bazelisk build -j 1 -s //...
  3. This will result in an error like
$ USE_BAZEL_VERSION=6.0.0-pre.20220112.2 bazelisk build -j 1 -s //...
Starting local Bazel server and connecting to it...
INFO: Analyzed 21 targets (201 packages loaded, 18517 targets configured).
INFO: Found 21 targets...
[…]
SUBCOMMAND: # @emacs_module_header_master//:gen_header [action 'Executing genrule @emacs_module_header_master//:gen_header', configuration: 8960923b9e7dc13418be101268efd8e57d80283213d18174705345598b699c6b, execution platform: @local_config_platform//:host]
(cd /home/phst/.cache/bazel/_bazel_phst/f2d3c0c57843c32aaa8cc9f44a3dc5a4/execroot/com_github_phst_emacs && \
  exec env - \
    PATH=/home/phst/.cache/bazelisk/downloads/bazelbuild/bazel-6.0.0-pre.20220112.2-linux-x86_64/bin:/home/phst/.local/bin:/home/phst/bin:/home/phst/go/bin:/home/phst/.cask/bin:/home/phst/.local/bin:/home/phst/bin:/home/phst/go/bin:/home/phst/.cask/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
  /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/k8-opt-exec-2B5CBBC6/bin/genheader_/genheader --template=external/emacs_module_header_master/emacs-module.h.in --output=bazel-out/k8-fastbuild/bin/external/emacs_module_header_master/emacs-module.h -- external/emacs_module_header_master/module-env-25.h external/emacs_module_header_master/module-env-26.h external/emacs_module_header_master/module-env-27.h external/emacs_module_header_master/module-env-28.h')
# Configuration: 8960923b9e7dc13418be101268efd8e57d80283213d18174705345598b699c6b
# Execution platform: @local_config_platform//:host
SUBCOMMAND: # @emacs_module_header_master//:gen_header [action 'Executing genrule @emacs_module_header_master//:gen_header', configuration: 3493caf1e8e99fb13b159a0d045be4e79e2d96bbadc15963937346a69bbd7c42, execution platform: @local_config_platform//:host]
(cd /home/phst/.cache/bazel/_bazel_phst/f2d3c0c57843c32aaa8cc9f44a3dc5a4/execroot/com_github_phst_emacs && \
  exec env - \
    PATH=/home/phst/.cache/bazelisk/downloads/bazelbuild/bazel-6.0.0-pre.20220112.2-linux-x86_64/bin:/home/phst/.local/bin:/home/phst/bin:/home/phst/go/bin:/home/phst/.cask/bin:/home/phst/.local/bin:/home/phst/bin:/home/phst/go/bin:/home/phst/.cask/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
  /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/k8-opt-exec-2B5CBBC6-ST-a88bc2bcbeb3/bin/genheader_/libgenheader.so --template=external/emacs_module_header_master/emacs-module.h.in --output=bazel-out/k8-fastbuild-ST-a88bc2bcbeb3/bin/external/emacs_module_header_master/emacs-module.h -- external/emacs_module_header_master/module-env-25.h external/emacs_module_header_master/module-env-26.h external/emacs_module_header_master/module-env-27.h external/emacs_module_header_master/module-env-28.h')
# Configuration: 3493caf1e8e99fb13b159a0d045be4e79e2d96bbadc15963937346a69bbd7c42
# Execution platform: @local_config_platform//:host
ERROR: /home/phst/.cache/bazel/_bazel_phst/f2d3c0c57843c32aaa8cc9f44a3dc5a4/external/emacs_module_header_master/BUILD.bazel:33:8: Executing genrule @emacs_module_header_master//:gen_header failed: (Segmentation fault): bash failed: error executing command (from target @emacs_module_header_master//:gen_header) /bin/bash -c ... (remaining 1 argument skipped)

Use --sandbox_debug to see verbose messages from the sandbox
INFO: Elapsed time: 13.244s, Critical Path: 2.43s
INFO: 6 processes: 4 internal, 2 linux-sandbox.
FAILED: Build did NOT complete successfully

Note how Bazel for the second action attempts to execute a shared library (libgenheader.so), which segfaults.

This doesn't happen with the previous rolling release (try USE_BAZEL_VERSION=6.0.0-pre.20220105.5 bazelisk build -j 1 -s //...). My guess is that it's related to "genrule switched to use exec transition instead of host" in https://github.com/bazelbuild/bazel/blob/master/CHANGELOG.md#release-600-pre202201122-2022-01-19. Changing $(location) to $(execpath) in the genrule command doesn't fix the issue.

Maybe this is also an issue with rules_go, but I didn't find anything obviously wrong when checking the implementation of go_binary.

What operating system are you running Bazel on?

GNU/Linux, recent Debian testing

What's the output of bazel info release?

$ USE_BAZEL_VERSION=6.0.0-pre.20220112.2 bazelisk info release
Starting local Bazel server and connecting to it...
release 6.0.0-pre.20220112.2

What's the output of git remote get-url origin ; git rev-parse master ; git rev-parse HEAD ?

$ git remote get-url origin ; git rev-parse master ; git rev-parse HEAD
https://github.com/phst/emacs
0cc0712941a0620a38f290b04de6c6d7314fba69
0cc0712941a0620a38f290b04de6c6d7314fba69

Have you found anything relevant by searching the web?

No

@aiuto aiuto added team-Local-Exec Issues and PRs for the Execution (Local) team untriaged and removed team-Local-Exec Issues and PRs for the Execution (Local) team labels Jan 23, 2022
@aiuto
Copy link
Contributor

aiuto commented Jan 23, 2022

Is this only true for Go, or does it happen for other langauges?
Have you tried reporting in rules_go?

@fmeum
Copy link
Collaborator

fmeum commented Jan 23, 2022

Using cquery, you can see that your build depends on //:genheader in two configurations with the following diff:

FragmentOptions com.google.devtools.build.lib.analysis.config.CoreOptions {
  affected by starlark transition: [@io_bazel_rules_go//go/config:linkmode], []
}
FragmentOptions user-defined {
  @io_bazel_rules_go//go/config:linkmode: c-shared, null
}

I'm pretty sure that what you are seeing is essentially a (much more concerning) variant of bazelbuild/rules_go#2999: The bin_name go_binary in your emacs_module macro causes rules_go to transition on linkmode, but that transition is not reset before it transitively depends on //:master, which in turn depends on the genrule. This causes the genrule tool binary to be built with linkmode = "c-shared", which causes the issue you are seeing. This didn't happen when genrule built its tools in the host configuration as that one appears to swallow all transitions.

If I set linkmode = "normal" (the default, unless transitions already changed it) explicitly on //:genheader, the build passes for me.

@phst Could you open an issue in rules_go? I started to work on resolving these kinds of issues in bazelbuild/rules_go#3005, but at that point didn't notice that it could also have implications for correctness and not just performance.

fmeum added a commit to fmeum/rules_go that referenced this issue Apr 9, 2022
By resetting Go-specific settings changed by go_transition to their
previous values when crossing a non-deps dependency edge, e.g. srcs or
data, dependencies through those edges are not affected by the change in
Go settings. This has two advantages:

* Correctness: Previously, if a go_binary with linkmode = "c-archive"
  used another go_binary to generate srcs, that go_binary would also be
  built as a c-archive and thus fail to run during the build. This was
  observed in bazelbuild/bazel#14626.
* Performance: With the new Bazel flag
  --experimental_output_directory_naming_scheme=diff_against_baseline,
  transitions can return to the top-level configuration since the
  Starlark settings that were affected by a transition at some point
  during the build are no longer tracked explicitly in a setting, but
  computed as a configuration diff. Resetting the configuration for non-
  deps dependencies thus prevents redundant rebuilds of non-Go deps
  caused by changes in Go settings, achieving a version of
  "configuration trimming" for Go.
@fmeum
Copy link
Collaborator

fmeum commented Apr 9, 2022

@phst I just finished the work on bazelbuild/rules_go#3108, which should address this issue on a fundamental level. Could you take a look and verify that it fixes this issue (I did, but better to be sure)?

fmeum added a commit to fmeum/rules_go that referenced this issue Apr 11, 2022
By resetting Go-specific settings changed by go_transition to their
previous values when crossing a non-deps dependency edge, e.g. srcs or
data, dependencies through those edges are not affected by the change in
Go settings. This has two advantages:

* Correctness: Previously, if a go_binary with linkmode = "c-archive"
  used another go_binary to generate srcs, that go_binary would also be
  built as a c-archive and thus fail to run during the build. This was
  observed in bazelbuild/bazel#14626.
* Performance: With the new Bazel flag
  --experimental_output_directory_naming_scheme=diff_against_baseline,
  transitions can return to the top-level configuration since the
  Starlark settings that were affected by a transition at some point
  during the build are no longer tracked explicitly in a setting, but
  computed as a configuration diff. Resetting the configuration for non-
  deps dependencies thus prevents redundant rebuilds of non-Go deps
  caused by changes in Go settings, achieving a version of
  "configuration trimming" for Go.
phst added a commit to phst/emacs that referenced this issue Apr 14, 2022
@phst
Copy link
Contributor Author

phst commented Apr 14, 2022

Maybe I'm doing something wrong, but with that PR I always get the error message

error: file '//go/private/rules:transition.bzl' does not contain symbol 'go_reset_target'

See e.g. https://github.com/phst/emacs/runs/6029654171

@phst
Copy link
Contributor Author

phst commented Apr 14, 2022

I've now filed bazelbuild/rules_go#3120 so that this is properly tracked in rules_go.

@fmeum
Copy link
Collaborator

fmeum commented Apr 14, 2022

Maybe I'm doing something wrong, but with that PR I always get the error message

error: file '//go/private/rules:transition.bzl' does not contain symbol 'go_reset_target'

See e.g. https://github.com/phst/emacs/runs/6029654171

I think that the build is failing because you are using the first (bazelbuild/rules_go@9fc6d45) and not the last (bazelbuild/rules_go@efd118d) commit of bazelbuild/rules_go#3108. Could you give it another try with the last commit?

phst added a commit to phst/emacs that referenced this issue Apr 14, 2022
@phst
Copy link
Contributor Author

phst commented Apr 14, 2022

Ah, good catch, I messed up the order of the commits.

@fmeum
Copy link
Collaborator

fmeum commented Apr 14, 2022

@phst https://github.com/phst/emacs/runs/6030260629?check_suite_focus=true is an interesting failure: It looks like my PR ends up reenabling nogo for a Go tool dependency, which is certainly not intended. I will look into this.

phst added a commit to phst/emacs that referenced this issue Apr 14, 2022
@phst
Copy link
Contributor Author

phst commented Apr 14, 2022

This might be another instance of bazelbuild/rules_go#2513, though I'm wondering why Bazel 5 isn't affected.

@fmeum
Copy link
Collaborator

fmeum commented Apr 15, 2022

The dependency chain from //:buildifier_test to the go_library that fails the nogo check is:

//:buildifier_test (d9cf67d)
@com_github_bazelbuild_buildtools//buildifier:buildifier (c3d5957)
@com_github_bazelbuild_buildtools//buildifier:go_default_library (c3d5957)
@com_github_bazelbuild_buildtools//warn:go_default_library (c3d5957)
@com_github_bazelbuild_buildtools//edit:go_default_library (c3d5957)
@com_github_bazelbuild_buildtools//api_proto:go_default_library (c3d5957)
@com_github_bazelbuild_buildtools//api_proto:api_proto_go_proto (c3d5957)
@io_bazel_rules_go//proto:go_proto (c3d5957)
@org_golang_google_protobuf//types/known/timestamppb:timestamppb (c3d5957)
@org_golang_google_protobuf//runtime/protoimpl:protoimpl (c3d5957)
@org_golang_google_protobuf//internal/impl:impl (c3d5957)
@org_golang_google_protobuf//internal/encoding/tag:tag (c3d5957)
@org_golang_google_protobuf//internal/encoding/defval:defval (c3d5957)
@org_golang_google_protobuf//internal/encoding/text:text (c3d5957)

The last config does not have the bootstrap_nogo setting set to True, so nogo is not disabled. With Bazel 5, all of the configs are the HOST config, which prevents further transitions that could enable nogo.

All in all, I think that this is a separate issue and not caused (but also not prevented) by my changes. It would be very helpful if build settings had a build_setting_exec_default, which could be used here to disable nogo automatically when transitioning to the exec configuration rather than putting the onus on rules_go to apply custom transitions on all exec edges.

@gregestren Have there been any considerations to allow Starlark build settings to provide defaults for the exec configuration? In Java, fragments have getHost() to implement custom logic and make good use of it, but a build_setting_exec_default that, if set, overrides the current value of a setting when switching to the exec config, would already be very helpful.

@gregestren
Copy link
Contributor

Yes, but that's in pre-design: we know we'd like a better APi than hard-coded getHost() but haven't committed time to what that API would look like yet.

fmeum added a commit to fmeum/rules_go that referenced this issue May 5, 2022
By resetting Go-specific settings changed by go_transition to their
previous values when crossing a non-deps dependency edge, e.g. srcs or
data, dependencies through those edges are not affected by the change in
Go settings. This has two advantages:

* Correctness: Previously, if a go_binary with linkmode = "c-archive"
  used another go_binary to generate srcs, that go_binary would also be
  built as a c-archive and thus fail to run during the build. This was
  observed in bazelbuild/bazel#14626.
* Performance: With the new Bazel flag
  --experimental_output_directory_naming_scheme=diff_against_baseline,
  transitions can return to the top-level configuration since the
  Starlark settings that were affected by a transition at some point
  during the build are no longer tracked explicitly in a setting, but
  computed as a configuration diff. Resetting the configuration for non-
  deps dependencies thus prevents redundant rebuilds of non-Go deps
  caused by changes in Go settings, achieving a version of
  "configuration trimming" for Go.
fmeum added a commit to fmeum/rules_go that referenced this issue May 5, 2022
By resetting Go-specific settings changed by go_transition to their
previous values when crossing a non-deps dependency edge, e.g. srcs or
data, dependencies through those edges are not affected by the change in
Go settings. This has two advantages:

* Correctness: Previously, if a go_binary with linkmode = "c-archive"
  used another go_binary to generate srcs, that go_binary would also be
  built as a c-archive and thus fail to run during the build. This was
  observed in bazelbuild/bazel#14626.
* Performance: With the new Bazel flag
  --experimental_output_directory_naming_scheme=diff_against_baseline,
  transitions can return to the top-level configuration since the
  Starlark settings that were affected by a transition at some point
  during the build are no longer tracked explicitly in a setting, but
  computed as a configuration diff. Resetting the configuration for non-
  deps dependencies thus prevents redundant rebuilds of non-Go deps
  caused by changes in Go settings, achieving a version of
  "configuration trimming" for Go.
fmeum added a commit to fmeum/rules_go that referenced this issue May 6, 2022
By resetting Go-specific settings changed by go_transition to their
previous values when crossing a non-deps dependency edge, e.g. srcs or
data, dependencies through those edges are not affected by the change in
Go settings. This has two advantages:

* Correctness: Previously, if a go_binary with linkmode = "c-archive"
  used another go_binary to generate srcs, that go_binary would also be
  built as a c-archive and thus fail to run during the build. This was
  observed in bazelbuild/bazel#14626.
* Performance: With the new Bazel flag
  --experimental_output_directory_naming_scheme=diff_against_baseline,
  transitions can return to the top-level configuration since the
  Starlark settings that were affected by a transition at some point
  during the build are no longer tracked explicitly in a setting, but
  computed as a configuration diff. Resetting the configuration for non-
  deps dependencies thus prevents redundant rebuilds of non-Go deps
  caused by changes in Go settings, achieving a version of
  "configuration trimming" for Go.
linzhp pushed a commit to bazelbuild/rules_go that referenced this issue May 9, 2022
* Trim transitioned Go settings on non-Go dependencies

By resetting Go-specific settings changed by go_transition to their
previous values when crossing a non-deps dependency edge, e.g. srcs or
data, dependencies through those edges are not affected by the change in
Go settings. This has two advantages:

* Correctness: Previously, if a go_binary with linkmode = "c-archive"
  used another go_binary to generate srcs, that go_binary would also be
  built as a c-archive and thus fail to run during the build. This was
  observed in bazelbuild/bazel#14626.
* Performance: With the new Bazel flag
  --experimental_output_directory_naming_scheme=diff_against_baseline,
  transitions can return to the top-level configuration since the
  Starlark settings that were affected by a transition at some point
  during the build are no longer tracked explicitly in a setting, but
  computed as a configuration diff. Resetting the configuration for non-
  deps dependencies thus prevents redundant rebuilds of non-Go deps
  caused by changes in Go settings, achieving a version of
  "configuration trimming" for Go.

* Verify transition hermeticity for go_{binary,library,test}

The test verifies that Go settings such as gotags and linkmode do not
affect dependencies through attributes other than deps.
@fmeum
Copy link
Collaborator

fmeum commented May 10, 2022

bazelbuild/rules_go#3120 has been fixed, so this issue can be closed.

@ckolli5
Copy link

ckolli5 commented May 27, 2022

Hey @phst, With respect to the above comment we are closing this issue. Feel free to reach us if above issue still persists. Thank you.

@ckolli5 ckolli5 closed this as completed May 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants