-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bazel's own integration tests fail locally on Linux #20753
Comments
CC @lberki |
Let me see if I can just remove that flag from the test bazelrc. Now that we have hermetic |
Note to self: |
I bet it's the recent reshuffling of
And my guess is that this is because |
Bingo! There is a sinister comment here: bazel/src/main/java/com/google/devtools/build/lib/sandbox/AbstractSandboxSpawnRunner.java Line 432 in fb4f63f
which indicates that 'll probably have to do more digging, sigh. |
That comment probably references this code: bazel/src/main/java/com/google/devtools/build/lib/sandbox/AbstractContainerizingSandboxedSpawn.java Line 121 in fb4f63f
and apparently that's how it creates Arguably, Grump squared. |
Update: I have a fix for this, but I still see widespread breakages when I remove There are still a few test cases in |
The test failure in the coverage integration test looks similar to this as of now unresolved Bazel 7.0.0 regression: #20556 |
I had some scraps of time to track down One lesson is that the presence of For that test case in particular, one could argue that it's WAI: it tests that builds requiring files under the There are a number of other test cases that I think fail for the same reason and Androd builds are hopelessly broken. haven't been able to figure out how because the little scraps of time I had didn't allow me to put together and Android SDK and NDK good enough for Bazel today.. |
In addition to
I'm not sure if the right approach here is to special-case |
This is necessary because that paths of those directories are different when seen by Bazel and by the processes within the sandbox and the sandbox interprets paths to writable directories as within the sandbox. This is notably the case for $TEST_TMPDIR. The reason why this worked at all is that the $TEST_TMPDIR that Bazel passes to the test is relative to the working directory (it's absolutized in the test wrapper script) Progress on #20753. RELNOTES: None. PiperOrigin-RevId: 596566851 Change-Id: Ifb56a3016a521b6a0cd4b5700172951d6feabddf
…ata map. This happens for action templates. The reason is that the Linux sandbox needs to know where the tree file artifact is materialized to, which information is stored in its parent. The reason for making this a flag was performance, but it's only at most one tree artifact per action, it's a constant time operation per tree artifact and it only happens rarely (on templated actions) so I think this time, simplicity is better than performance. The surprising change to ActionInputMapHelper was needed because before this change, the tree artifact was a discovered input and therefore was processed by the code path in ActionExecutionFunction.addDiscoveredInputs(), which populated archivedTreeArtifacts but addToMap() did not. It was presumably the bug. The depOwner argument of putTreeArtifact may also be wrong there. It's at least inconsistent with the other two call sites, but I don't want to add more polish at some risk to this change so I decided to not add it to addArchivedTreeArtifactMaybe(). Progress towards #20753. RELNOTES: None. PiperOrigin-RevId: 596586625 Change-Id: Ib690a84508da07560ec376d4e87ed8fb2211979f
wrt. Bazel 7.0.1, I propose we do the following:
If (2) doesn't make any more skeletons fall out from the closet, we call the fire drill done. If it does, well, that's unfortunate. @meteorcloudy WDYT? |
@bazel-io fork 7.0.1 |
@bazel-io fork 7.1.0 |
Yet Another issue: it looks like deploy jar actions don't quite work. The symptom is:
Quick debugging shows that the action input
After adding My guess is that this is an instance of the problem that the integration tests of Bazel do the very thing which |
This makes --experimental_split_coverage_postprocessing work in the wake of bazelbuild@fb6658c: before, it was enough to add the metadata of the tree file artifacts to the metadata provider of the post-processing action, but that change made the metadata of the tree artifact necessary, too. Progress towards bazelbuild#20753. RELNOTES: None. PiperOrigin-RevId: 596929659 Change-Id: I481ef36328de7f7ab07f2ec7a0ac83d5fd508c36
This makes --experimental_split_coverage_postprocessing work in the wake of bazelbuild@fb6658c: before, it was enough to add the metadata of the tree file artifacts to the metadata provider of the post-processing action, but that change made the metadata of the tree artifact necessary, too. Progress towards bazelbuild#20753. RELNOTES: None. PiperOrigin-RevId: 596929659 Change-Id: I481ef36328de7f7ab07f2ec7a0ac83d5fd508c36
This makes --experimental_split_coverage_postprocessing work in the wake of bazelbuild@fb6658c: before, it was enough to add the metadata of the tree file artifacts to the metadata provider of the post-processing action, but that change made the metadata of the tree artifact necessary, too. Progress towards bazelbuild#20753. RELNOTES: None. PiperOrigin-RevId: 596929659 Change-Id: I481ef36328de7f7ab07f2ec7a0ac83d5fd508c36
…ata map. This happens for action templates. The reason is that the Linux sandbox needs to know where the tree file artifact is materialized to, which information is stored in its parent. The reason for making this a flag was performance, but it's only at most one tree artifact per action, it's a constant time operation per tree artifact and it only happens rarely (on templated actions) so I think this time, simplicity is better than performance. The surprising change to ActionInputMapHelper was needed because before this change, the tree artifact was a discovered input and therefore was processed by the code path in ActionExecutionFunction.addDiscoveredInputs(), which populated archivedTreeArtifacts but addToMap() did not. It was presumably the bug. The depOwner argument of putTreeArtifact may also be wrong there. It's at least inconsistent with the other two call sites, but I don't want to add more polish at some risk to this change so I decided to not add it to addArchivedTreeArtifactMaybe(). Progress towards bazelbuild#20753. RELNOTES: None. PiperOrigin-RevId: 596586625 Change-Id: Ib690a84508da07560ec376d4e87ed8fb2211979f
This is necessary because that paths of those directories are different when seen by Bazel and by the processes within the sandbox and the sandbox interprets paths to writable directories as within the sandbox. This is notably the case for $TEST_TMPDIR. The reason why this worked at all is that the $TEST_TMPDIR that Bazel passes to the test is relative to the working directory (it's absolutized in the test wrapper script) Progress on bazelbuild#20753. RELNOTES: None. PiperOrigin-RevId: 596566851 Change-Id: Ifb56a3016a521b6a0cd4b5700172951d6feabddf
This makes --experimental_split_coverage_postprocessing work in the wake of bazelbuild@fb6658c: before, it was enough to add the metadata of the tree file artifacts to the metadata provider of the post-processing action, but that change made the metadata of the tree artifact necessary, too. Progress towards bazelbuild#20753. RELNOTES: None. PiperOrigin-RevId: 596929659 Change-Id: I481ef36328de7f7ab07f2ec7a0ac83d5fd508c36
…ata map. This happens for action templates. The reason is that the Linux sandbox needs to know where the tree file artifact is materialized to, which information is stored in its parent. The reason for making this a flag was performance, but it's only at most one tree artifact per action, it's a constant time operation per tree artifact and it only happens rarely (on templated actions) so I think this time, simplicity is better than performance. The surprising change to ActionInputMapHelper was needed because before this change, the tree artifact was a discovered input and therefore was processed by the code path in ActionExecutionFunction.addDiscoveredInputs(), which populated archivedTreeArtifacts but addToMap() did not. It was presumably the bug. The depOwner argument of putTreeArtifact may also be wrong there. It's at least inconsistent with the other two call sites, but I don't want to add more polish at some risk to this change so I decided to not add it to addArchivedTreeArtifactMaybe(). Progress towards bazelbuild#20753. RELNOTES: None. PiperOrigin-RevId: 596586625 Change-Id: Ib690a84508da07560ec376d4e87ed8fb2211979f
This is necessary because that paths of those directories are different when seen by Bazel and by the processes within the sandbox and the sandbox interprets paths to writable directories as within the sandbox. This is notably the case for $TEST_TMPDIR. The reason why this worked at all is that the $TEST_TMPDIR that Bazel passes to the test is relative to the working directory (it's absolutized in the test wrapper script) Progress on bazelbuild#20753. RELNOTES: None. PiperOrigin-RevId: 596566851 Change-Id: Ifb56a3016a521b6a0cd4b5700172951d6feabddf
This makes --experimental_split_coverage_postprocessing work in the wake of bazelbuild@fb6658c: before, it was enough to add the metadata of the tree file artifacts to the metadata provider of the post-processing action, but that change made the metadata of the tree artifact necessary, too. Progress towards bazelbuild#20753. RELNOTES: None. PiperOrigin-RevId: 596929659 Change-Id: I481ef36328de7f7ab07f2ec7a0ac83d5fd508c36
A thousand thanks and a bhattleship :) |
I know that there are still failures, that's why this bug is still open, but AFAICT they are not due to bugs in Bazel, but due to bugs in the test battery. That particular test is fixed by replacing an |
Related bug: #21190. Disabling |
Tempoarily workaround #20753 to unblock bazelbuild/java_tools#87 Setting `--sandbox_tmpfs_path=/tmp` for linux matches [the current behavior of bazelci.py](https://github.com/bazelbuild/continuous-integration/blob/7a8d90d15520b81e0f330a85772c5416a04d0061/buildkite/bazelci.py#L1976) PiperOrigin-RevId: 604341263 Change-Id: I37fe324afe4328d861b06fc64a03e82cc55de38f
The bind mounting scheme used with the Linux sandbox' hermetic `/tmp` feature is modified to preserve all paths as they are outside the sandbox, which removes the need to rewrite paths when staging inputs into and, crucially, moving outputs out of the sandbox. Source roots and output base paths under `/tmp` are now treated just like any user-specified bind mount under `/tmp`: They are mounted under the hermetic tmp directory with their path relativized against `/tmp` before the hermetic tmp directory is mounted as `/tmp` as the final step. There is one caveat compared to user-specified mounts: Source roots, which may themselves not lie under `/tmp`, can be symlinks to directories under `/tmp` (e.g., when they arise from a `local_repository`). To handle this situation in the common case, all parent directories of package path entries (up to direct children of `/tmp`) are mounted into the sandbox. If users use `local_repository`s with fixed target paths under `/tmp`, they will need to specify `--sandbox_add_mount_pair`. Overlayfs has been considered as an alternative to this approach, but ultimately doesn't seem to work for this use case since its `lowerpath`, which would be `/tmp`, is not allowed to have child mounts from a different user namespace (see https://unix.stackexchange.com/questions/776030/mounting-overlayfs-in-a-user-namespace-with-child-mounts). However, this is exactly the situation created by a Bazel-in-Bazel test and can also arise if the user has existing mounts under `/tmp` when using Bazel (e.g. the JetBrains toolbox on Linux uses such mounts). This replaces and mostly reverts the following commits, but keeps their tests: * bf6ebe9 * fb6658c * bc1d9d3 * 1829883 * 70691f2 * a556969 * 8e32f44 (had its test lost in an incorrect merge conflict resolution, this PR adds it back) Fixes #20533 Work towards #20753 Fixes #21215 Fixes #22117 Fixes #22226 Fixes #22290 RELNOTES: Paths in the Linux sandbox are now again identical to those outside the sandbox, even with `--incompatible_sandbox_hermetic_tmp`. Closes #22001. PiperOrigin-RevId: 634381503 Change-Id: I9f7f3948c705be120c55c9b0c51204e5bea45f61
The bind mounting scheme used with the Linux sandbox' hermetic `/tmp` feature is modified to preserve all paths as they are outside the sandbox, which removes the need to rewrite paths when staging inputs into and, crucially, moving outputs out of the sandbox. Source roots and output base paths under `/tmp` are now treated just like any user-specified bind mount under `/tmp`: They are mounted under the hermetic tmp directory with their path relativized against `/tmp` before the hermetic tmp directory is mounted as `/tmp` as the final step. There is one caveat compared to user-specified mounts: Source roots, which may themselves not lie under `/tmp`, can be symlinks to directories under `/tmp` (e.g., when they arise from a `local_repository`). To handle this situation in the common case, all parent directories of package path entries (up to direct children of `/tmp`) are mounted into the sandbox. If users use `local_repository`s with fixed target paths under `/tmp`, they will need to specify `--sandbox_add_mount_pair`. Overlayfs has been considered as an alternative to this approach, but ultimately doesn't seem to work for this use case since its `lowerpath`, which would be `/tmp`, is not allowed to have child mounts from a different user namespace (see https://unix.stackexchange.com/questions/776030/mounting-overlayfs-in-a-user-namespace-with-child-mounts). However, this is exactly the situation created by a Bazel-in-Bazel test and can also arise if the user has existing mounts under `/tmp` when using Bazel (e.g. the JetBrains toolbox on Linux uses such mounts). This replaces and mostly reverts the following commits, but keeps their tests: * bazelbuild@bf6ebe9 * bazelbuild@fb6658c * bazelbuild@bc1d9d3 * bazelbuild@1829883 * bazelbuild@70691f2 * bazelbuild@a556969 * bazelbuild@8e32f44 (had its test lost in an incorrect merge conflict resolution, this PR adds it back) Fixes bazelbuild#20533 Work towards bazelbuild#20753 Fixes bazelbuild#21215 Fixes bazelbuild#22117 Fixes bazelbuild#22226 Fixes bazelbuild#22290 RELNOTES: Paths in the Linux sandbox are now again identical to those outside the sandbox, even with `--incompatible_sandbox_hermetic_tmp`. Closes bazelbuild#22001. PiperOrigin-RevId: 634381503 Change-Id: I9f7f3948c705be120c55c9b0c51204e5bea45f61
The bind mounting scheme used with the Linux sandbox' hermetic `/tmp` feature is modified to preserve all paths as they are outside the sandbox, which removes the need to rewrite paths when staging inputs into and, crucially, moving outputs out of the sandbox. Source roots and output base paths under `/tmp` are now treated just like any user-specified bind mount under `/tmp`: They are mounted under the hermetic tmp directory with their path relativized against `/tmp` before the hermetic tmp directory is mounted as `/tmp` as the final step. There is one caveat compared to user-specified mounts: Source roots, which may themselves not lie under `/tmp`, can be symlinks to directories under `/tmp` (e.g., when they arise from a `local_repository`). To handle this situation in the common case, all parent directories of package path entries (up to direct children of `/tmp`) are mounted into the sandbox. If users use `local_repository`s with fixed target paths under `/tmp`, they will need to specify `--sandbox_add_mount_pair`. Overlayfs has been considered as an alternative to this approach, but ultimately doesn't seem to work for this use case since its `lowerpath`, which would be `/tmp`, is not allowed to have child mounts from a different user namespace (see https://unix.stackexchange.com/questions/776030/mounting-overlayfs-in-a-user-namespace-with-child-mounts). However, this is exactly the situation created by a Bazel-in-Bazel test and can also arise if the user has existing mounts under `/tmp` when using Bazel (e.g. the JetBrains toolbox on Linux uses such mounts). This replaces and mostly reverts the following commits, but keeps their tests: * bf6ebe9 * fb6658c * bc1d9d3 * 1829883 * 70691f2 * a556969 * 8e32f44 (had its test lost in an incorrect merge conflict resolution, this PR adds it back) Fixes #20533 Work towards #20753 Fixes #21215 Fixes #22117 Fixes #22226 Fixes #22290 RELNOTES: Paths in the Linux sandbox are now again identical to those outside the sandbox, even with `--incompatible_sandbox_hermetic_tmp`. Closes #22001. PiperOrigin-RevId: 634381503 Change-Id: I9f7f3948c705be120c55c9b0c51204e5bea45f61 Fixes #22291
Description of the bug:
Many of Bazel's own integration tests fail locally on Linux with an error such as the following:
This appears to be due to a combination of the outer Bazel process now running with
--incompatible_sandbox_hermetic_tmp
and the inner Bazel process running with--sandbox_tmpfs_path=/tmp
due tobazel/src/test/shell/testenv.sh.tmpl
Line 261 in 055e25b
Which category does this issue belong to?
No response
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
bazel test //src/test/shell/bazel:bazel_rules_cc_test --test_output=errors
Which operating system are you running Bazel on?
Linux
What is the output of
bazel info release
?No response
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.
No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response
The text was updated successfully, but these errors were encountered: