-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
--incompatible_sandbox_hermetic_tmp
breaks outputs materialized as a symlinks to source artifacts
#21215
Comments
--incompatible_sandbox_hermetic_tmp
breaks outputs materialized as a symlinks to source artifacts
@bazel-io fork 7.1.0 |
A tricky edge case: The output may be declared as an unresolved symlink ( Maybe we need to follow symlinks exactly until after the last one that points under one of the sandbox roots under |
If it's an unresolved symlink, I'm not even sure we should do any path transformation at all. That has the potential to break cases where the symlink target path is textually fixed. (If one intentionally wants a |
Is the attitude that this is WAI viable? On the practical side, I'd rather have less code than more code and following symlinks and magically replacing On the theoretical side, the whole point of sandboxing is that what the action does and what it results in shouldn't be a function of the location of the output base, the location of the source trees or whether the action is sandboxed. And arguably the result of I do realize that the stance of "thou shalt always copy your inputs to outputs" may harm performance, though. (For the record, |
I started out firmly in camp "WAI", but my opinion has changed to "we should perform this magic replacement". If it were just for the But the original repro by @tyler-french in #20886 (comment) makes a more convincing case for this being a "Bazel bug": When using the sandbox, Bazel hands the user inputs in a form (symlinks pointing under |
Meh. I can't put up any good argument against yours, but then we'll have to gear up for implementing symlink resolution yet another time :( |
just to confirm, are we still planning to make progress on this for 7.1.0? It's looking like we'll cut the first RC without fixing this anyway, but if this is being worked on, we can still get the fix into later RCs. |
Were we? I didn't realize that this was slated for 7.1.0 and there is no indication on this bug to that effect. (cc @oquenchil @meteorcloudy ) |
I forked it for 7.1.0 in #21215 (comment), which was after @oquenchil marked it as P1 and "potential release blocker". The Happy to help fix this, it looks like the last real blocker to more widespread hermetic tmp adoption. |
Having to fix every output symlink seems costly. It will be costly for every build since we have to scan outputs of every action to see if they are symlinks and it will be costly in terms of maintenance. An alternative is the following order of bind mounting which I think should fix this issue (even for builds with source root and exec root under
That's not ideal either. Still costly and hard to maintain. I think the whole implementation of hermetic tmp should be replaced with (handwave) a single line of code:
I tried locally on the terminal and it works as expected. The current process will see any files it writes to Files from the system Please let me know what you all think and I can send a change adding this. |
This actually seems much better than the current semantics as it is less of a breaking change (e.g. Docker sockets under |
I like this idea too! But to be clear, the new behavior of What about older Linux kernels where |
Could we test this as part of the one-time "sandboxing available" check? Requiring a manual override for a default wouldn't be a great user experience. But I also don't know what the market share of affected kernel versions is. With the overlay approach, there could theoretically still be issues with concurrent Java compilations, but only if there is a concurrent one on the host with a PID of 12, which is presumably too low to occur in practice (unless the concurrent action is also sandboxed in a similar way). I think that's acceptable. |
https://docs.kernel.org/filesystems/overlayfs.html#changes-to-underlying-filesystems sounds as if this scheme may not be safe when there are concurrent accesses to |
@oquenchil and I discussed this offline and found the thread at https://lore.kernel.org/all/CA+ZH+jFBAaRi6VPmf3PBdDZQQMOaT6WUByATqaw1QL5M9+-dxg@mail.gmail.com/, which makes it sound as if overlayfs would be fine to use in practice under the weaker assumption that there are no concurrent changes to the subtree of If anyone here knows anyone who happens to know more about kernel FS implementations, that would of course be very helpful. :-) |
Perfect, thank you Fabian. I'd go with the overlayfs solution when we detect sourceroot and execroot are under The behavior between the two would be different since for builds with source-execroot under @lberki The commit 8e32f44 affected even builds without source-execroot under |
A memory-backed tmpfs for
I see pros and cons here: If we are relatively certain that overlayfs is safe and it actually is in 99% of all cases but has issues in the remaining 1%, we would have a much easier time identifying those cases if all builds used the feature, not just those building under |
I don't want to take sides here (y'all have collectively way more brain cells engaged on this problem that I do), however, I would like to point out two things:
|
Instead of resolving symlinks with a twist, could we perform the symlink resolution in the sandbox binary with the mounts still present? This would require passing the output file paths to the sandbox though. |
I'd say that fixing symlinks being simple in most cases is not a strong argument. The current implementation of hermetic tmp also worked most of the time. What if we don't have just one output and what if they are unpredictable because they are inside a tree artifact? We might fix this issue now and get it to work with extra code but I'd strongly argue in favor of simplifying all this to avoid having to fix bugs in this area as often as we are doing now. I think we should try the overlayfs solution first. |
I really don't like the complexity entailed by resolving symlinks; it's always more complicated than it seems. What if, for example, an output is materialized as a chain of symlinks that jump back and forth between the source and output trees an arbitrary number of times? Should we preserve the full symlink chain, or just replace it with a symlink to the final target? Similarly, what should we do with unresolved symlinks? Earlier in this thread, I claimed that we should copy the symlink out of the sandbox verbatim. I'm no longer sure that would work, because I've since learned that rules_js uses unresolved symlinks pointing to the source tree; I can't say I like that particular design decision, but we're likely stuck with supporting it. But, on the other hand, I'm not convinced that rewriting always works (even if it tolerates dangling symlinks). If overlayfs can be made to work, I'd still prefer it (Lukács: we're already fully bought into Linuxisms thanks to filesystem namespaces; IMO adding overlayfs doesn't make things significantly worse). If we can't, I still think using the mount structure described #21215 (comment), as complex as it sounds, is better than resolving symlinks. (Also, I fully agree with Fabian that, whichever solution we pick, we should use it unconditionally instead of special-casing "source root is under /tmp"; a seldom-used code path makes bugs more likely.) |
Ok, so let's try |
What's the status of this? We just cut 7.1.0rc2 today, which after a bit of baking should be good to release next Monday. But @fmeum notified me that this issue might be more than just a soft blocker. @oquenchil are you still working on this? What is the timeframe you think it'll be ready by? If it's before EOW, I think we could fit another RC in, but otherwise I'd rather not delay 7.1.0 forever as Chrome OS has been asking for it. Alternatively, Fabian mentioned that we could roll back some commits made in 7.0.0 ("@lberki made the flag flip work for builds with output base under /tmp, but that required some bind mount magic that is now haunting us"). What do y'all think about the feasibility of that? |
If this isn't an easy fix. I'd prefer we don't rush it and block the 7.1.0 release. We can make patch releases later for remaining fixes, and users do get affected can use |
gentle ping -- we could still cherry-pick this into 7.1.1 (tentative release date next Wednesday) if it gets a fix soon, otherwise this will need to wait until 7.2.0. |
This can wait till 7.2.0. |
I took over from @oquenchil and have an overlayfs-based prototype that works pretty well - except when you have something mounted under |
Sandboxing tests should always run with Bazel defaults, which include no tmpfs path. Work towards bazelbuild#21215
Sandboxing tests should always run with Bazel defaults, which include no tmpfs path. Work towards bazelbuild#21215 Closes bazelbuild#22002. PiperOrigin-RevId: 625615279 Change-Id: If4146f04effeaabc1eb22d38cc5ac32247759c8c
Sandboxing tests should always run with Bazel defaults, which include no tmpfs path. Work towards #21215 Closes #22002. PiperOrigin-RevId: 625615279 Change-Id: If4146f04effeaabc1eb22d38cc5ac32247759c8c Commit 5086f65 Co-authored-by: Fabian Meumertzheim <[email protected]>
Hi all, is this still on track for 7.2? We're aiming to create the first RC on 5/13. |
Yes, I'm working on this and am positive that I'll have something ready in time. |
Sandboxing tests should always run with Bazel defaults, which include no tmpfs path. Work towards bazelbuild#21215 Closes bazelbuild#22002. PiperOrigin-RevId: 625615279 Change-Id: If4146f04effeaabc1eb22d38cc5ac32247759c8c
The bind mounting scheme used with the Linux sandbox' hermetic `/tmp` feature is modified to preserve all paths as they are outside the sandbox, which removes the need to rewrite paths when staging inputs into and, crucially, moving outputs out of the sandbox. Source roots and output base paths under `/tmp` are now treated just like any user-specified bind mount under `/tmp`: They are mounted under the hermetic tmp directory with their path relativized against `/tmp` before the hermetic tmp directory is mounted as `/tmp` as the final step. There is one caveat compared to user-specified mounts: Source roots, which may themselves not lie under `/tmp`, can be symlinks to directories under `/tmp` (e.g., when they arise from a `local_repository`). To handle this situation in the common case, all parent directories of package path entries (up to direct children of `/tmp`) are mounted into the sandbox. If users use `local_repository`s with fixed target paths under `/tmp`, they will need to specify `--sandbox_add_mount_pair`. Overlayfs has been considered as an alternative to this approach, but ultimately doesn't seem to work for this use case since its `lowerpath`, which would be `/tmp`, is not allowed to have child mounts from a different user namespace (see https://unix.stackexchange.com/questions/776030/mounting-overlayfs-in-a-user-namespace-with-child-mounts). However, this is exactly the situation created by a Bazel-in-Bazel test and can also arise if the user has existing mounts under `/tmp` when using Bazel (e.g. the JetBrains toolbox on Linux uses such mounts). This replaces and mostly reverts the following commits, but keeps their tests: * bazelbuild@bf6ebe9 * bazelbuild@fb6658c * bazelbuild@bc1d9d3 * bazelbuild@1829883 * bazelbuild@70691f2 * bazelbuild@a556969 * bazelbuild@8e32f44 (had its test lost in an incorrect merge conflict resolution, this PR adds it back) Fixes bazelbuild#20533 Work towards bazelbuild#20753 Fixes bazelbuild#21215 Fixes bazelbuild#22117 Fixes bazelbuild#22226 Fixes bazelbuild#22290 RELNOTES: Paths in the Linux sandbox are now again identical to those outside the sandbox, even with `--incompatible_sandbox_hermetic_tmp`. Closes bazelbuild#22001. PiperOrigin-RevId: 634381503 Change-Id: I9f7f3948c705be120c55c9b0c51204e5bea45f61
The bind mounting scheme used with the Linux sandbox' hermetic `/tmp` feature is modified to preserve all paths as they are outside the sandbox, which removes the need to rewrite paths when staging inputs into and, crucially, moving outputs out of the sandbox. Source roots and output base paths under `/tmp` are now treated just like any user-specified bind mount under `/tmp`: They are mounted under the hermetic tmp directory with their path relativized against `/tmp` before the hermetic tmp directory is mounted as `/tmp` as the final step. There is one caveat compared to user-specified mounts: Source roots, which may themselves not lie under `/tmp`, can be symlinks to directories under `/tmp` (e.g., when they arise from a `local_repository`). To handle this situation in the common case, all parent directories of package path entries (up to direct children of `/tmp`) are mounted into the sandbox. If users use `local_repository`s with fixed target paths under `/tmp`, they will need to specify `--sandbox_add_mount_pair`. Overlayfs has been considered as an alternative to this approach, but ultimately doesn't seem to work for this use case since its `lowerpath`, which would be `/tmp`, is not allowed to have child mounts from a different user namespace (see https://unix.stackexchange.com/questions/776030/mounting-overlayfs-in-a-user-namespace-with-child-mounts). However, this is exactly the situation created by a Bazel-in-Bazel test and can also arise if the user has existing mounts under `/tmp` when using Bazel (e.g. the JetBrains toolbox on Linux uses such mounts). This replaces and mostly reverts the following commits, but keeps their tests: * bf6ebe9 * fb6658c * bc1d9d3 * 1829883 * 70691f2 * a556969 * 8e32f44 (had its test lost in an incorrect merge conflict resolution, this PR adds it back) Fixes #20533 Work towards #20753 Fixes #21215 Fixes #22117 Fixes #22226 Fixes #22290 RELNOTES: Paths in the Linux sandbox are now again identical to those outside the sandbox, even with `--incompatible_sandbox_hermetic_tmp`. Closes #22001. PiperOrigin-RevId: 634381503 Change-Id: I9f7f3948c705be120c55c9b0c51204e5bea45f61 Fixes #22291
Originally reported by @tyler-french in #20886 (but seems to be a separate issue).
Minimal repro (Linux):
.bazelversion
.bazelrc
repo.bzl
WORSKPACE
BUILD
Result:
I believe this is because we copy (actually, move) the symlink as-is out of the sandbox, so it ends up pointing to a location like
/tmp/bazel-source-roots/1/file.txt
, which only makes sense in the context of the sandbox.Things to keep in mind when fixing this:
The text was updated successfully, but these errors were encountered: