Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--incompatible_sandbox_hermetic_tmp breaks outputs materialized as a symlinks to source artifacts #21215

Closed
tjgq opened this issue Feb 6, 2024 · 29 comments
Assignees
Labels
P1 I'll work on this now. (Assignee required) team-Local-Exec Issues and PRs for the Execution (Local) team type: bug

Comments

@tjgq
Copy link
Contributor

tjgq commented Feb 6, 2024

Originally reported by @tyler-french in #20886 (but seems to be a separate issue).

Minimal repro (Linux):

.bazelversion

7.0.2

.bazelrc

build --noenable_bzlmod

repo.bzl

def _impl(rctx):
  rctx.file("file.txt", "hello")
  rctx.file("BUILD", """exports_files(["file.txt"])""")

repo = repository_rule(_impl)

WORSKPACE

load(":repo.bzl", "repo")
repo(name= "repo")

BUILD

genrule(
  name = "gen",
  outs = ["file.txt"],
  srcs = ["@repo//:file.txt"],
  cmd = "ln -s $< $@",
)

Result:

ERROR: /usr/local/google/home/tjgq/proj/dangling-symlink-repro/BUILD:1:8: declared output 'file.txt' is a dangling symbolic link
ERROR: /usr/local/google/home/tjgq/proj/dangling-symlink-repro/BUILD:1:8: Executing genrule //:gen failed: not all outputs were created or valid

I believe this is because we copy (actually, move) the symlink as-is out of the sandbox, so it ends up pointing to a location like /tmp/bazel-source-roots/1/file.txt, which only makes sense in the context of the sandbox.

Things to keep in mind when fixing this:

  • Make sure that symlinks to the main repo (as opposed to an external repo) work as well
  • Make sure that an indirection through multiple symlinks works as well
@tjgq tjgq added team-Local-Exec Issues and PRs for the Execution (Local) team type: bug untriaged labels Feb 6, 2024
@tjgq tjgq changed the title Hermetic sandbox breaks outputs materialized as a symlinks to source artifacts --incompatible_sandbox_hermetic_tmp breaks outputs materialized as a symlinks to source artifacts Feb 6, 2024
@oquenchil oquenchil added P2 We'll consider working on this in future. (Assignee optional) and removed untriaged labels Feb 6, 2024
@oquenchil oquenchil self-assigned this Feb 6, 2024
@oquenchil oquenchil added P1 I'll work on this now. (Assignee required) potential release blocker Flagged by community members using "@bazel-io flag". Should be added to a release blocker milestone and removed P2 We'll consider working on this in future. (Assignee optional) labels Feb 6, 2024
@fmeum
Copy link
Collaborator

fmeum commented Feb 6, 2024

@bazel-io fork 7.1.0

@bazel-io bazel-io removed the potential release blocker Flagged by community members using "@bazel-io flag". Should be added to a release blocker milestone label Feb 6, 2024
@fmeum
Copy link
Collaborator

fmeum commented Feb 6, 2024

Make sure that an indirection through multiple symlinks works as well

A tricky edge case: The output may be declared as an unresolved symlink (ctx.actions.declare_symlink), so we would not want to follow all symlinks.

Maybe we need to follow symlinks exactly until after the last one that points under one of the sandbox roots under /tmp?

@tjgq
Copy link
Contributor Author

tjgq commented Feb 6, 2024

If it's an unresolved symlink, I'm not even sure we should do any path transformation at all. That has the potential to break cases where the symlink target path is textually fixed.

(If one intentionally wants a declare_symlink to point to some other file in the build, it should be a declare_file/declare_directory instead; we've gone to great lengths to optimize for that use case, so there's no reason to make things unnecessarily non-hermetic.)

@lberki
Copy link
Contributor

lberki commented Feb 12, 2024

Is the attitude that this is WAI viable?

On the practical side, I'd rather have less code than more code and following symlinks and magically replacing /tmp/bazel-* paths on any intermediate paths during symlink resolution sounds quite error-prone.

On the theoretical side, the whole point of sandboxing is that what the action does and what it results in shouldn't be a function of the location of the output base, the location of the source trees or whether the action is sandboxed. And arguably the result of ln -s hard-codes said locations.

I do realize that the stance of "thou shalt always copy your inputs to outputs" may harm performance, though.

(For the record, SandboxHelpers.moveOutputs() does exactly what @tjgq suspects: it just calls FileSystemUtils.moveFile(), which in turn calls rename(). It also warns against the harms of copying, though, which is a counter-argument against what I said above.

@fmeum
Copy link
Collaborator

fmeum commented Feb 12, 2024

I started out firmly in camp "WAI", but my opinion has changed to "we should perform this magic replacement".

If it were just for the ln -s case, I would agree that we could reasonably expect users to just not do this.

But the original repro by @tyler-french in #20886 (comment) makes a more convincing case for this being a "Bazel bug": When using the sandbox, Bazel hands the user inputs in a form (symlinks pointing under /tmp/bazel-*) that it would not accept as outputs (e.g. produced via the innocuous cp -r applied to the inputs). "Thou shalt always copy your inputs to outputs while resolving all symlinks" is a much harder sell :-)

@lberki
Copy link
Contributor

lberki commented Feb 13, 2024

Meh. I can't put up any good argument against yours, but then we'll have to gear up for implementing symlink resolution yet another time :(

@Wyverald
Copy link
Member

just to confirm, are we still planning to make progress on this for 7.1.0? It's looking like we'll cut the first RC without fixing this anyway, but if this is being worked on, we can still get the fix into later RCs.

@lberki
Copy link
Contributor

lberki commented Feb 22, 2024

Were we? I didn't realize that this was slated for 7.1.0 and there is no indication on this bug to that effect. (cc @oquenchil @meteorcloudy )

@fmeum
Copy link
Collaborator

fmeum commented Feb 22, 2024

Were we? I didn't realize that this was slated for 7.1.0 and there is no indication on this bug to that effect. (cc @oquenchil @meteorcloudy )

I forked it for 7.1.0 in #21215 (comment), which was after @oquenchil marked it as P1 and "potential release blocker". The bazel-io bot removes this label after it has created the fork, which may have caused confusion. Maybe we could introduce a dedicated label for "marked for cherry-pick" issues that remains after they have been forked?

Happy to help fix this, it looks like the last real blocker to more widespread hermetic tmp adoption.

@oquenchil
Copy link
Contributor

Having to fix every output symlink seems costly. It will be costly for every build since we have to scan outputs of every action to see if they are symlinks and it will be costly in terms of maintenance.

An alternative is the following order of bind mounting which I think should fix this issue (even for builds with source root and exec root under /tmp):

  • $SANDBOX/{unique-integer-per-action}/root/bazel-working-dir this will contain the symlink tree and will be the cwd of the action
  • mkdir $SANDBOX/{unique-integer-per-action}/root this will be where we mount every top level directory of the real system root except /tmp
  • real system root /{anything-other-than-tmp} gets mounted at $SANDBOX/{unique-integer-per-action}/root/{anything-other-than-tmp}
  • real system root /tmp gets mounted at $SANDBOX/{unique-integer-per-action}/_tmp
  • every real system root /tmp/{dir} gets mounted at $SANDBOX/{unique-integer-per-action}/_tmp/{dir} (This is to make sure this solution still works for source roots and execroots under /tmp/)
  • $SANDBOX/{unique-integer-per-action}/root gets mounted at /
  • cwd of the action is /bazel-working-dir
  • mkdir /tmp
  • $SANDBOX/{unique-integer-per-action}/_tmp/ gets mounted at /tmp

That's not ideal either. Still costly and hard to maintain.

I think the whole implementation of hermetic tmp should be replaced with (handwave) a single line of code:

mount -t overlay  overlay  -olowerdir=/tmp/,upperdir=/tmp/{unique-per-action},workdir=/tmp/_mount_point  /tmp

I tried locally on the terminal and it works as expected. The current process will see any files it writes to /tmp but after the action is finished and /tmp is unmounted, the files won't be there. Only that process can see the files that it itself writes to /tmp.

Files from the system /tmp will still be visible by all actions under /tmp which is a change from the current behavior. From #19915, /tmp being empty doesn't seem like a requirement. From the description of the flag, the new behavior would be equivalent to adding --sandbox_add_mount_pair=/tmp.

Please let me know what you all think and I can send a change adding this.

@fmeum
Copy link
Collaborator

fmeum commented Feb 24, 2024

I tried locally on the terminal and it works as expected. The current process will see any files it writes to /tmp but after the action is finished and /tmp is unmounted, the files won't be there. Only that process can see the files that it itself writes to /tmp.
Files from the system /tmp will still be visible by all actions under /tmp which is a change from the current behavior.

This actually seems much better than the current semantics as it is less of a breaking change (e.g. Docker sockets under /tmp can still be used without having sockets created by the action pollute the host /tmp). If it's also simpler and resolves the current issue, I'm 100% for this solution. Great find!

@tjgq
Copy link
Contributor Author

tjgq commented Feb 24, 2024

I like this idea too! But to be clear, the new behavior of --incompatible_sandbox_hermetic_tmp wouldn't be exactly the same as --sandbox_add_mount_pair=/tmp, right? AIUI, the latter not only makes the system /tmp visible to sandboxes, but also makes the files created by a sandbox visible to other sandboxes; the last part is what we're actually trying to protect against.

What about older Linux kernels where mount -t overlay is still a privileged operation? Will we auto-detect or require users to manually override the flag?

@fmeum
Copy link
Collaborator

fmeum commented Feb 24, 2024

What about older Linux kernels where mount -t overlay is still a privileged operation? Will we auto-detect or require users to manually override the flag?

Could we test this as part of the one-time "sandboxing available" check? Requiring a manual override for a default wouldn't be a great user experience. But I also don't know what the market share of affected kernel versions is.

With the overlay approach, there could theoretically still be issues with concurrent Java compilations, but only if there is a concurrent one on the host with a PID of 12, which is presumably too low to occur in practice (unless the concurrent action is also sandboxed in a similar way). I think that's acceptable.

@fmeum
Copy link
Collaborator

fmeum commented Feb 24, 2024

https://docs.kernel.org/filesystems/overlayfs.html#changes-to-underlying-filesystems sounds as if this scheme may not be safe when there are concurrent accesses to /tmp.

@fmeum
Copy link
Collaborator

fmeum commented Feb 26, 2024

https://docs.kernel.org/filesystems/overlayfs.html#changes-to-underlying-filesystems sounds as if this scheme may not be safe when there are concurrent accesses to /tmp.

@oquenchil and I discussed this offline and found the thread at https://lore.kernel.org/all/CA+ZH+jFBAaRi6VPmf3PBdDZQQMOaT6WUByATqaw1QL5M9+-dxg@mail.gmail.com/, which makes it sound as if overlayfs would be fine to use in practice under the weaker assumption that there are no concurrent changes to the subtree of /tmp containing files that the sandboxed process is supposed to read (e.g. the output base under /tmp). Since any "clever" (the bad kind) symlink algorithm or bind mount scheme would also have the potential for breakages, I would actually prefer the overlayfs approach even without a formal correctness guarantee.

If anyone here knows anyone who happens to know more about kernel FS implementations, that would of course be very helpful. :-)

@oquenchil
Copy link
Contributor

Perfect, thank you Fabian. I'd go with the overlayfs solution when we detect sourceroot and execroot are under /tmp but would take a simpler code path when we don't detect that. In these cases a simple bind mount over /tmp (maybe even mounting tmpfs over /tmp by default) should work.

The behavior between the two would be different since for builds with source-execroot under /tmp, the system /tmpwould be visible but for the other case it would start as empty. What do you think? Even though we go on the assumption that overlayfs works as we expect until proven otherwise, it might make sense to reduce its usage if we can avoid it when we are not building under /tmp.

@lberki The commit 8e32f44 affected even builds without source-execroot under /tmp. Do you think it's fine to take a different code path in those cases?

@fmeum
Copy link
Collaborator

fmeum commented Feb 27, 2024

In these cases a simple bind mount over /tmp (maybe even mounting tmpfs over /tmp by default) should work.

A memory-backed tmpfs for /tmp is nice when it works, but I don't think it would be a good default as tests can use GBs of tmp storage, which may overwhelm the host. But a simple bind mount as originally introduced for --incompatible_sandbox_hermetic_tmp.

it might make sense to reduce its usage if we can avoid it when we are not building under /tmp.

I see pros and cons here: If we are relatively certain that overlayfs is safe and it actually is in 99% of all cases but has issues in the remaining 1%, we would have a much easier time identifying those cases if all builds used the feature, not just those building under /tmp.

@lberki
Copy link
Contributor

lberki commented Feb 27, 2024

I don't want to take sides here (y'all have collectively way more brain cells engaged on this problem that I do), however, I would like to point out two things:

  1. I don't think checking every output file is that problematic, because actions usually have very few output files and those are usually not symbolic links. Also, most of the time, doing what's essentially readlink -f with a twist one a file will be negligible compared to the effort of generating that file.
  2. Are we sure we want to add a dependency on overlayfs? Sure, we can do feature detection, but between a very pedestrian "resolve symlinks" and depending on some fancy kernel feature, the former looks much less risky.

@fmeum
Copy link
Collaborator

fmeum commented Feb 27, 2024

Instead of resolving symlinks with a twist, could we perform the symlink resolution in the sandbox binary with the mounts still present? This would require passing the output file paths to the sandbox though.

@oquenchil
Copy link
Contributor

I'd say that fixing symlinks being simple in most cases is not a strong argument. The current implementation of hermetic tmp also worked most of the time.

What if we don't have just one output and what if they are unpredictable because they are inside a tree artifact? We might fix this issue now and get it to work with extra code but I'd strongly argue in favor of simplifying all this to avoid having to fix bugs in this area as often as we are doing now. I think we should try the overlayfs solution first.

@tjgq
Copy link
Contributor Author

tjgq commented Feb 27, 2024

I really don't like the complexity entailed by resolving symlinks; it's always more complicated than it seems.

What if, for example, an output is materialized as a chain of symlinks that jump back and forth between the source and output trees an arbitrary number of times? Should we preserve the full symlink chain, or just replace it with a symlink to the final target?

Similarly, what should we do with unresolved symlinks? Earlier in this thread, I claimed that we should copy the symlink out of the sandbox verbatim. I'm no longer sure that would work, because I've since learned that rules_js uses unresolved symlinks pointing to the source tree; I can't say I like that particular design decision, but we're likely stuck with supporting it. But, on the other hand, I'm not convinced that rewriting always works (even if it tolerates dangling symlinks).

If overlayfs can be made to work, I'd still prefer it (Lukács: we're already fully bought into Linuxisms thanks to filesystem namespaces; IMO adding overlayfs doesn't make things significantly worse). If we can't, I still think using the mount structure described #21215 (comment), as complex as it sounds, is better than resolving symlinks.

(Also, I fully agree with Fabian that, whichever solution we pick, we should use it unconditionally instead of special-casing "source root is under /tmp"; a seldom-used code path makes bugs more likely.)

@oquenchil
Copy link
Contributor

Ok, so let's try overlayfs for both cases then. I will not be working the next 3 days. Can this wait till next week?

@Wyverald
Copy link
Member

Wyverald commented Mar 7, 2024

What's the status of this?

We just cut 7.1.0rc2 today, which after a bit of baking should be good to release next Monday. But @fmeum notified me that this issue might be more than just a soft blocker.

@oquenchil are you still working on this? What is the timeframe you think it'll be ready by? If it's before EOW, I think we could fit another RC in, but otherwise I'd rather not delay 7.1.0 forever as Chrome OS has been asking for it.

Alternatively, Fabian mentioned that we could roll back some commits made in 7.0.0 ("@lberki made the flag flip work for builds with output base under /tmp, but that required some bind mount magic that is now haunting us"). What do y'all think about the feasibility of that?

@meteorcloudy
Copy link
Member

If this isn't an easy fix. I'd prefer we don't rush it and block the 7.1.0 release. We can make patch releases later for remaining fixes, and users do get affected can use --noincompatible_sandbox_hermetic_tmp to workaround?

@Wyverald
Copy link
Member

gentle ping -- we could still cherry-pick this into 7.1.1 (tentative release date next Wednesday) if it gets a fix soon, otherwise this will need to wait until 7.2.0.

@oquenchil
Copy link
Contributor

This can wait till 7.2.0.

@fmeum fmeum assigned fmeum and unassigned oquenchil Apr 12, 2024
@fmeum
Copy link
Collaborator

fmeum commented Apr 12, 2024

I took over from @oquenchil and have an overlayfs-based prototype that works pretty well - except when you have something mounted under /tmp. I don't yet know whether this can be worked around easily.

fmeum added a commit to fmeum/bazel that referenced this issue Apr 15, 2024
Sandboxing tests should always run with Bazel defaults, which include no tmpfs path.

Work towards bazelbuild#21215
copybara-service bot pushed a commit that referenced this issue Apr 17, 2024
Sandboxing tests should always run with Bazel defaults, which include no tmpfs path.

Work towards #21215

Closes #22002.

PiperOrigin-RevId: 625615279
Change-Id: If4146f04effeaabc1eb22d38cc5ac32247759c8c
iancha1992 pushed a commit to iancha1992/bazel that referenced this issue Apr 17, 2024
Sandboxing tests should always run with Bazel defaults, which include no tmpfs path.

Work towards bazelbuild#21215

Closes bazelbuild#22002.

PiperOrigin-RevId: 625615279
Change-Id: If4146f04effeaabc1eb22d38cc5ac32247759c8c
github-merge-queue bot pushed a commit that referenced this issue Apr 17, 2024
Sandboxing tests should always run with Bazel defaults, which include no
tmpfs path.

Work towards #21215

Closes #22002.

PiperOrigin-RevId: 625615279
Change-Id: If4146f04effeaabc1eb22d38cc5ac32247759c8c

Commit
5086f65

Co-authored-by: Fabian Meumertzheim <[email protected]>
@keertk
Copy link
Member

keertk commented Apr 29, 2024

Hi all, is this still on track for 7.2? We're aiming to create the first RC on 5/13.

@fmeum
Copy link
Collaborator

fmeum commented May 1, 2024

Yes, I'm working on this and am positive that I'll have something ready in time.

Kila2 pushed a commit to Kila2/bazel that referenced this issue May 13, 2024
Sandboxing tests should always run with Bazel defaults, which include no tmpfs path.

Work towards bazelbuild#21215

Closes bazelbuild#22002.

PiperOrigin-RevId: 625615279
Change-Id: If4146f04effeaabc1eb22d38cc5ac32247759c8c
fmeum added a commit to fmeum/bazel that referenced this issue May 16, 2024
The bind mounting scheme used with the Linux sandbox' hermetic `/tmp` feature is modified to preserve all paths as they are outside the sandbox, which removes the need to rewrite paths when staging inputs into and, crucially, moving outputs out of the sandbox.

Source roots and output base paths under `/tmp` are now treated just like any user-specified bind mount under `/tmp`: They are mounted under the hermetic tmp directory with their path relativized against `/tmp` before the hermetic tmp directory is mounted as `/tmp` as the final step.

There is one caveat compared to user-specified mounts: Source roots, which may themselves not lie under `/tmp`, can be symlinks to directories under `/tmp` (e.g., when they arise from a `local_repository`). To handle this situation in the common case, all parent directories of package path entries (up to direct children of `/tmp`) are mounted into the sandbox. If users use `local_repository`s with fixed target paths under `/tmp`, they will need to specify `--sandbox_add_mount_pair`.

Overlayfs has been considered as an alternative to this approach, but ultimately doesn't seem to work for this use case since its `lowerpath`, which would be `/tmp`, is not allowed to have child mounts from a different user namespace (see https://unix.stackexchange.com/questions/776030/mounting-overlayfs-in-a-user-namespace-with-child-mounts). However, this is exactly the situation created by a Bazel-in-Bazel test and can also arise if the user has existing mounts under `/tmp` when using Bazel (e.g. the JetBrains toolbox on Linux uses such mounts).

This replaces and mostly reverts the following commits, but keeps their tests:
* bazelbuild@bf6ebe9
* bazelbuild@fb6658c
* bazelbuild@bc1d9d3
* bazelbuild@1829883
* bazelbuild@70691f2
* bazelbuild@a556969
* bazelbuild@8e32f44 (had its test lost in an incorrect merge conflict resolution, this PR adds it back)

Fixes bazelbuild#20533
Work towards bazelbuild#20753
Fixes bazelbuild#21215
Fixes bazelbuild#22117
Fixes bazelbuild#22226
Fixes bazelbuild#22290

RELNOTES: Paths in the Linux sandbox are now again identical to those outside the sandbox, even with `--incompatible_sandbox_hermetic_tmp`.

Closes bazelbuild#22001.

PiperOrigin-RevId: 634381503
Change-Id: I9f7f3948c705be120c55c9b0c51204e5bea45f61
github-merge-queue bot pushed a commit that referenced this issue May 16, 2024
The bind mounting scheme used with the Linux sandbox' hermetic `/tmp`
feature is modified to preserve all paths as they are outside the
sandbox, which removes the need to rewrite paths when staging inputs
into and, crucially, moving outputs out of the sandbox.

Source roots and output base paths under `/tmp` are now treated just
like any user-specified bind mount under `/tmp`: They are mounted under
the hermetic tmp directory with their path relativized against `/tmp`
before the hermetic tmp directory is mounted as `/tmp` as the final
step.

There is one caveat compared to user-specified mounts: Source roots,
which may themselves not lie under `/tmp`, can be symlinks to
directories under `/tmp` (e.g., when they arise from a
`local_repository`). To handle this situation in the common case, all
parent directories of package path entries (up to direct children of
`/tmp`) are mounted into the sandbox. If users use `local_repository`s
with fixed target paths under `/tmp`, they will need to specify
`--sandbox_add_mount_pair`.

Overlayfs has been considered as an alternative to this approach, but
ultimately doesn't seem to work for this use case since its `lowerpath`,
which would be `/tmp`, is not allowed to have child mounts from a
different user namespace (see
https://unix.stackexchange.com/questions/776030/mounting-overlayfs-in-a-user-namespace-with-child-mounts).
However, this is exactly the situation created by a Bazel-in-Bazel test
and can also arise if the user has existing mounts under `/tmp` when
using Bazel (e.g. the JetBrains toolbox on Linux uses such mounts).

This replaces and mostly reverts the following commits, but keeps their
tests:
*
bf6ebe9
*
fb6658c
*
bc1d9d3
*
1829883
*
70691f2
*
a556969
*
8e32f44
(had its test lost in an incorrect merge conflict resolution, this PR
adds it back)

Fixes #20533
Work towards #20753
Fixes #21215
Fixes #22117
Fixes #22226
Fixes #22290

RELNOTES: Paths in the Linux sandbox are now again identical to those
outside the sandbox, even with `--incompatible_sandbox_hermetic_tmp`.

Closes #22001.

PiperOrigin-RevId: 634381503
Change-Id: I9f7f3948c705be120c55c9b0c51204e5bea45f61

Fixes #22291
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 I'll work on this now. (Assignee required) team-Local-Exec Issues and PRs for the Execution (Local) team type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants