Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I/O exception during sandboxed execution: input dependency could not be checked for modifications during execution #15340

Closed
lunch-glide-pepper opened this issue Apr 25, 2022 · 11 comments
Assignees
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Local-Exec Issues and PRs for the Execution (Local) team type: bug

Comments

@lunch-glide-pepper
Copy link

Description of the bug:

My build fails with I/O exception during sandboxed execution: input dependency <path> could not be checked for modifications during execution. The error seems to be about a symlink inside a tree artifact, with the symlink pointing to a directory inside the same tree artifact, and only happens when using --experimental_use_hermetic_linux_sandbox.

Using bazel 5.1.1 (official binaries), and also reproduced with recent master at 4e6153e, on Ubuntu 20.04.

Simplified reproduction below.
I'm sure that I'm doing a number of questionable things that can be achieved in better ways, but the error looks more like a bug than a deliberate error message, so I figured I'd open an issue.

#!/bin/bash


cat << 'EOF' > .bazelrc
build --experimental_use_hermetic_linux_sandbox
build --experimental_use_sandboxfs
build --spawn_strategy=linux-sandbox
EOF


cat << 'EOF' > BUILD
load(":defs.bzl", "custom_python_binary", "generate_sum", "python_virtualenv")

python_virtualenv(
    name = "virtualenv",
    output = "env",
)

custom_python_binary(
    name = "add_numbers",
    main = "add_numbers.py",
    venv = "virtualenv",
)

generate_sum(
    name = "two_plus_two",
    a = 2,
    b = 2,
    output = "result.txt",
)
EOF


cat << 'EOF' > defs.bzl
# buildifier: disable=module-docstring
def _python_virtualenv_impl(ctx):
    test_directory = ctx.actions.declare_directory(ctx.attr.output)

    ctx.actions.run_shell(
        command = " && ".join([
            'mkdir -p "$DIR_PATH/lib/python3.8/site-packages"',
            'echo "add = lambda a, b: a + b" > "$DIR_PATH/lib/python3.8/site-packages/addition.py"',
            'ln -s lib "$DIR_PATH/lib64"',
            'echo "Change this to trigger rebuild: 1" > /dev/null',
        ]),
        outputs = [test_directory],
        env = {"DIR_PATH": test_directory.path},
    )

    return [DefaultInfo(files = depset([test_directory]))]

python_virtualenv = rule(
    implementation = _python_virtualenv_impl,
    attrs = {
        "output": attr.string(mandatory = True),
    },
)

def _custom_python_binary_impl(ctx):
    output_file = ctx.actions.declare_file(ctx.label.name)

    ctx.actions.write(output_file, """#!/bin/bash
set -Eeuo pipefail
BASE_PATH="$(dirname "${BASH_SOURCE[0]}")"
export PYTHONPATH="$BASE_PATH/env/lib/python3.8/site-packages"
exec python3 -B -s -S "%MAIN_FILE_PATH%" "$@"
""".replace("%MAIN_FILE_PATH%", ctx.file.main.path), is_executable = True)

    return [
        DefaultInfo(
            files = depset(direct = [output_file, ctx.file.main] + ctx.files.venv),
            runfiles = ctx.runfiles(files = [ctx.file.main] + ctx.files.venv),
            executable = output_file,
        ),
    ]

custom_python_binary = rule(
    implementation = _custom_python_binary_impl,
    attrs = {
        "main": attr.label(allow_single_file = [".py"], mandatory = True),
        "venv": attr.label(allow_files = True, mandatory = True),
    },
    executable = True,
)

def _generate_sum_impl(ctx):
    output_file = ctx.actions.declare_file(ctx.attr.output)

    ctx.actions.run_shell(
        inputs = ctx.files.tool,
        command = " && ".join([
            '"$TOOL_PATH" $A $B > "$OUTPUT_PATH"',
        ]),
        outputs = [output_file],
        env = {
            "A": str(ctx.attr.a),
            "B": str(ctx.attr.b),
            "TOOL_PATH": ctx.executable.tool.path,
            "OUTPUT_PATH": output_file.path,
        },
    )

    return [DefaultInfo(files = depset(direct = [output_file]))]

generate_sum = rule(
    implementation = _generate_sum_impl,
    attrs = {
        "a": attr.int(mandatory = True),
        "b": attr.int(mandatory = True),
        "output": attr.string(mandatory = True),
        "tool": attr.label(executable = True, cfg = "target", default = Label("//:add_numbers")),
    },
)
EOF


cat << 'EOF' > add_numbers.py
import sys
from addition import add

a = int(sys.argv[1])
b = int(sys.argv[2])
sum = add(a, b)
print(sum)
EOF


cat << 'EOF' > WORKSPACE
EOF


bazel clean --expunge
bazel build :two_plus_two

Output:

user@localhost:~/path$ ./bug_report.sh 
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
Starting local Bazel server and connecting to it...
INFO: Analyzed target //:two_plus_two (4 packages loaded, 9 targets configured).
INFO: Found 1 target...
ERROR: /home/user/path/BUILD:14:13: Action result.txt failed: I/O exception during sandboxed execution: input dependency /home/user/.cache/bazel/_bazel_user/c45d20c83f728d794469b8036365b9b4/execroot/__main__/bazel-out/k8-fastbuild/bin/env/lib64 could not be checked for modifications during execution.
Target //:two_plus_two failed to build
INFO: Elapsed time: 2.627s, Critical Path: 0.07s
INFO: 4 processes: 3 internal, 1 linux-sandbox.
FAILED: Build did NOT complete successfully

Workaround that makes the build work (and also works in the real codebase that this example is derived from), but I have no idea what this actually does (based on 4e6153e):

diff --git a/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedSpawnRunner.java b/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedSpawnRunner.java
index ffb7a5f3b7..eb075319e1 100644
--- a/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedSpawnRunner.java
+++ b/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedSpawnRunner.java
@@ -24,6 +24,7 @@ import com.google.devtools.build.lib.actions.ExecException;
 import com.google.devtools.build.lib.actions.ExecutionRequirements;
 import com.google.devtools.build.lib.actions.FileArtifactValue;
 import com.google.devtools.build.lib.actions.FileContentsProxy;
+import com.google.devtools.build.lib.actions.FileStateType;
 import com.google.devtools.build.lib.actions.ForbiddenActionInputException;
 import com.google.devtools.build.lib.actions.Spawn;
 import com.google.devtools.build.lib.actions.Spawns;
@@ -391,6 +392,11 @@ final class LinuxSandboxedSpawnRunner extends AbstractSandboxSpawnRunner {
       }
 
       FileArtifactValue metadata = context.getMetadataProvider().getMetadata(input);
+
+      if (metadata.getType() == FileStateType.DIRECTORY) {
+        continue;
+      }
+
       Path path = execRoot.getRelative(input.getExecPath());
 
       try {

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

No response

Which operating system are you running Bazel on?

No response

What is the output of bazel info release?

No response

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@sgowroji sgowroji added type: bug untriaged team-Local-Exec Issues and PRs for the Execution (Local) team labels Apr 26, 2022
@meisterT meisterT added P2 We'll consider working on this in future. (Assignee optional) and removed untriaged labels Apr 26, 2022
@meisterT meisterT assigned larsrc-google and unassigned meisterT Apr 26, 2022
@larsrc-google
Copy link
Contributor

Looping in @frazze-jobb who wrote the hermetic sandbox code.

@larsrc-google
Copy link
Contributor

Leaving out --experimental_use_sandboxfs doesn't change this but reduces the problem space.

@ulrfa
Copy link
Contributor

ulrfa commented Oct 17, 2022

@frazze-jobb is on another project now. I worked with him on this sandbox and answer instead.

Thanks for reporting this issue.

I will try to find time to investigate this and planning to create test cases for a few related special cases.

@SimonEbner
Copy link

FYI, I'm encountering the same problem in the context of a hermtic cc_toolchain using a symlink.
Minimal reproducible example:

#!/bin/bash
set -exuo pipefail

# Create files for bazel
touch src.cc
mkdir a
ln -s a b

touch WORKSPACE
cat > BUILD << EOF
load("toolchain.bzl", "gcc_toolchain")

cc_binary(
    name = "test",
    srcs = ["src.cc"],
)

cc_toolchain_suite(
    name = "my_toolchain",
    toolchains = {"k8": "dummy_toolchain"},
)

filegroup(
    name = "compiler_files",
    srcs = ["b"],
)

gcc_toolchain("dummy_toolchain")
EOF

cat > toolchain.bzl << EOF
def _impl(ctx):
    crosstool_proto = cc_common.create_cc_toolchain_config_info(
        ctx = ctx,
        toolchain_identifier = "cc-compiler-" + ctx.label.name,
        host_system_name = "x86_64-linux-gnu",
        target_system_name = "linux",
        target_cpu = "k8",
        target_libc = "unknown-glibc",
        compiler = "gcc",
    )
    return [crosstool_proto]

gcc_toolchain_configuration = rule(
    attrs = {},
    provides = [CcToolchainConfigInfo],
    implementation = _impl,
)

def gcc_toolchain(name):
    gcc_toolchain_configuration(name = name + "_configuration")

    native.cc_toolchain(
        name = name,
        toolchain_config = name + "_configuration",
        all_files = "//:compiler_files",
        compiler_files = "//:compiler_files",
        dwp_files = "//:compiler_files",
        linker_files = "//:compiler_files",
        objcopy_files = "//:compiler_files",
        strip_files = "//:compiler_files",
        supports_param_files = True,
        supports_header_parsing = True,
    )
EOF

Trying to build the binary via

bazel build //:test --crosstool_top=//:my_toolchain --experimental_use_hermetic_linux_sandbox

results in

ERROR: .../BUILD:3:10: Compiling src.cc failed: I/O exception during sandboxed execution: input dependency /home/.../de7a35bfd5cefbaf6a88e91201c016c4/execroot/__main__/b could not be checked for modifications during execution.

@larsrc-google
Copy link
Contributor

So the command-line output is not that useful, but looking into the server log I see this stack trace which shows a UnsupportedOperationException at FileArtifactValue.java:352 - it's trying to get a ContentsProxy of a directory, which isn't supported. Despite what FileArtifactValue may think, it's not always an actual file. So the modification check should take appropriate action on things that are not files.

com.google.devtools.build.lib.skyframe.ActionExecutionFunction$ActionExecutionFunctionException: com.google.devtools.build.lib.actions.AlreadyReportedActionEx
ecutionException: I/O exception during sandboxed execution: input dependency /usr/local/google/home/larsrc/.cache/bazel/_bazel_larsrc/b91bdcc9c8a35d79096b4ba9
b5794356/execroot/__main__/b could not be checked for modifications during execution.
        at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.computeInternal(ActionExecutionFunction.java:368)
        at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:169)
        at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:590)
        at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:382)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.base/java.lang.Thread.run(Unknown Source)
Caused by: com.google.devtools.build.lib.actions.AlreadyReportedActionExecutionException: I/O exception during sandboxed execution: input dependency /usr/loca
l/google/home/larsrc/.cache/bazel/_bazel_larsrc/b91bdcc9c8a35d79096b4ba9b5794356/execroot/__main__/b could not be checked for modifications during execution.
        ... 7 more
Caused by: com.google.devtools.build.lib.actions.UserExecException: I/O exception during sandboxed execution: input dependency /usr/local/google/home/larsrc/.
cache/bazel/_bazel_larsrc/b91bdcc9c8a35d79096b4ba9b5794356/execroot/__main__/b could not be checked for modifications during execution.
        at com.google.devtools.build.lib.sandbox.AbstractSandboxSpawnRunner.exec(AbstractSandboxSpawnRunner.java:97)
        at com.google.devtools.build.lib.sandbox.SandboxModule$SandboxFallbackSpawnRunner.exec(SandboxModule.java:494)
        at com.google.devtools.build.lib.exec.SpawnRunner.execAsync(SpawnRunner.java:245)
        at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:146)
        at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:108)
        at com.google.devtools.build.lib.actions.SpawnStrategy.beginExecution(SpawnStrategy.java:47)
        at com.google.devtools.build.lib.exec.SpawnStrategyResolver.beginExecution(SpawnStrategyResolver.java:68)
        at com.google.devtools.build.lib.rules.cpp.CppCompileAction.beginExecution(CppCompileAction.java:1454)
        at com.google.devtools.build.lib.actions.Action.execute(Action.java:133)
        at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$5.execute(SkyframeActionExecutor.java:907)
        at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.continueAction(SkyframeActionExecutor.java:1076)
        at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionR
unner.run(SkyframeActionExecutor.java:1031)
        at com.google.devtools.build.lib.skyframe.ActionExecutionState.runStateMachine(ActionExecutionState.java:152)
        at com.google.devtools.build.lib.skyframe.ActionExecutionState.getResultOrDependOnFuture(ActionExecutionState.java:91)
        at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeAction(SkyframeActionExecutor.java:492)
        at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkCacheAndExecuteIfNeeded(ActionExecutionFunction.java:856)
        at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.computeInternal(ActionExecutionFunction.java:349)
        ... 6 more
Caused by: java.io.IOException: input dependency /usr/local/google/home/larsrc/.cache/bazel/_bazel_larsrc/b91bdcc9c8a35d79096b4ba9b5794356/execroot/__main__/b could not be checked for modifications during execution.
        at com.google.devtools.build.lib.sandbox.LinuxSandboxedSpawnRunner.checkForConcurrentModifications(LinuxSandboxedSpawnRunner.java:399)
        at com.google.devtools.build.lib.sandbox.LinuxSandboxedSpawnRunner.verifyPostCondition(LinuxSandboxedSpawnRunner.java:380)
        at com.google.devtools.build.lib.sandbox.AbstractSandboxSpawnRunner.runSpawn(AbstractSandboxSpawnRunner.java:136)
        at com.google.devtools.build.lib.sandbox.AbstractSandboxSpawnRunner.exec(AbstractSandboxSpawnRunner.java:92)
        ... 22 more
Caused by: java.lang.UnsupportedOperationException
        at com.google.devtools.build.lib.actions.FileArtifactValue$DirectoryArtifactValue.getContentsProxy(FileArtifactValue.java:352)
        at com.google.devtools.build.lib.sandbox.LinuxSandboxedSpawnRunner.checkForConcurrentModifications(LinuxSandboxedSpawnRunner.java:395)
        ... 25 more

@larsrc-google
Copy link
Contributor

Looking at the possible subclasses of FileArtifactValue, they mostly don't support the operations needed for the modification check. @frazze-jobb we'll have to figure out which kinds of inputs to ignore and which to handle properly. Here are the direct subclasses:

google3/third_party/bazel/src/main/java/com/google/devtools/build/lib/actions/ActionCacheChecker.java
851: private static final class ConstantMetadataValue extends FileArtifactValue
google3/third_party/bazel/src/main/java/com/google/devtools/build/lib/actions/FileArtifactValue.java
332: private static final class DirectoryArtifactValue extends FileArtifactValue {
399: private static final class HashedDirectoryArtifactValue extends FileArtifactValue {
462: private static final class RegularFileArtifactValue extends FileArtifactValue {
549: public static class RemoteFileArtifactValue extends FileArtifactValue {
718: public static final class UnresolvedSymlinkArtifactValue extends FileArtifactValue {
758: public static final class InlineFileArtifactValue extends FileArtifactValue {
832: public static final class SourceFileArtifactValue extends FileArtifactValue {
895: private static final class SingletonMarkerValue extends FileArtifactValue implements Singleton {
940: private static final class OmittedFileValue extends FileArtifactValue implements Singleton {
google3/third_party/bazel/src/main/java/com/google/devtools/build/lib/skyframe/TreeArtifactValue.java
210: private static final class TreeArtifactCompositeFileArtifactValue extends FileArtifactValue {

@SimonEbner
Copy link

Thanks for looking into this. Not that it would help much, but leaving this smaller minimal example here for sake of simplicity

#!/bin/bash
touch WORKSPACE
mkdir a
ln -s a b

cat > BUILD << EOF
genrule(
    name = "target",
    srcs = [":b"],
    outs = ["dummy"],
    cmd = "ls $< > \$@",
)
EOF
bazel build //:target --experimental_use_hermetic_linux_sandbox

@larsrc-google
Copy link
Contributor

That's a nice small repro. It can also be done with just a directory as an input, so both directory and symlink handling needs to be fixed.

@ulrfa
Copy link
Contributor

ulrfa commented Nov 8, 2022

The purpose of checkForConcurrentModifications is to check that an action has not changed the content of hardlinked files outside of the sandbox. HardlinkedSandboxedSpawn.hardLinkRecursive is resolving symbolic links and are only creating hardlinks for files, not directories. Therefor it should be safe to skip the concurrent modification check when metadata.getType().isFile() is false. That should address the use cases in this ticket reliably.

However, then the following subclasses of FileArtifactValue, mentioned by @larsrc-google, remains:

  • ConstantMetadataValue
  • InlineFileArtifactValue
  • RemoteFileArtifactValue
  • SourceFileArtifactValue

I don't see any good way to handle them. Are they rare? As workaround, should we catch UnsupportedOperationException and log a warning each time it occurs? (That could be up to one warning per action.)

@lunch-glide-pepper
Copy link
Author

Thanks for the fix!

I don't know anything about Bazel release policies - is this fix likely to make it to a 6.x release?

@fmeum
Copy link
Collaborator

fmeum commented Nov 23, 2022

@lunch-glide-pepper You can use the @bazel-io flag command in a GitHub comment to mark it for cherry-picking and then create a PR with the commit cherry-picked on release-6.0.0.

lunch-glide-pepper pushed a commit to lunch-glide-pepper/bazel that referenced this issue Nov 23, 2022
…asses

Avoid exceptions from linux hermetic sandbox for unsupported FileArtifactValue subclasses.

Also adds test cases to confirm no regression of existing functionality.

Fixes bazelbuild#15340

Closes bazelbuild#16739.

Change-Id: I0f1373f6f99328b8277fe1cec87d3946b83481c1
PiperOrigin-RevId: 490490477
lunch-glide-pepper pushed a commit to lunch-glide-pepper/bazel that referenced this issue Nov 23, 2022
…asses

Avoid exceptions from linux hermetic sandbox for unsupported FileArtifactValue subclasses.

Also adds test cases to confirm no regression of existing functionality.

Fixes bazelbuild#15340

Closes bazelbuild#16739.

Change-Id: I0f1373f6f99328b8277fe1cec87d3946b83481c1
PiperOrigin-RevId: 490490477
ShreeM01 added a commit that referenced this issue Dec 1, 2022
…asses (#16830)

Avoid exceptions from linux hermetic sandbox for unsupported FileArtifactValue subclasses.

Also adds test cases to confirm no regression of existing functionality.

Fixes #15340

Closes #16739.

Change-Id: I0f1373f6f99328b8277fe1cec87d3946b83481c1
PiperOrigin-RevId: 490490477

Co-authored-by: Ulrik Falklof <[email protected]>
Co-authored-by: kshyanashree <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Local-Exec Issues and PRs for the Execution (Local) team type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants