Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local fallback not working when remote cache times out. #6219

Closed
ob opened this issue Sep 24, 2018 · 10 comments
Closed

Local fallback not working when remote cache times out. #6219

ob opened this issue Sep 24, 2018 · 10 comments
Assignees
Labels
P0 This is an emergency and more important than other current work. (Assignee required) type: bug

Comments

@ob
Copy link
Contributor

ob commented Sep 24, 2018

When the remote cache times-out, I would have expected the --remote_local_fallback flag to fall back to building if the remote cache times out. However that is not what I'm seeing.

My .bazelrc:

build --spawn_strategy=standalone
build --genrule_strategy=standalone
build --experimental_objc_enable_module_maps
build --features swift.no_generated_module_map
build --symlink_prefix=build/
build --xcode_version 9.4.1
build --remote_local_fallback_strategy=local
build --remote_http_cache=$URL

and I'm building by running:

time bazel build --jobs 128 //:Learning

With bazel from GitHub as of today (4cba428). But I can also reproduce with 0.17.1.

The error I get is:

ERROR: /Users/obonilla/r/learning-ios_trunk/Pods/LISemaphoreLib/BUILD.bazel:2:1: C++ compilation of rule '//Pods/LISemaphoreLib:LISemaphoreLib_ObjC' failed: Unexpected IO error.: Exhausted retry attempts (0)
Target //:Learning failed to build
@philwo
Copy link
Member

philwo commented Sep 25, 2018

We're seeing this on CI a lot, too and it's causing jobs to fail.

Example:
https://buildkite.com/bazel/google-bazel-presubmit/builds/9435#b624626c-2ed5-442d-9da6-e1fbaf3f7358
Couldn't build file src/bazel: Executing genrule //src:bazel-bin failed: Unexpected IO error.: Exhaused retry attempts (0)

@buchgr We might have to disable remote caching on CI if we can't come up with a fix tomorrow. :(

@philwo philwo added type: bug P0 This is an emergency and more important than other current work. (Assignee required) category: remote execution / caching and removed untriaged labels Sep 25, 2018
@philwo
Copy link
Member

philwo commented Sep 25, 2018

Ping @ola-rozenfeld - any idea what might cause this and/or how we could fix it?

@ola-rozenfeld
Copy link
Contributor

Yes, this is my bad, I'm very sorry, this was regressed by #5917. Note how in this change we rethrow https://github.com/bazelbuild/bazel/pull/5917/files#diff-1bc490926cdfbdaf2b6d5292238dfb97R190 all exceptions that are not cache misses, instead of warning on them. Will send a fix right away, sorry! (FYI @werkt )

Btw, local fallback flags are not relevant to local execution mode, since we always execute locally if we execute at all.

@ola-rozenfeld ola-rozenfeld self-assigned this Sep 26, 2018
@werkt
Copy link
Contributor

werkt commented Sep 26, 2018

It is up to the bazel team to decide what classes of exceptions are not to be silently ignored and funnelled into cache misses. And to that end, it may be a part of delegating it to the user, asking that they be willing to specify to accept timeouts, unavailable caches, auth errors, etc, all as cache misses, or some configured set of the above, and with what noisiness.

@ola-rozenfeld
Copy link
Contributor

silently ignored

Not silently, it did print a warning before. I think that's still the right thing to do -- if the user sees lots of warnings, they will realize the remote cache is not working at all (they will also see it in the final status of number of cache hits). Not sure how we could allow more finely grained control.

@ob
Copy link
Contributor Author

ob commented Sep 26, 2018

What I think would help is some statistics at the end of the build that can be tracked. They should include remote cache failures. I think cache hits doesn't really convey the right information. If I see a sudden drop in cache hits, was it because the code changed a lot? Maybe the toolchain changed? I would rather see remote cache failures explicitly listed.

@ola-rozenfeld
Copy link
Contributor

I agree, let's file a separate feature request of having a better unified approach (tracking) for various remote failures (caching, execution, BEP). Currently, we do different things for all three.

@bayareabear
Copy link

@ola-rozenfeld, how is the remote cache timeout is determined or set? I see my cache server answered the get request with "200", but still hit "Unexpected IO error.: Exhausted retry attempts"

aehlig pushed a commit that referenced this issue Sep 28, 2018
…ad or upload, only warn. Added more tests.

TESTED=new unit tests
RELNOTES: Fix regression #6219, remote cache failures
PiperOrigin-RevId: 214614941
@ola-rozenfeld
Copy link
Contributor

Are you running with my patch? Can you please run with --verbose_failures?
The timeout is set by the --remote_timeout option, the default is a minute.

aehlig pushed a commit that referenced this issue Oct 1, 2018
…ad or upload, only warn. Added more tests.

TESTED=new unit tests
RELNOTES: Fix regression #6219, remote cache failures
PiperOrigin-RevId: 214614941
aehlig pushed a commit that referenced this issue Oct 1, 2018
…ad or upload, only warn. Added more tests.

TESTED=new unit tests
RELNOTES: Fix regression #6219, remote cache failures
PiperOrigin-RevId: 214614941
@ixdy
Copy link
Contributor

ixdy commented Oct 2, 2018

It looks like a potential fix was cherrypicked into the 0.18 RCs. Is a fixed 0.17 patch release likely?

dslomov pushed a commit that referenced this issue Oct 5, 2018
…ad or upload, only warn. Added more tests.

TESTED=new unit tests
RELNOTES: Fix regression #6219, remote cache failures
PiperOrigin-RevId: 214614941
bazel-io pushed a commit that referenced this issue Oct 15, 2018
Baseline: c062b1f

Cherry picks:

   + 2834613:
     Include also ext jars in the bootclasspath jar.
   + 2579b79:
     Fix toolchain_java9 on --host_javabase=<jdk9> after
     7eb9ea1
   + faaff7f:
     Release notes: fix markdown
   + b073a18:
     Fix NestHost length computation Fixes #5987
   + bf6a63d:
     Fixes #6219. Don't rethrow any remote cache failures on either
     download or upload, only warn. Added more tests.
   + c1a7b4c:
     Fix broken IdlClassTest on Bazel's CI.
   + 71926bc:
     Fix the Xcode version detection which got broken by the upgrade
     to Xcode 10.0.
   + 86a8217:
     Temporarily restore processing of workspace-wide tools/bazel.rc
     file.

General changes

- New [bazelrc file list](https://docs.bazel.build/versions/master/user-manual.html#where-are-the-bazelrc-files).
  If you need to keep both the old and new lists of .rc files active
  concurrently to support multiple versions of Bazel, you can import the old
  file location into the new list using `try-import`. This imports a file if it
  exists and silently exits if it does not. You can use this method to account
  for a user file that may or may not exist

- [.bazelignore](https://docs.bazel.build/versions/master/user-manual.html#.bazelignore)
  is now fully functional.

- The startup flag `--host_javabase` has been renamed to
  `--server_javabase` to avoid confusion with the build flag
  `--host_javabase`.

Android

- The Android resource processing pipeline now supports persistence
  via worker processes. Enable it with
  `--persistent_android_resource_processor`. We have observed a 50% increase
  in build speed for clean local builds and up to 150% increase in build
  speed for incremental local builds.

C++

- In-memory package //tools/defaults has been removed (controlled by
  `--incompatible_disable_tools_defaults_package` flag). Please see
  [migration instructions](https://docs.bazel.build/versions/master/skylark/backward-compatibility.html#disable-inmemory-tools-defaults-package)
  and migrate soon, the flag will be flipped in Bazel 0.19, and the legacy
  behavior will be removed in Bazel 0.20.

- Late bound option defaults (typical example was the `--compiler` flag, when
  it was not specified, it’s value was computed using the CROSSTOOL) are removed
  (controlled by `--incompatible_disable_late_bound_option_defaults` flag).
  Please see [migration instructions](https://docs.bazel.build/versions/master/skylark/backward-compatibility.html#disable-late-bound-option-defaults)
  and migrate soon, the flag will be flipped in Bazel 0.19, and the legacy
  behavior will be removed in Bazel 0.20.

- Depsets are no longer accepted in `user_compile_flags` and `user_link_flags`
  in the C++ toolchain API (controlled by
  `--incompatible_disable_depset_in_cc_user_flags` flag) affects C++ users.
  Please see [migration instructions](https://docs.bazel.build/versions/master/skylark/backward-compatibility.html#disable-depsets-in-c-toolchain-api-in-user-flags)
  and migrate soon, the flag will be flipped in Bazel 0.19, and the legacy
  behavior will be removed in Bazel 0.20.

- CROSSTOOL is no longer consulted when selecting C++ toolchain (controlled by
  `--incompatible_disable_cc_toolchain_label_from_crosstool_proto` flag).
  Please see [migration instructions](https://docs.bazel.build/versions/master/skylark/backward-compatibility.html#disallow-using-crosstool-to-select-the-cc_toolchain-label)
  and migrate soon, the flag will be flipped in Bazel 0.19, and the legacy behavior will be removed in Bazel 0.20.

- You can now use [`toolchain_identifier` attribute](857d466)
  on `cc_toolchain` to pair it with CROSSTOOL toolchain.

- C++ specific Make variables
  are no longer passed from the `CppConfiguration`, but from the C++ toolchain
  (controlled by `--incompatible_disable_cc_configuration_make_variables` flag).
  Please see [migration instructions](https://docs.bazel.build/versions/master/skylark/backward-compatibility.html#disallow-using-c-specific-make-variables-from-the-configuration)
  and migrate soon, the flag will be flipped
  in Bazel 0.19, and the legacy behavior will be removed in Bazel 0.20.

- Skylark api accessing C++
  toolchain in `ctx.fragments.cpp` is removed (controlled by
  `--incompatible_disable_legacy_cpp_toolchain_skylark_api` flag).
  Please migrate soon, the flag will be flipped
  in Bazel 0.19, and the legacy behavior will be removed in Bazel 0.20.

- cc_binary link action no longer hardcodes
  `-static-libgcc` for toolchains that support embedded runtimes
  (guarded by [`--experimental_dont_emit_static_libgcc`](https://source.bazel.build/bazel/+/2f281960b829e964526a9d292d4c3003e4d19f1c)
  temporarily). Proper deprecation using `--incompatible` flags will follow.

Java

- Future versions of Bazel will require a locally installed JDK
  for Java development. Previously Bazel would fall back to using
  the embedded `--server_javabase` if no JDK as available. Pass
  `--incompatible_never_use_embedded_jdk_for_javabase` to disable the
  legacy behaviour.

- `--javacopt=` no longer affects compilations of tools that are
  executed during the build; use `--host_javacopt=` to change javac
  flags in the host configuration.

Objective C

- `objc_library` now supports the module_name attribute.

Skylark

- Adds `--incompatible_expand_directories` to automatically expand
  directories in skylark command lines. Design doc:
  https://docs.google.com/document/d/11agWFiOUiz2htBLj6swPTob5z78TrCxm8DQE4uJLOwM

- Support fileset expansion in ctx.actions.args(). Controlled by
  `--incompatible_expand_directories`.

Windows

- `--windows_exe_launcher` is deprecated, this flag will be removed
  soon. Please make sure you are not using it.

- Bazel now supports the symlink runfiles tree on Windows with
  `--experimental_enable_runfiles` flag. For more details, see
  [this doc](https://docs.google.com/document/d/1hnYmU1BmtCSJOUvvDAK745DSJQCapToJxb3THXYMrmQ).

Other Changes

- A new experimental option `--experimental_ui_deduplicate` has been added. It
  causes the UI to attempt to deduplicate messages from actions to keep the
  console output cleaner.

- Add `--modify_execution_info`, a flag to customize action execution
  info.

- Add ExecutionInfo to aquery output for ExecutionInfoSpecifier
  actions.

- When computing `--instrumentation_filter`, end filter patterns with
  "[/:]" to match non-top-level packages exactly and treat
  top-level targets consistently.

- Added the `bazel info server_log` command, which obtains the main Bazel
  server log file path. This can help debug Bazel issues.

- `aapt shrink` resources now properly respect filter configurations.
aehlig pushed a commit that referenced this issue Oct 23, 2018
…ad or upload, only warn. Added more tests.

TESTED=new unit tests
RELNOTES: Fix regression #6219, remote cache failures
PiperOrigin-RevId: 214614941
aehlig pushed a commit that referenced this issue Oct 24, 2018
…ad or upload, only warn. Added more tests.

TESTED=new unit tests
RELNOTES: Fix regression #6219, remote cache failures
PiperOrigin-RevId: 214614941
aehlig pushed a commit that referenced this issue Oct 29, 2018
…ad or upload, only warn. Added more tests.

TESTED=new unit tests
RELNOTES: Fix regression #6219, remote cache failures
PiperOrigin-RevId: 214614941
acarlton0 pushed a commit to acarlton0/bazel that referenced this issue Oct 30, 2018
…her download or upload, only warn. Added more tests.

TESTED=new unit tests
RELNOTES: Fix regression bazelbuild#6219, remote cache failures
PiperOrigin-RevId: 214614941
acarlton0 pushed a commit to acarlton0/bazel that referenced this issue Oct 30, 2018
Baseline: c062b1f

Cherry picks:

   + 2834613:
     Include also ext jars in the bootclasspath jar.
   + 2579b79:
     Fix toolchain_java9 on --host_javabase=<jdk9> after
     7eb9ea1
   + faaff7f:
     Release notes: fix markdown
   + b073a18:
     Fix NestHost length computation Fixes bazelbuild#5987
   + bf6a63d:
     Fixes bazelbuild#6219. Don't rethrow any remote cache failures on either
     download or upload, only warn. Added more tests.
   + c1a7b4c:
     Fix broken IdlClassTest on Bazel's CI.
   + 71926bc:
     Fix the Xcode version detection which got broken by the upgrade
     to Xcode 10.0.
   + 86a8217:
     Temporarily restore processing of workspace-wide tools/bazel.rc
     file.

General changes

- New [bazelrc file list](https://docs.bazel.build/versions/master/user-manual.html#where-are-the-bazelrc-files).
  If you need to keep both the old and new lists of .rc files active
  concurrently to support multiple versions of Bazel, you can import the old
  file location into the new list using `try-import`. This imports a file if it
  exists and silently exits if it does not. You can use this method to account
  for a user file that may or may not exist

- [.bazelignore](https://docs.bazel.build/versions/master/user-manual.html#.bazelignore)
  is now fully functional.

- The startup flag `--host_javabase` has been renamed to
  `--server_javabase` to avoid confusion with the build flag
  `--host_javabase`.

Android

- The Android resource processing pipeline now supports persistence
  via worker processes. Enable it with
  `--persistent_android_resource_processor`. We have observed a 50% increase
  in build speed for clean local builds and up to 150% increase in build
  speed for incremental local builds.

C++

- In-memory package //tools/defaults has been removed (controlled by
  `--incompatible_disable_tools_defaults_package` flag). Please see
  [migration instructions](https://docs.bazel.build/versions/master/skylark/backward-compatibility.html#disable-inmemory-tools-defaults-package)
  and migrate soon, the flag will be flipped in Bazel 0.19, and the legacy
  behavior will be removed in Bazel 0.20.

- Late bound option defaults (typical example was the `--compiler` flag, when
  it was not specified, it’s value was computed using the CROSSTOOL) are removed
  (controlled by `--incompatible_disable_late_bound_option_defaults` flag).
  Please see [migration instructions](https://docs.bazel.build/versions/master/skylark/backward-compatibility.html#disable-late-bound-option-defaults)
  and migrate soon, the flag will be flipped in Bazel 0.19, and the legacy
  behavior will be removed in Bazel 0.20.

- Depsets are no longer accepted in `user_compile_flags` and `user_link_flags`
  in the C++ toolchain API (controlled by
  `--incompatible_disable_depset_in_cc_user_flags` flag) affects C++ users.
  Please see [migration instructions](https://docs.bazel.build/versions/master/skylark/backward-compatibility.html#disable-depsets-in-c-toolchain-api-in-user-flags)
  and migrate soon, the flag will be flipped in Bazel 0.19, and the legacy
  behavior will be removed in Bazel 0.20.

- CROSSTOOL is no longer consulted when selecting C++ toolchain (controlled by
  `--incompatible_disable_cc_toolchain_label_from_crosstool_proto` flag).
  Please see [migration instructions](https://docs.bazel.build/versions/master/skylark/backward-compatibility.html#disallow-using-crosstool-to-select-the-cc_toolchain-label)
  and migrate soon, the flag will be flipped in Bazel 0.19, and the legacy behavior will be removed in Bazel 0.20.

- You can now use [`toolchain_identifier` attribute](bazelbuild@857d466)
  on `cc_toolchain` to pair it with CROSSTOOL toolchain.

- C++ specific Make variables
  are no longer passed from the `CppConfiguration`, but from the C++ toolchain
  (controlled by `--incompatible_disable_cc_configuration_make_variables` flag).
  Please see [migration instructions](https://docs.bazel.build/versions/master/skylark/backward-compatibility.html#disallow-using-c-specific-make-variables-from-the-configuration)
  and migrate soon, the flag will be flipped
  in Bazel 0.19, and the legacy behavior will be removed in Bazel 0.20.

- Skylark api accessing C++
  toolchain in `ctx.fragments.cpp` is removed (controlled by
  `--incompatible_disable_legacy_cpp_toolchain_skylark_api` flag).
  Please migrate soon, the flag will be flipped
  in Bazel 0.19, and the legacy behavior will be removed in Bazel 0.20.

- cc_binary link action no longer hardcodes
  `-static-libgcc` for toolchains that support embedded runtimes
  (guarded by [`--experimental_dont_emit_static_libgcc`](https://source.bazel.build/bazel/+/2f281960b829e964526a9d292d4c3003e4d19f1c)
  temporarily). Proper deprecation using `--incompatible` flags will follow.

Java

- Future versions of Bazel will require a locally installed JDK
  for Java development. Previously Bazel would fall back to using
  the embedded `--server_javabase` if no JDK as available. Pass
  `--incompatible_never_use_embedded_jdk_for_javabase` to disable the
  legacy behaviour.

- `--javacopt=` no longer affects compilations of tools that are
  executed during the build; use `--host_javacopt=` to change javac
  flags in the host configuration.

Objective C

- `objc_library` now supports the module_name attribute.

Skylark

- Adds `--incompatible_expand_directories` to automatically expand
  directories in skylark command lines. Design doc:
  https://docs.google.com/document/d/11agWFiOUiz2htBLj6swPTob5z78TrCxm8DQE4uJLOwM

- Support fileset expansion in ctx.actions.args(). Controlled by
  `--incompatible_expand_directories`.

Windows

- `--windows_exe_launcher` is deprecated, this flag will be removed
  soon. Please make sure you are not using it.

- Bazel now supports the symlink runfiles tree on Windows with
  `--experimental_enable_runfiles` flag. For more details, see
  [this doc](https://docs.google.com/document/d/1hnYmU1BmtCSJOUvvDAK745DSJQCapToJxb3THXYMrmQ).

Other Changes

- A new experimental option `--experimental_ui_deduplicate` has been added. It
  causes the UI to attempt to deduplicate messages from actions to keep the
  console output cleaner.

- Add `--modify_execution_info`, a flag to customize action execution
  info.

- Add ExecutionInfo to aquery output for ExecutionInfoSpecifier
  actions.

- When computing `--instrumentation_filter`, end filter patterns with
  "[/:]" to match non-top-level packages exactly and treat
  top-level targets consistently.

- Added the `bazel info server_log` command, which obtains the main Bazel
  server log file path. This can help debug Bazel issues.

- `aapt shrink` resources now properly respect filter configurations.
bazel-io pushed a commit that referenced this issue Oct 31, 2018
Baseline: c062b1f

Cherry picks:

   + 2834613:
     Include also ext jars in the bootclasspath jar.
   + 2579b79:
     Fix toolchain_java9 on --host_javabase=<jdk9> after
     7eb9ea1
   + faaff7f:
     Release notes: fix markdown
   + b073a18:
     Fix NestHost length computation Fixes #5987
   + bf6a63d:
     Fixes #6219. Don't rethrow any remote cache failures on either
     download or upload, only warn. Added more tests.
   + c1a7b4c:
     Fix broken IdlClassTest on Bazel's CI.
   + 71926bc:
     Fix the Xcode version detection which got broken by the upgrade
     to Xcode 10.0.
   + 86a8217:
     Temporarily restore processing of workspace-wide tools/bazel.rc
     file.
   + 914b4ce:
     Windows: Fix Precondition check for addDynamicInputLinkOptions
   + e025726:
     Update turbine

Important changes:

  - Fix regression #6219, remote cache failures
meteorcloudy pushed a commit that referenced this issue Nov 29, 2018
…ad or upload, only warn. Added more tests.

TESTED=new unit tests
RELNOTES: Fix regression #6219, remote cache failures
PiperOrigin-RevId: 214614941
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P0 This is an emergency and more important than other current work. (Assignee required) type: bug
Projects
None yet
Development

No branches or pull requests

8 participants