For QUICHE integration, add --experimental_remap_main_repo to bazelrc.#5767
For QUICHE integration, add --experimental_remap_main_repo to bazelrc.#5767wu-bin wants to merge 1 commit intoenvoyproxy:masterfrom
Conversation
Signed-off-by: Bin Wu <wub@google.com>
|
LGTM /retest |
|
🔨 rebuilding |
|
Although the error messages appear unrelated, asan and tsan do seem to fail consistently. Bin, could you try repro'ing locally, to see if/how this might be related to the added flag? |
|
@wu-bin do we build QUICHE by default? I'd like to ensure that for the bulk of Envoy devs, they don't need this flag, even if not using this |
|
@htuch what do we need to do to get QUICHE to not be built by default? The doc for disabling extensions makes it sound like extensions are only built if listed in source/extensions/extensions_build_config.bzl, and currently the code under //{source,test}/extensions/quic_listeners/quiche/ is not listed there. That said, it does look like tests from //test/extensions/quic_listeners/quiche/ are run in presubmit checks, so I'm guessing there's something more that we need to be doing, to exclude QUICHE from normal builds? |
|
@mpwarres good point. I don't think we have a good story for this beyond the heavy hammer of Bazel settings. Take a look at what we do for Google gRPC in https://github.com/envoyproxy/envoy/blob/master/test/common/grpc/BUILD#L63 and other places that use this I'd suggest a similar approach, defaulting to off for regular builds, and then in CI you could turn it on perhaps. The downside is that things might fail in CI for some folks and not locally until they realize this. We definitely don't want to make this flag required for regular builds, we've delayed PRs for weeks before to avoid this (see #5218 for example). |
|
I've managed to reproduce the asan ci failure. It looks like a bug with the --experimental_remap_main_repo flag: The asan build was ran from the envoyproxy/envoy-filter-example repo, using command: The bazelrc used by this repo is the same as envoy's, so the --experimental_remap_main_repo flag is enabled in the command. The command failed for envoy-filter-example's :envoy_binary_test target, because one of its dependency, '//source/common/common:generate_version_number', depends on envoy's "//:VERSION", but bazel seems to thinking it is envoy-filter-example's "//:VERSION". envoy-filter-example does not have "//:VERSION", hence the failure. @mpwarres , @htuch : please suggest what should we do to proceed. Thanks! |
IIUC, seems like this immediate issue could be addressed by having //source/common/common:generate_version_number depend on @envoy//:VERSION, which perhaps is the right thing to do anyways (irrespective of this PR), if the intent is to rely on envoy's :VERSION? Past that, we still need to address @htuch's previous comment, though. @wu-bin, LMK if you want to deal with that in this PR, or would like me to address it in a separate one. |
Since it's a within-repo reference(from generate_version_number to VERSION), I think it makes sense to not use @envoy, but it fails even if @envoy is added.
I just sent #5791 to default disable quic_platform_test. @htuch @mpwarres , can you take a look at it? Thanks! |
Oh, I see, I had missed that //source/common/common:generate_version_number is a rule in Envoy and not in the filter example. I agree that seems like a bug in the Bazel fix. @htuch, you mentioned you have contacts on Bazel team, is there someone you'd recommend? Otherwise I can ask on bazel-discuss@. I'll also try to simplify into a smaller example. |
|
I believe this specific issue has been fixed in Bazel 0.22, where this feature is non-experimental. /cc @dkelmer |
To reproduce:
|
|
Just to add some context/history:
|
|
@mpwarres and I took a look at this yesterday. @wu-bin, your repro instructions do indeed reproduce the failure, but it is difficult to discern which Bazel targets are being built and from where. We attempted to create a more minimal repro by doing the following:
Both of the above builds succeed even though I'd expect both to fail as that is what is failing when invoking the CI script. Some additional information from the failure by invoking the CI script is that the top level target that is failing is |
|
Some findings from continuing to play around with this:
IIUC, this means that bazel will be executing with a --package_path that includes both envoy-filter-example and envoy repos, even though the envoy repo is also brought in as a local_repository from the WORKSPACE file (here), which seems odd to me. However, if I tweak --package_path to only be "%workspace%", the build fails due to other missing targets. |
|
Progress (I think): I noticed that the ci/WORKSPACE.filter.example file that ci/build_setup.sh installs for the failing CI runs seems a bit funky: it reuses (here and here) the name "envoy" both for the workspace name and a local_repository name, and also refers to files provided by the envoy local_repository via names that start with "//" rather than "@envoy//" (i.e. "//bazel:repositories.bzl" and "//bazel:cc_configure.bzl"). If I change ci/WORKSPACE.filter.example to use distinct names for the envoy-filter-example workspace and the envoy local_repository, and fully qualify the references to //bazel:repositories.bzl and //bazel:cc_configure.bzl, things appear to work, both with and without the --experimental_remap_main_repo flag. This change is PR #5868. A side benefit is that it brings ci/WORKSPACE.filter.example closer in line with the current actual envoy-filter-example WORKSPACE file, and removes the need to set --package_path in ci/build_setup.sh. My best guess as to why this fouled things up is that it may have confused Bazel about which rules were defined in which workspaces, which perhaps was handled differently/less strictly prior to --experimental_remap_main_repo? I'm also not sure if there's a reason why ci/WORKSPACE.filter.example needs to be the way it is currently. The local_repository(name="envoy") goes a couple years back, to PR #742, perhaps the external dependency story was different back then? Not sure if @htuch may remember. @dkelmer does the above theory seem plausible? @htuch does #5868 seem like a reasonable change (with or without the actual setting of --experimental_remap_main_repo)? |
|
#5868 is absolutely a good change. My very scientific explanation for why it was working before is "you were lucky" :). Bazel generally ignored the name specified in the workspace() function. But with the --experimental_remap_main_repo, Bazel now uses the name. For example, if the workspace function was |
|
@wu-bin can we close this one out now? |
|
Thanks @dkelmer for the explanation! Closing this PR SGTM, will work on addressing the remaining TEST_WORKSPACE issue in #5868 so that can land. @wu-bin does that sound ok to you? |
|
Proposed fix for issues seen in #5767 Update envoy/ci/WORKSPACE.filter.example, which is used in CI tests (see setup [here](https://github.com/envoyproxy/envoy/blob/f2511a39cf2c4fe392d5499e854c39f262712100/ci/build_setup.sh#L92)), to more closely match current envoy-filter-example [WORKSPACE](https://github.com/envoyproxy/envoy-filter-example/blob/master/WORKSPACE) file. In particular, remove the dual use of "envoy" as both the name of envoy-filter-example workspace as well as the envoy local_repository, and load //bazel:repositories.bzl and //bazel:cc_configure.bzl using fully qualified target names, which removes the need to [set --package_path](https://github.com/envoyproxy/envoy/blob/f2511a39cf2c4fe392d5499e854c39f262712100/ci/build_setup.sh#L63) in ci/build_setup.sh. Risk Level: low: errors should result in CI failure Testing: existing CI tests Signed-off-by: Michael Warres <mpw@google.com>
Proposed fix for issues seen in envoyproxy#5767 Update envoy/ci/WORKSPACE.filter.example, which is used in CI tests (see setup [here](https://github.com/envoyproxy/envoy/blob/f2511a39cf2c4fe392d5499e854c39f262712100/ci/build_setup.sh#L92)), to more closely match current envoy-filter-example [WORKSPACE](https://github.com/envoyproxy/envoy-filter-example/blob/master/WORKSPACE) file. In particular, remove the dual use of "envoy" as both the name of envoy-filter-example workspace as well as the envoy local_repository, and load //bazel:repositories.bzl and //bazel:cc_configure.bzl using fully qualified target names, which removes the need to [set --package_path](https://github.com/envoyproxy/envoy/blob/f2511a39cf2c4fe392d5499e854c39f262712100/ci/build_setup.sh#L63) in ci/build_setup.sh. Risk Level: low: errors should result in CI failure Testing: existing CI tests Signed-off-by: Michael Warres <mpw@google.com> Signed-off-by: Fred Douglas <fredlas@google.com>
Description:
To allow source/extensions/quic_listeners/quiche/platform:quic_platform_impl_lib to depend on envoy build rules, add --experimental_remap_main_repo to bazelrc.
For context, this flag is temporarily necessary in order for the Envoy QUICHE platform implementation to be able to depend on Envoy libraries, without incurring link-time errors. More explanation in this comment. Bazel team's plan for graduating the flag-protected behavior is described here.
Note that the Bazel fix is only available in the most recent Bazel release, 0.22.0, which was released yesterday. @htuch , are there timing considerations around when a PR can assume the most recent version of Bazel?
Risk Level: minimal: build only
Testing: Tested build with PR #5758
Docs Changes: none
Release Notes: none
[Optional Fixes #Issue]
[Optional Deprecated:]