Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to cross-compile py_binary in Bazel 7 without a cc toolchain #1857

Closed
linzhp opened this issue Apr 17, 2024 · 12 comments
Closed

Not able to cross-compile py_binary in Bazel 7 without a cc toolchain #1857

linzhp opened this issue Apr 17, 2024 · 12 comments

Comments

@linzhp
Copy link
Contributor

linzhp commented Apr 17, 2024

🐞 bug report

Affected Rule

The issue is caused by the rule: py_binary

Is this a regression?

Yes, the previous version in which this bug was not present was: Bazel 6.3.2

Description

Cross-compiling py_binary from macOS arm64 to either macOS amd64 or Linux amd64 break when using Bazel 7. It was working in Bazel 6.

Not sure if this bug report belongs here or Bazel. Let me know if I should move it to Bazel.

🔬 Minimal Reproduction

Run bazel build --incompatible_enable_cc_toolchain_resolution --platforms //:darwin_amd64 --enable_bzlmod //:hello in the following workspace with Bazel 7 and Bazel 6:

-- MODULE.bazel --
module(name = "cross_compliing")
bazel_dep(name = "rules_python", version = "0.31.0")

python = use_extension("@rules_python//python/extensions:python.bzl", "python")
python.toolchain(
    configure_coverage_tool = True,
    ignore_root_user_error = True,
    is_default = True,
    python_version = "3.9",
)
-- WORKSPACE --
-- BUILD.bazel --
load("@rules_python//python:defs.bzl", "py_binary")

py_binary(
    name = "hello",
    srcs = ["hello.py"],
)

platform(
    name = "darwin_amd64",
    constraint_values = [
        "@platforms//cpu:x86_64",
        "@platforms//os:macos",
    ],
    visibility = ["//visibility:public"],
)

platform(
    name = "linux_amd64",
    constraint_values = [
        "@platforms//cpu:x86_64",
        "@platforms//os:linux",
    ],
    visibility = ["//visibility:public"],
)
-- hello.py --
if __name__ == "__main__":
    print("hello world")

🔥 Exception or Error


ERROR: /private/var/tmp/_bazel_zplin/91d514326e938596854c0d93324b38ae/external/bazel_tools/tools/cpp/BUILD:58:19: in cc_toolchain_alias rule @@bazel_tools//tools/cpp:current_cc_toolchain: 
Traceback (most recent call last):
        File "/virtual_builtins_bzl/common/cc/cc_toolchain_alias.bzl", line 26, column 48, in _impl
        File "/virtual_builtins_bzl/common/cc/cc_helper.bzl", line 219, column 17, in _find_cpp_toolchain
Error in fail: Unable to find a CC toolchain using toolchain resolution. Target: @@bazel_tools//tools/cpp:current_cc_toolchain, Platform: @@//:darwin_amd64, Exec platform: @@local_config_platform//:host
ERROR: /private/var/tmp/_bazel_zplin/91d514326e938596854c0d93324b38ae/external/bazel_tools/tools/cpp/BUILD:58:19: Analysis of target '@@bazel_tools//tools/cpp:current_cc_toolchain' failed
ERROR: Analysis of target '//:hello' failed; build aborted: Analysis failed

🌍 Your Environment

Operating System:

macOS Sonoma 14.4.1 on Apple M1 Max

Output of bazel version:

  
Build label: 7.1.1
Build target: @@//src/main/java/com/google/devtools/build/lib/bazel:BazelServer
Build time: Thu Mar 21 18:08:59 2024 (1711044539)
Build timestamp: 1711044539
Build timestamp as int: 1711044539
  

Rules_python version:

  
bazel_dep(name = "rules_python", version = "0.31.0")
  

Anything else relevant?
The issue can be worked around with --noincompatible_enable_cc_toolchain_resolution in Bazel 7, although it worked in Bazel 6 with and without cc_toolchain resolution.

This is the root cause for #1825

@aignas
Copy link
Collaborator

aignas commented Apr 18, 2024

I previously thought that it is due to bazel version upgrade and not because of rules_python changes. Could somebody verify if the same bug exists in 0.30? I think that is the version before we switch to starlark rules, if that is broken in the same way, then I don't think there is anything we can do here.

@linzhp
Copy link
Contributor Author

linzhp commented Apr 18, 2024

I agree this is due to Bazel version upgrade. But isn't rules_python responsible for making sure that itself work with latest Bazel?

@aignas
Copy link
Collaborator

aignas commented Apr 19, 2024

I tested with rules_python 0.28 through 0.31 and the result is the same with the latest bazel.

It seems that the problem is because we reference @bazel_tools//tools/cpp:current_cc_toolchain here. And the reason for this is that we are using a helper function to get the platform using cc_helper.find_cpp_toolchain(ctx) here, which is a bazel-internal thing as documented here.

Not sure if there is a better way to do this, but the flipping of the toolchain resolution in bazel 7 indeed broke python binary rule cross-compilation. @rickeylev, do you know if there is a workaround here?

At $dayjob we cross-build items and we have a cc toolchain, so I guess a known workaround for this would be to register a cc_toolchain (e.g. hermetic_cc_toolchain or something similar).

@linzhp
Copy link
Contributor Author

linzhp commented Apr 19, 2024

but the flipping of the toolchain resolution in bazel 7

I don't think this is related to the flipping. bazel build --incompatible_enable_cc_toolchain_resolution --platforms //:darwin_amd64 --enable_bzlmod //:hello works in Bazel 6

@aignas
Copy link
Collaborator

aignas commented Apr 19, 2024

You are correct, the offending commit is bazelbuild/bazel@1895585

I ran: bazelisk --bisect=6.0.0..HEAD build //:hello

@rickeylev looking at the list in bazelbuild/bazel#15897 it does not seem that this incompatibility was expected.

@linzhp
Copy link
Contributor Author

linzhp commented Apr 19, 2024

Do you think I should move this bug to bazel?

@aignas
Copy link
Collaborator

aignas commented Apr 19, 2024 via email

@rickeylev
Copy link
Collaborator

This all sounds a bit weird. Build output with --toolchain_resolution_debug would be helpful.

The Bazel 6 java-implementation of the rules also depended on the CC toolchain, so I would expect that to have the same resolution logic.

rules_python releases prior to 0.31.0 didn't enable the rules_python-based Starlark implementation, i.e. they use the Bazel-based Starlark implementation.

So yeah, this does sound like something relating to the Starlark implementation, be it the one in Bazel itself or rules_python.

cc_helper.find_cpp_toolchain used to get toolchain_id in write_build_data

IIRC, the write_build_data codepath isn't active in rules_python/Bazel. But, that fact is a bit moot -- the way toolchain resolution works is it resolves all the rule's toolchains, even if one isn't used.

I think we could do away with using cc_helper.find_cpp_toolchain. IIRC, it's logic is pretty simple, so should be OK to copy/paste or re-implement. I'm not sure if that'll help, though; if the issue is that there isn't a matching toolchain, then there isn't much we can do.

We could make the CC toolchain optional. That should be possible.

Our test matrix includes bazel 6, 7, and rolling releases, but doesn't attempt any cross-building. I'm surprised it ever worked.

I don't have a working mac machine right now, and probably won't for another week.

@aignas aignas changed the title Not able to cross-compile py_binary in Bazel 7 Not able to cross-compile py_binary in Bazel 7 without a cc toolchain May 13, 2024
@rickeylev
Copy link
Collaborator

I think I ran into this with the RBE tests and the analysis test for precompiling. I think this can be repro'd on linux by building for another platform and trying to run the analysis tests.

bazel test //tests/base_rules/py_binary/... --toolchain_resolution_debug=cpp:toolchain_type --platforms=//tests/support:windows_x86_64

I'm not sure where the toolchain lookup is happening, though. I started to comment out all the places where the cc toolchain is mentioned, but something is still trying to resolve the cc toolchain. Additionally, I noticed the cc toolchain is marked as optional.

@rickeylev
Copy link
Collaborator

Aha, i think i found it: the implicit _launcher attribute:

  • attr _launcher ->
  • alias @bazel_tools//tools/launcher:launcher ->
    • windows: alias @bazel_tools//tools/launcher:launcher_windows ->
      • remote: cc_binary @bazel_tools//src/tools/launcher:launcher
      • default: file @bazel_tools//tools/launcher:launcher.exe
    • default: cc_binary @bazel_tools//src/tools/launcher:launcher

I think the fix here is to have the launcher attribute point to a select. It's only used for windows. Other platforms should just point to a no-op dependency. This should be a pretty easy fix.

My guess is, if we dig up the old Java code, we'll find this attribute was a one of those old dynamically-computed-at-the-java-level attributes and it got a null value for non-windows platforms.

@rickeylev
Copy link
Collaborator

posterity: --action_env=BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1 can also cause this. The RBE tests set this. It prevents the cc toolchain from being auto detected and registered.

@rickeylev
Copy link
Collaborator

I'm pretty sure this was fixed by #1902; I fixed it as part of that PR since CI was failing there, too (it was also failing at main under the right circumstances). I moved the launcher to an alias with a select so that only windows has the launcher dependency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants