-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't include pyc files as part of the hermetic toolchain #713
Conversation
pyc files don't appear to be generated deterministically, so including them as part of the :files filegroup means that actions that use the hermetic toolchain do not have deterministic hashes, and so bazel needlessly re-executes them. This can be seen when using bazel together with a remote cache: builds on separate machines will generate pyc files with different hashes, leading to cache misses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Thank you!
No problem - thanks for merging! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I first would like to understand why this happens before simply excluding .pyc
files. To clarify, we download the python toolchain deterministically, then make the directory tree read-only. So, excluding the .pyc
sounds like burying the real problem.
@tomgr You said you see this on all platforms, right? The hash you posted is from the Windows toolchain, which we don't make the tree read-only. |
Sorry, 43bdb71 should be the one... |
I've just tried using 43bdb71 and it still has the same problem. And just to confirm: we saw this on all three platforms, and this patch appears to have completely solved the issue. We've been using it for all of our CI runs for the last week, and our builds now complete entirely from the cache if there are no changes. Previously, whenever we had a fresh VM bazel would rebuild all of the targets that used python. As an example, with your latest patch, I again saw the hash of
and on a second run:
The same was also true on windows with the first run containing:
and the second run:
It seems that not all python modules suffer from the problem. Looking at the logs I see the issue with: gettext, markupbase, typing, pathlib, socket, site, zipfile, hashlib and message, but not any others, despite the fact we do use more modules than that. There's no obvious pattern from this. I presume that the read-only permissions must be being overridden. We do run our Linux builds inside a docker container with the repository cache mounted as a volume. However our Windows and Mac builds run directly on a VM and we still see the problem on there: each different VM creates different pyc files. We do run bazel using a specific |
@tomgr When you say "first run" and then "second run", is there any |
It throws me off that somehow read-only is not taking any effects. |
There's no Is there any extra data that I can gather to help debug why the read-only setting is not working? |
@tomgr any chance your builds are running as |
Interesting. I'm seeing similar issues but only on our stateless build servers (which are running builds as root) vs our stateful build servers which run builds as uid 1000. This only started happening after we configured container-diff diff
Sounds like an interim solution might be |
A better idea is not to run as root! |
I generally agree but we are currently using google cloud build which does not allow you to run as non-root (afaik). Just fyi for anyone else browsing this thread later. Update: actually there is a way |
i use a temporary container locally for working on a project using i get this is not "best-practice" but im well aware of the risks and in this case there arent really any it seems a bit arbitrary to enforce that you cant use |
BTw, its seems impossible to run this in gitlab because it forces root. Did someone figure out how to get around that? I even tried:
and claimed it was it running as root and it was a no-go. |
FYI there is a setting to opt out of this root warning, https://github.com/bazelbuild/rules_python//commit/e67e7dd719d34d5a13c15b24d0234c1ac753b52d |
Modern docker utilizes user mapping through uidmap to implement rootless setup where root user inside docker nicely maps to host user - finally solving annoying ownership/permission issues. |
@ph03 were you able to find any solution to run this as non-root user via gitlab? |
I can say gitlab is getting closer to allowing it: |
@Cartman75 I see, so do we need to wait until this is rolled out and then proceed? |
I did do this recently and it seems to be working. This is when I moved to MODULE.bazel
Pointing to |
I'm sorry I don't know where to do this change, can you please let me know regarding that. |
That I won't be able to walk you through. That is a whole thing if you do not understand what I am doing. |
I don't recall the details anymore, but we're happily running gitlab CI without root user by using an image that setup a non-root user like this in the related Dockerfile
|
This is causing GCP cloud build to fail |
Did anyone ever consider and/or test |
This is a requirement for one dependency. Otherwise, we get the error: "The current user is root, please run as non-root when using the hermetic Python interpreter. See bazelbuild/rules_python#713."
This is a requirement for one dependency. Otherwise, we get the error: "The current user is root, please run as non-root when using the hermetic Python interpreter. See bazelbuild/rules_python#713."
This is a requirement for one dependency. Otherwise, we get the error: "The current user is root, please run as non-root when using the hermetic Python interpreter. See bazelbuild/rules_python#713."
Hey, for clarification I do need to add this python = use_extension("@rules_python//python/extensions:python.bzl", "python")
python.toolchain(
configure_coverage_tool = False,
ignore_root_user_error = True,
python_version = "3.11",
) while one of my deps is using it works btw but this does not feel good Thanks |
Our CI fails to build Bazel, producing this error: "The current user is root, please run as non-root when using the hermetic Python interpreter. See bazelbuild/rules_python#713." The workaround was suggested at bazelbuild/rules_python#1169 (comment)
Our CI fails to build Bazel, producing this error: "The current user is root, please run as non-root when using the hermetic Python interpreter. See bazelbuild/rules_python#713." The workaround was suggested at bazelbuild/rules_python#1169 (comment) Closes #24549. PiperOrigin-RevId: 702362811 Change-Id: Ib6d4da1e09fd2b7dfd68af02bbb0eb997e8f427c
Our CI fails to build Bazel, producing this error: "The current user is root, please run as non-root when using the hermetic Python interpreter. See bazelbuild/rules_python#713." The workaround was suggested at bazelbuild/rules_python#1169 (comment) Closes #24549. PiperOrigin-RevId: 702362811 Change-Id: Ib6d4da1e09fd2b7dfd68af02bbb0eb997e8f427c Co-authored-by: Florian Weikert <[email protected]>
PR Checklist
Please check if your PR fulfills the following requirements:
I'm not sure how to test this.
PR Type
What kind of change does this PR introduce?
What is the current behavior?
Using rules_python with the hermetic toolchain via
python_register_toolchains
results in cache misses when using bazel with a remote cache. This appears to be because pyc files that are generated on the build host are included as part of one of the globs, but since pyc files don't appear to be generated deterministically. For example, from one run we saw one action that used the hermetic toolchain had the following file as an input:but from another run we saw:
The hashes above are taken from the bazel execution log as per https://bazel.build/docs/remote-execution-caching-debug.
Note that although the above are with the windows toolchain, we saw the same behaviour on all MacOS, Linux, and Windows.
What is the new behavior?
pyc files are excluded from the glob. We're no longer seeing cache misses when using a cache.
Does this PR introduce a breaking change?