Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't include pyc files as part of the hermetic toolchain #713

Closed
wants to merge 2 commits into from

Conversation

tomgr
Copy link

@tomgr tomgr commented May 25, 2022

PR Checklist

Please check if your PR fulfills the following requirements:

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)

I'm not sure how to test this.

PR Type

What kind of change does this PR introduce?

  • Bugfix
  • Feature (please, look at the "Scope of the project" section in the README.md file)
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • CI related changes
  • Documentation content changes
  • Other... Please describe:

What is the current behavior?

Using rules_python with the hermetic toolchain via python_register_toolchains results in cache misses when using bazel with a remote cache. This appears to be because pyc files that are generated on the build host are included as part of one of the globs, but since pyc files don't appear to be generated deterministically. For example, from one run we saw one action that used the hermetic toolchain had the following file as an input:

        {
          "path": "external/python3_9_x86_64-pc-windows-msvc/lib/__pycache__/_markupbase.cpython-39.pyc",
          "digest": "acc87949c12c0a73eace38bed1a4d9e311ff283fe29a7e28bdfb339ff267ed27"
        },

but from another run we saw:

        {
          "path": "external/python3_9_x86_64-pc-windows-msvc/lib/__pycache__/_markupbase.cpython-39.pyc",
          "digest": "afb43608c750e3f7d4dcb0c968aec6d84f678cebc506666325b43c24ce964e5a"
        },

The hashes above are taken from the bazel execution log as per https://bazel.build/docs/remote-execution-caching-debug.
Note that although the above are with the windows toolchain, we saw the same behaviour on all MacOS, Linux, and Windows.

What is the new behavior?

pyc files are excluded from the glob. We're no longer seeing cache misses when using a cache.

Does this PR introduce a breaking change?

  • Yes
  • No

pyc files don't appear to be generated deterministically, so including
them as part of the :files filegroup means that actions that use the
hermetic toolchain do not have deterministic hashes, and so bazel
needlessly re-executes them. This can be seen when using bazel together
with a remote cache: builds on separate machines will generate pyc files
with different hashes, leading to cache misses.
@tomgr tomgr requested review from brandjon and lberki as code owners May 25, 2022 07:06
Copy link
Collaborator

@aignas aignas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thank you!

@thundergolfer thundergolfer requested review from f0rmiga and removed request for brandjon and lberki May 31, 2022 05:52
@tomgr
Copy link
Author

tomgr commented May 31, 2022

No problem - thanks for merging!

Copy link
Collaborator

@f0rmiga f0rmiga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I first would like to understand why this happens before simply excluding .pyc files. To clarify, we download the python toolchain deterministically, then make the directory tree read-only. So, excluding the .pyc sounds like burying the real problem.

@f0rmiga
Copy link
Collaborator

f0rmiga commented May 31, 2022

@tomgr You said you see this on all platforms, right? The hash you posted is from the Windows toolchain, which we don't make the tree read-only.

@f0rmiga
Copy link
Collaborator

f0rmiga commented May 31, 2022

@tomgr Could you check if 2485990 solves the problem under Windows?

@f0rmiga
Copy link
Collaborator

f0rmiga commented May 31, 2022

Sorry, 43bdb71 should be the one...

@tomgr
Copy link
Author

tomgr commented May 31, 2022

I've just tried using 43bdb71 and it still has the same problem. And just to confirm: we saw this on all three platforms, and this patch appears to have completely solved the issue. We've been using it for all of our CI runs for the last week, and our builds now complete entirely from the cache if there are no changes. Previously, whenever we had a fresh VM bazel would rebuild all of the targets that used python.

As an example, with your latest patch, I again saw the hash of _markupbase.cpython-39.pyc change. For example, on the linux builds on one run we had:

        {
          "path": "external/python3_9_aarch64-unknown-linux-gnu/lib/python3.9/__pycache__/_markupbase.cpython-39.pyc",
          "digest": "3e3c8aaadc6a61e09acbe7cacc9082871fc017209b961c777eeb95f3e59a4507"
        },

and on a second run:

        {
          "path": "external/python3_9_aarch64-unknown-linux-gnu/lib/python3.9/__pycache__/_markupbase.cpython-39.pyc",
          "digest": "ae9d198fb89c74012b2613c07057af98fa8d0cd8f70c1001e0c4e039fba07307"
        },

The same was also true on windows with the first run containing:

        {
          "path": "external/python3_9_x86_64-pc-windows-msvc/lib/__pycache__/_markupbase.cpython-39.pyc",
          "digest": "ee128cd267464d5e68f3b1dd1c4a57d422b7490538d517bc1f5d37b4de178ca5"
        },

and the second run:

        {
          "path": "external/python3_9_x86_64-pc-windows-msvc/lib/__pycache__/_markupbase.cpython-39.pyc",
          "digest": "0b57d0d519df845a69ec5d460a8682e4641c7b78dc70ab3737ce23b055e3c98b"
        },

It seems that not all python modules suffer from the problem. Looking at the logs I see the issue with: gettext, markupbase, typing, pathlib, socket, site, zipfile, hashlib and message, but not any others, despite the fact we do use more modules than that. There's no obvious pattern from this.

I presume that the read-only permissions must be being overridden. We do run our Linux builds inside a docker container with the repository cache mounted as a volume. However our Windows and Mac builds run directly on a VM and we still see the problem on there: each different VM creates different pyc files.

We do run bazel using a specific --repository_cache, otherwise I don't think we use any bazel flags that could affect this.

@f0rmiga
Copy link
Collaborator

f0rmiga commented Jun 1, 2022

@tomgr When you say "first run" and then "second run", is there any bazel clean involved between them, or different machines? Can you ever get into this state from the same machine without cleaning the Bazel cache?

@f0rmiga
Copy link
Collaborator

f0rmiga commented Jun 1, 2022

It throws me off that somehow read-only is not taking any effects.
Nevertheless, we could be hitting https://docs.python.org/3/reference/import.html#pyc-invalidation. I have to check if the metadata of those files changes between extractions.

@tomgr
Copy link
Author

tomgr commented Jun 1, 2022

@tomgr When you say "first run" and then "second run", is there any bazel clean involved between them, or different machines? Can you ever get into this state from the same machine without cleaning the Bazel cache?

There's no bazel clean, but I was deliberately running them on separate machines to reproduce the issue. I've never noticed it happening from the same machine: it seems that once the pyc files have been modified, they are then stable. Although we run our Linux builds inside Docker containers, we do share the repository cache between the containers. I suspect that removing the shared repository cache would reproduce the same issue.

Is there any extra data that I can gather to help debug why the read-only setting is not working?

@mattoberle
Copy link
Contributor

@tomgr any chance your builds are running as root inside the Docker containers?
When running as a user with CAP_DAC_OVERRIDE the read-only mode on the hermetic toolchain directory is ignored.

@thundergolfer thundergolfer added the need: discussion On the agenda for next team meeting label Jun 21, 2022
@alexgartner-bc
Copy link

@tomgr any chance your builds are running as root inside the Docker containers? When running as a user with CAP_DAC_OVERRIDE the read-only mode on the hermetic toolchain directory is ignored.

Interesting. I'm seeing similar issues but only on our stateless build servers (which are running builds as root) vs our stateful build servers which run builds as uid 1000.

This only started happening after we configured python_register_toolchains. I noticed it because our rules_docker python images started permadiffing:

container-diff diff
$ container-diff diff --type=history --type=file us-west2-docker.pkg.dev/REDACTED/oci/mock_isv@sha256:45de6b34dddce7c24935a5d2f3706973750d06a9f3d27ad2f8f23a9894036ba2 us-west2-docker.pkg.dev/REDACTED/oci/mock_isv@sha256:ee7655603252efbf1d852f7d065622c6d3017c2bb01d8e2ae516183c5de68b09

-----File-----

These entries have been added to us-west2-docker.pkg.dev/REDACTED/oci/mock_isv@sha256:45de6b34dddce7c24935a5d2f3706973750d06a9f3d27ad2f8f23a9894036ba2: None

These entries have been deleted from us-west2-docker.pkg.dev/REDACTED/oci/mock_isv@sha256:45de6b34dddce7c24935a5d2f3706973750d06a9f3d27ad2f8f23a9894036ba2: None

These entries have been changed between us-west2-docker.pkg.dev/REDACTED/oci/mock_isv@sha256:45de6b34dddce7c24935a5d2f3706973750d06a9f3d27ad2f8f23a9894036ba2 and us-west2-docker.pkg.dev/REDACTED/oci/mock_isv@sha256:ee7655603252efbf1d852f7d065622c6d3017c2bb01d8e2ae516183c5de68b09:
FILE                                                                                                                                                                 SIZE1        SIZE2
/app/src/python/sense/services/mock_isv/image.binary.runfiles/python3_8_x86_64-unknown-linux-gnu/lib/python3.8/__pycache__/zipfile.cpython-38.pyc                    57.3K        57.3K
/app/src/python/sense/services/mock_isv/image.binary.runfiles/python3_8_x86_64-unknown-linux-gnu/lib/python3.8/asyncio/__pycache__/base_events.cpython-38.pyc        49.9K        49.9K
/app/src/python/sense/services/mock_isv/image.binary.runfiles/python3_8_x86_64-unknown-linux-gnu/lib/python3.8/__pycache__/pathlib.cpython-38.pyc                    43.3K        43.3K
/app/src/python/sense/services/mock_isv/image.binary.runfiles/python3_8_x86_64-unknown-linux-gnu/lib/python3.8/json/__pycache__/decoder.cpython-38.pyc               9.7K         9.7K
/app/src/python/sense/services/mock_isv/image.binary.runfiles/python3_8_x86_64-unknown-linux-gnu/lib/python3.8/__pycache__/_markupbase.cpython-38.pyc                7.7K         7.7K
/app/src/python/sense/services/mock_isv/image.binary.runfiles/python3_8_x86_64-unknown-linux-gnu/lib/python3.8/__pycache__/hashlib.cpython-38.pyc                    6.7K         6.7K


-----History-----

Docker history lines found only in us-west2-docker.pkg.dev/REDACTED/oci/mock_isv@sha256:45de6b34dddce7c24935a5d2f3706973750d06a9f3d27ad2f8f23a9894036ba2: None

Docker history lines found only in us-west2-docker.pkg.dev/REDACTED/oci/mock_isv@sha256:ee7655603252efbf1d852f7d065622c6d3017c2bb01d8e2ae516183c5de68b09: None

Sounds like an interim solution might be capsh --drop=CAP_DAC_OVERRIDE before the build?

@f0rmiga
Copy link
Collaborator

f0rmiga commented Jul 1, 2022

A better idea is not to run as root!

@f0rmiga f0rmiga closed this Jul 1, 2022
@alexgartner-bc
Copy link

alexgartner-bc commented Jul 1, 2022

A better ida is not to run as root!

I generally agree but we are currently using google cloud build which does not allow you to run as non-root (afaik). Just fyi for anyone else browsing this thread later.

Update: actually there is a way

This was referenced Jul 12, 2022
@phlax
Copy link
Contributor

phlax commented Aug 15, 2022

i use a temporary container locally for working on a project using rules_python which currently uses root inside

i get this is not "best-practice" but im well aware of the risks and in this case there arent really any

it seems a bit arbitrary to enforce that you cant use rules_python at all as root

@Cartman75
Copy link

BTw, its seems impossible to run this in gitlab because it forces root. Did someone figure out how to get around that?

I even tried:

      sudo -H -i -u citests bash -e -x << EOS
      whoami
      cd \$CI_BUILDS_DIR
      ./bazelisk --output_base bazel_cache test //...
      EOS

and claimed it was it running as root and it was a no-go.

@keith
Copy link
Member

keith commented Sep 23, 2022

FYI there is a setting to opt out of this root warning, https://github.com/bazelbuild/rules_python//commit/e67e7dd719d34d5a13c15b24d0234c1ac753b52d

@SlausB
Copy link

SlausB commented Oct 17, 2023

Modern docker utilizes user mapping through uidmap to implement rootless setup where root user inside docker nicely maps to host user - finally solving annoying ownership/permission issues.
If you create another user inside docker (one would suggest it as a solution to this issue), then such user won't be mapped to host user anymore.
I am currently trying to buiild/run biomes-game inside docker which uses npm to run python to run bazel to run yarn which fails with "error: you are root". Reinstalling everything as non-root will be a headache. It's huge mess.

@learnsomethingtoday
Copy link

@ph03 were you able to find any solution to run this as non-root user via gitlab?

@Cartman75
Copy link

I can say gitlab is getting closer to allowing it:
https://gitlab.com/gitlab-org/gitlab/-/merge_requests/137907

@learnsomethingtoday
Copy link

learnsomethingtoday commented Jan 10, 2024

@Cartman75 I see, so do we need to wait until this is rolled out and then proceed?
Or did u find any other way to do this..?

@Cartman75
Copy link

I did do this recently and it seems to be working. This is when I moved to MODULE.bazel

python = use_extension("@rules_python//python/extensions:python.bzl", "python")
python.toolchain(
    configure_coverage_tool = False,
    ignore_root_user_error = True,
    python_version = "3.11",
)

Pointing to ignore_root_user_error but posted the rest for context.

@learnsomethingtoday
Copy link

I'm sorry I don't know where to do this change, can you please let me know regarding that.

@Cartman75
Copy link

That I won't be able to walk you through. That is a whole thing if you do not understand what I am doing.
https://bazel.build/external/migration
But we are using bzlmod to load our external modules.

@ph03
Copy link

ph03 commented Jan 12, 2024

@ph03 were you able to find any solution to run this as non-root user via gitlab?

I don't recall the details anymore, but we're happily running gitlab CI without root user by using an image that setup a non-root user like this in the related Dockerfile

# Run as developer
RUN \
    useradd --create-home --groups sudo,docker developer && \
    echo "developer:developer" | sudo chpasswd
USER developer

@debkanchan
Copy link

This is causing GCP cloud build to fail

@layus
Copy link

layus commented Jun 25, 2024

Did anyone ever consider and/or test chattr +i to make files and folder immutable, even for root ?

kyakdan added a commit to CodeIntelligenceTesting/oss-fuzz that referenced this pull request Aug 18, 2024
This is a requirement for one dependency. Otherwise, we get the error:
"The current user is root, please run as non-root when using
the hermetic Python interpreter. See
bazelbuild/rules_python#713."
kyakdan added a commit to CodeIntelligenceTesting/oss-fuzz that referenced this pull request Aug 19, 2024
This is a requirement for one dependency. Otherwise, we get the error:
"The current user is root, please run as non-root when using
the hermetic Python interpreter. See
bazelbuild/rules_python#713."
kyakdan added a commit to CodeIntelligenceTesting/oss-fuzz that referenced this pull request Aug 19, 2024
This is a requirement for one dependency. Otherwise, we get the error:
"The current user is root, please run as non-root when using
the hermetic Python interpreter. See
bazelbuild/rules_python#713."
@loeffel-io
Copy link

Hey, for clarification

I do need to add this

python = use_extension("@rules_python//python/extensions:python.bzl", "python")
python.toolchain(
    configure_coverage_tool = False,
    ignore_root_user_error = True,
    python_version = "3.11",
)

while one of my deps is using rules_python for my builds since i do not "use" rules_python?

it works btw but this does not feel good

Thanks

fweikert added a commit to fweikert/bazel that referenced this pull request Dec 3, 2024
Our CI fails to build Bazel, producing this error: "The current user is root, please run as non-root when using the hermetic Python interpreter. See bazelbuild/rules_python#713."

The workaround was suggested at bazelbuild/rules_python#1169 (comment)
copybara-service bot pushed a commit to bazelbuild/bazel that referenced this pull request Dec 3, 2024
Our CI fails to build Bazel, producing this error: "The current user is root, please run as non-root when using the hermetic Python interpreter. See bazelbuild/rules_python#713."

The workaround was suggested at bazelbuild/rules_python#1169 (comment)

Closes #24549.

PiperOrigin-RevId: 702362811
Change-Id: Ib6d4da1e09fd2b7dfd68af02bbb0eb997e8f427c
github-merge-queue bot pushed a commit to bazelbuild/bazel that referenced this pull request Dec 4, 2024
Our CI fails to build Bazel, producing this error: "The current user is
root, please run as non-root when using the hermetic Python interpreter.
See bazelbuild/rules_python#713."

The workaround was suggested at
bazelbuild/rules_python#1169 (comment)

Closes #24549.

PiperOrigin-RevId: 702362811
Change-Id: Ib6d4da1e09fd2b7dfd68af02bbb0eb997e8f427c

Co-authored-by: Florian Weikert <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need: discussion On the agenda for next team meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.