-
Notifications
You must be signed in to change notification settings - Fork 547
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: compile source files at build time (#1902)
This implements precompiling: performing Python source to byte code compilation at build time. This allows improved program startup time by allowing the byte code compilation step to be skipped at runtime. Precompiling is disabled by default, for now. A subsequent release will enable it by default. This allows the necessary flags and attributes to become available so users can opt-out prior to it being enabled by default. Similarly, `//python:features.bzl` is introduced to allow feature detection. This implementation is made to serve a variety of use cases, so there are several attributes and flags to control behavior. The main use cases being served are: * Large mono-repos that need to incrementally enable/disable precompiling. * Remote execution builds, where persistent workers aren't easily available. * Environments where toolchains are custom defined instead of using the ones created by rules_python. To that end, there are several attributes and flags to control behavior, and the toolchains allow customizing the tools used. Fixes #1761
- Loading branch information
Showing
39 changed files
with
1,976 additions
and
74 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -60,6 +60,7 @@ pypi-dependencies | |
toolchains | ||
pip | ||
coverage | ||
precompiling | ||
gazelle | ||
Contributing <contributing> | ||
support | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
# Precompiling | ||
|
||
Precompiling is compiling Python source files (`.py` files) into byte code (`.pyc` | ||
files) at build | ||
time instead of runtime. Doing it at build time can improve performance by | ||
skipping that work at runtime. | ||
|
||
Precompiling is enabled by default, so there typically isn't anything special | ||
you must do to use it. | ||
|
||
|
||
## Overhead of precompiling | ||
|
||
While precompiling helps runtime performance, it has two main costs: | ||
1. Increasing the size (count and disk usage) of runfiles. It approximately | ||
double the count of the runfiles because for every `.py` file, there is also | ||
a `.pyc` file. Compiled files are generally around the same size as the | ||
source files, so it approximately doubles the disk usage. | ||
2. Precompiling requires running an extra action at build time. While | ||
compiling itself isn't that expensive, the overhead can become noticable | ||
as more files need to be compiled. | ||
|
||
## Binary-level opt-in | ||
|
||
Because of the costs of precompiling, it may not be feasible to globally enable it | ||
for your repo for everything. For example, some binaries may be | ||
particularly large, and doubling the number of runfiles isn't doable. | ||
|
||
If this is the case, there's an alternative way to more selectively and | ||
incrementally control precompiling on a per-binry basis. | ||
|
||
To use this approach, the two basic steps are: | ||
1. Disable pyc files from being automatically added to runfiles: | ||
`--@rules_python//python/config_settings:precompile_add_to_runfiles=decided_elsewhere`, | ||
2. Set the `pyc_collection` attribute on the binaries/tests that should or should | ||
not use precompiling. | ||
|
||
The default for the `pyc_collection` attribute is controlled by a flag, so you | ||
can use an opt-in or opt-out approach by setting the flag: | ||
* targets must opt-out: `--@rules_python//python/config_settings:pyc_collection=include_pyc`, | ||
* targets must opt-in: `--@rules_python//python/config_settings:pyc_collection=disabled`, | ||
|
||
## Advanced precompiler customization | ||
|
||
The default implementation of the precompiler is a persistent, multiplexed, | ||
sandbox-aware, cancellation-enabled, json-protocol worker that uses the same | ||
interpreter as the target toolchain. This works well for local builds, but may | ||
not work as well for remote execution builds. To customize the precompiler, two | ||
mechanisms are available: | ||
|
||
* The exec tools toolchain allows customizing the precompiler binary used with | ||
the `precompiler` attribute. Arbitrary binaries are supported. | ||
* The execution requirements can be customized using | ||
`--@rules_python//tools/precompiler:execution_requirements`. This is a list | ||
flag that can be repeated. Each entry is a key=value that is added to the | ||
execution requirements of the `PyPrecompile` action. Note that this flag | ||
is specific to the rules_python precompiler. If a custom binary is used, | ||
this flag will have to be propagated from the custom binary using the | ||
`testing.ExecutionInfo` provider; refer to the `py_interpreter_program` an | ||
|
||
The default precompiler implementation is an asynchronous/concurrent | ||
implementation. If you find it has bugs or hangs, please report them. In the | ||
meantime, the flag `--worker_extra_flag=PyPrecompile=--worker_impl=serial` can | ||
be used to switch to a synchronous/serial implementation that may not perform | ||
as well, but is less likely to have issues. | ||
|
||
The `execution_requirements` keys of most relevance are: | ||
* `supports-workers`: 1 or 0, to indicate if a regular persistent worker is | ||
desired. | ||
* `supports-multiplex-workers`: 1 o 0, to indicate if a multiplexed persistent | ||
worker is desired. | ||
* `requires-worker-protocol`: json or proto; the rules_python precompiler | ||
currently only supports json. | ||
* `supports-multiplex-sandboxing`: 1 or 0, to indicate if sanboxing is of the | ||
worker is supported. | ||
* `supports-worker-cancellation`: 1 or 1, to indicate if requests to the worker | ||
can be cancelled. | ||
|
||
Note that any execution requirements values can be specified in the flag. | ||
|
||
## Known issues, caveats, and idiosyncracies | ||
|
||
* Precompiling requires Bazel 7+ with the Pystar rule implementation enabled. | ||
* Mixing rules_python PyInfo with Bazel builtin PyInfo will result in pyc files | ||
being dropped. | ||
* Precompiled files may not be used in certain cases prior to Python 3.11. This | ||
occurs due Python adding the directory of the binary's main `.py` file, which | ||
causes the module to be found in the workspace source directory instead of | ||
within the binary's runfiles directory (where the pyc files are). This can | ||
usually be worked around by removing `sys.path[0]` (or otherwise ensuring the | ||
runfiles directory comes before the repos source directory in `sys.path`). | ||
* The pyc filename does not include the optimization level (e.g. | ||
`foo.cpython-39.opt-2.pyc`). This works fine (it's all byte code), but also | ||
means the interpreter `-O` argument can't be used -- doing so will cause the | ||
interpreter to look for the non-existent `opt-N` named files. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.