-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fetch RepoSpec
s in parallel
#19294
Fetch RepoSpec
s in parallel
#19294
Conversation
I briefly looked into whether the repo spec computation could be moved out of the |
The flaky test may be related to this PR. |
entry.getKey(), | ||
moduleFromInterimModule( | ||
entry.getValue(), overrides.get(entry.getKey().getName()), eventHandler)); | ||
ExecutorService executorService = Executors.newWorkStealingPool(8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all I'll admit I'm very unfamiliar with the Java parallel stream stuff, so apologies if I make any rookie mistakes here :)
From my read of things, it looks like we don't even need to have this executor service at all? We could just directly call collect
on a parallel stream and Java would use the global fork join pool (ForkJoinPool.commonPool()
) underneath. That's probably better than using a dedicated pool (as the global fork join pool is also used to carry virtual threads).
But I'm not sure what happens when an exception is thrown in that case. In sequential streams, I think the exception just gets propagated out, so we just need to "tunnel" it. Do you know what happens in parallel streams?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with using the common pool is that it has NUM_CPUS - 1
threads by default, which can interact badly with other parallel operations relying on it, especially since we are blocking on IO.
Exceptions in parallel streams are also thrown by the terminal operation. I think that the first throw "wins", but I'm not sure about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. I did some more reading on parallel streams and I increasingly feel like they're not a great fit for our use case (maybe I just don't understand them well enough). My main gripe is the fact that it tries to do all the MapReduce-esque splitting and merging which is basically completely unnecessary here where we're just doing a map and basically no reduce. Plus the concurrency primitives are completely hidden and it works "magically" depending on whether you're already in a ForkJoinPool or not; and this doesn't seem to be documented anywhere in the official JDK -- I had to believe StackOverflow. (For example, I'm not confident at all whether the introduction of Loom would automatically cause parallel stream processing to use virtual threads.)
On the other hand, I'll admit that the magic isn't all bad -- just slapping .parallel()
on the stream and having it work is pretty neat.
I'd probably have preferred to use Skyframe for this; that is, create a RepoSpecFunction/Value and just give everything to Skyframe. To avoid the restart, we could use a SkyKeyComputeState to store everything right before the repo spec computation step. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @lberki
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the lack of official documentation on parallel streams' internals really hurts their usefulness. I can only hope that this explained by the desire to switch them over to a virtual thread executor by default once available - one can dream.
I don't quite see the problem with us not using the full feature set of a MapReduce. Isn't turning a list of values into a LinkedHashMap
still an (associative) reduction? But we do need to believe StackOverflow that streams are just relying on ForkJoinPool#fork
and thus just depens on the ForkJoinPool
implementation that we started with (#commonPool
or a custom one).
I am fine with switching to Skyframe here though, especially as it avoids anyone reading the code from having to reason about another concurrency framework.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I found a way to extract this work into a SkyFunction, please take a look. I am a bit unsure about how to implement Registry
's equality correctly, which made me worried about using it in a SkyKey
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After spending like 10 seconds looking at this change, I don't think Registry
is appropriate to be included in a SkyKey
: SkyKey
s are supposed to be small and serializable value objects, and IndexRegistry
keeps references to a DownloadManager
and a Gson
instance, which means that they are not value objects at all.
From a quick glance, it looks like each Registry
instance comes from a URI in ModuleFileFunction.REGISTRIES
so maybe the URI could be used in the SkyKey instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I trimmed this down to a ModuleKey
and a String
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that sounds much nicer. I'll leave the review to @Wyverald since he is much more familiar with the area (unless he indicates that he could use my brain)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite see the problem with us not using the full feature set of a MapReduce. Isn't turning a list of values into a LinkedHashMap still an (associative) reduction?
That was poorly worded on my part -- what I wanted to say was that the extra considerations about splitting the stream into appropriately sized substreams so that it could be sent to child ForkJoinTasks and then merging the results together are kind of wasteful. Yes technically a "map" operation is a "reduce" operation itself too, but we really don't need a tree structure for this. Just linearly going over the keys and sending each one off for a download will suffice.
So I'm very happy with the new Skyframe-based implementation :)
src/main/java/com/google/devtools/build/lib/bazel/bzlmod/BazelModuleResolutionFunction.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/bazel/bzlmod/BazelModuleResolutionFunction.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/bazel/bzlmod/BazelModuleResolutionFunction.java
Outdated
Show resolved
Hide resolved
d1dfd64
to
df70c95
Compare
@Wyverald I stacked this onto the profiling PR for the moment. |
@lberki Do you happen to know how the value of |
Nope, sorry. I checked the change history but there doesn't seem to be any indication as to how the author of the change that added the parallelism arrived at that number. Naively, 8 seems to be both too small and too unconfigurable, but I'm saying this with literally zero research, so that it with a grain of salt. |
src/main/java/com/google/devtools/build/lib/bazel/bzlmod/RepoSpecFunction.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/bazel/bzlmod/BazelModuleResolutionFunction.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/bazel/bzlmod/BazelModuleResolutionFunction.java
Outdated
Show resolved
Hide resolved
Also please rebase; the profiling PR is merged :) |
@@ -158,6 +158,7 @@ public final class SkyFunctions { | |||
SkyFunctionName.createHermetic("BAZEL_DEP_GRAPH"); | |||
public static final SkyFunctionName BAZEL_LOCK_FILE = | |||
SkyFunctionName.createHermetic("BAZEL_LOCK_FILE"); | |||
public static final SkyFunctionName REPO_SPEC = SkyFunctionName.createNonHermetic("REPO_SPEC"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just an observation: I noticed that BAZEL_MODULE_RESOLUTION
was marked hermetic even though it probably wasn't before this PR (due to the registry fetches).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, that does look like an oversight. However in practice this shouldn't matter much since we assume existing registry contents are immutable anyway.
Computing `RepoSpec`s for all selected Bazel modules is on the critical path for the computation of the main repository mapping and thus benefits from parallelized downloads. On my machine, this change has the following effect on `bazel build //src:bazel-dev --enable_bzlmod --nobuild`: ``` before compute main repo mapping: 8s 127ms after compute main repo mapping: 4s 226ms ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
Thanks for pushing me towards the Skyframe solution. A side effect of that is that trivial changes to the root module file now no longer result in any network I/O (assuming the Skyframe cache is retained during invocations). |
@bazel-io flag |
@bazel-io fork 6.4.0 |
Computing `RepoSpec`s for all selected Bazel modules is on the critical path for the computation of the main repository mapping and thus benefits from parallelized downloads. On my machine, this change has the following effect on `bazel build //src:bazel-dev --enable_bzlmod --nobuild`: ``` before compute main repo mapping: 8s 127ms after compute main repo mapping: 4s 226ms ``` Closes bazelbuild#19294. PiperOrigin-RevId: 559819452 Change-Id: Ieef957fcfe402c909d2863ba4a4ca3540781a56d
Computing `RepoSpec`s for all selected Bazel modules is on the critical path for the computation of the main repository mapping and thus benefits from parallelized downloads. On my machine, this change has the following effect on `bazel build //src:bazel-dev --enable_bzlmod --nobuild`: ``` before compute main repo mapping: 8s 127ms after compute main repo mapping: 4s 226ms ``` Closes bazelbuild#19294. PiperOrigin-RevId: 559819452 Change-Id: Ieef957fcfe402c909d2863ba4a4ca3540781a56d
Computing `RepoSpec`s for all selected Bazel modules is on the critical path for the computation of the main repository mapping and thus benefits from parallelized downloads. On my machine, this change has the following effect on `bazel build //src:bazel-dev --enable_bzlmod --nobuild`: ``` before compute main repo mapping: 8s 127ms after compute main repo mapping: 4s 226ms ``` Closes #19294. Commit 8a68310 PiperOrigin-RevId: 559819452 Change-Id: Ieef957fcfe402c909d2863ba4a4ca3540781a56d
The changes in this PR have been included in Bazel 6.4.0 RC1. Please test out the release candidate and report any issues as soon as possible. If you're using Bazelisk, you can point to the latest RC by setting USE_BAZEL_VERSION=last_rc. |
Computing
RepoSpec
s for all selected Bazel modules is on the critical path for the computation of the main repository mapping and thus benefits from parallelized downloads.On my machine, this change has the following effect on
bazel build //src:bazel-dev --enable_bzlmod --nobuild
: