-
-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent behavior when parallel=1 #461
Comments
Huh, great point. Will fix ASAP |
Parallel=1 behaviour, comes from following code which was added in 2.3.0: static <T, R, RR> Collector<T, ?, CompletableFuture<RR>> asyncCollector(Function<T, R> mapper, Executor executor, Function<Stream<R>, RR> finisher) {
return collectingAndThen(toList(), list -> supplyAsync(() -> finisher.apply(list.stream().map(mapper)), executor));
} Future is |
Yeah, I know. I even wrote an article about this behaviour of CompletableFuture 🤦♂ - I tried to outsmart the tool by providing a simplified "parallel" collectors for parallelism == 1, but forgot about such edge cases... |
Initial fix is here: #462 I will see if I can come up with something better than a revert, though |
An ultimate fix might be cumbersome a bit but it's still way lighter than reverting to the full-blown parallel collector infrastructure:
|
Released in https://github.com/pivovarit/parallel-collectors/releases/tag/2.3.2 (give it some time to propagate to Maven Central, usually around 12-24h) Thanks for the investigation and making the tool better👏 |
I have just stumbled upon this issue (running the same code on my Windows machine vs some small AWS instance running on Linux) and could not initially figure out what was going on as nothing was executed on my |
Keep in mind that there's one more overload where you can provide the parallelism of your choice. Now when I think about it, I think it was a mistake to try to be smart and have parallelism coupled to the Anyway, I'm glad those issues pop up... it means someone is using the lib :) thanks! |
I would vote for the removal. Most use cases seems to be related with blocking IO parallelization and its easy to set the right parallelism if you are trying to parallelize CPU bound tasks. |
To parallelize some side-effects I wrote following code which works as expected on my local machine:
However it took me a while to figure out why it doesn't when deployed on our cluster (Mesos & K8S). Eventually, I figured out that since release 2.3.0, parallel-collectors has an inconstent behavior when parallel=1 and parallel>1.
When parallel=1, mapping function is only invoked when the stream returned by
parallel().join()
is consumed by a terminal operation (some likecount
won't as the mapping function can be optimized away).When parallel>1, mapping is always performed in provider executor in an eager way (no need to consume the stream).
This seems a bug as Javadoc says that parallel computation will be performed on a custom executor which is not true in this case. It is also dangerous as it is seems too easy to write the same bogus code than me.
I would propose to revert to 2.2.0 behaviour.
The text was updated successfully, but these errors were encountered: