Skip to content

Commit

Permalink
A legendary tale of why we should make pmap default to using CachingP…
Browse files Browse the repository at this point in the history
…ool (#33892)

Once upon a time, there was a young julia user first getting started
with parallelism.
And she found it fearsomely slow.
And so she did investigate, and she did illuminate upon her issue.
Her closures, they were being reserialized again and again.
And so this young woman, she openned an issue #16345.
Lo and behold, a noble soul did come and resolve it,
by making the glorious `CachingPool()` in #16808.

3 long years a later this julia user did bravely return to the world of
parallism, with many battle worn scars.
and once more she did face the demon that is `pmap` over closures.
But to her folly, she felt no fear, for she believed the demon to be
crippled and chained by the glorious `CachingPool`.
Fearlessly, she threw his closure over 2GB of data into the maw of the
demon `pmap`.
But alas, alas indeed, she was wrong.
The demon remained unbound, and it slew her, and slew her again.
100 times did it slay her for 101 items was the user iterating upon. 
For the glorious chains of the the `CachingPool()` remains unused, left
aside in the users tool chest, forgotten.
  • Loading branch information
oxinabox authored Jul 28, 2023
1 parent 503d5b4 commit 4825a0c
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 3 deletions.
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ New library features
Standard library changes
------------------------

* `pmap` now defaults to using a `CachingPool` ([#33892]).

#### Package Manager

#### LinearAlgebra
Expand Down
6 changes: 3 additions & 3 deletions stdlib/Distributed/src/pmap.jl
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,8 @@ For multiple collection arguments, apply `f` elementwise.
Note that `f` must be made available to all worker processes; see
[Code Availability and Loading Packages](@ref code-availability) for details.
If a worker pool is not specified, all available workers, i.e., the default worker pool
is used.
If a worker pool is not specified all available workers will be used via a [`CachingPool`](@ref).
By default, `pmap` distributes the computation over all specified workers. To use only the
local process and distribute over tasks, specify `distributed=false`.
Expand Down Expand Up @@ -153,7 +153,7 @@ function pmap(f, p::AbstractWorkerPool, c; distributed=true, batch_size=1, on_er
end

pmap(f, p::AbstractWorkerPool, c1, c...; kwargs...) = pmap(a->f(a...), p, zip(c1, c...); kwargs...)
pmap(f, c; kwargs...) = pmap(f, default_worker_pool(), c; kwargs...)
pmap(f, c; kwargs...) = pmap(f, CachingPool(workers()), c; kwargs...)
pmap(f, c1, c...; kwargs...) = pmap(a->f(a...), zip(c1, c...); kwargs...)

function wrap_on_error(f, on_error; capture_data=false)
Expand Down

2 comments on commit 4825a0c

@nanosoldier
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Executing the daily package evaluation, I will reply here when finished:

@nanosoldier runtests(isdaily = true)

@nanosoldier
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

Please sign in to comment.