-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
functions are serialised at every batch with pmap
and pgenerate
#16345
Comments
Work AroundsSo goal here is to cut the serialisation down to once per worker. Presending (Does Not Work, and IDK why)Idea is we should first send the function into remote Futures. I really thought this would work, but it does not.
Does not help. Infact makes it worse, cos one extra call initially, per worker. Global Variable Hack (Works, but leaks memory and is and scary)This is kinda scary.
I feel unclean. |
For this particular scenario we could support a Other alternatives (along the lines you are exploring)
cc: @samoconnor |
A very different work around is what I thought was a good idea a couple of days ago. http://codereview.stackexchange.com/questions/128058/using-remotechannels-for-an-parallel-iterable-map The idea being that when you start you create on each worker a task that infinite loops, while reading one channel and writing to another. The infinite loop ends when a channel is closed. Problem with |
We should probably remember which functions we have sent to which workers and automatically avoid duplicating. |
Not a bad idea. The mapping function is typically an anonymous function. |
Thinking in more detail, closures might be a separate issue. Internal to the serializer, we could avoid re-sending the code for functions, which is not very big but is slow to serialize. Avoiding re-sending closure data is harder, since it can be arbitrarily big and will need a cache management strategy as you point out. |
So I improved my hack around the issue, to actually only serialize once. This, as with the last, is twice as fast as pmap, in the above test.
|
closed by #16808 |
…ool (#33892) Once upon a time, there was a young julia user first getting started with parallelism. And she found it fearsomely slow. And so she did investigate, and she did illuminate upon her issue. Her closures, they were being reserialized again and again. And so this young woman, she openned an issue #16345. Lo and behold, a noble soul did come and resolve it, by making the glorious `CachingPool()` in #16808. 3 long years a later this julia user did bravely return to the world of parallism, with many battle worn scars. and once more she did face the demon that is `pmap` over closures. But to her folly, she felt no fear, for she believed the demon to be crippled and chained by the glorious `CachingPool`. Fearlessly, she threw his closure over 2GB of data into the maw of the demon `pmap`. But alas, alas indeed, she was wrong. The demon remained unbound, and it slew her, and slew her again. 100 times did it slay her for 101 items was the user iterating upon. For the glorious chains of the the `CachingPool()` remains unused, left aside in the users tool chest, forgotten.
…ool (JuliaLang/julia#33892) Once upon a time, there was a young julia user first getting started with parallelism. And she found it fearsomely slow. And so she did investigate, and she did illuminate upon her issue. Her closures, they were being reserialized again and again. And so this young woman, she openned an issue JuliaLang/julia#16345. Lo and behold, a noble soul did come and resolve it, by making the glorious `CachingPool()` in JuliaLang/julia#16808. 3 long years a later this julia user did bravely return to the world of parallism, with many battle worn scars. and once more she did face the demon that is `pmap` over closures. But to her folly, she felt no fear, for she believed the demon to be crippled and chained by the glorious `CachingPool`. Fearlessly, she threw his closure over 2GB of data into the maw of the demon `pmap`. But alas, alas indeed, she was wrong. The demon remained unbound, and it slew her, and slew her again. 100 times did it slay her for 101 items was the user iterating upon. For the glorious chains of the the `CachingPool()` remains unused, left aside in the users tool chest, forgotten.
So I was trying to workout why my parallel code was taking so long.
After-all I only sent the big datastructures once, though a closure as the function that was mapped over.
That should happen, once (I thought), since the function is constant
Not so
MWE:
Then running the function and counting the lines in the log:
So it was serialized 17 times for pmap.
If is switch to
pgenerate
is it 18 times (so about the same).I believe that after the batchsplit step is done that is once serialisation of the closure, per batch that was sent.
It only need to be serialized once.
(in my nonMWE, it is happening millions of times, and takes 6 seconds a piece...)
I suspect this is already known, but I can't find an issue for it, so maybe not.
see also:
pgenerate
behaves differently if WorkerPool not specified #16322The text was updated successfully, but these errors were encountered: