-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
user defined transports #9434
user defined transports #9434
Conversation
Documentation has been added. In the absence of any concerns I'll merge this over the next day or two. cc: @JeffBezanson |
Answering my own question, they don't. Could they? |
Wouldn't that introduce new dependencies on Base, such as zeromq? Perhaps the test can be conditional on finding zeromq. |
Good point. Would the simple one be quick and painless to test, or does it require some infrastructure? |
Good idea, I'll add the simple example (unix domain sockets) to test. And maybe the 0mq one with a dummy module at least to ensure that the code loads. |
should probably be |
And would that test be possible in environments with limited/no internet access like distribution buildbots? As long it's not more demanding in resources than our current socket or parallel tests it should be fine. |
As long as networking is enabled, we can test by starting just 2 additional workers - so that would be a total of 3 julia processes. |
@tkelman - could you take a look now? I had to add a line to the Makefile to copy these examples to the build doc dir. The simple manager is run |
LGTM. Not sure whether this could cause any issues on distribution buildbots, but there's an easy way to find out. I might've done the temporary-file test using a |
# code loads using a stub module definition for ZMQ. | ||
zmq_found=true | ||
try | ||
using zmq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait, this should probably be capitalized
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good catch.
This builds on #9309 and supports user definable transports.
It supersedes #9046 - the discussion there outlines much of the motivations for this PR.
To implement this, custom cluster managers would need to provide their own
connect(manager::FooManager, pid::Integer, config::WorkerConfig)
methodconnect
should return a pair ofAsyncStream
objects, one for reading data sent from workerpid
, and the other to write data that needs to be sent to workerpid
. custom cluster managers can use an in-memoryBufferStream
as the plumbing to ferry data between the custom, non AsyncStream transport and the Julia parallel infrastructure.A
BufferStream
wraps aPipeBuffer
and condition variables to make a waitable stream.examples/clustermanager/0mq
is an example of how they are used to setup a star network with a 0MQ broker in the middle.connect
is optional, and the default implementation is based on TCP as a transport mechanism.Another optional method is
kill(manager::FooManager, pid::Int, config::WorkerConfig)
, which is called to remove a worker from the cluster.Two example implementations are provided:
examples/clustermanager/simple
shows the use of unix domain sockets as transportexamples/clustermanager/0mq
shows the use of 0MQ as transportOne thought is to move the examples to package
ClusterManagers.jl
instead of adding it here. Have kept it in this PR so that folks can have a look. Will move it based on feedback.Documentation still needs to updated.