-
-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support of parallel launch? #149
Comments
Sounds interesting! Can you suggest a suitable design? |
Well, I can think of a few ways. Maybe have an option to construct multiple pipelines manually and have them run in parallel. Or take an input, split based on some criteria (such as lines), then for each subpart run a pipeline in parallel. Optionally use a thread pool to limit the simultaneous processes. Then there might be a need to join their outputs. Maybe it can be done by lines, in-order or out-of-order. There are quite a few possibilities, but I guess something like a parallel |
Great! Let's take a concrete example to help us think about what this might look like as code. The particular use case you mentioned can be done with: script.ListFiles(".").SHA256Sums().Stdout() Can you think of another reason you might want to run a bunch of parallel |
Hmmm, does it already work in parallel? I haven't checked. Say, I have a heavily single process program, script.ListFiles().Parallel().Exec("./a.out").Stdout() Or more complicated, if we would normally do p := script.Exec("./a.out").Exec("./b.out")
script.ListFiles().Parallel(p).Stdout() Or script.ListFiles().Parallel().Exec("./a.out").Exec("./b.out").Join().Stdout() Reordering output might not necessarily be a good idea, but if some program generate finite, small amount of output you can certainly use a reorder buffer to ensure output generate as if you run the program in serial (using first API for simplicity): script.ListFiles().Parallel().Exec("./a.out").Reorder().Stdout() |
As you say, output ordering is the issue. If you buffer all the output so that you can reorder it in sequence, then there's no point generating it concurrently; you still have to wait for it all to be done before you can see any of it. With the existing sequential code, you do see the first line of output as soon as it is available. On the other hand, some operations don't produce output, or it doesn't matter what order the output arrives in. For example, if you wanted to compress a bunch of large files, it would make sense to have all these compute-intensive operations running in parallel.
script.ListFiles().ExecParallel("./a.out").Stdout() I think the implementation would likely be via
|
The second API I mention is if you want to do Reordering is probably a niche use case, and it does require finite, small output for the command to work well. I would probably not consider this a frequently used function. |
Is it possible to add parallel invocation of commands? Something like GNU Parallel:
Of course, using
parallel
directly is a solution... but some simple pooling in this package might also be helpful.The text was updated successfully, but these errors were encountered: