Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use an user-provided "backend" (threadpool, event loop, ...) #93

Open
gnzlbg opened this issue Sep 7, 2016 · 14 comments
Open

Use an user-provided "backend" (threadpool, event loop, ...) #93

gnzlbg opened this issue Sep 7, 2016 · 14 comments

Comments

@gnzlbg
Copy link

gnzlbg commented Sep 7, 2016

Is it possible to tell rayon to use an user provided thread-pool (e.g. in my crate's main function)?

For example if I'd had a crate using tokio where I also want to use rayon, I would like to have a single thread-pool/event loop/task manager... that is used as the backend for both (and that does workstealing for both), instead of two competing ones.

@nikomatsakis
Copy link
Member

It is not currently possible, but one of my big To Do items for rayon is to factor out the backend into a distinct crate.

@iduartgomez
Copy link

iduartgomez commented May 19, 2017

From looking at the source, I wonder if as an alternative (or in addition/ temporal workaround), it would be possible to have a finer control of the available number of threads for execution at a given time (ie. setting it up before running a parallel iterator chain) during runtime though a function call (that would in turn modify the Registry).

This would be really helpful in the context of load balancing in an application where you have several live threads competing over resources with different pools (other than rayon), for example. Is a bit hackish, but still something.

(edit: NVM, cleared up by cuviper The ideal solution would be IMO to have the same functionality provided by the global thread pool provided by instantiated ThreadPool types so we would have total control over how we split resources, a good steep would be to have something more akin https://github.com/frewsxcv/rust-threadpool incorporated to rayon).

@cuviper
Copy link
Member

cuviper commented May 19, 2017

The ideal solution would be IMO to have the same functionality provided by the global thread pool provided by instantiated ThreadPool types so we would have total control over how we split resources

What do you mean by this? Rayon's global thread pool is literally just a global ThreadPool instance. You can ThreadPool::install your parallel iterator into whatever pool you like.

@iduartgomez
Copy link

iduartgomez commented May 19, 2017

@cuviper Didn't see how to do it in the documentation (maybe is just that, a lack of documentation/example?), but for example:

hash_map
    .par_iter()
    .for_each(|(k, v)| // do stuff ... );

works but when if I use the install method I can't (install closure signature is FnOnce() -> R + Send so does not expect arguments), that's what i was thinking when saying 'have the same functionality'.

Probably missing something obvious.

@cuviper
Copy link
Member

cuviper commented May 20, 2017

The install closure has no arguments, but you can capture anything you like. For instance:

let hash_map = get_map();
my_pool.install(|| {
    // this implicitly captures a reference to `hash_map`
    hash_map.par_iter().for_each(|(k, v)| {
        // do stuff ...
    })
});

@nikomatsakis
Copy link
Member

https://github.com/nikomatsakis/rayon/pull/353 makes it possible to have a custom backend for parallel iterators, at least.

@alexcrichton
Copy link

alexcrichton commented Oct 10, 2018

FWIW I'm running into this I believe with the wasm32-unknown-unknown target where std::thread::spawn doesn't work but we're able, with wasm-bindgen, to get something that looks similar-ish to thread spawning. In that sense Rayon can't spawn any threads because it won't work, but I can either empower it to spawn threads or give it a pool of threads to draw from.

(just wanted to chime in with another use case!)

@nikomatsakis
Copy link
Member

The rustc-rayon fork also adds a "custom main function" -- seems like if you could specify the "spawn thread" function, which @alexcrichton would like, then you could also control the "main". The main difference is that rustc-rayon would also like to know the thread index.

@alexcrichton
Copy link

Oh that could work! (I think?)

In wasm we'll for sure have a way to get the thread index

@nikomatsakis
Copy link
Member

nikomatsakis commented Oct 10, 2018

What I mean is:

I was envisioning that we could give you (on the thread-pool) the ability to specify the spawn function. When spawning threads, we would call the function with an integer (thread index) plus a closure that you are supposed to invoke. The default function would just be |_index, body| std::thread::spawn(body), something like that.

@alexcrichton
Copy link

We could get that to work!

@cuviper
Copy link
Member

cuviper commented Oct 23, 2018

@alexcrichton Ultimately, wasm should just implement std::thread normally, right? In this case, it feels like a custom spawn is just a stopgap measure while wasm works out its thread story.

@nikomatsakis

The default function would just be |_index, body| std::thread::spawn(body), something like that.

Our current spawn loop looks like this:

        for (index, worker) in workers.into_iter().enumerate() {
            let registry = registry.clone();
            let mut b = thread::Builder::new();
            if let Some(name) = builder.get_thread_name(index) {
                b = b.name(name);
            }
            if let Some(stack_size) = builder.get_stack_size() {
                b = b.stack_size(stack_size);
            }
            if let Err(e) = b.spawn(move || unsafe { main_loop(worker, registry, index, breadth_first) }) {
                return Err(ThreadPoolBuildError::new(ErrorKind::IOError(e)))
            }
        }

If we did this with a user's spawn function, we'd need arguments for the name, the stack_size, and the spawn closure, but the last is an FnOnce which makes callbacks awkward. I guess we can also pass the index as you suggest, but I'm not sure what they would use it for.

We could instead define a trait like ThreadBuilder, with methods matching std::thread::Builder as needed. Only the spawn function is really required, and the name and stack_size could probably just be defaulted as no-ops.

@alexcrichton
Copy link

@cuviper to me at least it's not actually clear whether std::thread will ever be implementable with wasm. I suspect in the limit of time it will likely happen, but likely for the next few years it will remain unimplemented. The current proposal is too minimal to implement std::thread as-is but there's other future proposals/ideas which may empower it.

I think the ideal interface for wasm would be something along the lines of "rayon, you can control this thread for some time" where a thread sort of opts-in to being a rayon worker thread. Requiring that rayon is still the one to spawn the thread may be too restrictive still, but I don't mind testing out to see if it's the case!

@zopsicle
Copy link

I was envisioning that we could give you (on the thread-pool) the ability to specify the spawn function.

I think this would still be rather high level and insufficient for some applications. Interoperability with the Windows thread pool, libdispatch, or other thread pools that have a task submission API, that are also used by OpenMP implementations, MSVC C++' std::async, and many other libraries, would allow all of these tools to submit tasks to the same thread pool and interoperate seamlessly. I'm not sure if rayon::join stealing work when one of the tasks finishes early would work well with that model though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants