-
Notifications
You must be signed in to change notification settings - Fork 507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the ability to customize thread spawning #636
Conversation
cc #93 -- this isn't a total custom backend, but at least it enables custom threads. |
Oops, my refactoring on |
Looks perfect for us, thanks! |
I'm considering that maybe we should go ahead and provide a scoped pool more directly, like: impl ThreadPoolBuilder {
fn build_scoped<W, F, R>(self, wrapper: W, with_pool: F) -> Result<R, ThreadPoolBuildError>
where
W: Fn(ThreadBuilder) + Sync, // expected to call `run()`
F: FnOnce(&ThreadPool) -> R,
{
// `self.spawn` in a `crossbeam::scope`
}
} i.e. pretty close to @Zoxc's |
OK, I added One other thing I would consider is whether the spawn functions should return a join handle, but that's not easy to design in a generic way. We'd want to support |
7912c0c
to
09f58e3
Compare
rayon-core/src/lib.rs
Outdated
/// if the pool is leaked. | ||
pub fn spawn<F>(self, spawn: F) -> Result<ThreadPool, ThreadPoolBuildError> | ||
where | ||
F: FnMut(ThreadBuilder) -> io::Result<()>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want FnMut
here? IIUC, this implies that we are spawning the threads serially, is that a guarantee we want to make?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed in our meeting today, I think we could maybe make this more of a "field" of the builder (rather than being something you mmust specify when the builder is built), but it would require adding a defaulted type parameter to ThreadPoolBuilder
. The idea would be something like this. First, we add a trait SpawnFn
and a type that implements it and has the default behavior:
trait SpawnFn {
}
struct DefaultSpawnFn;
impl<T> SpawnFn for T
where
T: FnMut(ThreadBuilder) -> io::Result<()>
{
}
then we would have
struct ThreadPoolBuilder<S: SpawnFn = DefaultSpawnFn> {
...,
spawn_fn: S
}
and
impl<S> ThreadPoolBuilder<S> {
fn spawn<F>(self, spawn_fn: F) -> ThreadPoolBuilder<F> {
ThreadPoolBuilder { spawn_fn, ..self } // wouldn't actually work, but you get the idea
}
}
...or something like that, anyway.
I guess you could also model it by having the result of spawn
return a wrapper around a ThreadPoolBuilder
, which would be 100% backwards compatible but would require mirroring a lot of API surface.
Why do this? Mostly because it feels more like a 'builder' then -- and if we add further options, they might work in a similar way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tokio's thread pool builder has the around_worker
method - it might be helpful to look at its signature for inspiration. Ignore the &mut Enter
argument because it's irrelevant here. The closure is allocated on the heap as Arc<Fn(&Worker, &mut Enter) + Send + Sync>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want
FnMut
here? IIUC, this implies that we are spawning the threads serially, is that a guarantee we want to make?
It's hard to see parallel thread spawning for WASM, where we're relying on them to spawn threads in the first place. However, maybe we can actually do this join
-style where each new thread is also responsible for spawning half of its remaining peers... interesting!
@alexcrichton would it work for WASM if this spawn callback were Sync + Fn
? I think this would imply that you'd have web workers potentially spawning more web workers -- is that allowed? Otherwise, if WASM can only do this from the main thread, we should probably just stick with FnMut
.
As we discussed in our meeting today, I think we could maybe make this more of a "field" of the builder (rather than being something you m_must_ specify when the builder is built), but it would require adding a defaulted type parameter to
ThreadPoolBuilder
.
I'll play with that idea.
Tokio's thread pool builder has the
around_worker
method
Note that around_worker
requires 'static
, which doesn't help scoped use cases. This is pretty similar to rustc-rayon
's custom main_handler
, and then their scoped_pool
uses unsafe casts to hide the lifetimes. The scoped case also doesn't allow it to return an interim builder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, workers can spawn more workers.
They saw me coming! 😆
(with caveats about browser support)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that'll work yeah! If the main use case here is wasm though I'd be totally fine holding off on merging this until I can prove this works with a proof of concept with wasm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to wait for your wasm PoC, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, maybe we can actually do this join-style where each new thread is also responsible for spawning half of its remaining peers... interesting!
This is exactly what I meant.
To circle back to this, I've got some initial support for this locally working. (it's this demo that I'm porting) Some aspects of the implementation so far:
So all in all at least something works! I think that the threading bug with memory allocation isn't the only bug, I suspect there's at least one more but it's likely all on our end. |
I should also mention that this is raytracing a scene and using rayon to parallelize all the pixels, so it's really cool to see a visual representation of how rayon splits things in half and divides the work! |
This was one of the points I raised in my review, actually -- I'm not sure what we want here. In particular, I'm not sure if we want Would it be harder for you if we had a callback signature like |
For one, |
Ah, yeah, I actually don't think we necessarily need |
It's hard to say without seeing your code, but my wild guess is that you may need something to keep the jobs from splitting too small. e.g. parallelize rows only, or use |
It'd definitely be harder for sure, but also not insurmountable! If it helps I can give you a bit of a better idea about what the code is currently doing. The raytrace example looks like this where this is the main function. I use the new APIs here, and assuming that For some context about that, we maintain a pool of web workers because they're so expensive to start up, so the idea is that when we run a rayon thread we take a worker, send it a message via This isn't really all that hard of a problem, but it's quite convenient that we can do it all on the main thread for sure! Definitely not an insurmountable problem, though, and I feel like for every other user of rayon
I think you're definitely right, I was just hoping that the magic of rayon would fix this all for me :). Previously the synchronization was "each thread takes a big chunk of pixels at a time, figures it out, and then re-synchronizes the output". With rayon it's just "do every pixel in parallel", so I imagine that if we add some more chunking with rayon we're unlikely to really lose all that much performance and likely reduce synchronization costs by quite a bit. Thanks for the tip! |
Ok good news! Now that rust-lang/rust#58855 is in nightly I've been able to restart testing of rayon here. Thankfully the faults I saw related to rust-lang/rust#58855 have all gone away, as expected, but I still continued to see asserts being tripped in rayon (which cascaded to other segfaults I think because panics don't run dtors which is probably pretty critical for rayon). I think, though, that I've found the problem and it's inside rayon-core. Specifically this assert was tripping whenever I would rerender a scene in the demo. I believe that's happening because a thread stops being a rayon thread, and then it starts being a rayon thread again. The general architecture I'm thinking of is that it's my job to manage a pool of web worker worker threads. When a render is requested we take out a fixed number of workers from the pool (spawning more if needed) and then hand them all off to the rayon thread pool. When the render is finished we take back the workers and put them in the general web worker pool. What's happening here then is that a thread, for a short period of time, is a "rayon thread" and doing useful rayon work. That then exits and the thread goes idle for a bit. Afterwards (on the rerender) the thread is reused and booted up to do rayon work again. The assert there though is tripping because the Naively adding a small diff: diff --git a/rayon-core/src/registry.rs b/rayon-core/src/registry.rs
index 63d8a31..ced0b9c 100644
--- a/rayon-core/src/registry.rs
+++ b/rayon-core/src/registry.rs
@@ -589,6 +589,13 @@ impl WorkerThread {
});
}
+ unsafe fn unset_current() {
+ WORKER_THREAD_STATE.with(|t| {
+ assert!(!t.get().is_null());
+ t.set(std::ptr::null());
+ });
+ }
+
/// Returns the registry that owns this worker thread.
pub fn registry(&self) -> &Arc<Registry> {
&self.registry
@@ -789,6 +796,7 @@ unsafe fn main_loop(
}
// We're already exiting the thread, there's nothing else to do.
}
+ WorkerThread::unset_current();
}
/// If already in a worker-thread, just execute `op`. Otherwise, was able to fix it though! Afterwards everything seems to be working perfectly! With some coarser chunking of pixels (parallelizing at the layer of every 100 pixels or so instead of every single pixel) also makes it just as fast as the previous implementation. |
That looks reasonable to me. It might be a little stronger to
🎉 |
Ah true, a destructor would probably be much more prudent for all other platforms! Technically wasm is compiled with panic=abort by default, but the way wasm works you can just pretty easily reenter it after it aborts (even though it "aborted"). With https://github.com/WebAssembly/exception-handling though we'll switch to panic=unwind and fix this! |
One of the best parts about concurrency in Rust is using `rayon` and how easy it makes parallelization of tasks, so it's the ideal example for parallel Rust on the web! Previously we've been unable to use `rayon` because there wasn't a way to customize how rayon threads themselves are spawned, but [that's now being developed for us][rayon]! This commit uses that PR to rewrite the `raytrace-parallel` example in this repository. While not a perfect idiomatic representation of using `rayon` I think this is far more idiomatic than the previous iteration of `raytrace-parallel`! I'm hoping that we can continue to iterate on this, but otherwise show it off as a good example of parallel Rust on the web. [rayon]: rayon-rs/rayon#636
Ok I've posted the wasm-bindgen modifications to our example to accompany this! |
One of the best parts about concurrency in Rust is using `rayon` and how easy it makes parallelization of tasks, so it's the ideal example for parallel Rust on the web! Previously we've been unable to use `rayon` because there wasn't a way to customize how rayon threads themselves are spawned, but [that's now being developed for us][rayon]! This commit uses that PR to rewrite the `raytrace-parallel` example in this repository. While not a perfect idiomatic representation of using `rayon` I think this is far more idiomatic than the previous iteration of `raytrace-parallel`! I'm hoping that we can continue to iterate on this, but otherwise show it off as a good example of parallel Rust on the web. [rayon]: rayon-rs/rayon#636
One of the best parts about concurrency in Rust is using `rayon` and how easy it makes parallelization of tasks, so it's the ideal example for parallel Rust on the web! Previously we've been unable to use `rayon` because there wasn't a way to customize how rayon threads themselves are spawned, but [that's now being developed for us][rayon]! This commit uses that PR to rewrite the `raytrace-parallel` example in this repository. While not a perfect idiomatic representation of using `rayon` I think this is far more idiomatic than the previous iteration of `raytrace-parallel`! I'm hoping that we can continue to iterate on this, but otherwise show it off as a good example of parallel Rust on the web. [rayon]: rayon-rs/rayon#636
One of the best parts about concurrency in Rust is using `rayon` and how easy it makes parallelization of tasks, so it's the ideal example for parallel Rust on the web! Previously we've been unable to use `rayon` because there wasn't a way to customize how rayon threads themselves are spawned, but [that's now being developed for us][rayon]! This commit uses that PR to rewrite the `raytrace-parallel` example in this repository. While not a perfect idiomatic representation of using `rayon` I think this is far more idiomatic than the previous iteration of `raytrace-parallel`! I'm hoping that we can continue to iterate on this, but otherwise show it off as a good example of parallel Rust on the web. [rayon]: rayon-rs/rayon#636
This looks awesome! Thank you @cuviper :) Is there anything still blocking this PR? |
Thanks @torkleyy! Is there a particular use-case that makes you interested in this? I still need to try out @nikomatsakis's suggestion to make this look more like a plain builder method, using some otherwise-default type parameter to allow non-static lifetimes. We'll also have to decide whether we want to require |
Yes, I'd like to get Specs fully WASM compatible. It's an Entity Component System library which we're using in the Amethyst Game Engine and WASM compatibility is one of our main goals for this year. The only feature I'm using is |
OK -- the |
OK, I'm glad the builder parameter can work! I think the trait should be called Given that we don't even try to spawn threads in parallel yet anyway, I'm inclined to leave this to the future. When we have the ability and desire, we can internally specialize for |
I debated about this. The reason that I started with both is that (1) I think, for most people, the code they write will be thread-safe by default, so they would support parallel spawning. Therefore, I thought it'd be better if the "normal" (2) I considered specialization but it seemed like a bit of an inappropriate use -- just because the language considers something thread-safe doesn't imply to me that it necessarily is. The closure might access thread-local state, for example, and we would not know. However, I don't feel very strongly on point (1). I think there won't be very many users of I'm curious why you don't want to expose it in the API, though? Is it just a sense of "YAGNI"? (It does seem a touch premature to me to expose two methods when we always use the same underlying implementation.) |
I'm not sure what you mean by "leave the actual building step out of it" -- maybe that's possible with deeper edits. I didn't look too closely at why we have e.g. As an aside, I am reminded that Rust's "public-private" rules are super annoying here. I'd like for the |
Hmm, actually, the overall design of the |
@cuviper well I pushed a version to branch that uses a defaulted type parameter, but doesn't make the other changes you requested (e.g., renaming to spawn etc). I don't know that I would have time to do that sort of thing, but I'd be happy if you wanted to (hint hint) (and no reason to build on my branch, obviously). Otherwise maybe I'll get to it some other time. In the end, I don't have a strong opinion about the "parallel thread spawn" thing, so long as we have a path forward that lets us support it at some point (and make it the default). I was thinking that if we ever want to support thread-pools of dynamic size (which I sometimes do?) than it would have similar requirements, since we would have to be able to spawn threads at any time. |
Right, this is the line I was hoping to draw by making it
This stance seems antithetical to our purpose as a threading crate. We already warn about this sort of thing in at least one place, but maybe we should be louder about it... rayon/rayon-core/src/scope/mod.rs Lines 103 to 108 in 939d3ee
But I'm also thinking about the scoped-tls example tests. Those cases don't mind because the local is borrowed in a scope outside the entire threadpool. But maybe in an unscoped case, someone would just want to clone some current TLS into the newly spawned TLS, so it matters where they're called from. Hmm... It just feels ugly to expose multiple paths here, especially when we only do one thing at the moment.
This is an additional wrinkle, because also we don't have any lifetime on the |
I don't think that's quite right. In general, the philosophy of Rayon is that you request parallelism and that you supply closures and things that are correct to parallelize -- but the runtime decides whether or not to actually parallelize. We use the type system to check that your closures are safe to parallelize, but we can't actually test that it's correct -- that's up to you. Using specialization to parallelize only if safe kind of blurs the line. I'd prefer if you explicitly "opted in" to parallel execution, or if you explicitly request sequential. Using specialization here feels like it would be akin to
Yes but this is exactly the point: this warning appears on a method where you are not supposed to rely on the fact that it executes within the same thread. That closure has bounds (
To be clear, I'm fine with any path that leaves us room to expose more paths later. I'd probably be happiest exposing just
If we only support sequential spawning, we can do this, as long as we make it a trait that cannot be implemented outside the crate (as we do elsewhere in Rayon already, right?). If/when we later extend to parallel spawning, we can modify the trait to have a more general "build" method. Anyway I don't think users really need to care about the methods on the trait (and I'd be game to use |
Another possibility is to have a single However, I just realized that |
Yeah, we do that with |
Ah, interesting. Yeah, that'd be another possibility. I'm not sure which I would think is better but it's a good point! |
(I would imagine it'd be something like |
So it sounds like we have some consensus to
This leaves a few future routes to expansion if we like:
I think the one downside of |
I just added |
r=me modulo the doc nits |
OK, I added links and examples to the docs, and also cleared the current thread per #636 (comment). bors r=nikomatsakis |
636: Add the ability to customize thread spawning r=nikomatsakis a=cuviper As an alternative to `ThreadPoolBuilder::build()` and `build_global()`, the new `spawn()` and `spawn_global()` methods take a closure which will be responsible for spawning the actual threads. This is called with a `ThreadBuilder` argument that provides the thread index, name, and stack size, with the expectation to call its `run()` method in the new thread. The motivating use cases for this are: - experimental WASM threading, to be externally implemented. - scoped threads, like the new test using `scoped_tls`. Co-authored-by: Josh Stone <[email protected]>
One of the best parts about concurrency in Rust is using `rayon` and how easy it makes parallelization of tasks, so it's the ideal example for parallel Rust on the web! Previously we've been unable to use `rayon` because there wasn't a way to customize how rayon threads themselves are spawned, but [that's now being developed for us][rayon]! This commit uses that PR to rewrite the `raytrace-parallel` example in this repository. While not a perfect idiomatic representation of using `rayon` I think this is far more idiomatic than the previous iteration of `raytrace-parallel`! I'm hoping that we can continue to iterate on this, but otherwise show it off as a good example of parallel Rust on the web. [rayon]: rayon-rs/rayon#636
One of the best parts about concurrency in Rust is using `rayon` and how easy it makes parallelization of tasks, so it's the ideal example for parallel Rust on the web! Previously we've been unable to use `rayon` because there wasn't a way to customize how rayon threads themselves are spawned, but [that's now being developed for us][rayon]! This commit uses that PR to rewrite the `raytrace-parallel` example in this repository. While not a perfect idiomatic representation of using `rayon` I think this is far more idiomatic than the previous iteration of `raytrace-parallel`! I'm hoping that we can continue to iterate on this, but otherwise show it off as a good example of parallel Rust on the web. [rayon]: rayon-rs/rayon#636
As an alternative to
ThreadPoolBuilder::build()
andbuild_global()
,the new
spawn()
andspawn_global()
methods take a closure which willbe responsible for spawning the actual threads. This is called with a
ThreadBuilder
argument that provides the thread index, name, and stacksize, with the expectation to call its
run()
method in the new thread.The motivating use cases for this are:
scoped_tls
.