Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to customize thread spawning #636

Merged
merged 6 commits into from
Jun 5, 2019

Conversation

cuviper
Copy link
Member

@cuviper cuviper commented Feb 18, 2019

As an alternative to ThreadPoolBuilder::build() and build_global(),
the new spawn() and spawn_global() methods take a closure which will
be responsible for spawning the actual threads. This is called with a
ThreadBuilder argument that provides the thread index, name, and stack
size, with the expectation to call its run() method in the new thread.

The motivating use cases for this are:

  • experimental WASM threading, to be externally implemented.
  • scoped threads, like the new test using scoped_tls.

@cuviper cuviper requested a review from nikomatsakis February 18, 2019 23:04
@cuviper
Copy link
Member Author

cuviper commented Feb 18, 2019

cc #93 -- this isn't a total custom backend, but at least it enables custom threads.
cc @alexcrichton and @fitzgen for WASM
cc @Zoxc for possibly displacing rustc-rayon's scoped_pool

@cuviper
Copy link
Member Author

cuviper commented Feb 18, 2019

Oops, my refactoring on global_registry() is wrong -- will fix.

@alexcrichton
Copy link

Looks perfect for us, thanks!

@cuviper
Copy link
Member Author

cuviper commented Feb 19, 2019

I'm considering that maybe we should go ahead and provide a scoped pool more directly, like:

impl ThreadPoolBuilder {
    fn build_scoped<W, F, R>(self, wrapper: W, with_pool: F) -> Result<R, ThreadPoolBuildError>
    where
        W: Fn(ThreadBuilder) + Sync, // expected to call `run()`
        F: FnOnce(&ThreadPool) -> R,
    {
        // `self.spawn` in a `crossbeam::scope`
    }
}

i.e. pretty close to @Zoxc's ThreadPool::scoped_pool. ThreadBuilder::run() gets us a proper FnOnce callback though, whereas their wrapper had to use &mut FnMut.

@cuviper
Copy link
Member Author

cuviper commented Feb 19, 2019

OK, I added build_scoped() too. All of these methods probably need some name bikeshedding.

One other thing I would consider is whether the spawn functions should return a join handle, but that's not easy to design in a generic way. We'd want to support std::thread::JoinHandle, crossbeam's ScopedJoinHandle, etc., but these can't even be trait objects since they need fn join(self).

/// if the pool is leaked.
pub fn spawn<F>(self, spawn: F) -> Result<ThreadPool, ThreadPoolBuildError>
where
F: FnMut(ThreadBuilder) -> io::Result<()>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want FnMut here? IIUC, this implies that we are spawning the threads serially, is that a guarantee we want to make?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed in our meeting today, I think we could maybe make this more of a "field" of the builder (rather than being something you mmust specify when the builder is built), but it would require adding a defaulted type parameter to ThreadPoolBuilder. The idea would be something like this. First, we add a trait SpawnFn and a type that implements it and has the default behavior:

trait SpawnFn {

}

struct DefaultSpawnFn;

impl<T> SpawnFn for T
where
  T: FnMut(ThreadBuilder) -> io::Result<()>
{
}

then we would have

struct ThreadPoolBuilder<S: SpawnFn = DefaultSpawnFn> {
    ...,
    spawn_fn: S
}

and

impl<S> ThreadPoolBuilder<S> {
    fn spawn<F>(self, spawn_fn: F) -> ThreadPoolBuilder<F> {
        ThreadPoolBuilder { spawn_fn, ..self } // wouldn't actually work, but you get the idea
    }
}

...or something like that, anyway.

I guess you could also model it by having the result of spawn return a wrapper around a ThreadPoolBuilder, which would be 100% backwards compatible but would require mirroring a lot of API surface.

Why do this? Mostly because it feels more like a 'builder' then -- and if we add further options, they might work in a similar way.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tokio's thread pool builder has the around_worker method - it might be helpful to look at its signature for inspiration. Ignore the &mut Enter argument because it's irrelevant here. The closure is allocated on the heap as Arc<Fn(&Worker, &mut Enter) + Send + Sync>.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want FnMut here? IIUC, this implies that we are spawning the threads serially, is that a guarantee we want to make?

It's hard to see parallel thread spawning for WASM, where we're relying on them to spawn threads in the first place. However, maybe we can actually do this join-style where each new thread is also responsible for spawning half of its remaining peers... interesting!

@alexcrichton would it work for WASM if this spawn callback were Sync + Fn? I think this would imply that you'd have web workers potentially spawning more web workers -- is that allowed? Otherwise, if WASM can only do this from the main thread, we should probably just stick with FnMut.

As we discussed in our meeting today, I think we could maybe make this more of a "field" of the builder (rather than being something you m_must_ specify when the builder is built), but it would require adding a defaulted type parameter to ThreadPoolBuilder.

I'll play with that idea.

Tokio's thread pool builder has the around_worker method

Note that around_worker requires 'static, which doesn't help scoped use cases. This is pretty similar to rustc-rayon's custom main_handler, and then their scoped_pool uses unsafe casts to hide the lifetimes. The scoped case also doesn't allow it to return an interim builder.

Copy link
Member Author

@cuviper cuviper Feb 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Functions_and_classes_available_to_workers

Yes, workers can spawn more workers.

They saw me coming! 😆
(with caveats about browser support)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that'll work yeah! If the main use case here is wasm though I'd be totally fine holding off on merging this until I can prove this works with a proof of concept with wasm

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to wait for your wasm PoC, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, maybe we can actually do this join-style where each new thread is also responsible for spawning half of its remaining peers... interesting!

This is exactly what I meant.

@alexcrichton
Copy link

To circle back to this, I've got some initial support for this locally working. (it's this demo that I'm porting)

Some aspects of the implementation so far:

  • The performance numbers are sort of all over the place, but rayon does look slower than the previous home-brewed version. I think though that this is a case where rayon may not be able to do the optimal thing relative to what was previously happening, but good lord writing the rayon code was so much easier (consumption of rayon, not necessarily implementation). I don't see any glaring bugs in profiles about performance issues in rayon, and there may be missing optimizations in the wasm threading engine which could speed things up a lot with rayon.

  • I was surprised about the signature of spawn where the closure passed in is neither Send nor 'static. I think that means that all worker threads are spawned immediately during startup, right? That's certainly fine on our end, it made it very easy to implement :). Other schemes are also possible for us, but I was curious if that's intentional or a part that may change.

  • Unfortunately nothing is really demo-able yet because of other bugs. The whole "the main thread can't ever block" restriction on wasm is currently not handled for memory allocation (the main thread does indeed try to grab a lock there). This means that the demo crashes on the second rerender or often crashes on the first due to memory contention (more memory contention I believe is happening than before with the previous scheme).

So all in all at least something works! I think that the threading bug with memory allocation isn't the only bug, I suspect there's at least one more but it's likely all on our end.

@alexcrichton
Copy link

I should also mention that this is raytracing a scene and using rayon to parallelize all the pixels, so it's really cool to see a visual representation of how rayon splits things in half and divides the work!

@nikomatsakis
Copy link
Member

@alexcrichton

I was surprised about the signature of spawn where the closure passed in is neither Send nor 'static. I think that means that all worker threads are spawned immediately during startup, right? That's certainly fine on our end, it made it very easy to implement :). Other schemes are also possible for us, but I was curious if that's intentional or a part that may change.

This was one of the points I raised in my review, actually -- I'm not sure what we want here. In particular, I'm not sure if we want FnMut, as it precludes us from "distributing" this thread spawning across threads, and forces us to do it serially.

Would it be harder for you if we had a callback signature like Fn() + Send + Sync + 'static?

@cuviper
Copy link
Member Author

cuviper commented Mar 6, 2019

Would it be harder for you if we had a callback signature like Fn() + Send + Sync + 'static?

For one, 'static means we can't support the scoped use case, which I'd like to have.

@nikomatsakis
Copy link
Member

Ah, yeah, I actually don't think we necessarily need 'static, although it would probably be helpful

@cuviper
Copy link
Member Author

cuviper commented Mar 6, 2019

The performance numbers are sort of all over the place, but rayon does look slower than the previous home-brewed version. I think though that this is a case where rayon may not be able to do the optimal thing relative to what was previously happening

It's hard to say without seeing your code, but my wild guess is that you may need something to keep the jobs from splitting too small. e.g. parallelize rows only, or use par_chunks or with_min_len. You could also try implementing a custom split into blocks with more pixel locality.

@alexcrichton
Copy link

Would it be harder for you if we had a callback signature like Fn() + Send + Sync + 'static?

It'd definitely be harder for sure, but also not insurmountable!

If it helps I can give you a bit of a better idea about what the code is currently doing. The raytrace example looks like this where this is the main function. I use the new APIs here, and assuming that 'static won't be introduced the main thing we'd have to do is make WorkerPool both Send + Sync, which given its usage of Rc and RefCell is somewhat nontrivial.

For some context about that, we maintain a pool of web workers because they're so expensive to start up, so the idea is that when we run a rayon thread we take a worker, send it a message via postMessage, let it run its work, and then when it's done working it'll send us a message with postMessage which we listen for in an onmessage callback. All that to say that we need to take out a worker, register an onmessage, and in the onmessage we put it back into the pool. (you can see this sorta in reclaim_on_message.

This isn't really all that hard of a problem, but it's quite convenient that we can do it all on the main thread for sure! Definitely not an insurmountable problem, though, and I feel like for every other user of rayon Send + Sync makes sense, so that seems totally reasonable.


It's hard to say without seeing your code, but my wild guess is that you may need something to keep the jobs from splitting too small. e.g. parallelize rows only, or use par_chunks or with_min_len. You could also try implementing a custom split into blocks with more pixel locality.

I think you're definitely right, I was just hoping that the magic of rayon would fix this all for me :). Previously the synchronization was "each thread takes a big chunk of pixels at a time, figures it out, and then re-synchronizes the output". With rayon it's just "do every pixel in parallel", so I imagine that if we add some more chunking with rayon we're unlikely to really lose all that much performance and likely reduce synchronization costs by quite a bit.

Thanks for the tip!

@alexcrichton
Copy link

Ok good news! Now that rust-lang/rust#58855 is in nightly I've been able to restart testing of rayon here. Thankfully the faults I saw related to rust-lang/rust#58855 have all gone away, as expected, but I still continued to see asserts being tripped in rayon (which cascaded to other segfaults I think because panics don't run dtors which is probably pretty critical for rayon).

I think, though, that I've found the problem and it's inside rayon-core. Specifically this assert was tripping whenever I would rerender a scene in the demo. I believe that's happening because a thread stops being a rayon thread, and then it starts being a rayon thread again.

The general architecture I'm thinking of is that it's my job to manage a pool of web worker worker threads. When a render is requested we take out a fixed number of workers from the pool (spawning more if needed) and then hand them all off to the rayon thread pool. When the render is finished we take back the workers and put them in the general web worker pool.

What's happening here then is that a thread, for a short period of time, is a "rayon thread" and doing useful rayon work. That then exits and the thread goes idle for a bit. Afterwards (on the rerender) the thread is reused and booted up to do rayon work again. The assert there though is tripping because the WORKER_THREAD_STATE variable was already set by the previous execution.

Naively adding a small diff:

diff --git a/rayon-core/src/registry.rs b/rayon-core/src/registry.rs
index 63d8a31..ced0b9c 100644
--- a/rayon-core/src/registry.rs
+++ b/rayon-core/src/registry.rs
@@ -589,6 +589,13 @@ impl WorkerThread {
         });
     }
 
+    unsafe fn unset_current() {
+        WORKER_THREAD_STATE.with(|t| {
+            assert!(!t.get().is_null());
+            t.set(std::ptr::null());
+        });
+    }
+
     /// Returns the registry that owns this worker thread.
     pub fn registry(&self) -> &Arc<Registry> {
         &self.registry
@@ -789,6 +796,7 @@ unsafe fn main_loop(
         }
         // We're already exiting the thread, there's nothing else to do.
     }
+    WorkerThread::unset_current();
 }
 
/// If already in a worker-thread, just execute `op`. Otherwise,

was able to fix it though!

Afterwards everything seems to be working perfectly! With some coarser chunking of pixels (parallelizing at the layer of every 100 pixels or so instead of every single pixel) also makes it just as fast as the previous implementation.

@cuviper
Copy link
Member Author

cuviper commented Mar 21, 2019

Naively adding a small diff:

That looks reasonable to me. It might be a little stronger to impl Drop for WorkerThread to handle this, and sanity check that we're correctly unsetting our own value. I guess that doesn't add much if wasm "panics don't run dtors" as you said, but you should probably fix that. 🙂

Afterwards everything seems to be working perfectly! With some coarser chunking of pixels (parallelizing at the layer of every 100 pixels or so instead of every single pixel) also makes it just as fast as the previous implementation.

🎉

@alexcrichton
Copy link

Ah true, a destructor would probably be much more prudent for all other platforms!

Technically wasm is compiled with panic=abort by default, but the way wasm works you can just pretty easily reenter it after it aborts (even though it "aborted"). With https://github.com/WebAssembly/exception-handling though we'll switch to panic=unwind and fix this!

alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Mar 21, 2019
One of the best parts about concurrency in Rust is using `rayon` and how
easy it makes parallelization of tasks, so it's the ideal example for
parallel Rust on the web! Previously we've been unable to use `rayon`
because there wasn't a way to customize how rayon threads themselves are
spawned, but [that's now being developed for us][rayon]!

This commit uses that PR to rewrite the `raytrace-parallel` example in
this repository. While not a perfect idiomatic representation of using
`rayon` I think this is far more idiomatic than the previous iteration
of `raytrace-parallel`! I'm hoping that we can continue to iterate on
this, but otherwise show it off as a good example of parallel Rust on
the web.

[rayon]: rayon-rs/rayon#636
@alexcrichton
Copy link

Ok I've posted the wasm-bindgen modifications to our example to accompany this!

alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Mar 21, 2019
One of the best parts about concurrency in Rust is using `rayon` and how
easy it makes parallelization of tasks, so it's the ideal example for
parallel Rust on the web! Previously we've been unable to use `rayon`
because there wasn't a way to customize how rayon threads themselves are
spawned, but [that's now being developed for us][rayon]!

This commit uses that PR to rewrite the `raytrace-parallel` example in
this repository. While not a perfect idiomatic representation of using
`rayon` I think this is far more idiomatic than the previous iteration
of `raytrace-parallel`! I'm hoping that we can continue to iterate on
this, but otherwise show it off as a good example of parallel Rust on
the web.

[rayon]: rayon-rs/rayon#636
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Mar 21, 2019
One of the best parts about concurrency in Rust is using `rayon` and how
easy it makes parallelization of tasks, so it's the ideal example for
parallel Rust on the web! Previously we've been unable to use `rayon`
because there wasn't a way to customize how rayon threads themselves are
spawned, but [that's now being developed for us][rayon]!

This commit uses that PR to rewrite the `raytrace-parallel` example in
this repository. While not a perfect idiomatic representation of using
`rayon` I think this is far more idiomatic than the previous iteration
of `raytrace-parallel`! I'm hoping that we can continue to iterate on
this, but otherwise show it off as a good example of parallel Rust on
the web.

[rayon]: rayon-rs/rayon#636
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Mar 22, 2019
One of the best parts about concurrency in Rust is using `rayon` and how
easy it makes parallelization of tasks, so it's the ideal example for
parallel Rust on the web! Previously we've been unable to use `rayon`
because there wasn't a way to customize how rayon threads themselves are
spawned, but [that's now being developed for us][rayon]!

This commit uses that PR to rewrite the `raytrace-parallel` example in
this repository. While not a perfect idiomatic representation of using
`rayon` I think this is far more idiomatic than the previous iteration
of `raytrace-parallel`! I'm hoping that we can continue to iterate on
this, but otherwise show it off as a good example of parallel Rust on
the web.

[rayon]: rayon-rs/rayon#636
@torkleyy
Copy link
Contributor

torkleyy commented Apr 5, 2019

This looks awesome! Thank you @cuviper :)

Is there anything still blocking this PR?

@cuviper
Copy link
Member Author

cuviper commented Apr 5, 2019

Thanks @torkleyy! Is there a particular use-case that makes you interested in this?

I still need to try out @nikomatsakis's suggestion to make this look more like a plain builder method, using some otherwise-default type parameter to allow non-static lifetimes. We'll also have to decide whether we want to require Send and/or Sync here, giving us the possibility to fork-spawn the pool, but this will make it harder for the wasm implementation.

@torkleyy
Copy link
Contributor

torkleyy commented Apr 5, 2019

Is there a particular use-case that makes you interested in this?

Yes, I'd like to get Specs fully WASM compatible. It's an Entity Component System library which we're using in the Amethyst Game Engine and WASM compatibility is one of our main goals for this year.

The only feature I'm using is ParallelIterator, running code in parallel inside the for_each closure. I am only relying on being able to pass non-'static stuff, Sync / Send is implemented for all the types.

@cuviper
Copy link
Member Author

cuviper commented Apr 5, 2019

OK -- the Send, Sync, and lifetime design questions here are only about the custom spawn handler. Nothing is proposed to change in how the pool is used.

@cuviper
Copy link
Member Author

cuviper commented May 14, 2019

OK, I'm glad the builder parameter can work! I think the trait should be called ThreadPoolSpawn though, with DefaultSpawn and CustomSpawn types, and leave the actual building step out of it. Related, I'm also not really keen on exposing "seq" differences in the API.

Given that we don't even try to spawn threads in parallel yet anyway, I'm inclined to leave this to the future. When we have the ability and desire, we can internally specialize for Send/Sync spawners automatically. You're on the lang team -- how close is specialization? 😄

@nikomatsakis
Copy link
Member

nikomatsakis commented May 14, 2019

Related, I'm also not really keen on exposing "seq" differences in the API.

I debated about this. The reason that I started with both is that

(1) I think, for most people, the code they write will be thread-safe by default, so they would support parallel spawning. Therefore, I thought it'd be better if the "normal" spawn API was something with stricter requirements.

(2) I considered specialization but it seemed like a bit of an inappropriate use -- just because the language considers something thread-safe doesn't imply to me that it necessarily is. The closure might access thread-local state, for example, and we would not know.

However, I don't feel very strongly on point (1). I think there won't be very many users of spawn to begin with, so if we wind up adding some spawn_par or something in the future that would be "ok".

I'm curious why you don't want to expose it in the API, though? Is it just a sense of "YAGNI"? (It does seem a touch premature to me to expose two methods when we always use the same underlying implementation.)

@nikomatsakis
Copy link
Member

I think the trait should be called ThreadPoolSpawn though, with DefaultSpawn and CustomSpawn types, and leave the actual building step out of it.

I'm not sure what you mean by "leave the actual building step out of it" -- maybe that's possible with deeper edits. I didn't look too closely at why we have e.g. Registry::new and Registry::spawn -- oh, it seems like the former just invokes the latter. Yeah, that works.

As an aside, I am reminded that Rust's "public-private" rules are super annoying here. I'd like for the ThreadPoolSpawn (nee ThreadPoolBuild) trait to be "public" for users to name but to have "private" methods that they cannot see, which reference private types. But the compile gets upset. So if we want to have the internal methods and details be "private" we have to make "public but inaccessible" types. Grr.

@nikomatsakis
Copy link
Member

Hmm, actually, the overall design of the ThreadPoolBuild trait -- i.e., having a "build" method and not a method that just spawns threads -- is exactly what would permit it to later support parallel spawning if we wanted that. Though I guess that so long as this is "private" from the user we could change it if we wanted to without them knowing. So we should definitely ensure that this is an "unimplementable" trait.

@nikomatsakis
Copy link
Member

@cuviper well I pushed a version to branch that uses a defaulted type parameter, but doesn't make the other changes you requested (e.g., renaming to spawn etc). I don't know that I would have time to do that sort of thing, but I'd be happy if you wanted to (hint hint) (and no reason to build on my branch, obviously). Otherwise maybe I'll get to it some other time.

In the end, I don't have a strong opinion about the "parallel thread spawn" thing, so long as we have a path forward that lets us support it at some point (and make it the default). I was thinking that if we ever want to support thread-pools of dynamic size (which I sometimes do?) than it would have similar requirements, since we would have to be able to spawn threads at any time.

@cuviper
Copy link
Member Author

cuviper commented May 15, 2019

I'm not sure what you mean by "leave the actual building step out of it"

Hmm, actually, the overall design of the ThreadPoolBuild trait -- i.e., having a "build" method and not a method that just spawns threads -- is exactly what would permit it to later support parallel spawning if we wanted that. Though I guess that so long as this is "private" from the user we could change it if we wanted to without them knowing.

Right, this is the line I was hoping to draw by making it ThreadPoolSpawn instead. The public interface would only be concerned with how we spawn some particular thread, not the strategy for building the entire collection of threads.

just because the language considers something thread-safe doesn't imply to me that it necessarily is. The closure might access thread-local state, for example, and we would not know.

This stance seems antithetical to our purpose as a threading crate. We already warn about this sort of thing in at least one place, but maybe we should be louder about it...

/// # A note on threading
///
/// The closure given to `scope()` executes in the Rayon thread-pool,
/// as do those given to `spawn()`. This means that you can't access
/// thread-local variables (well, you can, but they may have
/// unexpected values).

But I'm also thinking about the scoped-tls example tests. Those cases don't mind because the local is borrowed in a scope outside the entire threadpool. But maybe in an unscoped case, someone would just want to clone some current TLS into the newly spawned TLS, so it matters where they're called from. Hmm...

It just feels ugly to expose multiple paths here, especially when we only do one thing at the moment.

I was thinking that if we ever want to support thread-pools of dynamic size (which I sometimes do?) than it would have similar requirements, since we would have to be able to spawn threads at any time.

This is an additional wrinkle, because also we don't have any lifetime on the ThreadPool itself.

@nikomatsakis
Copy link
Member

@cuviper

This stance seems antithetical to our purpose as a threading crate.

I don't think that's quite right. In general, the philosophy of Rayon is that you request parallelism and that you supply closures and things that are correct to parallelize -- but the runtime decides whether or not to actually parallelize. We use the type system to check that your closures are safe to parallelize, but we can't actually test that it's correct -- that's up to you.

Using specialization to parallelize only if safe kind of blurs the line. I'd prefer if you explicitly "opted in" to parallel execution, or if you explicitly request sequential. Using specialization here feels like it would be akin to foo.iter().map().collect() sometimes automatically executing in parallel if the compiler thought that it could, which would clearly cause problems -- particularly since we know that some users here (e.g., WASM) really want the guarantee of running on the master thread.

We already warn about this sort of thing in at least one place

Yes but this is exactly the point: this warning appears on a method where you are not supposed to rely on the fact that it executes within the same thread. That closure has bounds (Fn + Send + Sync etc) that make that safe.

It just feels ugly to expose multiple paths here, especially when we only do one thing at the moment.

To be clear, I'm fine with any path that leaves us room to expose more paths later. I'd probably be happiest exposing just spawn_seq for now, so that users are explicitly requesting sequential code, and maybe we add a spawn later that is parallel enabled. But I suppose that one could even make a case that spawn_par would be a more consistent name for the "parallel safe path", since we have par_iter. I don't really have that strong of an opinion here, as I think relatively few users will use this anyway, and we could always deprecate spawn later and make up a new name.

Right, this is the line I was hoping to draw by making it ThreadPoolSpawn instead. The public interface would only be concerned with how we spawn some particular thread, not the strategy for building the entire collection of threads.

If we only support sequential spawning, we can do this, as long as we make it a trait that cannot be implemented outside the crate (as we do elsewhere in Rayon already, right?). If/when we later extend to parallel spawning, we can modify the trait to have a more general "build" method. Anyway I don't think users really need to care about the methods on the trait (and I'd be game to use #[doc(hidden)] to suppress the details). We can name the trait ThreadPoolSpawn either way.

@cuviper
Copy link
Member Author

cuviper commented May 17, 2019

Another possibility is to have a single spawn, and a build that works sequentially (status quo), then add a new build_par or whatever constrained where Self: Send + Sync.

However, I just realized that ThreadPoolBuilder is already !Send + !Sync, which I think is because of the get_thread_name: Option<Box<FnMut(usize) -> String>>. So we're going to have to at least front-load the thread names, and then build_par only needs to be conditional on the spawn type.

@cuviper
Copy link
Member Author

cuviper commented May 17, 2019

as long as we make it a trait that cannot be implemented outside the crate (as we do elsewhere in Rayon already, right?)

Yeah, we do that with rayon::private::PrivateMarker for our own Pattern and Try. Both of those traits are also tucked in private modules, so I guess the marker is kind of redundant anyway... meh.

@nikomatsakis
Copy link
Member

@cuviper

Another possibility is to have a single spawn, and a build that works sequentially (status quo), then add a new build_par or whatever constrained where Self: Send + Sync.

Ah, interesting. Yeah, that'd be another possibility. I'm not sure which I would think is better but it's a good point!

@nikomatsakis
Copy link
Member

(I would imagine it'd be something like fn build_par(self) -> Result<..> where S: 'static + Send + Sync or whatever.)

@nikomatsakis
Copy link
Member

So it sounds like we have some consensus to

  • add spawn with the current sequential constraints
  • a type parameter that is defaulted
  • and a trait named XxxSpawn whose internal details are hidden

This leaves a few future routes to expansion if we like:

  • a build_par that requires the spawn method be thread-safe and meet whatever other bounds seems reasonable
  • a spawn_par that requires a safer function, combined with a build that does a parallel build

I think the one downside of build_par that I see, but it's totally surmountable, is that if you don't customize spawn, we would want parallel builds -- but we could do that still so long as you have the "default" spawn in use.

@cuviper
Copy link
Member Author

cuviper commented May 21, 2019

I just added trait ThreadSpawn, struct DefaultSpawn, and struct CustomSpawn<F>, all pub-in-private for the moment. I think we can get away with calling these internal details, but let me know if you want to open that up at all.

rayon-core/src/lib.rs Outdated Show resolved Hide resolved
rayon-core/src/lib.rs Show resolved Hide resolved
rayon-core/src/lib.rs Outdated Show resolved Hide resolved
@nikomatsakis
Copy link
Member

r=me modulo the doc nits

@cuviper
Copy link
Member Author

cuviper commented Jun 5, 2019

OK, I added links and examples to the docs, and also cleared the current thread per #636 (comment).

bors r=nikomatsakis

bors bot added a commit that referenced this pull request Jun 5, 2019
636: Add the ability to customize thread spawning r=nikomatsakis a=cuviper

As an alternative to `ThreadPoolBuilder::build()` and `build_global()`,
the new `spawn()` and `spawn_global()` methods take a closure which will
be responsible for spawning the actual threads. This is called with a
`ThreadBuilder` argument that provides the thread index, name, and stack
size, with the expectation to call its `run()` method in the new thread.

The motivating use cases for this are:
- experimental WASM threading, to be externally implemented.
- scoped threads, like the new test using `scoped_tls`.

Co-authored-by: Josh Stone <[email protected]>
@bors
Copy link
Contributor

bors bot commented Jun 5, 2019

Build succeeded

@bors bors bot merged commit 249ad3f into rayon-rs:master Jun 5, 2019
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Jun 5, 2019
One of the best parts about concurrency in Rust is using `rayon` and how
easy it makes parallelization of tasks, so it's the ideal example for
parallel Rust on the web! Previously we've been unable to use `rayon`
because there wasn't a way to customize how rayon threads themselves are
spawned, but [that's now being developed for us][rayon]!

This commit uses that PR to rewrite the `raytrace-parallel` example in
this repository. While not a perfect idiomatic representation of using
`rayon` I think this is far more idiomatic than the previous iteration
of `raytrace-parallel`! I'm hoping that we can continue to iterate on
this, but otherwise show it off as a good example of parallel Rust on
the web.

[rayon]: rayon-rs/rayon#636
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Jun 13, 2019
One of the best parts about concurrency in Rust is using `rayon` and how
easy it makes parallelization of tasks, so it's the ideal example for
parallel Rust on the web! Previously we've been unable to use `rayon`
because there wasn't a way to customize how rayon threads themselves are
spawned, but [that's now being developed for us][rayon]!

This commit uses that PR to rewrite the `raytrace-parallel` example in
this repository. While not a perfect idiomatic representation of using
`rayon` I think this is far more idiomatic than the previous iteration
of `raytrace-parallel`! I'm hoping that we can continue to iterate on
this, but otherwise show it off as a good example of parallel Rust on
the web.

[rayon]: rayon-rs/rayon#636
@cuviper cuviper deleted the custom-spawn branch March 11, 2020 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants