-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for using a jobserver with Rayon #56946
Conversation
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
I'm not too familiar with rayon internals, but can you describe at a high level the strategy for managing the jobserver tokens? |
The strategy is to call |
As someone not familiar with rayon, can you expand on that a bit more? Acquiring and releasing a token is a pretty expensive operation and would be pretty inefficient if we did it super commonly, so is blocking in rayon something that's amortized over a long time? |
There are 2 places where blocking can happen which is not related to setup/teardown:
I'd want to add that we spawn |
cc @michaelwoerister too |
Could the jobserver be made abstract to Rayon? I.e. Rayon would not directly use the jobserver crate and instead just use something with the jobserver interface to acquire and release tokens? That would be a bit more general and rustc (or any other piece of code using Rayon) could do some buffering and application-specific management of tokens (e.g. keeping tokens around a bit longer if it expects for more work to show up soon). Another approach that might be interesting: Add some way to tell Rayon the target number of active threads it should be running. It would then internally try to match this soft target by not assigning any more work to threads it wants to wind down. |
(FYI I have this scheduled for review Thu Dec 20 at 13:00 UTC-05:00.) |
OK, so I read this PR and the other one. @Zoxc let me summarize what I think is going on in the current PRs. I'm putting the comment here because I want to keep conversation "consolidated". I believe that the current design basically has each rayon thread acquire a token from the jobserve before it starts looking for work and release that token when it goes to sleep, right? This obviously makes a lot of sense, though I'm wondering a bit if there is some interaction with the LLVM compilation threads we want to be careful of. In particular, I believe that LLVM execution and (e.g.) trans can overlap -- this was a requirement to help us reduce peak memory usage requirements of incremental compilation, if I recall. Maybe we want to move those LLVM things into spawned rayon tasks, so that they are sharing the same basic thread-pool? (I've sort of forgotten how that system is setup, I'll have to investigate.) I would definitely prefer if the rayon core code was "agnostic" as to the specifics of the thread-pool, as @michaelwoerister suggested. Given how simple the interface is, it basically seems like we are talking about adding two callbacks -- (The PR has some other changes, e.g., adopting |
That's basically how I imagined this work in the future. The current LLVM scheduling is rather complicated but only because codegen/trans is bound to the main thread. All of this should get a lot simpler once the |
ping from triage @Zoxc @nikomatsakis any updates on this? |
I've updated this to use the callbacks I added to Rayon. The |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
Rayon threads and LLVM threads will compete for jobserver tokens. I do want to get rid of the LLVM threads once parallel queries is on by default though. |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
Ping @nikomatsakis |
☔ The latest upstream changes (presumably #57948) made this pull request unmergeable. Please resolve the merge conflicts. |
bd32bc4
to
10097cc
Compare
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good, but I have a few questions and nits. =)
use std::mem; | ||
|
||
#[derive(Default)] | ||
pub struct LockedProxyData { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this struct pub
?
} | ||
|
||
#[derive(Default)] | ||
pub struct ProxyData { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this struct pub
?
cond_var: Condvar, | ||
} | ||
|
||
pub struct Proxy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a doc-comment? (Also, does this struct need to be pub
?)
} | ||
|
||
impl Proxy { | ||
pub fn release_token(&self) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, I believe these can be private to the module
// Also note that we stick this in a global because there could be | ||
// multiple rustc instances in this process, and the jobserver is | ||
// per-process. | ||
static ref GLOBAL_CLIENT: Client = unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This unsafe
keyword feels problematic. In particular, it is asserting that GLOBAL_CLIENT
is used early in the process, but it (locally, at least) has no way to know that. Basically, the clients of this module must invoke one of the public methods "early enough", right?
I feel like pub fn client
should be declared unsafe, and parts of this comment moved onto it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think any of this is related to memory safety though.
}) | ||
}; | ||
|
||
static ref GLOBAL_PROXY: Proxy = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no ordering dependency on when GLOBAL_PROXY
is created vs GLOBAL_CLIENT
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GLOBAL_PROXY
requires a GLOBAL_CLIENT
.
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
@bors r=nikomatsakis |
📌 Commit a21370f8978ab051289de2bcc6f29fa2c81584af has been approved by |
☔ The latest upstream changes (presumably #58250) made this pull request unmergeable. Please resolve the merge conflicts. |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
@bors r=nikomatsakis |
📌 Commit 5c78fa8 has been approved by |
Add support for using a jobserver with Rayon The Rayon changes are here: Zoxc/rayon#2 cc @alexcrichton r? @nikomatsakis
☀️ Test successful - checks-travis, status-appveyor |
It would be great if we could have some kind of regression test for this. I don't know how to do that though (we don't have tests for the other jobserver support either). |
The Rayon changes are here: Zoxc/rayon#2
cc @alexcrichton
r? @nikomatsakis