-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize ObligationForest's NodeState handling. #36993
Conversation
|
For example, you should be able to get a better perf improvement by skipping over the entire middle "GC" steps if all obligations hit the |
I found this comment to be curt and unhelpful. It made me feel angry and bad. It probably wasn't your intent, but that is the effect it had. Please think about how you express yourself when criticizing other people's code. |
Can you elaborate more? Are you talking about |
I'm sorry. I just think that the best way to deal with an O(n^2) case is to well, make it stop being O(n^2), rather than adding low-level optimizations to the specific O(n^2) part. The main entry point of the obligation forest is If nothing changes in the |
To be clear: this is a reasonable opinion. The problem was not that you criticized my code, the problem was the way you criticized it. I suggest you avoid using the phrase "Totally The Wrong Way" in future reviews. Also, when suggesting alternative approaches, please describe them in reasonable detail. (Your follow-up comment about GCing was much better in this regard.) |
Mod note: As Nick said, please be constructive and nice in your criticism. It seems like you've already acknowledged that, but just in case, leaving this comment 😄 |
Why have my constructive and nice comments been removed? I'm very well acquainted with Rust's code of conduct and I think @nnethercote's reply could very well be considered harassment. He tried to make @arielb1 feel bad for his reply. We shouldn't reply to harassment with harassment. I am so angry that I'm considering writing a Medium blog post about it and posting it to HackerNews. |
Mod note: @zatherz Your comment was deleted for containing profanity and blatantly violating the code of conduct. If you have further questions about moderation or the code of conduct, please ask [email protected] or on one of the forums. This is not the place for such discussion [further offtopic comments will be deleted] |
Moderator note: As @Manishearth said, if you have a specific question or objection with our moderation, then please ask [email protected]. This isn't the place for it. |
Adding myself as another assigneee. Sorry I haven't taken a look yet. My initial reaction was somewhat similar to @arielb1 (i.e., I'd rather solve this from the top), but on the other hand the speed improvements that @nnethercote showed are promising! I'd like to give this a closer read. Will try to do ASAP! Apologies for the delay @nnethercote! |
Oh, please wait a bit! #37231 landed recently and that changed the profiles significantly, so I've started reworking this PR but it's not ready yet. |
@nnethercote ok ping me |
7c64806
to
40f5122
Compare
I've updated. The commit no longer converts It feels like there is still plenty of room for improvement in this benchmark, though I'm struggling to find those improvements myself. Here's Cachegrind's summary of the hottest functions (measurements are instruction counts).
You have to squint a bit when interpreting them due to inlining but they give you a good idea. In particular, this call chain is very hot:
Any changes that can speed up any of those functions or reduce the number of calls in this chain are likely to help. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r=me with the helper extracted (maybe tagged #[inline]
?)
|
||
for dependent in &node.dependents { | ||
self.mark_as_waiting_from(&self.nodes[dependent.get()]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than making two copies of this code, maybe pull out a helper like mark_neighbors_as_waiting_from(node)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
40f5122
to
1011042
Compare
This commit partially inlines two functions, `find_cycles_from_node` and `mark_as_waiting_from`, at two call sites in order to avoid function unnecessary function calls on hot paths. It also fully inlines and removes `is_popped`. These changes speeds up rustc-benchmarks/inflate-0.1.0 by about 2% when doing debug builds with a stage1 compiler.
1011042
to
7b33f7e
Compare
Comments addressed. |
@bors r=nikomatsakis |
@bors r+ |
📌 Commit 7b33f7e has been approved by |
Optimize ObligationForest's NodeState handling. This commit does the following. - Changes `NodeState` from an enum to a `bitflags`. This makes it possible to check against multiple possible values in a single bitwise operation. - Replaces all the hot `match`es involving `NodeState` with `if`/`else` chains that ensure that cases are handled in the order of frequency. - Partially inlines two functions, `find_cycles_from_node` and `mark_as_waiting_from`, at two call sites in order to avoid function unnecessary function calls on hot paths. - Fully inlines and removes `is_popped`. These changes speeds up rustc-benchmarks/inflate-0.1.0 by about 7% when doing debug builds with a stage1 compiler. r? @arielb1
This commit does the following.
NodeState
from an enum to abitflags
. This makes itpossible to check against multiple possible values in a single bitwise
operation.
match
es involvingNodeState
withif
/else
chains that ensure that cases are handled in the order of frequency.
find_cycles_from_node
andmark_as_waiting_from
, at two call sites in order to avoid functionunnecessary function calls on hot paths.
is_popped
.These changes speeds up rustc-benchmarks/inflate-0.1.0 by about 7% when
doing debug builds with a stage1 compiler.
r? @arielb1