-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stabilize <[T]>::get_many_mut()
#128318
Stabilize <[T]>::get_many_mut()
#128318
Conversation
… of O(N^2) by sorting first
We only need an immutable access to the indices array, and it we take it owned we may force a copy for big arrays. For small arrays, LLVM will fully inline and unroll the body anyway, so this doesn't matter. In simple cases, LLVM is able to avoid the copy, but it has troubles in more complex cases (https://godbolt.org/z/s371G4r9e). Given that I expect `get_many_mut()` to be used most of the times with few, probably only two, indices, this seems unlikely to matter and adds one character to type, but it isn't that bad and avoiding a copy seems nice.
This hasn't yet had an FCP so needs libs-api to take a look r? libs-api |
// Based on benchmarks, it is faster to sort starting with 9 indices. | ||
if N >= 9 { | ||
let mut sorted_indices = *indices; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit message states:
We only need an immutable access to the indices array, and it we take it owned we may force a copy for big arrays. For small arrays, LLVM will fully inline and unroll the body anyway, so this doesn't matter.
But the array is copied here, so it seems any gains there are unfortunately immediately lost for anything N >= 9. Depending on the exact performance it may be worth checking indices.is_sorted()
before duplicating and sorting - do you have the results of those benchmarks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not true, as we need both the unsorted and sorted array. So if we take by value we would copy twice.
About checking is_sorted()
, I can do that, but is it worth it? How common is sorted slice that you don't know is sorted? After all, if you know it's sorted you can use a guaranteed O(N) API (not in std) - and I don't know how common is this alone, yet alone by accidence.
However you made me realize that my assembly inspection was wrong - LLVM did copy the array, but only once. So it seems it is good enough to know when the copy can be elided. I wonder if that means we can trust it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant that if copying is showing up to be significant on benchmarks (enough to justify the by-val to by-ref change), then maybe we should try to avoid the copy whenever possible. But LLVM is usually pretty good about figuring out when it makes sense to pass larger types as a pointer, so I am curious how different the benchmark results turned out.
The DisjointIndices
type would make this a lot cleaner since it gives a way to move copying and sorting out of a potentially hot loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't benchmark passing by ref/value, all I benchmarked was sorting versus O(N^2) check. Without a concrete example where LLVM does not elide the copy, of course benchmarks won't show any difference.
The DisjointIndices type would make this a lot cleaner since it gives a way to move copying and sorting out of a potentially hot loop.
This is the "reusing indices" benefit, but as I said, I doubt how useful this is.
☔ The latest upstream changes (presumably #128360) made this pull request unmergeable. Please resolve the merge conflicts. |
Could you please split the implementation changes into a separate PR from stabilization? Also we discussed the signature in the libs-api meeting yesterday and would prefer that the array of indices be taken by value instead of by reference. This is in line with other APIs that take and return arrays. |
@Amanieu Will do, but this may take me some days. |
Hmm, libs-api requested that |
@jdahlstrom Making it return |
I'm closing this and will open new PRs with the requested changes. |
Tracking issue: #104642.
Best reviewed commit by commit.
Closes #104642.
Stabilization Report
Implementation History
Implemented in #83608. Originally included an immutable version too, but that was deemed unnecessary. Also, at first the indices array was required to be sorted, but this was changed.
It was suggested that a
DisjointIndices
type will be added, that will encapsulate the invariant and allow you to index with O(1). This has the benefits that it allows reuse of the same keys, and possibly to add a constructor that allows O(N) instead of O(NlgN) check by requiring the indices array to be sorted. has I decided to not go for this route, since (1) it is unclear how useful those benefits are (I've never needed them personally), (2) about the second benefit, we can always add aget_many_sorted_mut()
method, and that will have roughly the same implementation and API complexity asDisjointIndices
with two constructors, if not better, and (3) we provide theget_many_unchecked_mut()
primitive, so anyone can build on top of that while the tricky unsafe details of how to soundly implementget_many_unchecked_mut()
are hidden from them. Also, it will make the call significantly more convoluted (get_many_mut(&DisjointIndices::new([...])?)?
instead ofget_many_mut(&[...])?
, and while it is possible to simplify that with a trait, that will imply more implementation complexity and weaker inference.It was also suggested that we add the ability to index with ranges. I decided to not do that either. The reasons being unclear benefit and greater implementation and API complexity.
Another suggestion was to add a
index_many_mut()
method that will panic on out of bounds/overlapping indices, possibly with a better panic message thanget_many_mut().expect()
(due to the ability to differentiate between the two error conditions). While not in this stabilization request, we can always add it later if deemed worthy.Yet another request that came is the ability to tell from the error whether it was out of bounds or overlapping indices. I reserved the right to do this (by having a private field on the error type), but didn't actually do it now because it leads to worse codegen on common cases (#128214).
In this PR, I:
GetManyMutError
from std and alloc, which was probably just forgotten.API Summary
Experience Report
I used it in Advent of Code 2022 day 5, and also suggested it here. It worked great, but in this case I didn't have to handle the error so I can't tell if knowing the kind of error is really needed.
There are few polyfill crates (https://crates.io/crates/indices, https://crates.io/crates/index_many, https://crates.io/crates/get_many_mut). They don't seem to get much usage. There is also a highly upvoted question on Stack Overflow, and I see this question popping up there sometimes.