-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
specialize some collection and iterator operations to run in-place #70793
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @sfackler (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
cc @Centril isn't there a moratorium on new usage of specialization? |
Yeah, there is. However, @matthewjasper recently landed #68970, which provides a mechanism for limited and sound specialization. I believe we need to complete the first two points in their PR description before we can add more specialization:
|
☔ The latest upstream changes (presumably #70800) made this pull request unmergeable. Please resolve the merge conflicts. |
09268c2
to
be03b6a
Compare
This comment has been minimized.
This comment has been minimized.
be03b6a
to
6e28e84
Compare
☔ The latest upstream changes (presumably #71230) made this pull request unmergeable. Please resolve the merge conflicts. |
@the8472 if you can resolve the conflicts we can get this reviewed |
This comment has been minimized.
This comment has been minimized.
6e28e84
to
4cce024
Compare
This comment has been minimized.
This comment has been minimized.
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
@Dylan-DPC it's ready for review now. |
This could also use a new perf run. |
☀️ Test successful - checks-actions, checks-azure |
#[test] | ||
fn test_from_iter_specialization_with_iterator_adapters() { | ||
fn assert_in_place_trait<T: InPlaceIterable>(_: &T) {}; | ||
let src: Vec<usize> = vec![0usize; 65535]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, this test takes forever to run in Miri.^^ I'll reduce it to 512 elements when cfg(miri)
is set to make https://github.com/RalfJung/miri-test-libstd not take forever.
…lacrum Fix liballoc test suite for Miri Mostly, fix the regression introduced by rust-lang#75207 that caused slices (i.e., references) to be created to invalid memory or memory that has aliasing pointers that we want to keep valid. @dylni this changes the type of `check_range` to only require the length, not the full reference to the slice, which indeed is all the information this function requires. Also reduce the size of a test introduced in rust-lang#70793 to make it not take 3 minutes in Miri. This makes https://github.com/RalfJung/miri-test-libstd work again.
…lacrum Fix liballoc test suite for Miri Mostly, fix the regression introduced by rust-lang#75207 that caused slices (i.e., references) to be created to invalid memory or memory that has aliasing pointers that we want to keep valid. @dylni this changes the type of `check_range` to only require the length, not the full reference to the slice, which indeed is all the information this function requires. Also reduce the size of a test introduced in rust-lang#70793 to make it not take 3 minutes in Miri. This makes https://github.com/RalfJung/miri-test-libstd work again.
…lacrum Fix liballoc test suite for Miri Mostly, fix the regression introduced by rust-lang#75207 that caused slices (i.e., references) to be created to invalid memory or memory that has aliasing pointers that we want to keep valid. @dylni this changes the type of `check_range` to only require the length, not the full reference to the slice, which indeed is all the information this function requires. Also reduce the size of a test introduced in rust-lang#70793 to make it not take 3 minutes in Miri. This makes https://github.com/RalfJung/miri-test-libstd work again.
@the8472 fwiw, on future PRs, especially if rebasing it would be great to squash out commits that are fixups (e.g., tidy fixes) and such to have a cleaner git history. |
This is a rebase and update of #66383 which was closed due inactivity.
Recent rustc changes made the compile time regressions disappear, at least for webrender-wrench. Running a stage2 compile and the rustc-perf suite takes hours on the hardware I have at the moment, so I can't do much more than that.
In the best case of the
vec::bench_in_place_recycle
synthetic microbenchmark these optimizations can provide a 15x speedup over the regular implementation which allocates a new vec for every benchmark iteration. Benchmark results. In real code the speedups are tiny, but it also depends on the allocator used, a system allocator that uses a process-wide mutex will benefit more than one with thread-local pools.What was changed
SpecExtend
which coveredfrom_iter
andextend
specializations was split into separate traitsextend
andfrom_iter
now reuse theappend_elements
if passed iterators are from slices.vec.into_iter().collect::<Vec<_>>()
optimization that passed through the original vec has been generalized further to also cover cases where the original has been partially drained.IntoIter
s through various iterator adapters collected into Vec and BinaryHeap will be performed in place as long asT
andU
have the same alignment and size and aren't ZSTs.SourceIter
andInPlaceIterable
traits have been added. The first allows reaching through the iterator pipeline to grab a pointer to the source memory. The latter is a marker that promises that the read pointer will advance as fast or faster than the write pointer and thus in-place operation is possible in the first place.vec::IntoIter
implementsTrustedRandomAccess
forT: Copy
to allow in-place collection when there is aZip
adapter in the iterator. TRA had to be made an unstable public trait to support this.In-place collectible adapters
Map
MapWhile
Filter
FilterMap
Fuse
Skip
SkipWhile
Take
TakeWhile
Enumerate
Zip
(left hand side only,Copy
types only)Peek
Scan
Inspect
Concerns
vec.into_iter().filter(|_| false).collect()
will no longer return a vec with 0 capacity, instead it will return its original allocation. This avoids the cost of doing any allocation or deallocation but could lead to large allocations living longer than expected.If that's not acceptable some resizing policy at the end of the attempted in-place collect would be necessary, which in the worst case could result in one more memcopy than the non-specialized case.
Possible followup work
ignore-tidy-filelength
vec.into_iter().skip(1).collect::<Vec<)>>()
to compile to amemmove
(currently compiles to a pile of SIMD, see Missed optimization: repeated pointer increments don't compile to a memcpy #69187 )HashSet
to be in-place collected into aVec