specialize some collection and iterator operations to run in-place #70793

the8472 · 2020-04-04T23:30:23Z

This is a rebase and update of #66383 which was closed due inactivity.

Recent rustc changes made the compile time regressions disappear, at least for webrender-wrench. Running a stage2 compile and the rustc-perf suite takes hours on the hardware I have at the moment, so I can't do much more than that.

In the best case of the vec::bench_in_place_recycle synthetic microbenchmark these optimizations can provide a 15x speedup over the regular implementation which allocates a new vec for every benchmark iteration. Benchmark results. In real code the speedups are tiny, but it also depends on the allocator used, a system allocator that uses a process-wide mutex will benefit more than one with thread-local pools.

What was changed

SpecExtend which covered from_iter and extend specializations was split into separate traits
extend and from_iter now reuse the append_elements if passed iterators are from slices.
A preexisting vec.into_iter().collect::<Vec<_>>() optimization that passed through the original vec has been generalized further to also cover cases where the original has been partially drained.
A chain of Vec / BinaryHeap / Box<[T]> IntoIters through various iterator adapters collected into Vec and BinaryHeap will be performed in place as long as T and U have the same alignment and size and aren't ZSTs.
To enable above specialization the unsafe, unstable SourceIter and InPlaceIterable traits have been added. The first allows reaching through the iterator pipeline to grab a pointer to the source memory. The latter is a marker that promises that the read pointer will advance as fast or faster than the write pointer and thus in-place operation is possible in the first place.
vec::IntoIter implements TrustedRandomAccess for T: Copy to allow in-place collection when there is a Zip adapter in the iterator. TRA had to be made an unstable public trait to support this.

In-place collectible adapters

Map
MapWhile
Filter
FilterMap
Fuse
Skip
SkipWhile
Take
TakeWhile
Enumerate
Zip (left hand side only, Copy types only)
Peek
Scan
Inspect

Concerns

vec.into_iter().filter(|_| false).collect() will no longer return a vec with 0 capacity, instead it will return its original allocation. This avoids the cost of doing any allocation or deallocation but could lead to large allocations living longer than expected.
If that's not acceptable some resizing policy at the end of the attempted in-place collect would be necessary, which in the worst case could result in one more memcopy than the non-specialized case.

Possible followup work

split liballoc/vec.rs to remove ignore-tidy-filelength
try to get trivial chains such as vec.into_iter().skip(1).collect::<Vec<)>>() to compile to a memmove (currently compiles to a pile of SIMD, see Missed optimization: repeated pointer increments don't compile to a memcpy #69187 )
improve up the traits so they can be reused by other crates, e.g. itertools. I think currently they're only good enough for internal use
allow iterators sourced from a HashSet to be in-place collected into a Vec

rust-highfive · 2020-04-04T23:30:26Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @sfackler (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

sfackler · 2020-04-04T23:58:47Z

cc @Centril isn't there a moratorium on new usage of specialization?

Centril · 2020-04-05T00:08:34Z

Yeah, there is. However, @matthewjasper recently landed #68970, which provides a mechanism for limited and sound specialization. I believe we need to complete the first two points in their PR description before we can add more specialization:

Update libcore and liballoc to compile with min_specialization.

Add a lint to forbid use of feature(specialization) (and other unsound, type system extending features) in the standard library.

bors · 2020-04-05T10:03:19Z

☔ The latest upstream changes (presumably #70800) made this pull request unmergeable. Please resolve the merge conflicts.

Dylan-DPC-zz · 2020-04-16T12:08:15Z

r? @LukasKalbertodt

bors · 2020-04-17T07:25:44Z

☔ The latest upstream changes (presumably #71230) made this pull request unmergeable. Please resolve the merge conflicts.

Dylan-DPC-zz · 2020-04-28T11:40:19Z

@the8472 if you can resolve the conflicts we can get this reviewed

rust-highfive · 2020-05-18T23:59:24Z

The job x86_64-gnu-llvm-8 of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

##[section]Starting: Linux x86_64-gnu-llvm-8
##[section]Starting: Initialize job
Agent name: 'Azure Pipelines 9'
Agent machine name: 'fv-az578'
Current agent version: '2.168.2'
##[group]Operating System
16.04.6
LTS
LTS
##[endgroup]
##[group]Virtual Environment
Environment: ubuntu-16.04
Version: 20200512.2
Included Software: https://github.com/actions/virtual-environments/blob/ubuntu16/20200512.2/images/linux/Ubuntu1604-README.md
##[endgroup]
Agent running as: 'vsts'
Prepare build directory.
Set build variables.
Download all required tasks.
Download all required tasks.
Downloading task: Bash (3.163.2)
Checking job knob settings.
   Knob: AgentToolsDirectory = /opt/hostedtoolcache Source: ${AGENT_TOOLSDIRECTORY} 
   Knob: AgentPerflog = /home/vsts/perflog Source: ${VSTS_AGENT_PERFLOG} 
Start tracking orphan processes.
##[section]Finishing: Initialize job
##[section]Starting: Configure Job Name
==============================================================================
---
========================== Starting Command Output ===========================
[command]/bin/bash --noprofile --norc /home/vsts/work/_temp/5c897b11-70af-4966-ac8e-7fbd82149478.sh

##[section]Finishing: Disable git automatic line ending conversion
##[section]Starting: Checkout rust-lang/rust@refs/pull/70793/merge to s
Task         : Get sources
Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
Version      : 1.0.0
Author       : Microsoft
---
##[command]git remote add origin https://github.com/rust-lang/rust
##[command]git config gc.auto 0
##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
##[command]git config --get-all http.proxy
##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/70793/merge:refs/remotes/pull/70793/merge
---
 ---> cb2676f08729
Step 5/8 : ENV RUST_CONFIGURE_ARGS       --build=x86_64-unknown-linux-gnu       --llvm-root=/usr/lib/llvm-8       --enable-llvm-link-shared       --set rust.thin-lto-import-instr-limit=10
 ---> Using cache
 ---> df25ce111862
Step 6/8 : ENV SCRIPT python2.7 ../x.py test --exclude src/tools/tidy &&            python2.7 ../x.py test src/test/mir-opt --pass=build                                   --target=armv5te-unknown-linux-gnueabi &&            python2.7 ../x.py test src/tools/tidy
 ---> 599b9ac96b27
Step 7/8 : ENV NO_DEBUG_ASSERTIONS=1
 ---> Using cache
 ---> 091087e35a36
---
   Compiling fmt_macros v0.0.0 (/checkout/src/libfmt_macros)
   Compiling rustc_ast_pretty v0.0.0 (/checkout/src/librustc_ast_pretty)
   Compiling chalk-rust-ir v0.10.0
   Compiling rustc_hir v0.0.0 (/checkout/src/librustc_hir)
   Compiling rustc_query_system v0.0.0 (/checkout/src/librustc_query_system)
   Compiling chalk-solve v0.10.0
   Compiling rustc_hir_pretty v0.0.0 (/checkout/src/librustc_hir_pretty)
   Compiling rustc_parse v0.0.0 (/checkout/src/librustc_parse)
   Compiling rustc_ast_lowering v0.0.0 (/checkout/src/librustc_ast_lowering)
---
   Compiling fmt_macros v0.0.0 (/checkout/src/libfmt_macros)
   Compiling chalk-rust-ir v0.10.0
   Compiling rustc_ast_pretty v0.0.0 (/checkout/src/librustc_ast_pretty)
   Compiling rustc_hir v0.0.0 (/checkout/src/librustc_hir)
   Compiling rustc_query_system v0.0.0 (/checkout/src/librustc_query_system)
   Compiling chalk-solve v0.10.0
   Compiling rustc_hir_pretty v0.0.0 (/checkout/src/librustc_hir_pretty)
   Compiling rustc_parse v0.0.0 (/checkout/src/librustc_parse)
   Compiling rustc_ast_lowering v0.0.0 (/checkout/src/librustc_ast_lowering)
---
.......................................................i............................................ 1800/10175
.................................................................................................... 1900/10175
..........................................................................i..i...................... 2000/10175
.................................................................................................... 2100/10175
................................................................iiiii............................... 2200/10175
.................................................................................................... 2400/10175
.................................................................................................... 2500/10175
.................................................................................................... 2600/10175
.................................................................................................... 2700/10175
---
.................................................................................................... 5200/10175
.................................................................................................... 5300/10175
...........................i........................................................................ 5400/10175
....................i............................................................................... 5500/10175
............................iiFii........i...i...................................................... 5600/10175
..............................................................................i..................... 5800/10175
.................................................................................................... 5900/10175
.........................ii.....................................i................................... 6000/10175
.................................................................................................... 6100/10175
.................................................................................................... 6100/10175
.................................................................................................... 6200/10175
......................................................................................ii...i..ii.... 6300/10175
.................................................................................................... 6500/10175
.................................................................................................... 6600/10175
.................................................................................................... 6700/10175
.................................................................................................... 6700/10175
...................i..ii............................................................................ 6800/10175
.................................................................................................... 7000/10175
.........................................................................i.......................... 7100/10175
.................................................................................................... 7200/10175
.................................................................................................... 7300/10175
---
.................................................................................................... 8100/10175
.................................................................................................... 8200/10175
................................................................................................i... 8300/10175
.................................................................................................... 8400/10175
..................................................iiiiii.iiiiii.i................................... 8500/10175
....i............................................................................................... 8700/10175
.................................................................................................... 8800/10175
.................................................................................................... 8900/10175
.................................................................................................... 9000/10175
---
---- [ui] ui/iterators/issue-58952-filter-type-length.rs stdout ----

error: test compilation failed although it shouldn't!
status: exit code: 1
command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/ui/iterators/issue-58952-filter-type-length.rs" "-Zthreads=1" "--target=x86_64-unknown-linux-gnu" "--error-format" "json" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/iterators/issue-58952-filter-type-length/a" "-Crpath" "-O" "-Cdebuginfo=0" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/iterators/issue-58952-filter-type-length/auxiliary"
------------------------------------------

------------------------------------------
stderr:
stderr:
------------------------------------------
error: reached the type-length limit while instantiating `<std::vec::IntoIter<i32> as std:...<std::vec::InPlaceDrop<i32>, !>>`
   |
   |
LL | /     fn try_fold<B, F, R>(&mut self, init: B, mut f: F) -> R
LL | |         Self: Sized,
LL | |         Self: Sized,
LL | |         F: FnMut(B, Self::Item) -> R,
LL | |         Try::from_ok(accum)
LL | |     }
   | |_____^
   |
   |
   = note: consider adding a `#![type_length_limit="1327046"]` attribute to your crate
error: aborting due to previous error


------------------------------------------
---
thread 'main' panicked at 'Some tests failed', src/tools/compiletest/src/main.rs:348:22
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/compiletest" "--compile-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib" "--run-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib" "--rustc-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "--src-base" "/checkout/src/test/ui" "--build-base" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui" "--stage-id" "stage2-x86_64-unknown-linux-gnu" "--mode" "ui" "--target" "x86_64-unknown-linux-gnu" "--host" "x86_64-unknown-linux-gnu" "--llvm-filecheck" "/usr/lib/llvm-8/bin/FileCheck" "--nodejs" "/usr/bin/node" "--host-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--target-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--docck-python" "/usr/bin/python2.7" "--lldb-python" "/usr/bin/python2.7" "--gdb" "/usr/bin/gdb" "--quiet" "--llvm-version" "8.0.0" "--system-llvm" "--cc" "" "--cxx" "" "--cflags" "" "--llvm-components" "" "--adb-path" "adb" "--adb-test-dir" "/data/tmp/work" "--android-cross-path" "" "--color" "always"


failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test --exclude src/tools/tidy
Build completed unsuccessfully in 1:05:52
Build completed unsuccessfully in 1:05:52
== clock drift check ==
  local time: Mon May 18 23:59:19 UTC 2020
  network time: Mon, 18 May 2020 23:59:19 GMT
== end clock drift check ==

##[error]Bash exited with code '1'.
##[section]Finishing: Run build
##[section]Starting: Checkout rust-lang/rust@refs/pull/70793/merge to s
Task         : Get sources
Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
Version      : 1.0.0
Author       : Microsoft
Author       : Microsoft
Help         : [More Information](https://go.microsoft.com/fwlink/?LinkId=798199)
==============================================================================
Cleaning any cached credential from repository: rust-lang/rust (GitHub)
##[section]Finishing: Checkout rust-lang/rust@refs/pull/70793/merge to s
Cleaning up task key
Start cleaning up orphan processes.
Terminate orphan process: pid (3561) (python)
##[section]Finishing: Finalize Job

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @rust-lang/infra. (Feature Requests)

src/test/ui/iterators/issue-58952-filter-type-length.rs

the8472 · 2020-05-24T12:47:52Z

@Dylan-DPC it's ready for review now.

the8472 · 2020-05-26T15:16:34Z

This could also use a new perf run.

bors · 2020-09-03T21:20:26Z

⌛ Testing commit 2f23a0f with merge 0d0f6b1...

bors · 2020-09-03T23:27:39Z

☀️ Test successful - checks-actions, checks-azure
Approved by: Amanieu
Pushing 0d0f6b1 to master...

ecstatic-morse · 2020-09-08T19:43:37Z

Final perf results are here. A small increase in instruction counts during compilation, but a meaningful increase in runtime performance, at least for doc builds. Seems like this is expected. Thanks @the8472.

RalfJung · 2020-09-13T10:26:59Z

library/alloc/tests/vec.rs

+#[test]
+fn test_from_iter_specialization_with_iterator_adapters() {
+    fn assert_in_place_trait<T: InPlaceIterable>(_: &T) {};
+    let src: Vec<usize> = vec![0usize; 65535];


FWIW, this test takes forever to run in Miri.^^ I'll reduce it to 512 elements when cfg(miri) is set to make https://github.com/RalfJung/miri-test-libstd not take forever.

@dylni

…lacrum Fix liballoc test suite for Miri Mostly, fix the regression introduced by rust-lang#75207 that caused slices (i.e., references) to be created to invalid memory or memory that has aliasing pointers that we want to keep valid. @dylni this changes the type of `check_range` to only require the length, not the full reference to the slice, which indeed is all the information this function requires. Also reduce the size of a test introduced in rust-lang#70793 to make it not take 3 minutes in Miri. This makes https://github.com/RalfJung/miri-test-libstd work again.

@dylni

…lacrum Fix liballoc test suite for Miri Mostly, fix the regression introduced by rust-lang#75207 that caused slices (i.e., references) to be created to invalid memory or memory that has aliasing pointers that we want to keep valid. @dylni this changes the type of `check_range` to only require the length, not the full reference to the slice, which indeed is all the information this function requires. Also reduce the size of a test introduced in rust-lang#70793 to make it not take 3 minutes in Miri. This makes https://github.com/RalfJung/miri-test-libstd work again.

@dylni

…lacrum Fix liballoc test suite for Miri Mostly, fix the regression introduced by rust-lang#75207 that caused slices (i.e., references) to be created to invalid memory or memory that has aliasing pointers that we want to keep valid. @dylni this changes the type of `check_range` to only require the length, not the full reference to the slice, which indeed is all the information this function requires. Also reduce the size of a test introduced in rust-lang#70793 to make it not take 3 minutes in Miri. This makes https://github.com/RalfJung/miri-test-libstd work again.

Mark-Simulacrum · 2020-10-26T15:16:56Z

@the8472 fwiw, on future PRs, especially if rebasing it would be great to squash out commits that are fixups (e.g., tidy fixes) and such to have a cleaner git history.

rust-highfive assigned sfackler Apr 4, 2020

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 4, 2020

the8472 force-pushed the in-place-iter-collect branch from 09268c2 to be03b6a Compare April 6, 2020 16:58

This comment has been minimized.

Sign in to view

the8472 force-pushed the in-place-iter-collect branch from be03b6a to 6e28e84 Compare April 6, 2020 22:34

joelpalmer added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 16, 2020

rust-highfive assigned LukasKalbertodt and unassigned sfackler Apr 16, 2020

joelpalmer added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 28, 2020

Dylan-DPC-zz added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 28, 2020

This comment has been minimized.

Sign in to view

LukasKalbertodt added S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 3, 2020

the8472 force-pushed the in-place-iter-collect branch from 6e28e84 to 4cce024 Compare May 18, 2020 22:48

This comment has been minimized.

Sign in to view

the8472 commented May 24, 2020

View reviewed changes

src/test/ui/iterators/issue-58952-filter-type-length.rs Show resolved Hide resolved

dtolnay added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label May 28, 2020

bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Sep 3, 2020

lcnr removed their request for review September 3, 2020 21:21

bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 3, 2020

bors merged commit 0d0f6b1 into rust-lang:master Sep 3, 2020

This was referenced Sep 3, 2020

Detect overflow in proc_macro_server subspan #75146

Merged

Add Iterator::intersperse #75784

Closed

Overload Iterator::last() for IntoIter. #76087

Closed

Stabilize feature(binary_heap_into_iter_sorted) #76234

Closed

the8472 deleted the in-place-iter-collect branch September 5, 2020 21:26

RalfJung mentioned this pull request Sep 13, 2020

Fix liballoc test suite for Miri #76662

Merged

RalfJung reviewed Sep 13, 2020

View reviewed changes

the8472 mentioned this pull request Oct 3, 2020

reuse RHS allocation for vec.extend(vec.into_iter()) when they do not fit into the LHS #77496

Closed

LeSeulArtichaut mentioned this pull request Oct 20, 2020

regression: could not fully normalize type #78139

Closed

alecmocatta mentioned this pull request Nov 5, 2020

Building on stable RustyYato/vec-utils#3

Merged

marmeladema mentioned this pull request Nov 20, 2020

Performance regression in 1.48.0 #79246

Closed

glandium mentioned this pull request Dec 4, 2020

Do not inline finish_grow #78682

Merged

SimonSapin mentioned this pull request May 16, 2021

Public unstable feature="inplace_iteration" APIs without a tracking issue #85373

Closed

SkiFire13 mentioned this pull request May 18, 2021

The implementation of InPlaceIterable for Peekable is unsound #85322

Closed

MarinPostma mentioned this pull request Sep 13, 2021

Implement documents API meilisearch/milli#324

Merged

steffahn mentioned this pull request Sep 9, 2022

In-place optimisation in IntoIterator can lead to memory leak #101628

Closed

tbu- mentioned this pull request Jul 25, 2023

Comments: Learn Unsafe Rust from My Mistakes geo-ant/blog#49

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

specialize some collection and iterator operations to run in-place #70793

specialize some collection and iterator operations to run in-place #70793

the8472 commented Apr 4, 2020

rust-highfive commented Apr 4, 2020

sfackler commented Apr 4, 2020

Centril commented Apr 5, 2020

bors commented Apr 5, 2020

This comment has been minimized.

Dylan-DPC-zz commented Apr 16, 2020

bors commented Apr 17, 2020

Dylan-DPC-zz commented Apr 28, 2020 •

edited

Loading

This comment has been minimized.

This comment has been minimized.

rust-highfive commented May 18, 2020

the8472 commented May 24, 2020

the8472 commented May 26, 2020

bors commented Sep 3, 2020

bors commented Sep 3, 2020

ecstatic-morse commented Sep 8, 2020

RalfJung Sep 13, 2020

Mark-Simulacrum commented Oct 26, 2020

specialize some collection and iterator operations to run in-place #70793

specialize some collection and iterator operations to run in-place #70793

Conversation

the8472 commented Apr 4, 2020

What was changed

In-place collectible adapters

Concerns

Possible followup work

rust-highfive commented Apr 4, 2020

sfackler commented Apr 4, 2020

Centril commented Apr 5, 2020

bors commented Apr 5, 2020

This comment has been minimized.

Dylan-DPC-zz commented Apr 16, 2020

bors commented Apr 17, 2020

Dylan-DPC-zz commented Apr 28, 2020 • edited Loading

This comment has been minimized.

This comment has been minimized.

rust-highfive commented May 18, 2020

the8472 commented May 24, 2020

the8472 commented May 26, 2020

bors commented Sep 3, 2020

bors commented Sep 3, 2020

ecstatic-morse commented Sep 8, 2020

RalfJung Sep 13, 2020

Choose a reason for hiding this comment

Mark-Simulacrum commented Oct 26, 2020

Dylan-DPC-zz commented Apr 28, 2020 •

edited

Loading