Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

specialize some collection and iterator operations to run in-place #70793

Merged
merged 63 commits into from
Sep 3, 2020

Conversation

the8472
Copy link
Member

@the8472 the8472 commented Apr 4, 2020

This is a rebase and update of #66383 which was closed due inactivity.

Recent rustc changes made the compile time regressions disappear, at least for webrender-wrench. Running a stage2 compile and the rustc-perf suite takes hours on the hardware I have at the moment, so I can't do much more than that.

Screenshot_2020-04-05 rustc performance data

In the best case of the vec::bench_in_place_recycle synthetic microbenchmark these optimizations can provide a 15x speedup over the regular implementation which allocates a new vec for every benchmark iteration. Benchmark results. In real code the speedups are tiny, but it also depends on the allocator used, a system allocator that uses a process-wide mutex will benefit more than one with thread-local pools.

What was changed

  • SpecExtend which covered from_iter and extend specializations was split into separate traits
  • extend and from_iter now reuse the append_elements if passed iterators are from slices.
  • A preexisting vec.into_iter().collect::<Vec<_>>() optimization that passed through the original vec has been generalized further to also cover cases where the original has been partially drained.
  • A chain of Vec / BinaryHeap / Box<[T]> IntoIters through various iterator adapters collected into Vec and BinaryHeap will be performed in place as long as T and U have the same alignment and size and aren't ZSTs.
  • To enable above specialization the unsafe, unstable SourceIter and InPlaceIterable traits have been added. The first allows reaching through the iterator pipeline to grab a pointer to the source memory. The latter is a marker that promises that the read pointer will advance as fast or faster than the write pointer and thus in-place operation is possible in the first place.
  • vec::IntoIter implements TrustedRandomAccess for T: Copy to allow in-place collection when there is a Zip adapter in the iterator. TRA had to be made an unstable public trait to support this.

In-place collectible adapters

  • Map
  • MapWhile
  • Filter
  • FilterMap
  • Fuse
  • Skip
  • SkipWhile
  • Take
  • TakeWhile
  • Enumerate
  • Zip (left hand side only, Copy types only)
  • Peek
  • Scan
  • Inspect

Concerns

vec.into_iter().filter(|_| false).collect() will no longer return a vec with 0 capacity, instead it will return its original allocation. This avoids the cost of doing any allocation or deallocation but could lead to large allocations living longer than expected.
If that's not acceptable some resizing policy at the end of the attempted in-place collect would be necessary, which in the worst case could result in one more memcopy than the non-specialized case.

Possible followup work

  • split liballoc/vec.rs to remove ignore-tidy-filelength
  • try to get trivial chains such as vec.into_iter().skip(1).collect::<Vec<)>>() to compile to a memmove (currently compiles to a pile of SIMD, see Missed optimization: repeated pointer increments don't compile to a memcpy #69187 )
  • improve up the traits so they can be reused by other crates, e.g. itertools. I think currently they're only good enough for internal use
  • allow iterators sourced from a HashSet to be in-place collected into a Vec

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @sfackler (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 4, 2020
@sfackler
Copy link
Member

sfackler commented Apr 4, 2020

cc @Centril isn't there a moratorium on new usage of specialization?

@Centril
Copy link
Contributor

Centril commented Apr 5, 2020

Yeah, there is. However, @matthewjasper recently landed #68970, which provides a mechanism for limited and sound specialization. I believe we need to complete the first two points in their PR description before we can add more specialization:

  • Update libcore and liballoc to compile with min_specialization.
  • Add a lint to forbid use of feature(specialization) (and other unsound, type system extending features) in the standard library.

@bors
Copy link
Contributor

bors commented Apr 5, 2020

☔ The latest upstream changes (presumably #70800) made this pull request unmergeable. Please resolve the merge conflicts.

@the8472

This comment has been minimized.

@joelpalmer joelpalmer added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 16, 2020
@Dylan-DPC-zz
Copy link

r? @LukasKalbertodt

@bors
Copy link
Contributor

bors commented Apr 17, 2020

☔ The latest upstream changes (presumably #71230) made this pull request unmergeable. Please resolve the merge conflicts.

@joelpalmer joelpalmer added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 28, 2020
@Dylan-DPC-zz Dylan-DPC-zz added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 28, 2020
@Dylan-DPC-zz
Copy link

Dylan-DPC-zz commented Apr 28, 2020

@the8472 if you can resolve the conflicts we can get this reviewed

@the8472

This comment has been minimized.

@LukasKalbertodt LukasKalbertodt added S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 3, 2020
@the8472

This comment has been minimized.

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-8 of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
##[section]Starting: Linux x86_64-gnu-llvm-8
##[section]Starting: Initialize job
Agent name: 'Azure Pipelines 9'
Agent machine name: 'fv-az578'
Current agent version: '2.168.2'
##[group]Operating System
16.04.6
LTS
LTS
##[endgroup]
##[group]Virtual Environment
Environment: ubuntu-16.04
Version: 20200512.2
Included Software: https://github.com/actions/virtual-environments/blob/ubuntu16/20200512.2/images/linux/Ubuntu1604-README.md
##[endgroup]
Agent running as: 'vsts'
Prepare build directory.
Set build variables.
Download all required tasks.
Download all required tasks.
Downloading task: Bash (3.163.2)
Checking job knob settings.
   Knob: AgentToolsDirectory = /opt/hostedtoolcache Source: ${AGENT_TOOLSDIRECTORY} 
   Knob: AgentPerflog = /home/vsts/perflog Source: ${VSTS_AGENT_PERFLOG} 
Start tracking orphan processes.
##[section]Finishing: Initialize job
##[section]Starting: Configure Job Name
==============================================================================
---
========================== Starting Command Output ===========================
[command]/bin/bash --noprofile --norc /home/vsts/work/_temp/5c897b11-70af-4966-ac8e-7fbd82149478.sh

##[section]Finishing: Disable git automatic line ending conversion
##[section]Starting: Checkout rust-lang/rust@refs/pull/70793/merge to s
Task         : Get sources
Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
Version      : 1.0.0
Author       : Microsoft
---
##[command]git remote add origin https://github.com/rust-lang/rust
##[command]git config gc.auto 0
##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
##[command]git config --get-all http.proxy
##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/70793/merge:refs/remotes/pull/70793/merge
---
 ---> cb2676f08729
Step 5/8 : ENV RUST_CONFIGURE_ARGS       --build=x86_64-unknown-linux-gnu       --llvm-root=/usr/lib/llvm-8       --enable-llvm-link-shared       --set rust.thin-lto-import-instr-limit=10
 ---> Using cache
 ---> df25ce111862
Step 6/8 : ENV SCRIPT python2.7 ../x.py test --exclude src/tools/tidy &&            python2.7 ../x.py test src/test/mir-opt --pass=build                                   --target=armv5te-unknown-linux-gnueabi &&            python2.7 ../x.py test src/tools/tidy
 ---> 599b9ac96b27
Step 7/8 : ENV NO_DEBUG_ASSERTIONS=1
 ---> Using cache
 ---> 091087e35a36
---
   Compiling fmt_macros v0.0.0 (/checkout/src/libfmt_macros)
   Compiling rustc_ast_pretty v0.0.0 (/checkout/src/librustc_ast_pretty)
   Compiling chalk-rust-ir v0.10.0
   Compiling rustc_hir v0.0.0 (/checkout/src/librustc_hir)
   Compiling rustc_query_system v0.0.0 (/checkout/src/librustc_query_system)
   Compiling chalk-solve v0.10.0
   Compiling rustc_hir_pretty v0.0.0 (/checkout/src/librustc_hir_pretty)
   Compiling rustc_parse v0.0.0 (/checkout/src/librustc_parse)
   Compiling rustc_ast_lowering v0.0.0 (/checkout/src/librustc_ast_lowering)
---
   Compiling fmt_macros v0.0.0 (/checkout/src/libfmt_macros)
   Compiling chalk-rust-ir v0.10.0
   Compiling rustc_ast_pretty v0.0.0 (/checkout/src/librustc_ast_pretty)
   Compiling rustc_hir v0.0.0 (/checkout/src/librustc_hir)
   Compiling rustc_query_system v0.0.0 (/checkout/src/librustc_query_system)
   Compiling chalk-solve v0.10.0
   Compiling rustc_hir_pretty v0.0.0 (/checkout/src/librustc_hir_pretty)
   Compiling rustc_parse v0.0.0 (/checkout/src/librustc_parse)
   Compiling rustc_ast_lowering v0.0.0 (/checkout/src/librustc_ast_lowering)
---
.......................................................i............................................ 1800/10175
.................................................................................................... 1900/10175
..........................................................................i..i...................... 2000/10175
.................................................................................................... 2100/10175
................................................................iiiii............................... 2200/10175
.................................................................................................... 2400/10175
.................................................................................................... 2500/10175
.................................................................................................... 2600/10175
.................................................................................................... 2700/10175
---
.................................................................................................... 5200/10175
.................................................................................................... 5300/10175
...........................i........................................................................ 5400/10175
....................i............................................................................... 5500/10175
............................iiFii........i...i...................................................... 5600/10175
..............................................................................i..................... 5800/10175
.................................................................................................... 5900/10175
.........................ii.....................................i................................... 6000/10175
.................................................................................................... 6100/10175
.................................................................................................... 6100/10175
.................................................................................................... 6200/10175
......................................................................................ii...i..ii.... 6300/10175
.................................................................................................... 6500/10175
.................................................................................................... 6600/10175
.................................................................................................... 6700/10175
.................................................................................................... 6700/10175
...................i..ii............................................................................ 6800/10175
.................................................................................................... 7000/10175
.........................................................................i.......................... 7100/10175
.................................................................................................... 7200/10175
.................................................................................................... 7300/10175
---
.................................................................................................... 8100/10175
.................................................................................................... 8200/10175
................................................................................................i... 8300/10175
.................................................................................................... 8400/10175
..................................................iiiiii.iiiiii.i................................... 8500/10175
....i............................................................................................... 8700/10175
.................................................................................................... 8800/10175
.................................................................................................... 8900/10175
.................................................................................................... 9000/10175
---
---- [ui] ui/iterators/issue-58952-filter-type-length.rs stdout ----

error: test compilation failed although it shouldn't!
status: exit code: 1
command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/ui/iterators/issue-58952-filter-type-length.rs" "-Zthreads=1" "--target=x86_64-unknown-linux-gnu" "--error-format" "json" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/iterators/issue-58952-filter-type-length/a" "-Crpath" "-O" "-Cdebuginfo=0" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/iterators/issue-58952-filter-type-length/auxiliary"
------------------------------------------

------------------------------------------
stderr:
stderr:
------------------------------------------
error: reached the type-length limit while instantiating `<std::vec::IntoIter<i32> as std:...<std::vec::InPlaceDrop<i32>, !>>`
   |
   |
LL | /     fn try_fold<B, F, R>(&mut self, init: B, mut f: F) -> R
LL | |         Self: Sized,
LL | |         Self: Sized,
LL | |         F: FnMut(B, Self::Item) -> R,
LL | |         Try::from_ok(accum)
LL | |     }
   | |_____^
   |
   |
   = note: consider adding a `#![type_length_limit="1327046"]` attribute to your crate
error: aborting due to previous error


------------------------------------------
---
thread 'main' panicked at 'Some tests failed', src/tools/compiletest/src/main.rs:348:22
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/compiletest" "--compile-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib" "--run-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib" "--rustc-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "--src-base" "/checkout/src/test/ui" "--build-base" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui" "--stage-id" "stage2-x86_64-unknown-linux-gnu" "--mode" "ui" "--target" "x86_64-unknown-linux-gnu" "--host" "x86_64-unknown-linux-gnu" "--llvm-filecheck" "/usr/lib/llvm-8/bin/FileCheck" "--nodejs" "/usr/bin/node" "--host-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--target-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--docck-python" "/usr/bin/python2.7" "--lldb-python" "/usr/bin/python2.7" "--gdb" "/usr/bin/gdb" "--quiet" "--llvm-version" "8.0.0" "--system-llvm" "--cc" "" "--cxx" "" "--cflags" "" "--llvm-components" "" "--adb-path" "adb" "--adb-test-dir" "/data/tmp/work" "--android-cross-path" "" "--color" "always"


failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test --exclude src/tools/tidy
Build completed unsuccessfully in 1:05:52
Build completed unsuccessfully in 1:05:52
== clock drift check ==
  local time: Mon May 18 23:59:19 UTC 2020
  network time: Mon, 18 May 2020 23:59:19 GMT
== end clock drift check ==

##[error]Bash exited with code '1'.
##[section]Finishing: Run build
##[section]Starting: Checkout rust-lang/rust@refs/pull/70793/merge to s
Task         : Get sources
Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
Version      : 1.0.0
Author       : Microsoft
Author       : Microsoft
Help         : [More Information](https://go.microsoft.com/fwlink/?LinkId=798199)
==============================================================================
Cleaning any cached credential from repository: rust-lang/rust (GitHub)
##[section]Finishing: Checkout rust-lang/rust@refs/pull/70793/merge to s
Cleaning up task key
Start cleaning up orphan processes.
Terminate orphan process: pid (3561) (python)
##[section]Finishing: Finalize Job

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @rust-lang/infra. (Feature Requests)

@the8472
Copy link
Member Author

the8472 commented May 24, 2020

@Dylan-DPC it's ready for review now.

@the8472
Copy link
Member Author

the8472 commented May 26, 2020

This could also use a new perf run.

@dtolnay dtolnay added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label May 28, 2020
@bors bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Sep 3, 2020
@bors
Copy link
Contributor

bors commented Sep 3, 2020

⌛ Testing commit 2f23a0f with merge 0d0f6b1...

@lcnr lcnr removed their request for review September 3, 2020 21:21
@bors
Copy link
Contributor

bors commented Sep 3, 2020

☀️ Test successful - checks-actions, checks-azure
Approved by: Amanieu
Pushing 0d0f6b1 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 3, 2020
@bors bors merged commit 0d0f6b1 into rust-lang:master Sep 3, 2020
@the8472 the8472 deleted the in-place-iter-collect branch September 5, 2020 21:26
@ecstatic-morse
Copy link
Contributor

Final perf results are here. A small increase in instruction counts during compilation, but a meaningful increase in runtime performance, at least for doc builds. Seems like this is expected. Thanks @the8472.

#[test]
fn test_from_iter_specialization_with_iterator_adapters() {
fn assert_in_place_trait<T: InPlaceIterable>(_: &T) {};
let src: Vec<usize> = vec![0usize; 65535];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this test takes forever to run in Miri.^^ I'll reduce it to 512 elements when cfg(miri) is set to make https://github.com/RalfJung/miri-test-libstd not take forever.

RalfJung added a commit to RalfJung/rust that referenced this pull request Sep 13, 2020
…lacrum

Fix liballoc test suite for Miri

Mostly, fix the regression introduced by rust-lang#75207 that caused slices (i.e., references) to be created to invalid memory or memory that has aliasing pointers that we want to keep valid. @dylni  this changes the type of `check_range` to only require the length, not the full reference to the slice, which indeed is all the information this function requires.

Also reduce the size of a test introduced in rust-lang#70793 to make it not take 3 minutes in Miri.

This makes https://github.com/RalfJung/miri-test-libstd work again.
RalfJung added a commit to RalfJung/rust that referenced this pull request Sep 13, 2020
…lacrum

Fix liballoc test suite for Miri

Mostly, fix the regression introduced by rust-lang#75207 that caused slices (i.e., references) to be created to invalid memory or memory that has aliasing pointers that we want to keep valid. @dylni  this changes the type of `check_range` to only require the length, not the full reference to the slice, which indeed is all the information this function requires.

Also reduce the size of a test introduced in rust-lang#70793 to make it not take 3 minutes in Miri.

This makes https://github.com/RalfJung/miri-test-libstd work again.
RalfJung added a commit to RalfJung/rust that referenced this pull request Sep 16, 2020
…lacrum

Fix liballoc test suite for Miri

Mostly, fix the regression introduced by rust-lang#75207 that caused slices (i.e., references) to be created to invalid memory or memory that has aliasing pointers that we want to keep valid. @dylni  this changes the type of `check_range` to only require the length, not the full reference to the slice, which indeed is all the information this function requires.

Also reduce the size of a test introduced in rust-lang#70793 to make it not take 3 minutes in Miri.

This makes https://github.com/RalfJung/miri-test-libstd work again.
@Mark-Simulacrum
Copy link
Member

@the8472 fwiw, on future PRs, especially if rebasing it would be great to squash out commits that are fixups (e.g., tidy fixes) and such to have a cleaner git history.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.