BTree: lazily locate leaves in rangeless iterators #86031

ssomers · 2021-06-05T17:32:28Z

BTree iterators always locate both the first and last leaf edge and often only need either one, i.e., whenever they are traversed in a single direction, like in for-loops and in the common use of iter().next() or iter().next_back() to retrieve the first or last key/value-pair (#62924). It's fairly easy to avoid because the iterators with this disadvantage already are quite separate from other iterators.

r? @Mark-Simulacrum

ssomers · 2021-06-05T17:48:34Z

About alloc's own benches:

They're generally positive, but who knows what that means in other code.
The btree::map::iter_X benches obviously benefit, but they're quite artificial now since they merely invoke iter().
In small maps, or intersection benches that can quickly bail out, the "descend" to the leaf node didn't take long anyway and the extra flags that take care of avoiding it slow down.
The btree::map::first_and_last_X_stable benches show that legacy iter().next() calls are twice as fast, but they are still considerably slower than the nightly first and last functions promoted in Tracking issue for map_first_last: first/last methods on BTreeSet and BTreeMap #62924. No idea why.

 name                                           mid ns/iter  new ns/iter  diff ns/iter   diff %  speedup
 btree::map::clone_fat_val_100                  10,772       11,452                680    6.31%   x 0.94
 btree::map::clone_fat_val_100_and_clear        10,746       10,737                 -9   -0.08%   x 1.00
 btree::map::clone_fat_val_100_and_drain_all    44,356       44,764                408    0.92%   x 0.99
 btree::map::clone_fat_val_100_and_drain_half   31,685       31,179               -506   -1.60%   x 1.02
 btree::map::clone_fat_val_100_and_into_iter    9,913        10,070                157    1.58%   x 0.98
 btree::map::clone_fat_val_100_and_pop_all      54,122       54,185                 63    0.12%   x 1.00
 btree::map::clone_fat_val_100_and_remove_all   45,772       46,705                933    2.04%   x 0.98
 btree::map::clone_fat_val_100_and_remove_half  29,832       29,763                -69   -0.23%   x 1.00
 btree::map::clone_slim_100                     4,303        4,309                   6    0.14%   x 1.00
 btree::map::clone_slim_100_and_clear           4,255        4,231                 -24   -0.56%   x 1.01
 btree::map::clone_slim_100_and_drain_all       6,091        6,150                  59    0.97%   x 0.99
 btree::map::clone_slim_100_and_drain_half      5,370        5,307                 -63   -1.17%   x 1.01
 btree::map::clone_slim_100_and_into_iter       4,303        4,240                 -63   -1.46%   x 1.01
 btree::map::clone_slim_100_and_pop_all         6,321        6,144                -177   -2.80%   x 1.03
 btree::map::clone_slim_100_and_remove_all      7,097        6,901                -196   -2.76%   x 1.03
 btree::map::clone_slim_100_and_remove_half     5,590        5,575                 -15   -0.27%   x 1.00
 btree::map::clone_slim_10k                     181,173      182,340             1,167    0.64%   x 0.99
 btree::map::clone_slim_10k_and_clear           179,972      185,704             5,732    3.18%   x 0.97
 btree::map::clone_slim_10k_and_drain_all       375,875      372,945            -2,930   -0.78%   x 1.01
 btree::map::clone_slim_10k_and_drain_half      316,386      313,505            -2,881   -0.91%   x 1.01
 btree::map::clone_slim_10k_and_into_iter       184,300      184,380                80    0.04%   x 1.00
 btree::map::clone_slim_10k_and_pop_all         406,045      406,948               903    0.22%   x 1.00
 btree::map::clone_slim_10k_and_remove_all      537,710      482,480           -55,230  -10.27%   x 1.11
 btree::map::clone_slim_10k_and_remove_half     482,560      491,320             8,760    1.82%   x 0.98
 btree::map::find_rand_100                      9            11                      2   22.22%   x 0.82
 btree::map::find_rand_10_000                   52           53                      1    1.92%   x 0.98
 btree::map::find_seq_100                       11           12                      1    9.09%   x 0.92
 btree::map::find_seq_10_000                    36           35                     -1   -2.78%   x 1.03
 btree::map::first_and_last_0_nightly           9            9                       0    0.00%   x 1.00
 btree::map::first_and_last_0_stable            24           24                      0    0.00%   x 1.00
 btree::map::first_and_last_100_nightly         33           48                     15   45.45%   x 0.69
 btree::map::first_and_last_100_stable          236          102                  -134  -56.78%   x 2.31
 btree::map::first_and_last_10k_nightly         68           60                     -8  -11.76%   x 1.13
 btree::map::first_and_last_10k_stable          259          121                  -138  -53.28%   x 2.14
 btree::map::insert_rand_100                    38           38                      0    0.00%   x 1.00
 btree::map::insert_rand_10_000                 38           38                      0    0.00%   x 1.00
 btree::map::insert_seq_100                     46           46                      0    0.00%   x 1.00
 btree::map::insert_seq_10_000                  93           95                      2    2.15%   x 0.98
 btree::map::iter_0                             1,254        1,755                 501   39.95%   x 0.71
 btree::map::iter_1                             4,013        1,755              -2,258  -56.27%   x 2.29
 btree::map::iter_100                           4,388        1,724              -2,664  -60.71%   x 2.55
 btree::map::iter_10k                           5,952        1,755              -4,197  -70.51%   x 3.39
 btree::map::iter_1m                            8,179        1,755              -6,424  -78.54%   x 4.66
 btree::map::iteration_1000                     4,092        3,850                -242   -5.91%   x 1.06
 btree::map::iteration_100000                   526,900      464,360           -62,540  -11.87%   x 1.13
 btree::map::iteration_20                       65           60                     -5   -7.69%   x 1.08
 btree::map::iteration_mut_1000                 4,250        4,011                -239   -5.62%   x 1.06
 btree::map::iteration_mut_100000               532,400      476,360           -56,040  -10.53%   x 1.12
 btree::map::iteration_mut_20                   66           62                     -4   -6.06%   x 1.06
 btree::map::range_included_excluded            393,385      406,255            12,870    3.27%   x 0.97
 btree::map::range_included_included            401,615      402,760             1,145    0.29%   x 1.00
 btree::map::range_included_unbounded           137,408      139,191             1,783    1.30%   x 0.99
 btree::map::range_unbounded_unbounded          45,399       42,853             -2,546   -5.61%   x 1.06
 btree::map::range_unbounded_vs_iter            45,758       18,328            -27,430  -59.95%   x 2.50
 btree::set::clone_100                          1,408        1,401                  -7   -0.50%   x 1.00
 btree::set::clone_100_and_clear                1,410        1,411                   1    0.07%   x 1.00
 btree::set::clone_100_and_drain_all            2,960        2,950                 -10   -0.34%   x 1.00
 btree::set::clone_100_and_drain_half           2,309        2,304                  -5   -0.22%   x 1.00
 btree::set::clone_100_and_into_iter            1,436        1,557                 121    8.43%   x 0.92
 btree::set::clone_100_and_pop_all              2,704        2,713                   9    0.33%   x 1.00
 btree::set::clone_100_and_remove_all           3,706        3,561                -145   -3.91%   x 1.04
 btree::set::clone_100_and_remove_half          2,410        2,469                  59    2.45%   x 0.98
 btree::set::clone_10k                          170,150      166,192            -3,958   -2.33%   x 1.02
 btree::set::clone_10k_and_clear                170,544      170,363              -181   -0.11%   x 1.00
 btree::set::clone_10k_and_drain_all            326,732      323,010            -3,722   -1.14%   x 1.01
 btree::set::clone_10k_and_drain_half           282,630      281,396            -1,234   -0.44%   x 1.00
 btree::set::clone_10k_and_into_iter            171,660      182,752            11,092    6.46%   x 0.94
 btree::set::clone_10k_and_pop_all              314,910      310,033            -4,877   -1.55%   x 1.02
 btree::set::clone_10k_and_remove_all           455,900      415,210           -40,690   -8.93%   x 1.10
 btree::set::clone_10k_and_remove_half          446,190      438,070            -8,120   -1.82%   x 1.02
 btree::set::difference_random_100_vs_100       1,009        1,077                  68    6.74%   x 0.94
 btree::set::difference_random_100_vs_10k       2,128        2,116                 -12   -0.56%   x 1.01
 btree::set::difference_random_10k_vs_100       69,723       76,655              6,932    9.94%   x 0.91
 btree::set::difference_random_10k_vs_10k       217,273      227,112             9,839    4.53%   x 0.96
 btree::set::difference_staggered_100_vs_100    1,111        1,266                 155   13.95%   x 0.88
 btree::set::difference_staggered_100_vs_10k    2,050        2,080                  30    1.46%   x 0.99
 btree::set::difference_staggered_10k_vs_10k    110,005      126,791            16,786   15.26%   x 0.87
 btree::set::intersection_100_neg_vs_100_pos    17           28                     11   64.71%   x 0.61
 btree::set::intersection_100_neg_vs_10k_pos    18           30                     12   66.67%   x 0.60
 btree::set::intersection_100_pos_vs_100_neg    17           28                     11   64.71%   x 0.61
 btree::set::intersection_100_pos_vs_10k_neg    18           30                     12   66.67%   x 0.60
 btree::set::intersection_10k_neg_vs_100_pos    18           29                     11   61.11%   x 0.62
 btree::set::intersection_10k_neg_vs_10k_pos    19           32                     13   68.42%   x 0.59
 btree::set::intersection_10k_pos_vs_100_neg    18           29                     11   61.11%   x 0.62
 btree::set::intersection_10k_pos_vs_10k_neg    19           32                     13   68.42%   x 0.59
 btree::set::intersection_random_100_vs_100     717          582                  -135  -18.83%   x 1.23
 btree::set::intersection_random_100_vs_10k     1,882        1,896                  14    0.74%   x 0.99
 btree::set::intersection_random_10k_vs_100     1,861        1,852                  -9   -0.48%   x 1.00
 btree::set::intersection_random_10k_vs_10k     166,006      164,202            -1,804   -1.09%   x 1.01
 btree::set::intersection_staggered_100_vs_100  641          617                   -24   -3.74%   x 1.04
 btree::set::intersection_staggered_100_vs_10k  1,770        1,847                  77    4.35%   x 0.96
 btree::set::intersection_staggered_10k_vs_10k  63,001       61,322             -1,679   -2.67%   x 1.03
 btree::set::is_subset_100_vs_100               626          621                    -5   -0.80%   x 1.01
 btree::set::is_subset_100_vs_10k               1,441        1,349                 -92   -6.38%   x 1.07
 btree::set::is_subset_10k_vs_100               2            2                       0    0.00%   x 1.00
 btree::set::is_subset_10k_vs_10k               63,428       61,306             -2,122   -3.35%   x 1.03

Mark-Simulacrum · 2021-06-07T00:20:18Z

@bors try @rust-timer queue

rust-timer · 2021-06-07T00:20:19Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2021-06-07T00:20:26Z

⌛ Trying commit 9e1ac543d134a115fe253691eb9b137192560b7a with merge ae3cf1a90b2081be4fab0ca50bf8e1c75877a29d...

bors · 2021-06-07T01:05:39Z

☀️ Try build successful - checks-actions
Build commit: ae3cf1a90b2081be4fab0ca50bf8e1c75877a29d (ae3cf1a90b2081be4fab0ca50bf8e1c75877a29d)

rust-timer · 2021-06-07T01:05:41Z

Queued ae3cf1a90b2081be4fab0ca50bf8e1c75877a29d with parent 35fff69, future comparison URL.

rust-timer · 2021-06-07T04:21:34Z

Finished benchmarking try commit (ae3cf1a90b2081be4fab0ca50bf8e1c75877a29d): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

ssomers · 2021-06-21T07:56:01Z

Last commits fix variance and try to maintain #[inline] placement, which seems to dominate performance.

bors · 2021-07-08T18:05:21Z

☔ The latest upstream changes (presumably #86520) made this pull request unmergeable. Please resolve the merge conflicts.

Mark-Simulacrum · 2021-08-01T14:17:59Z

@bors try @rust-timer queue

Seems generally OK, though I wonder if the performance regressions are largely down to slightly more complicated code. The theoretical improvement from avoiding the extra traversal (in practice usually just one, I imagine; most iterators are likely traversed at least a few times) is worth the possible compile time overheads...

rust-timer · 2021-08-01T14:18:00Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2021-08-01T14:18:11Z

⌛ Trying commit 35d02e2 with merge d3ac1a943e62f13e360607ac74b33332a0916f76...

bors · 2021-08-01T15:09:52Z

☀️ Try build successful - checks-actions
Build commit: d3ac1a943e62f13e360607ac74b33332a0916f76 (d3ac1a943e62f13e360607ac74b33332a0916f76)

rust-timer · 2021-08-01T15:09:54Z

Queued d3ac1a943e62f13e360607ac74b33332a0916f76 with parent 8d57c0a, future comparison URL.

rust-timer · 2021-08-01T17:37:48Z

Finished benchmarking try commit (d3ac1a943e62f13e360607ac74b33332a0916f76): comparison url.

Summary: This benchmark run did not return any significant changes.

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Mark-Simulacrum · 2021-08-01T17:43:47Z

@bors r+

bors · 2021-08-01T17:43:48Z

📌 Commit 35d02e2 has been approved by Mark-Simulacrum

ssomers · 2021-08-01T19:24:01Z

avoiding the extra traversal (in practice usually just one, I imagine; most iterators are likely traversed at least a few times)

I don't understand what you're referring to (but I don't have to). It's always about just one (multi-level) traversal, and one that's going to be done anyway if iterating to the end. For huge trees, iterating to the end could save multiple memory fetches of nodes that get bored in the cache and leave, before they're read for real, but then it's a huge tree anyway, and just traversing to the fixed first leaf node is wasteful.

As to the slightly more complicated code, it should allow reverting #81486.

bors · 2021-08-01T21:45:35Z

⌛ Testing commit 35d02e2 with merge cd5a90f...

bors · 2021-08-02T00:13:35Z

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing cd5a90f to master...

…lang#86031

…r=Mark-Simulacrum BTree: merge the complication introduced by rust-lang#81486 and rust-lang#86031 Also: - Deallocate the last few tree nodes as soon as an `into_iter` iterator steps beyond the end, instead of waiting around for the drop of the iterator (just to share more code). - Symmetric code for backward iteration. - Mark unsafe the methods on dying handles, modelling dying handles after raw pointers: it's the caller's responsibility to use them safely. r? `@Mark-Simulacrum`

rust-highfive assigned Mark-Simulacrum Jun 5, 2021

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 5, 2021

This comment has been minimized.

Sign in to view

ssomers marked this pull request as draft June 5, 2021 17:52

This comment has been minimized.

Sign in to view

ssomers force-pushed the btree_lazy_iterator branch from 2dab5be to 9e1ac54 Compare June 6, 2021 12:28

ssomers marked this pull request as ready for review June 6, 2021 12:28

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 7, 2021

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 7, 2021

ssomers marked this pull request as draft June 9, 2021 10:54

Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 9, 2021

ssomers force-pushed the btree_lazy_iterator branch from 9e1ac54 to 0272c3a Compare June 21, 2021 07:20

ssomers marked this pull request as ready for review June 21, 2021 07:51

ssomers marked this pull request as draft June 21, 2021 19:33

This comment has been minimized.

Sign in to view

ssomers force-pushed the btree_lazy_iterator branch from 0272c3a to 1e977a7 Compare June 26, 2021 11:43

ssomers marked this pull request as ready for review June 26, 2021 11:43

BTree: lazily locate leaves in rangeless iterators

35d02e2

ssomers force-pushed the btree_lazy_iterator branch from 1e977a7 to 35d02e2 Compare July 8, 2021 20:34

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 1, 2021

Mark-Simulacrum removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 1, 2021

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Aug 1, 2021

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 1, 2021

Mark-Simulacrum mentioned this pull request Aug 1, 2021

comment summary does not match significant changes filter on comparison pages rust-lang/rustc-perf#952

Closed

ssomers mentioned this pull request Aug 1, 2021

Tracking issue for map_first_last: first/last methods on BTreeSet and BTreeMap #62924

Closed

6 tasks

bors added the merged-by-bors This PR was explicitly merged by bors. label Aug 2, 2021

bors merged commit cd5a90f into rust-lang:master Aug 2, 2021

rustbot added this to the 1.56.0 milestone Aug 2, 2021

bors mentioned this pull request Aug 2, 2021

Port clippy lint redundant_field_names to compiler #87512

Closed

ssomers deleted the btree_lazy_iterator branch August 2, 2021 08:54

ssomers added a commit to ssomers/rust that referenced this pull request Aug 2, 2021

BTree: merge the complication introduced by rust-lang#81486 and rust-…

7b28036

…lang#86031

Eh2406 mentioned this pull request Aug 6, 2021

from and as RangeBounds pubgrub-rs/pubgrub#105

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BTree: lazily locate leaves in rangeless iterators #86031

BTree: lazily locate leaves in rangeless iterators #86031

ssomers commented Jun 5, 2021 •

edited

Loading

This comment has been minimized.

ssomers commented Jun 5, 2021

This comment has been minimized.

Mark-Simulacrum commented Jun 7, 2021

rust-timer commented Jun 7, 2021

bors commented Jun 7, 2021

bors commented Jun 7, 2021

rust-timer commented Jun 7, 2021

rust-timer commented Jun 7, 2021

ssomers commented Jun 21, 2021 •

edited

Loading

This comment has been minimized.

bors commented Jul 8, 2021

Mark-Simulacrum commented Aug 1, 2021

rust-timer commented Aug 1, 2021

bors commented Aug 1, 2021

bors commented Aug 1, 2021

rust-timer commented Aug 1, 2021

rust-timer commented Aug 1, 2021

Mark-Simulacrum commented Aug 1, 2021

bors commented Aug 1, 2021

ssomers commented Aug 1, 2021

bors commented Aug 1, 2021

bors commented Aug 2, 2021

BTree: lazily locate leaves in rangeless iterators #86031

BTree: lazily locate leaves in rangeless iterators #86031

Conversation

ssomers commented Jun 5, 2021 • edited Loading

This comment has been minimized.

ssomers commented Jun 5, 2021

This comment has been minimized.

Mark-Simulacrum commented Jun 7, 2021

rust-timer commented Jun 7, 2021

bors commented Jun 7, 2021

bors commented Jun 7, 2021

rust-timer commented Jun 7, 2021

rust-timer commented Jun 7, 2021

ssomers commented Jun 21, 2021 • edited Loading

This comment has been minimized.

bors commented Jul 8, 2021

Mark-Simulacrum commented Aug 1, 2021

rust-timer commented Aug 1, 2021

bors commented Aug 1, 2021

bors commented Aug 1, 2021

rust-timer commented Aug 1, 2021

rust-timer commented Aug 1, 2021

Mark-Simulacrum commented Aug 1, 2021

bors commented Aug 1, 2021

ssomers commented Aug 1, 2021

bors commented Aug 1, 2021

bors commented Aug 2, 2021

ssomers commented Jun 5, 2021 •

edited

Loading

ssomers commented Jun 21, 2021 •

edited

Loading