Improve Set Difference size_hint lower bound #530

ToMe25 · 2024-06-07T10:52:19Z

This PR makes the Set Difference iterator generate a non-zero lower bound in some situations.

Specifically, it now returns a non-zero lower bound if the difference method is called on a larger set with a smaller set.
That is because in those cases the fact that sets can't really contains duplicates* guarantees that a minimum of self.len() - other.len() items will be returned by the iterator.

* Well, they can, but that is already documented to be causing a random mess

This implementation has the disadvantage that a single next() call may reduce the lower bound by more than one.
Every size hint generated, even the first largest one, is guaranteed to be correct, but it may be confusing if one next() call causes the lower bound to drop by more than one.

This could be avoided by storing the minimum number of resulting elements in the iterator and subtracting one each time next() is called, but I don't think its worth the added complexity.

This implementation isn't perfect, but it is spec compliant as far as I can tell

ToMe25 · 2024-06-07T10:57:28Z

My motivation for this was that I'm a perfectionist and get annoyed by this kind of thing, but I still did A LOT of benchmarking.

The result of my benchmarking was that while the changes to collecting the various iterators to a HashSet were as expected, the changes to collecting to Vec were kinda weird.
For some reason this change improved the performance of collecting an Itersection to a Vec, even though this change doesn't affect Intersection at all.
This also drastically decreases the performance of collecting to a Vec, which I believe to be solely due to the fact that the inlined lower bound is no longer a constant.

The summary of my results is this:
Collecting to a HashSet gets faster by ~20% in the affected cases, and remains unchanged in the other cases.
Collecting to a Vec gets faster or slower by ~20-50% seemingly at random, but overall slower on average.

Here my exact results(click me):

Name	Before	After	Diff (%)
set_iter_collect_difference_large_small_hashset	55,917.77	44,172.59	-21.00%
set_iter_collect_difference_large_small_vec	22,996.61	34,235.16	48.87%
set_iter_collect_difference_small_large_hashset	2,785.54	2,737.30	-1.73%
set_iter_collect_difference_small_large_vec	2,071.05	1,598.02	-22.84%
set_iter_collect_intersection_large_small_hashset	2,862.06	2,860.77	-0.05%
set_iter_collect_intersection_large_small_vec	2,403.75	1,630.63	-32.16%
set_iter_collect_intersection_small_large_hashset	2,866.35	2,885.31	0.66%
set_iter_collect_intersection_small_large_vec	2,490.41	1,623.99	-34.79%
set_iter_collect_symmetric_difference_large_small_hashset	60,269.34	47,027.25	-21.97%
set_iter_collect_symmetric_difference_large_small_vec	24,543.87	34,968.29	42.47%
set_iter_collect_symmetric_difference_small_large_hashset	60,668.83	45,848.87	-24.43%
set_iter_collect_symmetric_difference_small_large_vec	24,864.28	34,386.12	38.30%
set_iter_collect_union_large_small_hashset	42,177.86	42,743.26	1.34%
set_iter_collect_union_large_small_vec	31,568.72	30,323.12	-3.95%
set_iter_collect_union_small_large_hashset	42,909.41	42,364.49	-1.27%
set_iter_collect_union_small_large_vec	25,763.16	30,302.66	17.62%

The tests were run with a "large" 1000 element set and a "small" 100 element set with an overlap of 50 elements.
The large_small/small_large portion of the name indicates which set the operation was called on.
large_small translates to large.$op(small).

That means that from a performance point of view, whether this change is worth it depends on whether collecting to a Vec or a HashSet is more common.

ToMe25 · 2024-06-11T15:26:53Z

One way to potentially get rid of most of the collect::<Vec<T>>::() performance penalty of this change would be to implement a specialized version for Vecs.

I'm not entirely sure how trait based specialization works though, so I can't guarantee that this will work.
Another disadvantage would be that this would add a significant amount of extra code complexity.

The advantage is, though, that doing so should allow making collecting into a Vec faster than without this PR, since the initial first allocation could be larger.
I'm not certain, however, if this outweighs the performance penalty that the trait based specialization probably brings with it.
Nor am I sure whether this would be worth the added complexity.

Amanieu · 2024-06-17T13:28:19Z

The overhead of a saturating add is tiny compared to memory allocation so that is most likely not the direct cause of the regression. Finding the actual cause would require looking at the exact assembly code generated by rustc and comparing the two versions along with profiling information.

However I think it is still better to merge this for now since it does provide a more accurate lower bound for the iterator size.

@bors r+

bors · 2024-06-17T13:28:22Z

📌 Commit f4361bc has been approved by Amanieu

It is now in the queue for this repository.

bors · 2024-06-17T13:29:28Z

⌛ Testing commit f4361bc with merge 65c553d...

bors · 2024-06-17T13:38:15Z

☀️ Test successful - checks-actions
Approved by: Amanieu
Pushing 65c553d to master...

Implement XxxAssign operations on HashSets This PR primarily implements the XxxAssign operation traits for `HashSet`. My primary motivation to do so is for convenience, but depending on the situation they can provide a significant performance improvement as well.\* In my tests, which may not be ideal because I don't have much benchmarking experience, the assigning operations are, with the exception of `Sub`, a minimum of 25% faster.\* Note that when swapping the large and the small set around, some of these are significantly slower than the non-assigning variant. Therefore using them is likely only worth it performance wise, if you already know which set is larger, and the right one of the sets just so happens to be the one you don't need to keep. \* Results may have changed due to #530 being merged Here my exact benchmark results, done with the newly added benchmark suit: <!DOCTYPE html> VER | LSIZE | SSIZE | OP | NS/ITER | DIFF (%) | COMMENT -- | -- | -- | -- | -- | -- | -- 1 | 1000 | 100 | and | 5,682.88 | | 1 | 1000 | 100 | or | 41,427.82 | | 1 | 1000 | 100 | xor | 57,404.27 | | 1 | 1000 | 100 | subls | 56,262.53 | | 1 | 1000 | 100 | subsl | 751.42 | | 1 | 1000 | 2 | and | 100.16 | | 1 | 1000 | 2 | or | 40,435.09 | | 1 | 1000 | 2 | xor | 59,058.05 | | 1 | 1000 | 2 | subls | 58,668.34 | | 1 | 1000 | 2 | subsl | 18.89 | | 1 | 1000 | 100 | or_ass | 32,888.49 | -20.61% | unconditional insert 2 | 1000 | 100 | or_ass | 29,397.04 | -29.04% | !contains insert 3 | 1000 | 100 | or_ass | 32,399.65 | -21.79% | extend iter().cloned() 4 | 1000 | 100 | or_ass | 30,693.33 | -25.91% | get_or_insert_owned 5 | 1000 | 100 | or_ass | 33,722.59 | -18.60% | calc intersection; extend rhs.iter() !intersection contains; Requires S: Clone 1 | 1000 | 100 | add_ass | 30,114.17 | -26.66% | !contains insert 1 | 1000 | 100 | xor_ass | 32,309.85 | -43.72% | contains remove else insert 2 | 1000 | 100 | xor_ass | 40,058.48 | -30.22% | extract_if rhs contains; extend !removed contains 3 | 1000 | 100 | xor_ass | 31,801.04 | -44.60% | raw_entry().from_key() replace_entry_with / insert 4 | 1000 | 100 | xor_ass | 31,935.07 | -44.37% | raw_entry().from_key_hashed_nocheck() replace_entry_with / insert_hashed_nocheck 5 | 1000 | 100 | xor_ass | 31,843.33 | -44.53% | self.map.table.get.is_none self.map.table.insert else self.map.table.remove_entry 1 | 1000 | 100 | subls_ass | 33,366.13 | -40.70% | contains remove 1 | 1000 | 100 | subsl_ass | 10,686.02 | 1322.11% | contains remove 2 | 1000 | 100 | subls_ass | 36,351.69 | -35.39% | retain !contains 2 | 1000 | 100 | subsl_ass | 3,939.67 | 424.30% | retain !contains 3 | 1000 | 100 | subls_ass | 32,012.82 | -43.10% | unconditional remove 3 | 1000 | 100 | subsl_ass | 9,908.76 | 1218.67% | unconditional remove 4 | 1000 | 100 | subls_ass | 36,232.13 | -35.60% | self.map.retain !contains 4 | 1000 | 100 | subsl_ass | 3,939.35 | 424.25% | self.map.retain !contains 5 | 1000 | 100 | subls_ass | 31,879.32 | -43.34% | if rhs smaller self unconditional remove else retain !contains 5 | 1000 | 100 | subsl_ass | 3,946.98 | 425.27% | if rhs smaller self unconditional remove else retain !contains 1 | 1000 | 2 | add_ass | 28,324.95 | -29.27% | 2 | 1000 | 2 | or_ass | 28,322.62 | -29.96% | 1 | 1000 | 2 | xor_ass | 29,107.31 | -50.71% | 3 | 1000 | 2 | xor_ass | 29,026.82 | -50.85% | 1 | 1000 | 2 | subls_ass | 29,310.04 | -50.04% | 1 | 1000 | 2 | subsl_ass | 4,212.56 | 22200.48% | 2 | 1000 | 2 | subls_ass | 34,074.85 | -41.92% | 2 | 1000 | 2 | subsl_ass | 66.43 | 251.67% | 3 | 1000 | 2 | subls_ass | 29,340.86 | -49.99% | 3 | 1000 | 2 | subsl_ass | 5,972.25 | 31515.93% | 5 | 1000 | 2 | subls_ass | 29,460.49 | -49.78% | 5 | 1000 | 2 | subsl_ass | 65.32 | 245.79% | In addition to the Assigning operators this PR changes a few more things: * It changes the allocator bound on the non-assigning set operations to `A: Allocator + Default`. * I also added a benchmark suit for the set operations.

Improve Set Difference size_hint

f4361bc

This implementation isn't perfect, but it is spec compliant as far as I can tell

bors merged commit 65c553d into rust-lang:master Jun 17, 2024
24 checks passed

ToMe25 deleted the fix_set_diff_size_hint branch June 17, 2024 14:28

ToMe25 mentioned this pull request Jun 17, 2024

Implement XxxAssign operations on HashSets #529

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Set Difference size_hint lower bound #530

Improve Set Difference size_hint lower bound #530

ToMe25 commented Jun 7, 2024

ToMe25 commented Jun 7, 2024

ToMe25 commented Jun 11, 2024

Amanieu commented Jun 17, 2024

bors commented Jun 17, 2024

bors commented Jun 17, 2024

bors commented Jun 17, 2024

Improve Set Difference size_hint lower bound #530

Improve Set Difference size_hint lower bound #530

Conversation

ToMe25 commented Jun 7, 2024

ToMe25 commented Jun 7, 2024

ToMe25 commented Jun 11, 2024

Amanieu commented Jun 17, 2024

bors commented Jun 17, 2024

bors commented Jun 17, 2024

bors commented Jun 17, 2024