Iterator::fold is a little slow compared to bare loop #76725
Labels
A-iterators
Area: Iterators
I-slow
Issue: Problems and improvements with respect to performance of generated code.
T-libs
Relevant to the library team, which will review and decide on the PR/issue.
T-libs-api
Relevant to the library API team, which will review and decide on the PR/issue.
Background
I was writing a small project to compare the memory impact of iterators. I decided to also investigate the runtime impact and see how close Rust came to providing various zero cost abstractions to iteration. I compared three types of iteration:
and saw this Criterion output plot:
The outliers to the right are the timings of the
fold
functions while the two pairs on the left are thefilter_map_filter
andraw
versions. Rust did a great job ensuring.filter.map.filter
was the same speed as a raw loop but.fold
seemed to be lacking.Quick Investigation
Looking at the source for
.fold
theaccum
is reassigned with the result of each invocation off
. I quickly tested if this could be improved with a&mut
instead in this PR. The result was surprising (to me):The "custom fold" method was faster than all the other options (which doesn't make a ton of sense to me but that's what y'all are here for!).
Path Forward
I initially was going to suggest adding some sort of
fold_mut
or some better named method to allow for this fasterfold
iterator. This could be a performance improvement in some areas and could also improve the syntax when the closure couldn't "easily" return the new accumulator:I made a branch for this if we want to head in that direction (the tests are slim, the benchmarks are probably overkill, the stability is missing, and the docs are probably slim and improperly formatted but it's a start!) and saw some improvements in the benchmarks I added:
Now I'm not sure if this "
fold_mut
" path is the right way to go - I'm not sure if it's awkward or dangerous. It seems similar to Ruby'seach_with_object
so there's maybe something there. It could also be that with some compiler witchcraft we can just makefold
a "true" zero cost abstraction.In any case, thought I'd post here instead of making a PR so we could decided if there should be any PR and I'm happy to help with whatever path forward we choose!
The text was updated successfully, but these errors were encountered: