Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fast paths to fill and wrap #478

Merged
merged 6 commits into from
Oct 1, 2022
Merged

Add fast paths to fill and wrap #478

merged 6 commits into from
Oct 1, 2022

Conversation

mgeisler
Copy link
Owner

@mgeisler mgeisler commented Oct 1, 2022

In case the input is shorter than the wrapping width, we can return it immediately. This avoids findings words, breaking them, and finally reassembling them.

The speedup from the is on the order of 10-25 times, depending on the wrapping width. I tested with a wrapping width of 60 columns and there the time for fill("") is now around 15 nanoseconds. It was around 150 nanoseconds before:

String lengths/fill_first_fit/0000
                        time:   [15.843 ns 15.880 ns 15.936 ns]
                        change: [-90.123% -90.066% -90.003%] (p = 0.00 < 0.05)
                        Performance has improved.

A similar 10x improvement is seen for fill("abcde"):

String lengths/fill_first_fit/0005
                        time:   [28.256 ns 28.311 ns 28.385 ns]
                        change: [-90.482% -90.428% -90.373%] (p = 0.00 < 0.05)
                        Performance has improved.

As strings get longer and closer to the wrapping width, the improvements get larger. Before it took just shy of a microsecond to wrap a 50 character string on my machine:

String lengths/fill_first_fit/0050
                        time:   [943.91 ns 944.96 ns 946.17 ns]

Now it only takes 34 nanoseconds, a 27x improvement:

String lengths/fill_first_fit/0050
                        time:   [34.438 ns 34.506 ns 34.586 ns]
                        change: [-96.362% -96.336% -96.306%] (p = 0.00 < 0.05)
                        Performance has improved.

The time needed to wrap the input is nearly flat until you hit the wrapping width, as which time it grows linearly with the input length:

image

The zero-padded benchmark ID makes it easy to run ranges of
benchmarks, e.g., to run benchmarks on inputs shorter than 100 bytes,
run the “/00” benchmarks.
This adds a conservative check to `fill` which will avoid wrapping the
input when it is a single line which is shorter than the wrap width.
In that case, wrapping the input is now 10-25 times faster than
before.

The input length is measured in bytes, which means that we can
overestimate the length compared `display_width`. This is because
multi-byte sequences (such as combining diacritics) can be displayed
as a single character. So we will end up taking the slow path for some
strings, but the vast majority of short strings will take the fast
path.

The correctness of the fast path is checked with a new fuzz test.
This speeds up wrapping of multi-line inputs since each line can be
returned as-is when no wrapping is needed. The `wrap` function is now
up to 10 times faster when it sees that no wrapping is needed.

The correctness of the fast-path is checked with a new fuzz test.
This module is only present when running fuzz tests.
@mgeisler mgeisler merged commit 62f8a48 into master Oct 1, 2022
@mgeisler mgeisler deleted the fast-path branch October 1, 2022 20:48
This was referenced Oct 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant