Skip to content

Conversation

@UBarney
Copy link
Contributor

@UBarney UBarney commented Jan 19, 2025

Which issue does this PR close?

Closes #12445.

Rationale for this change

See the issue.

What changes are included in this PR?

  • If all characters in this string are within the ASCII range, reverse bytes directly
  • Move gen_string_array in benches/character_length.rs to helper.rs

Are these changes tested?

Yes. Using existing unit test

bench result

reverse_StringArray_ascii_str_len_8
                        time:   [93.762 µs 94.463 µs 95.301 µs]
                        change: [-38.347% -37.726% -37.143%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  7 (7.00%) high mild
  2 (2.00%) high severe

reverse_StringArray_utf8_density_0.8_str_len_8
                        time:   [491.88 µs 496.57 µs 501.74 µs]
                        change: [-1.0373% +0.1648% +1.4032%] (p = 0.79 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

reverse_StringViewArray_ascii_str_len_8
                        time:   [87.226 µs 87.666 µs 88.147 µs]
                        change: [-34.737% -34.039% -33.350%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

reverse_StringViewArray_utf8_density_0.8_str_len_8
                        time:   [502.54 µs 505.02 µs 507.66 µs]
                        change: [-0.5734% +0.3619% +1.3269%] (p = 0.49 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

reverse_StringArray_ascii_str_len_32
                        time:   [98.021 µs 98.794 µs 99.785 µs]
                        change: [-70.309% -70.056% -69.817%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe

reverse_StringArray_utf8_density_0.8_str_len_32
                        time:   [1.6026 ms 1.6100 ms 1.6181 ms]
                        change: [-0.0304% +1.0048% +2.0155%] (p = 0.07 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

reverse_StringViewArray_ascii_str_len_32
                        time:   [102.37 µs 102.89 µs 103.44 µs]
                        change: [-68.935% -68.690% -68.454%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

reverse_StringViewArray_utf8_density_0.8_str_len_32
                        time:   [1.6199 ms 1.6269 ms 1.6345 ms]
                        change: [+0.1269% +1.4123% +3.0235%] (p = 0.04 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

reverse_StringArray_ascii_str_len_128
                        time:   [166.01 µs 167.09 µs 168.33 µs]
                        change: [-84.421% -84.299% -84.176%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild

reverse_StringArray_utf8_density_0.8_str_len_128
                        time:   [6.1278 ms 6.1569 ms 6.1883 ms]
                        change: [-1.0026% -0.2737% +0.5242%] (p = 0.48 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  6 (6.00%) high mild

reverse_StringViewArray_ascii_str_len_128
                        time:   [173.15 µs 174.16 µs 175.26 µs]
                        change: [-83.918% -83.731% -83.543%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) high mild
  4 (4.00%) high severe

reverse_StringViewArray_utf8_density_0.8_str_len_128
                        time:   [6.1279 ms 6.1637 ms 6.2028 ms]
                        change: [-1.3879% -0.5410% +0.3269%] (p = 0.22 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  7 (7.00%) high mild
  2 (2.00%) high severe

reverse_StringArray_ascii_str_len_4096
                        time:   [15.579 ms 15.687 ms 15.800 ms]
                        change: [-63.915% -63.620% -63.301%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

reverse_StringArray_utf8_density_0.8_str_len_4096
                        time:   [215.98 ms 217.47 ms 219.18 ms]
                        change: [-0.0524% +0.8008% +1.6803%] (p = 0.07 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

reverse_StringViewArray_ascii_str_len_4096
                        time:   [8.6421 ms 8.8760 ms 9.1174 ms]
                        change: [-75.462% -74.835% -74.087%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

reverse_StringViewArray_utf8_density_0.8_str_len_4096
                        time:   [217.17 ms 218.29 ms 219.58 ms]
                        change: [+0.9451% +1.5937% +2.2792%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

Are there any user-facing changes?

No

@github-actions github-actions bot added the functions Changes to functions implementation label Jan 19, 2025
@UBarney UBarney force-pushed the ascii_fast_path_for_reverse branch from 1010906 to 7e91bc8 Compare January 19, 2025 07:18
@UBarney UBarney marked this pull request as ready for review January 19, 2025 08:18
// SAFETY: Since the original string was ASCII, reversing the bytes still results in valid UTF-8.
let reversed = unsafe {
// reverse bytes directly since ASCII characters are single bytes
String::from_utf8_unchecked(s.bytes().rev().collect::<Vec<u8>>())
Copy link
Contributor

@Dandandan Dandandan Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could avoid a allocation here by copying into a Vec<u8> buffer

// slightly faster than using iterator `bytes`
bytes_buf.extend(s.as_bytes())
bytes_buf.reverse()
...
bytes_buf.clear()

Copy link
Contributor Author

@UBarney UBarney Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got an improvement of ~30% at most after add bytes_buf

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@UBarney UBarney force-pushed the ascii_fast_path_for_reverse branch from 028333c to e8b44b5 Compare January 21, 2025 14:24
@UBarney UBarney requested a review from Dandandan January 21, 2025 14:47
@Dandandan Dandandan closed this Jan 21, 2025
@Dandandan Dandandan reopened this Jan 21, 2025
@alamb
Copy link
Contributor

alamb commented Jan 22, 2025

change: [-83.918% -83.731% -83.543%] (p = 0.00 < 0.05)

That is some sweet sweet performance

@alamb alamb merged commit 2ac20e3 into apache:main Jan 22, 2025
49 checks passed
@alamb
Copy link
Contributor

alamb commented Jan 22, 2025

Thanks @UBarney and @Dandandan

@UBarney UBarney deleted the ascii_fast_path_for_reverse branch January 23, 2025 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize reverse() string function with ASCII fast path

3 participants