Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improvements to (i)starts_with and (i)ends_with performance #6118

Merged
merged 4 commits into from
Jul 30, 2024

Conversation

samuelcolvin
Copy link
Contributor

Which issue does this PR close?

Related to (but not closing) #6107.

Rationale for this change

Lots of context in #6107, this makes LIKE and ILIKE queries which are simply "starts with" and "ends with" significantly faster.

Running

cargo bench -p arrow --bench comparison_kernels -F test_utils -- like

Gives notably:

Test Change
like_utf8 scalar ends with -53.921%
like_utf8 scalar starts with -53.883%
like_utf8view scalar ends with -26.079%
like_utf8view scalar starts with -24.864%
nlike_utf8 scalar ends with -53.944%
nlike_utf8 scalar starts with -53.610%
ilike_utf8 scalar starts with -21.921%
nilike_utf8 scalar starts with -22.727%
Full output
   Compiling arrow-string v52.1.0 (/Users/samuel/code/arrow-rs/arrow-string)
   Compiling arrow v52.1.0 (/Users/samuel/code/arrow-rs/arrow)
    Finished `bench` profile [optimized] target(s) in 4.13s
     Running benches/comparison_kernels.rs (target/release/deps/comparison_kernels-b61fe744923f27b6)
like_utf8 scalar equals time:   [144.33 µs 144.47 µs 144.69 µs]
                        change: [-0.2401% -0.0477% +0.1358%] (p = 0.63 > 0.05)
                        No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  4 (4.00%) high severe

like_utf8 scalar contains
                        time:   [195.98 µs 196.10 µs 196.23 µs]
                        change: [-0.2527% -0.0384% +0.1985%] (p = 0.75 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

like_utf8 scalar ends with
                        time:   [66.456 µs 66.538 µs 66.628 µs]
                        change: [-53.995% -53.921% -53.817%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

like_utf8 scalar starts with
                        time:   [67.058 µs 67.093 µs 67.125 µs]
                        change: [-54.038% -53.883% -53.755%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

like_utf8 scalar complex
                        time:   [124.83 µs 124.89 µs 124.96 µs]
                        change: [-2.2102% -1.9432% -1.6682%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) high mild
  6 (6.00%) high severe

like_utf8view scalar equals
                        time:   [15.993 ms 16.022 ms 16.052 ms]
                        change: [-0.1739% +0.0490% +0.2773%] (p = 0.68 > 0.05)
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Benchmarking like_utf8view scalar contains: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 18.1s, or reduce sample count to 20.
like_utf8view scalar contains
                        time:   [179.98 ms 180.36 ms 180.76 ms]
                        change: [-0.9752% -0.6832% -0.3939%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 23 outliers among 100 measurements (23.00%)
  9 (9.00%) low mild
  7 (7.00%) high mild
  7 (7.00%) high severe

like_utf8view scalar ends with
                        time:   [21.398 ms 21.473 ms 21.552 ms]
                        change: [-26.453% -26.079% -25.713%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

like_utf8view scalar starts with
                        time:   [21.710 ms 21.790 ms 21.889 ms]
                        change: [-25.151% -24.864% -24.473%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

Benchmarking like_utf8view scalar complex: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 23.3s, or reduce sample count to 20.
like_utf8view scalar complex
                        time:   [231.47 ms 231.93 ms 232.49 ms]
                        change: [+0.6295% +0.8893% +1.1705%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) high mild
  1 (1.00%) high severe

nlike_utf8 scalar equals
                        time:   [145.37 µs 145.53 µs 145.70 µs]
                        change: [-0.0855% +0.1646% +0.3920%] (p = 0.18 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

nlike_utf8 scalar contains
                        time:   [197.16 µs 197.41 µs 197.71 µs]
                        change: [-0.7353% -0.3336% +0.0168%] (p = 0.09 > 0.05)
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

nlike_utf8 scalar ends with
                        time:   [67.028 µs 67.212 µs 67.443 µs]
                        change: [-54.194% -53.944% -53.720%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

nlike_utf8 scalar starts with
                        time:   [67.260 µs 67.327 µs 67.396 µs]
                        change: [-53.731% -53.610% -53.498%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

nlike_utf8 scalar complex
                        time:   [125.99 µs 126.10 µs 126.23 µs]
                        change: [-1.2656% -1.0550% -0.8713%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

ilike_utf8 scalar equals
                        time:   [103.50 µs 103.64 µs 103.79 µs]
                        change: [+2.2061% +2.5400% +2.8621%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

ilike_utf8 scalar contains
                        time:   [664.73 µs 665.66 µs 666.72 µs]
                        change: [+0.4162% +0.6823% +0.9311%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  7 (7.00%) high mild
  1 (1.00%) high severe

ilike_utf8 scalar ends with
                        time:   [104.73 µs 104.87 µs 105.01 µs]
                        change: [-0.0506% +0.2329% +0.5070%] (p = 0.11 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild

ilike_utf8 scalar starts with
                        time:   [93.878 µs 94.077 µs 94.287 µs]
                        change: [-22.114% -21.921% -21.735%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

ilike_utf8 scalar complex
                        time:   [136.51 µs 136.59 µs 136.68 µs]
                        change: [-0.8583% -0.6439% -0.4482%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe

nilike_utf8 scalar equals
                        time:   [103.23 µs 103.56 µs 104.04 µs]
                        change: [+2.3182% +2.5182% +2.7169%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

nilike_utf8 scalar contains
                        time:   [664.85 µs 665.41 µs 665.99 µs]
                        change: [+0.9991% +1.1622% +1.3429%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

nilike_utf8 scalar ends with
                        time:   [104.22 µs 104.36 µs 104.50 µs]
                        change: [-0.0951% +0.1088% +0.3029%] (p = 0.31 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  4 (4.00%) high mild

nilike_utf8 scalar starts with
                        time:   [94.197 µs 94.387 µs 94.575 µs]
                        change: [-23.029% -22.727% -22.441%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

nilike_utf8 scalar complex
                        time:   [136.32 µs 136.47 µs 136.63 µs]
                        change: [-2.8598% -2.4438% -2.0479%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

like_utf8_scalar_dyn dictionary[10] string[4])
                        time:   [29.504 µs 29.550 µs 29.603 µs]
                        change: [+1.1570% +1.4762% +1.8389%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

ilike_utf8_scalar_dyn dictionary[10] string[4])
                        time:   [29.404 µs 29.436 µs 29.471 µs]
                        change: [+0.1443% +0.5140% +0.8086%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

What changes are included in this PR?

  • new implementation of starts_with_ignore_ascii_case and ends_with_ignore_ascii_case, these showed significant improvements (~20%) over the previous implementations
  • new methods crate::predicate::starts_with and crate::predicate::ends_with that show a 2 or 3x improvement over str.starts_with and str.ends_with

Are there any user-facing changes?

Shouldn't be. I fuzzed all the implementations against the default methods here

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jul 25, 2024
@samuelcolvin samuelcolvin force-pushed the starts_with-ends_with-improvements branch from 8d6c7cc to 079b4b1 Compare July 26, 2024 09:03
@samuelcolvin samuelcolvin changed the title improvements to (i)starts_with and (i)ends_with improvements to (i)starts_with and (i)ends_with performance Jul 26, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @samuelcolvin

I also ran the benchmarks and here is what I got -- it seems a bit mixed where some go faster and some go slower. On the whole it seems an improvement to me.

I am rerunning the numbers to see how consistent it is from run to run

++ critcmp master starts_with-ends_with-improvements
group                                              master                                 starts_with-ends_with-improvements
-----                                              ------                                 ----------------------------------
ilike_utf8 scalar complex                          1.00    302.5±1.19µs        ? ?/sec    1.00    302.1±1.54µs        ? ?/sec
ilike_utf8 scalar contains                         1.00   1558.3±6.61µs        ? ?/sec    1.03   1599.0±5.82µs        ? ?/sec
ilike_utf8 scalar ends with                        1.00    218.7±0.63µs        ? ?/sec    1.13    246.2±0.45µs        ? ?/sec
ilike_utf8 scalar equals                           1.16    248.7±0.63µs        ? ?/sec    1.00    214.6±0.53µs        ? ?/sec
ilike_utf8 scalar starts with                      1.00    279.5±0.57µs        ? ?/sec    1.07    298.3±0.55µs        ? ?/sec
ilike_utf8_scalar_dyn dictionary[10] string[4])    1.00     88.1±0.14µs        ? ?/sec    1.00     88.1±0.24µs        ? ?/sec
like_utf8 scalar complex                           1.00    283.4±0.99µs        ? ?/sec    1.02    287.8±4.98µs        ? ?/sec
like_utf8 scalar contains                          1.01    349.8±0.65µs        ? ?/sec    1.00    348.1±0.41µs        ? ?/sec
like_utf8 scalar ends with                         1.43    221.4±0.44µs        ? ?/sec    1.00    154.7±0.23µs        ? ?/sec
like_utf8 scalar equals                            1.01    219.5±0.44µs        ? ?/sec    1.00    217.7±0.65µs        ? ?/sec
like_utf8 scalar starts with                       1.40    242.7±0.89µs        ? ?/sec    1.00    173.5±0.54µs        ? ?/sec
like_utf8_scalar_dyn dictionary[10] string[4])     1.00     88.2±0.21µs        ? ?/sec    1.00     88.1±0.16µs        ? ?/sec
like_utf8view scalar complex                       1.00    533.0±1.24ms        ? ?/sec    1.01    538.1±9.86ms        ? ?/sec
like_utf8view scalar contains                      1.00    379.1±0.34ms        ? ?/sec    1.00    380.7±0.48ms        ? ?/sec
like_utf8view scalar ends with                     1.14     60.0±0.27ms        ? ?/sec    1.00     52.7±0.21ms        ? ?/sec
like_utf8view scalar equals                        1.00     37.1±0.12ms        ? ?/sec    1.00     37.0±0.11ms        ? ?/sec
like_utf8view scalar starts with                   1.06     60.4±0.36ms        ? ?/sec    1.00     56.8±0.23ms        ? ?/sec
nilike_utf8 scalar complex                         1.00    302.9±1.46µs        ? ?/sec    1.00    302.3±1.97µs        ? ?/sec
nilike_utf8 scalar contains                        1.00   1556.6±6.92µs        ? ?/sec    1.03   1601.2±7.82µs        ? ?/sec
nilike_utf8 scalar ends with                       1.00    218.7±0.54µs        ? ?/sec    1.12    246.0±0.72µs        ? ?/sec
nilike_utf8 scalar equals                          1.16    249.4±3.30µs        ? ?/sec    1.00    214.9±0.98µs        ? ?/sec
nilike_utf8 scalar starts with                     1.00    279.7±0.72µs        ? ?/sec    1.07    298.3±0.58µs        ? ?/sec
nlike_utf8 scalar complex                          1.00    283.4±1.60µs        ? ?/sec    1.00    283.6±2.73µs        ? ?/sec
nlike_utf8 scalar contains                         1.00    350.1±0.66µs        ? ?/sec    1.00    348.4±2.19µs        ? ?/sec
nlike_utf8 scalar ends with                        1.43    221.4±0.70µs        ? ?/sec    1.00    155.0±0.47µs        ? ?/sec
nlike_utf8 scalar equals                           1.01    219.6±1.12µs        ? ?/sec    1.00    217.7±1.46µs        ? ?/sec
nlike_utf8 scalar starts with                      1.35    234.9±0.65µs        ? ?/sec    1.00    173.7±0.67µs        ? ?/sec

arrow-string/src/predicate.rs Show resolved Hide resolved
arrow-string/src/predicate.rs Show resolved Hide resolved
@alamb
Copy link
Contributor

alamb commented Jul 27, 2024

Here is my next run

++ critcmp master starts_with-ends_with-improvements
group                                              master                                 starts_with-ends_with-improvements
-----                                              ------                                 ----------------------------------
ilike_utf8 scalar complex                          1.00    301.7±1.10µs        ? ?/sec    1.00    301.5±1.04µs        ? ?/sec
ilike_utf8 scalar contains                         1.00   1554.5±6.20µs        ? ?/sec    1.03   1595.8±4.49µs        ? ?/sec
ilike_utf8 scalar ends with                        1.00    218.5±0.39µs        ? ?/sec    1.13    247.1±4.53µs        ? ?/sec
ilike_utf8 scalar equals                           1.16    248.8±0.40µs        ? ?/sec    1.00    214.8±0.66µs        ? ?/sec
ilike_utf8 scalar starts with                      1.00    279.4±0.36µs        ? ?/sec    1.07    298.5±0.46µs        ? ?/sec
ilike_utf8_scalar_dyn dictionary[10] string[4])    1.00     88.1±0.14µs        ? ?/sec    1.00     88.2±0.18µs        ? ?/sec
like_utf8 scalar complex                           1.01    283.4±0.82µs        ? ?/sec    1.00    282.0±0.96µs        ? ?/sec
like_utf8 scalar contains                          1.00    347.6±0.52µs        ? ?/sec    1.00    347.4±0.76µs        ? ?/sec
like_utf8 scalar ends with                         1.41    219.4±0.55µs        ? ?/sec    1.00    155.3±2.99µs        ? ?/sec
like_utf8 scalar equals                            1.00    217.6±0.53µs        ? ?/sec    1.00    217.9±0.73µs        ? ?/sec
like_utf8 scalar starts with                       1.34    232.8±0.29µs        ? ?/sec    1.00    173.5±0.30µs        ? ?/sec
like_utf8_scalar_dyn dictionary[10] string[4])     1.00     88.1±0.17µs        ? ?/sec    1.00     88.1±0.14µs        ? ?/sec
like_utf8view scalar complex                       1.00    531.3±2.04ms        ? ?/sec    1.00    531.0±2.43ms        ? ?/sec
like_utf8view scalar contains                      1.00    378.6±0.36ms        ? ?/sec    1.01    380.6±1.38ms        ? ?/sec
like_utf8view scalar ends with                     1.13     59.6±0.24ms        ? ?/sec    1.00     52.7±0.23ms        ? ?/sec
like_utf8view scalar equals                        1.00     37.0±0.47ms        ? ?/sec    1.00     36.9±0.09ms        ? ?/sec
like_utf8view scalar starts with                   1.06     60.0±0.48ms        ? ?/sec    1.00     56.8±0.23ms        ? ?/sec
nilike_utf8 scalar complex                         1.00    301.9±1.19µs        ? ?/sec    1.00    300.4±2.48µs        ? ?/sec
nilike_utf8 scalar contains                        1.00   1551.3±4.82µs        ? ?/sec    1.03   1594.3±4.71µs        ? ?/sec
nilike_utf8 scalar ends with                       1.00    219.7±3.41µs        ? ?/sec    1.12    246.3±0.86µs        ? ?/sec
nilike_utf8 scalar equals                          1.16    248.8±0.53µs        ? ?/sec    1.00    214.5±0.33µs        ? ?/sec
nilike_utf8 scalar starts with                     1.00    279.6±1.02µs        ? ?/sec    1.07    298.4±0.46µs        ? ?/sec
nlike_utf8 scalar complex                          1.01    283.2±1.05µs        ? ?/sec    1.00    281.1±1.19µs        ? ?/sec
nlike_utf8 scalar contains                         1.00    347.5±0.63µs        ? ?/sec    1.00    347.2±0.49µs        ? ?/sec
nlike_utf8 scalar ends with                        1.42    219.7±1.25µs        ? ?/sec    1.00    154.8±0.24µs        ? ?/sec
nlike_utf8 scalar equals                           1.00    217.7±0.45µs        ? ?/sec    1.00    217.7±0.47µs        ? ?/sec
nlike_utf8 scalar starts with                      1.34    233.0±0.85µs        ? ?/sec    1.00    173.6±0.48µs        ? ?/sec

BTW I am running this on a c2-standard-8 GCP instance

Script

pushd ~/arrow-rs

#git remote add samuelcolvin https://github.com/samuelcolvin/arrow-rs.git
git fetch -p samuelcolvin
BENCH_COMMAND="cargo bench -p arrow --bench comparison_kernels -F test_utils"
BENCH_FILTER="like"
REPO_NAME="samuelcolvin"
BRANCH_NAME="starts_with-ends_with-improvements"

# remove old test runs
rm -rf target/criterion/

git checkout $BRANCH_NAME
git reset --hard "$REPO_NAME/$BRANCH_NAME"

# Run on test branch
$BENCH_COMMAND -- --save-baseline ${BRANCH_NAME} ${BENCH_FILTER}

# Run on master
MERGE_BASE=$(git merge-base HEAD apache/master)
echo "** Comparing to ${MERGE_BASE}"

git checkout ${MERGE_BASE}
$BENCH_COMMAND -- --save-baseline master  ${BENCH_FILTER}

critcmp master ${BRANCH_NAME}

popd

@samuelcolvin
Copy link
Contributor Author

samuelcolvin commented Jul 27, 2024 via email

@samuelcolvin samuelcolvin force-pushed the starts_with-ends_with-improvements branch from 079b4b1 to 8cc373b Compare July 28, 2024 13:15
@samuelcolvin
Copy link
Contributor Author

I think the behavior here is again related to the unrealistically short (4 character) haystack used in benchmarks as I explained in #6128 (comment).

Running

#!/usr/bin/env bash
set -ex
git checkout master
rm -rf target/criterion
cargo bench -p arrow --bench comparison_kernels -F test_utils -- --save-baseline master 'like.*(starts|ends) with'
BRANCH_NAME=starts_with-ends_with-improvements
git checkout $BRANCH_NAME
cargo bench -p arrow --bench comparison_kernels -F test_utils -- --save-baseline $BRANCH_NAME 'like.*(starts|ends) with'
critcmp master $BRANCH_NAME

on a c2-standard-8 GCP instance.

With haystack length=4 (default)

group                               master                                 starts_with-ends_with-improvements
-----                               ------                                 ----------------------------------
ilike_utf8 scalar ends with         1.00    234.5±0.69µs        ? ?/sec    1.05    247.4±0.70µs        ? ?/sec
ilike_utf8 scalar starts with       1.00    251.4±0.39µs        ? ?/sec    1.19    298.0±1.11µs        ? ?/sec
like_utf8 scalar ends with          1.52    234.5±0.34µs        ? ?/sec    1.00    154.6±0.58µs        ? ?/sec
like_utf8 scalar starts with        1.22    253.7±0.44µs        ? ?/sec    1.00    207.8±2.90µs        ? ?/sec
like_utf8view scalar ends with      1.25     61.2±0.20ms        ? ?/sec    1.00     48.9±0.22ms        ? ?/sec
like_utf8view scalar starts with    1.07     58.1±0.17ms        ? ?/sec    1.00     54.3±0.15ms        ? ?/sec
nilike_utf8 scalar ends with        1.00    234.7±0.81µs        ? ?/sec    1.06    247.6±0.84µs        ? ?/sec
nilike_utf8 scalar starts with      1.00    251.4±0.34µs        ? ?/sec    1.19    298.2±1.02µs        ? ?/sec
nlike_utf8 scalar ends with         1.52    234.4±0.25µs        ? ?/sec    1.00    154.7±1.81µs        ? ?/sec
nlike_utf8 scalar starts with       1.22    253.8±0.72µs        ? ?/sec    1.00    207.3±0.55µs        ? ?/sec

With random haystack length=0..400

group                                     master                                 starts_with-ends_with-improvements
-----                                     ------                                 ----------------------------------
ilike_utf8 scalar ends with               1.10  1042.3±49.28µs        ? ?/sec    1.00   947.9±45.76µs        ? ?/sec
ilike_utf8 scalar starts with             1.02  1056.6±29.14µs        ? ?/sec    1.00  1036.7±54.46µs        ? ?/sec
like_utf8 scalar ends with                1.75   480.0±14.59µs        ? ?/sec    1.00   274.8±16.16µs        ? ?/sec
like_utf8 scalar starts with              1.39    499.2±6.69µs        ? ?/sec    1.00    358.8±5.57µs        ? ?/sec
like_utf8view scalar ends with            1.16     56.5±0.28ms        ? ?/sec    1.00     48.6±0.17ms        ? ?/sec
like_utf8view scalar starts with          1.08     57.5±0.17ms        ? ?/sec    1.00     53.3±0.24ms        ? ?/sec
nilike_utf8 scalar ends with              1.05   995.0±19.04µs        ? ?/sec    1.00   951.9±49.82µs        ? ?/sec
nilike_utf8 scalar starts with            1.00  1056.0±40.94µs        ? ?/sec    1.00  1050.8±34.07µs        ? ?/sec
nlike_utf8 scalar ends with               1.76    479.3±9.99µs        ? ?/sec    1.00   272.0±25.31µs        ? ?/sec
nlike_utf8 scalar starts with             1.36    499.1±5.47µs        ? ?/sec    1.00    367.9±8.76µs        ? ?/sec

@samuelcolvin
Copy link
Contributor Author

(I also rebased to latest master, which I guess could have had an affect, although I doubt it)

@samuelcolvin
Copy link
Contributor Author

samuelcolvin commented Jul 29, 2024

Bechmarks after merging master and fixing conflicts:

+ critcmp master starts_with-ends_with-improvements
group                               master                  starts_with-ends_with-improvements
-----                               ------                  ----------------------------------
ilike_utf8 scalar ends with         1.06    584.7±2.31µs    1.00    553.6±6.07µs
ilike_utf8 scalar starts with       1.03    578.1±5.63µs    1.00    559.8±7.77µs
like_utf8 scalar ends with          1.62    150.1±2.55µs    1.00     92.9±1.10µs
like_utf8 scalar starts with        1.63    150.2±4.11µs    1.00     92.2±1.39µs
like_utf8view scalar ends with      1.36     29.3±0.32ms    1.00     21.6±0.28ms
like_utf8view scalar starts with    1.38     28.9±0.21ms    1.00     21.0±0.38ms
nilike_utf8 scalar ends with        1.07   593.0±28.32µs    1.00    552.4±4.85µs
nilike_utf8 scalar starts with      1.04    578.1±2.64µs    1.00    557.9±7.13µs
nlike_utf8 scalar ends with         1.62    150.9±4.47µs    1.00     93.2±1.68µs
nlike_utf8 scalar starts with       1.60    149.5±3.35µs    1.00     93.2±2.92µs

@alamb
Copy link
Contributor

alamb commented Jul 30, 2024

🚀

@alamb alamb merged commit bf0ea91 into apache:master Jul 30, 2024
17 checks passed
@alamb
Copy link
Contributor

alamb commented Jul 30, 2024

Thanks again @samuelcolvin

@samuelcolvin samuelcolvin deleted the starts_with-ends_with-improvements branch July 30, 2024 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants