improvements to `(i)starts_with` and `(i)ends_with` performance #6118

samuelcolvin · 2024-07-25T18:58:52Z

Which issue does this PR close?

Related to (but not closing) #6107.

Rationale for this change

Lots of context in #6107, this makes LIKE and ILIKE queries which are simply "starts with" and "ends with" significantly faster.

Running

cargo bench -p arrow --bench comparison_kernels -F test_utils -- like

Gives notably:

Test	Change
like_utf8 scalar ends with	-53.921%
like_utf8 scalar starts with	-53.883%
like_utf8view scalar ends with	-26.079%
like_utf8view scalar starts with	-24.864%
nlike_utf8 scalar ends with	-53.944%
nlike_utf8 scalar starts with	-53.610%
ilike_utf8 scalar starts with	-21.921%
nilike_utf8 scalar starts with	-22.727%

Full output

   Compiling arrow-string v52.1.0 (/Users/samuel/code/arrow-rs/arrow-string)
   Compiling arrow v52.1.0 (/Users/samuel/code/arrow-rs/arrow)
    Finished `bench` profile [optimized] target(s) in 4.13s
     Running benches/comparison_kernels.rs (target/release/deps/comparison_kernels-b61fe744923f27b6)
like_utf8 scalar equals time:   [144.33 µs 144.47 µs 144.69 µs]
                        change: [-0.2401% -0.0477% +0.1358%] (p = 0.63 > 0.05)
                        No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  4 (4.00%) high severe

like_utf8 scalar contains
                        time:   [195.98 µs 196.10 µs 196.23 µs]
                        change: [-0.2527% -0.0384% +0.1985%] (p = 0.75 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

like_utf8 scalar ends with
                        time:   [66.456 µs 66.538 µs 66.628 µs]
                        change: [-53.995% -53.921% -53.817%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

like_utf8 scalar starts with
                        time:   [67.058 µs 67.093 µs 67.125 µs]
                        change: [-54.038% -53.883% -53.755%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

like_utf8 scalar complex
                        time:   [124.83 µs 124.89 µs 124.96 µs]
                        change: [-2.2102% -1.9432% -1.6682%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) high mild
  6 (6.00%) high severe

like_utf8view scalar equals
                        time:   [15.993 ms 16.022 ms 16.052 ms]
                        change: [-0.1739% +0.0490% +0.2773%] (p = 0.68 > 0.05)
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Benchmarking like_utf8view scalar contains: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 18.1s, or reduce sample count to 20.
like_utf8view scalar contains
                        time:   [179.98 ms 180.36 ms 180.76 ms]
                        change: [-0.9752% -0.6832% -0.3939%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 23 outliers among 100 measurements (23.00%)
  9 (9.00%) low mild
  7 (7.00%) high mild
  7 (7.00%) high severe

like_utf8view scalar ends with
                        time:   [21.398 ms 21.473 ms 21.552 ms]
                        change: [-26.453% -26.079% -25.713%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

like_utf8view scalar starts with
                        time:   [21.710 ms 21.790 ms 21.889 ms]
                        change: [-25.151% -24.864% -24.473%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

Benchmarking like_utf8view scalar complex: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 23.3s, or reduce sample count to 20.
like_utf8view scalar complex
                        time:   [231.47 ms 231.93 ms 232.49 ms]
                        change: [+0.6295% +0.8893% +1.1705%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) high mild
  1 (1.00%) high severe

nlike_utf8 scalar equals
                        time:   [145.37 µs 145.53 µs 145.70 µs]
                        change: [-0.0855% +0.1646% +0.3920%] (p = 0.18 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

nlike_utf8 scalar contains
                        time:   [197.16 µs 197.41 µs 197.71 µs]
                        change: [-0.7353% -0.3336% +0.0168%] (p = 0.09 > 0.05)
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

nlike_utf8 scalar ends with
                        time:   [67.028 µs 67.212 µs 67.443 µs]
                        change: [-54.194% -53.944% -53.720%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

nlike_utf8 scalar starts with
                        time:   [67.260 µs 67.327 µs 67.396 µs]
                        change: [-53.731% -53.610% -53.498%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

nlike_utf8 scalar complex
                        time:   [125.99 µs 126.10 µs 126.23 µs]
                        change: [-1.2656% -1.0550% -0.8713%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

ilike_utf8 scalar equals
                        time:   [103.50 µs 103.64 µs 103.79 µs]
                        change: [+2.2061% +2.5400% +2.8621%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

ilike_utf8 scalar contains
                        time:   [664.73 µs 665.66 µs 666.72 µs]
                        change: [+0.4162% +0.6823% +0.9311%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  7 (7.00%) high mild
  1 (1.00%) high severe

ilike_utf8 scalar ends with
                        time:   [104.73 µs 104.87 µs 105.01 µs]
                        change: [-0.0506% +0.2329% +0.5070%] (p = 0.11 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild

ilike_utf8 scalar starts with
                        time:   [93.878 µs 94.077 µs 94.287 µs]
                        change: [-22.114% -21.921% -21.735%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

ilike_utf8 scalar complex
                        time:   [136.51 µs 136.59 µs 136.68 µs]
                        change: [-0.8583% -0.6439% -0.4482%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe

nilike_utf8 scalar equals
                        time:   [103.23 µs 103.56 µs 104.04 µs]
                        change: [+2.3182% +2.5182% +2.7169%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

nilike_utf8 scalar contains
                        time:   [664.85 µs 665.41 µs 665.99 µs]
                        change: [+0.9991% +1.1622% +1.3429%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

nilike_utf8 scalar ends with
                        time:   [104.22 µs 104.36 µs 104.50 µs]
                        change: [-0.0951% +0.1088% +0.3029%] (p = 0.31 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  4 (4.00%) high mild

nilike_utf8 scalar starts with
                        time:   [94.197 µs 94.387 µs 94.575 µs]
                        change: [-23.029% -22.727% -22.441%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

nilike_utf8 scalar complex
                        time:   [136.32 µs 136.47 µs 136.63 µs]
                        change: [-2.8598% -2.4438% -2.0479%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

like_utf8_scalar_dyn dictionary[10] string[4])
                        time:   [29.504 µs 29.550 µs 29.603 µs]
                        change: [+1.1570% +1.4762% +1.8389%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

ilike_utf8_scalar_dyn dictionary[10] string[4])
                        time:   [29.404 µs 29.436 µs 29.471 µs]
                        change: [+0.1443% +0.5140% +0.8086%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

What changes are included in this PR?

new implementation of starts_with_ignore_ascii_case and ends_with_ignore_ascii_case, these showed significant improvements (~20%) over the previous implementations
new methods crate::predicate::starts_with and crate::predicate::ends_with that show a 2 or 3x improvement over str.starts_with and str.ends_with

Are there any user-facing changes?

Shouldn't be. I fuzzed all the implementations against the default methods here

alamb

Thank you @samuelcolvin

I also ran the benchmarks and here is what I got -- it seems a bit mixed where some go faster and some go slower. On the whole it seems an improvement to me.

I am rerunning the numbers to see how consistent it is from run to run

++ critcmp master starts_with-ends_with-improvements
group                                              master                                 starts_with-ends_with-improvements
-----                                              ------                                 ----------------------------------
ilike_utf8 scalar complex                          1.00    302.5±1.19µs        ? ?/sec    1.00    302.1±1.54µs        ? ?/sec
ilike_utf8 scalar contains                         1.00   1558.3±6.61µs        ? ?/sec    1.03   1599.0±5.82µs        ? ?/sec
ilike_utf8 scalar ends with                        1.00    218.7±0.63µs        ? ?/sec    1.13    246.2±0.45µs        ? ?/sec
ilike_utf8 scalar equals                           1.16    248.7±0.63µs        ? ?/sec    1.00    214.6±0.53µs        ? ?/sec
ilike_utf8 scalar starts with                      1.00    279.5±0.57µs        ? ?/sec    1.07    298.3±0.55µs        ? ?/sec
ilike_utf8_scalar_dyn dictionary[10] string[4])    1.00     88.1±0.14µs        ? ?/sec    1.00     88.1±0.24µs        ? ?/sec
like_utf8 scalar complex                           1.00    283.4±0.99µs        ? ?/sec    1.02    287.8±4.98µs        ? ?/sec
like_utf8 scalar contains                          1.01    349.8±0.65µs        ? ?/sec    1.00    348.1±0.41µs        ? ?/sec
like_utf8 scalar ends with                         1.43    221.4±0.44µs        ? ?/sec    1.00    154.7±0.23µs        ? ?/sec
like_utf8 scalar equals                            1.01    219.5±0.44µs        ? ?/sec    1.00    217.7±0.65µs        ? ?/sec
like_utf8 scalar starts with                       1.40    242.7±0.89µs        ? ?/sec    1.00    173.5±0.54µs        ? ?/sec
like_utf8_scalar_dyn dictionary[10] string[4])     1.00     88.2±0.21µs        ? ?/sec    1.00     88.1±0.16µs        ? ?/sec
like_utf8view scalar complex                       1.00    533.0±1.24ms        ? ?/sec    1.01    538.1±9.86ms        ? ?/sec
like_utf8view scalar contains                      1.00    379.1±0.34ms        ? ?/sec    1.00    380.7±0.48ms        ? ?/sec
like_utf8view scalar ends with                     1.14     60.0±0.27ms        ? ?/sec    1.00     52.7±0.21ms        ? ?/sec
like_utf8view scalar equals                        1.00     37.1±0.12ms        ? ?/sec    1.00     37.0±0.11ms        ? ?/sec
like_utf8view scalar starts with                   1.06     60.4±0.36ms        ? ?/sec    1.00     56.8±0.23ms        ? ?/sec
nilike_utf8 scalar complex                         1.00    302.9±1.46µs        ? ?/sec    1.00    302.3±1.97µs        ? ?/sec
nilike_utf8 scalar contains                        1.00   1556.6±6.92µs        ? ?/sec    1.03   1601.2±7.82µs        ? ?/sec
nilike_utf8 scalar ends with                       1.00    218.7±0.54µs        ? ?/sec    1.12    246.0±0.72µs        ? ?/sec
nilike_utf8 scalar equals                          1.16    249.4±3.30µs        ? ?/sec    1.00    214.9±0.98µs        ? ?/sec
nilike_utf8 scalar starts with                     1.00    279.7±0.72µs        ? ?/sec    1.07    298.3±0.58µs        ? ?/sec
nlike_utf8 scalar complex                          1.00    283.4±1.60µs        ? ?/sec    1.00    283.6±2.73µs        ? ?/sec
nlike_utf8 scalar contains                         1.00    350.1±0.66µs        ? ?/sec    1.00    348.4±2.19µs        ? ?/sec
nlike_utf8 scalar ends with                        1.43    221.4±0.70µs        ? ?/sec    1.00    155.0±0.47µs        ? ?/sec
nlike_utf8 scalar equals                           1.01    219.6±1.12µs        ? ?/sec    1.00    217.7±1.46µs        ? ?/sec
nlike_utf8 scalar starts with                      1.35    234.9±0.65µs        ? ?/sec    1.00    173.7±0.67µs        ? ?/sec

arrow-string/src/predicate.rs

alamb · 2024-07-27T10:54:50Z

Here is my next run

++ critcmp master starts_with-ends_with-improvements
group                                              master                                 starts_with-ends_with-improvements
-----                                              ------                                 ----------------------------------
ilike_utf8 scalar complex                          1.00    301.7±1.10µs        ? ?/sec    1.00    301.5±1.04µs        ? ?/sec
ilike_utf8 scalar contains                         1.00   1554.5±6.20µs        ? ?/sec    1.03   1595.8±4.49µs        ? ?/sec
ilike_utf8 scalar ends with                        1.00    218.5±0.39µs        ? ?/sec    1.13    247.1±4.53µs        ? ?/sec
ilike_utf8 scalar equals                           1.16    248.8±0.40µs        ? ?/sec    1.00    214.8±0.66µs        ? ?/sec
ilike_utf8 scalar starts with                      1.00    279.4±0.36µs        ? ?/sec    1.07    298.5±0.46µs        ? ?/sec
ilike_utf8_scalar_dyn dictionary[10] string[4])    1.00     88.1±0.14µs        ? ?/sec    1.00     88.2±0.18µs        ? ?/sec
like_utf8 scalar complex                           1.01    283.4±0.82µs        ? ?/sec    1.00    282.0±0.96µs        ? ?/sec
like_utf8 scalar contains                          1.00    347.6±0.52µs        ? ?/sec    1.00    347.4±0.76µs        ? ?/sec
like_utf8 scalar ends with                         1.41    219.4±0.55µs        ? ?/sec    1.00    155.3±2.99µs        ? ?/sec
like_utf8 scalar equals                            1.00    217.6±0.53µs        ? ?/sec    1.00    217.9±0.73µs        ? ?/sec
like_utf8 scalar starts with                       1.34    232.8±0.29µs        ? ?/sec    1.00    173.5±0.30µs        ? ?/sec
like_utf8_scalar_dyn dictionary[10] string[4])     1.00     88.1±0.17µs        ? ?/sec    1.00     88.1±0.14µs        ? ?/sec
like_utf8view scalar complex                       1.00    531.3±2.04ms        ? ?/sec    1.00    531.0±2.43ms        ? ?/sec
like_utf8view scalar contains                      1.00    378.6±0.36ms        ? ?/sec    1.01    380.6±1.38ms        ? ?/sec
like_utf8view scalar ends with                     1.13     59.6±0.24ms        ? ?/sec    1.00     52.7±0.23ms        ? ?/sec
like_utf8view scalar equals                        1.00     37.0±0.47ms        ? ?/sec    1.00     36.9±0.09ms        ? ?/sec
like_utf8view scalar starts with                   1.06     60.0±0.48ms        ? ?/sec    1.00     56.8±0.23ms        ? ?/sec
nilike_utf8 scalar complex                         1.00    301.9±1.19µs        ? ?/sec    1.00    300.4±2.48µs        ? ?/sec
nilike_utf8 scalar contains                        1.00   1551.3±4.82µs        ? ?/sec    1.03   1594.3±4.71µs        ? ?/sec
nilike_utf8 scalar ends with                       1.00    219.7±3.41µs        ? ?/sec    1.12    246.3±0.86µs        ? ?/sec
nilike_utf8 scalar equals                          1.16    248.8±0.53µs        ? ?/sec    1.00    214.5±0.33µs        ? ?/sec
nilike_utf8 scalar starts with                     1.00    279.6±1.02µs        ? ?/sec    1.07    298.4±0.46µs        ? ?/sec
nlike_utf8 scalar complex                          1.01    283.2±1.05µs        ? ?/sec    1.00    281.1±1.19µs        ? ?/sec
nlike_utf8 scalar contains                         1.00    347.5±0.63µs        ? ?/sec    1.00    347.2±0.49µs        ? ?/sec
nlike_utf8 scalar ends with                        1.42    219.7±1.25µs        ? ?/sec    1.00    154.8±0.24µs        ? ?/sec
nlike_utf8 scalar equals                           1.00    217.7±0.45µs        ? ?/sec    1.00    217.7±0.47µs        ? ?/sec
nlike_utf8 scalar starts with                      1.34    233.0±0.85µs        ? ?/sec    1.00    173.6±0.48µs        ? ?/sec

BTW I am running this on a c2-standard-8 GCP instance

Script

pushd ~/arrow-rs

#git remote add samuelcolvin https://github.com/samuelcolvin/arrow-rs.git
git fetch -p samuelcolvin
BENCH_COMMAND="cargo bench -p arrow --bench comparison_kernels -F test_utils"
BENCH_FILTER="like"
REPO_NAME="samuelcolvin"
BRANCH_NAME="starts_with-ends_with-improvements"

# remove old test runs
rm -rf target/criterion/

git checkout $BRANCH_NAME
git reset --hard "$REPO_NAME/$BRANCH_NAME"

# Run on test branch
$BENCH_COMMAND -- --save-baseline ${BRANCH_NAME} ${BENCH_FILTER}

# Run on master
MERGE_BASE=$(git merge-base HEAD apache/master)
echo "** Comparing to ${MERGE_BASE}"

git checkout ${MERGE_BASE}
$BENCH_COMMAND -- --save-baseline master  ${BENCH_FILTER}

critcmp master ${BRANCH_NAME}

popd

samuelcolvin · 2024-07-27T13:34:28Z

My reading of this is that (contrary to what I found) my "istarts_with" and "iends_with" are slower that what was here before. Very easy to revert, but we should just check the benchmarks are really representative of the most common queries. Samuel Colvin

…

On Sat, 27 Jul 2024, 12:55 Andrew Lamb, ***@***.***> wrote: Here is my next run ++ critcmp master starts_with-ends_with-improvements group master starts_with-ends_with-improvements ----- ------ ---------------------------------- ilike_utf8 scalar complex 1.00 301.7±1.10µs ? ?/sec 1.00 301.5±1.04µs ? ?/sec ilike_utf8 scalar contains 1.00 1554.5±6.20µs ? ?/sec 1.03 1595.8±4.49µs ? ?/sec ilike_utf8 scalar ends with 1.00 218.5±0.39µs ? ?/sec 1.13 247.1±4.53µs ? ?/sec ilike_utf8 scalar equals 1.16 248.8±0.40µs ? ?/sec 1.00 214.8±0.66µs ? ?/sec ilike_utf8 scalar starts with 1.00 279.4±0.36µs ? ?/sec 1.07 298.5±0.46µs ? ?/sec ilike_utf8_scalar_dyn dictionary[10] string[4]) 1.00 88.1±0.14µs ? ?/sec 1.00 88.2±0.18µs ? ?/sec like_utf8 scalar complex 1.01 283.4±0.82µs ? ?/sec 1.00 282.0±0.96µs ? ?/sec like_utf8 scalar contains 1.00 347.6±0.52µs ? ?/sec 1.00 347.4±0.76µs ? ?/sec like_utf8 scalar ends with 1.41 219.4±0.55µs ? ?/sec 1.00 155.3±2.99µs ? ?/sec like_utf8 scalar equals 1.00 217.6±0.53µs ? ?/sec 1.00 217.9±0.73µs ? ?/sec like_utf8 scalar starts with 1.34 232.8±0.29µs ? ?/sec 1.00 173.5±0.30µs ? ?/sec like_utf8_scalar_dyn dictionary[10] string[4]) 1.00 88.1±0.17µs ? ?/sec 1.00 88.1±0.14µs ? ?/sec like_utf8view scalar complex 1.00 531.3±2.04ms ? ?/sec 1.00 531.0±2.43ms ? ?/sec like_utf8view scalar contains 1.00 378.6±0.36ms ? ?/sec 1.01 380.6±1.38ms ? ?/sec like_utf8view scalar ends with 1.13 59.6±0.24ms ? ?/sec 1.00 52.7±0.23ms ? ?/sec like_utf8view scalar equals 1.00 37.0±0.47ms ? ?/sec 1.00 36.9±0.09ms ? ?/sec like_utf8view scalar starts with 1.06 60.0±0.48ms ? ?/sec 1.00 56.8±0.23ms ? ?/sec nilike_utf8 scalar complex 1.00 301.9±1.19µs ? ?/sec 1.00 300.4±2.48µs ? ?/sec nilike_utf8 scalar contains 1.00 1551.3±4.82µs ? ?/sec 1.03 1594.3±4.71µs ? ?/sec nilike_utf8 scalar ends with 1.00 219.7±3.41µs ? ?/sec 1.12 246.3±0.86µs ? ?/sec nilike_utf8 scalar equals 1.16 248.8±0.53µs ? ?/sec 1.00 214.5±0.33µs ? ?/sec nilike_utf8 scalar starts with 1.00 279.6±1.02µs ? ?/sec 1.07 298.4±0.46µs ? ?/sec nlike_utf8 scalar complex 1.01 283.2±1.05µs ? ?/sec 1.00 281.1±1.19µs ? ?/sec nlike_utf8 scalar contains 1.00 347.5±0.63µs ? ?/sec 1.00 347.2±0.49µs ? ?/sec nlike_utf8 scalar ends with 1.42 219.7±1.25µs ? ?/sec 1.00 154.8±0.24µs ? ?/sec nlike_utf8 scalar equals 1.00 217.7±0.45µs ? ?/sec 1.00 217.7±0.47µs ? ?/sec nlike_utf8 scalar starts with 1.34 233.0±0.85µs ? ?/sec 1.00 173.6±0.48µs ? ?/sec BTW I am running this on a c2-standard-8 GCP instance Script pushd ~/arrow-rs #git remote add samuelcolvin https://github.com/samuelcolvin/arrow-rs.git git fetch -p samuelcolvin BENCH_COMMAND="cargo bench -p arrow --bench comparison_kernels -F test_utils" BENCH_FILTER="like" REPO_NAME="samuelcolvin" BRANCH_NAME="starts_with-ends_with-improvements" # remove old test runs rm -rf target/criterion/ git checkout $BRANCH_NAME git reset --hard "$REPO_NAME/$BRANCH_NAME" # Run on test branch$BENCH_COMMAND -- --save-baseline ${BRANCH_NAME} ${BENCH_FILTER} # Run on master MERGE_BASE=$(git merge-base HEAD apache/master)echo "** Comparing to ${MERGE_BASE}" git checkout ${MERGE_BASE}$BENCH_COMMAND -- --save-baseline master ${BENCH_FILTER} critcmp master ${BRANCH_NAME} popd — Reply to this email directly, view it on GitHub <#6118 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA62GGP2BI2NTQKNKSNMXZDZON4BBAVCNFSM6AAAAABLPEFVR6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJUGEYTGMBWGQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

samuelcolvin · 2024-07-28T13:56:23Z

I think the behavior here is again related to the unrealistically short (4 character) haystack used in benchmarks as I explained in #6128 (comment).

Running

#!/usr/bin/env bash
set -ex
git checkout master
rm -rf target/criterion
cargo bench -p arrow --bench comparison_kernels -F test_utils -- --save-baseline master 'like.*(starts|ends) with'
BRANCH_NAME=starts_with-ends_with-improvements
git checkout $BRANCH_NAME
cargo bench -p arrow --bench comparison_kernels -F test_utils -- --save-baseline $BRANCH_NAME 'like.*(starts|ends) with'
critcmp master $BRANCH_NAME

on a c2-standard-8 GCP instance.

With haystack length=4 (default)

group                               master                                 starts_with-ends_with-improvements
-----                               ------                                 ----------------------------------
ilike_utf8 scalar ends with         1.00    234.5±0.69µs        ? ?/sec    1.05    247.4±0.70µs        ? ?/sec
ilike_utf8 scalar starts with       1.00    251.4±0.39µs        ? ?/sec    1.19    298.0±1.11µs        ? ?/sec
like_utf8 scalar ends with          1.52    234.5±0.34µs        ? ?/sec    1.00    154.6±0.58µs        ? ?/sec
like_utf8 scalar starts with        1.22    253.7±0.44µs        ? ?/sec    1.00    207.8±2.90µs        ? ?/sec
like_utf8view scalar ends with      1.25     61.2±0.20ms        ? ?/sec    1.00     48.9±0.22ms        ? ?/sec
like_utf8view scalar starts with    1.07     58.1±0.17ms        ? ?/sec    1.00     54.3±0.15ms        ? ?/sec
nilike_utf8 scalar ends with        1.00    234.7±0.81µs        ? ?/sec    1.06    247.6±0.84µs        ? ?/sec
nilike_utf8 scalar starts with      1.00    251.4±0.34µs        ? ?/sec    1.19    298.2±1.02µs        ? ?/sec
nlike_utf8 scalar ends with         1.52    234.4±0.25µs        ? ?/sec    1.00    154.7±1.81µs        ? ?/sec
nlike_utf8 scalar starts with       1.22    253.8±0.72µs        ? ?/sec    1.00    207.3±0.55µs        ? ?/sec

With random haystack length=0..400

group                                     master                                 starts_with-ends_with-improvements
-----                                     ------                                 ----------------------------------
ilike_utf8 scalar ends with               1.10  1042.3±49.28µs        ? ?/sec    1.00   947.9±45.76µs        ? ?/sec
ilike_utf8 scalar starts with             1.02  1056.6±29.14µs        ? ?/sec    1.00  1036.7±54.46µs        ? ?/sec
like_utf8 scalar ends with                1.75   480.0±14.59µs        ? ?/sec    1.00   274.8±16.16µs        ? ?/sec
like_utf8 scalar starts with              1.39    499.2±6.69µs        ? ?/sec    1.00    358.8±5.57µs        ? ?/sec
like_utf8view scalar ends with            1.16     56.5±0.28ms        ? ?/sec    1.00     48.6±0.17ms        ? ?/sec
like_utf8view scalar starts with          1.08     57.5±0.17ms        ? ?/sec    1.00     53.3±0.24ms        ? ?/sec
nilike_utf8 scalar ends with              1.05   995.0±19.04µs        ? ?/sec    1.00   951.9±49.82µs        ? ?/sec
nilike_utf8 scalar starts with            1.00  1056.0±40.94µs        ? ?/sec    1.00  1050.8±34.07µs        ? ?/sec
nlike_utf8 scalar ends with               1.76    479.3±9.99µs        ? ?/sec    1.00   272.0±25.31µs        ? ?/sec
nlike_utf8 scalar starts with             1.36    499.1±5.47µs        ? ?/sec    1.00    367.9±8.76µs        ? ?/sec

samuelcolvin · 2024-07-28T13:58:15Z

(I also rebased to latest master, which I guess could have had an affect, although I doubt it)

samuelcolvin · 2024-07-29T21:35:47Z

Bechmarks after merging master and fixing conflicts:

+ critcmp master starts_with-ends_with-improvements
group                               master                  starts_with-ends_with-improvements
-----                               ------                  ----------------------------------
ilike_utf8 scalar ends with         1.06    584.7±2.31µs    1.00    553.6±6.07µs
ilike_utf8 scalar starts with       1.03    578.1±5.63µs    1.00    559.8±7.77µs
like_utf8 scalar ends with          1.62    150.1±2.55µs    1.00     92.9±1.10µs
like_utf8 scalar starts with        1.63    150.2±4.11µs    1.00     92.2±1.39µs
like_utf8view scalar ends with      1.36     29.3±0.32ms    1.00     21.6±0.28ms
like_utf8view scalar starts with    1.38     28.9±0.21ms    1.00     21.0±0.38ms
nilike_utf8 scalar ends with        1.07   593.0±28.32µs    1.00    552.4±4.85µs
nilike_utf8 scalar starts with      1.04    578.1±2.64µs    1.00    557.9±7.13µs
nlike_utf8 scalar ends with         1.62    150.9±4.47µs    1.00     93.2±1.68µs
nlike_utf8 scalar starts with       1.60    149.5±3.35µs    1.00     93.2±2.92µs

alamb · 2024-07-30T19:49:42Z

🚀

alamb · 2024-07-30T19:49:48Z

Thanks again @samuelcolvin

github-actions bot added the arrow Changes to the arrow crate label Jul 25, 2024

Dandandan approved these changes Jul 26, 2024

View reviewed changes

samuelcolvin force-pushed the starts_with-ends_with-improvements branch from 8d6c7cc to 079b4b1 Compare July 26, 2024 09:03

samuelcolvin changed the title ~~improvements to (i)starts_with and (i)ends_with~~ improvements to (i)starts_with and (i)ends_with performance Jul 26, 2024

alamb approved these changes Jul 27, 2024

View reviewed changes

arrow-string/src/predicate.rs Show resolved Hide resolved

arrow-string/src/predicate.rs Show resolved Hide resolved

samuelcolvin added 3 commits July 28, 2024 14:15

improvements to "starts_with" and "ends_with"

05c78ca

add tests and refactor slightly

02dd628

add comments

8cc373b

samuelcolvin force-pushed the starts_with-ends_with-improvements branch from 079b4b1 to 8cc373b Compare July 28, 2024 13:15

samuelcolvin mentioned this pull request Jul 28, 2024

improve LIKE regex performance up to 12x #6145

Merged

Merge branch 'master' into starts_with-ends_with-improvements

bafb754

alamb merged commit bf0ea91 into apache:master Jul 30, 2024
17 checks passed

samuelcolvin deleted the starts_with-ends_with-improvements branch July 30, 2024 20:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improvements to `(i)starts_with` and `(i)ends_with` performance #6118

improvements to `(i)starts_with` and `(i)ends_with` performance #6118

samuelcolvin commented Jul 25, 2024

alamb left a comment

alamb commented Jul 27, 2024

samuelcolvin commented Jul 27, 2024 via email

samuelcolvin commented Jul 28, 2024

samuelcolvin commented Jul 28, 2024

samuelcolvin commented Jul 29, 2024 •

edited

Loading

alamb commented Jul 30, 2024

alamb commented Jul 30, 2024

improvements to (i)starts_with and (i)ends_with performance #6118

improvements to (i)starts_with and (i)ends_with performance #6118

Conversation

samuelcolvin commented Jul 25, 2024

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

alamb left a comment

Choose a reason for hiding this comment

alamb commented Jul 27, 2024

samuelcolvin commented Jul 27, 2024 via email

samuelcolvin commented Jul 28, 2024

With haystack length=4 (default)

With random haystack length=0..400

samuelcolvin commented Jul 28, 2024

samuelcolvin commented Jul 29, 2024 • edited Loading

alamb commented Jul 30, 2024

alamb commented Jul 30, 2024

improvements to `(i)starts_with` and `(i)ends_with` performance #6118

improvements to `(i)starts_with` and `(i)ends_with` performance #6118

samuelcolvin commented Jul 29, 2024 •

edited

Loading