Skip to content

Conversation

@ctsk
Copy link
Contributor

@ctsk ctsk commented May 30, 2025

Which issue does this PR close?

GenericByteArray::value_unchecked permits unsafe code, but still introduces a check due to unwrap being called here:

        let b = std::slice::from_raw_parts(
-           self.value_data.as_ptr().offset(start.to_isize().unwrap()),
-           (end - start).to_usize().unwrap(),
+           self.value_data.as_ptr().offset(start.to_isize().unwrap_unchecked()),
+           (end - start).to_usize().unwrap_unchecked(),
        );

I believe it is sensible to use unwrap_unsafe here instead. While the compiler may be able to prune the first unwrap as unreachable, I believe it can not prove at compile time that end >= start and eliminate the second unwrap. This is an invariant of GenericByteArray.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the arrow Changes to the arrow crate label May 30, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ctsk -- this also makes sense to me

I took the liberty of merging this PR up from main and running cargo fmt on it to get a clean CI run.

I also queued up some benchmark runs to see if we can see any improvements for this change

@alamb
Copy link
Contributor

alamb commented Jun 8, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP Wed Apr 2 16:34:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing fix/byte-array-unchecked (7414160) to 9d172a8 diff
BENCH_NAME=comparison_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench comparison_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=fix_byte-array-unchecked
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Jun 8, 2025

🤖: Benchmark completed

Details

group                                                                                                    fix_byte-array-unchecked               main
-----                                                                                                    ------------------------               ----
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar complex                    1.01      2.8±0.03ms        ? ?/sec    1.00      2.8±0.02ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar contains                   1.04      2.9±0.03ms        ? ?/sec    1.00      2.8±0.02ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar ends with                  1.04      2.3±0.05ms        ? ?/sec    1.00      2.2±0.06ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar starts with                1.02      2.2±0.02ms        ? ?/sec    1.00      2.1±0.02ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar complex        1.04      2.8±0.04ms        ? ?/sec    1.00      2.7±0.01ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar contains       1.00      2.9±0.02ms        ? ?/sec    1.00      2.9±0.02ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar ends with      1.05      2.3±0.06ms        ? ?/sec    1.00      2.2±0.03ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar starts with    1.02      2.2±0.03ms        ? ?/sec    1.00      2.1±0.02ms        ? ?/sec
eq Float32                                                                                               1.00     44.2±0.11µs        ? ?/sec    1.00     44.3±0.12µs        ? ?/sec
eq Int32                                                                                                 1.00     44.2±0.06µs        ? ?/sec    1.00     44.2±0.06µs        ? ?/sec
eq MonthDayNano                                                                                          1.00     96.8±5.05µs        ? ?/sec    1.00     96.7±4.65µs        ? ?/sec
eq StringArray StringArray                                                                               1.00     33.9±0.23ms        ? ?/sec    1.11     37.8±0.27ms        ? ?/sec
eq StringViewArray StringViewArray                                                                       1.01     25.0±0.23ms        ? ?/sec    1.00     24.8±0.24ms        ? ?/sec
eq dictionary[10] string[4])                                                                             1.00    814.6±1.54µs        ? ?/sec    1.02    829.8±1.66µs        ? ?/sec
eq long same prefix strings StringArray                                                                  1.00    571.3±4.54µs        ? ?/sec    1.05    598.0±4.96µs        ? ?/sec
eq long same prefix strings StringViewArray                                                              1.00    905.1±3.30µs        ? ?/sec    1.01    913.4±3.62µs        ? ?/sec
eq scalar Float32                                                                                        1.00     44.2±0.05µs        ? ?/sec    1.00     44.2±0.07µs        ? ?/sec
eq scalar Int32                                                                                          1.00     44.1±0.06µs        ? ?/sec    1.00     44.2±0.05µs        ? ?/sec
eq scalar MonthDayNano                                                                                   1.39     71.6±0.14µs        ? ?/sec    1.00     51.4±0.34µs        ? ?/sec
eq scalar StringArray                                                                                    1.00     25.4±0.31ms        ? ?/sec    1.20     30.6±0.59ms        ? ?/sec
eq scalar StringViewArray 13 bytes                                                                       1.02     17.0±0.13ms        ? ?/sec    1.00     16.7±0.10ms        ? ?/sec
eq scalar StringViewArray 4 bytes                                                                        1.02     17.2±0.18ms        ? ?/sec    1.00     16.9±0.15ms        ? ?/sec
eq scalar StringViewArray 6 bytes                                                                        1.03     17.3±0.13ms        ? ?/sec    1.00     16.7±0.10ms        ? ?/sec
eq_dyn_utf8_scalar dictionary[10] string[4])                                                             1.00     77.1±0.23µs        ? ?/sec    1.00     77.1±0.15µs        ? ?/sec
gt Float32                                                                                               1.00     57.3±0.12µs        ? ?/sec    1.00     57.6±0.14µs        ? ?/sec
gt Int32                                                                                                 1.00     44.2±0.08µs        ? ?/sec    1.00     44.2±0.11µs        ? ?/sec
gt scalar Float32                                                                                        1.00     45.8±0.04µs        ? ?/sec    1.00     45.9±0.09µs        ? ?/sec
gt scalar Int32                                                                                          1.00     44.2±0.06µs        ? ?/sec    1.00     44.2±0.07µs        ? ?/sec
gt_eq Float32                                                                                            1.00     57.2±0.14µs        ? ?/sec    1.00     57.4±0.12µs        ? ?/sec
gt_eq Int32                                                                                              1.00     44.2±0.08µs        ? ?/sec    1.00     44.2±0.06µs        ? ?/sec
gt_eq scalar Float32                                                                                     1.00     46.5±0.10µs        ? ?/sec    1.00     46.5±0.41µs        ? ?/sec
gt_eq scalar Int32                                                                                       1.00     44.2±0.08µs        ? ?/sec    1.00     44.1±0.05µs        ? ?/sec
gt_eq_dyn_utf8_scalar scalar dictionary[10] string[4])                                                   1.00     77.1±0.72µs        ? ?/sec    1.00     77.2±0.12µs        ? ?/sec
ilike_utf8 scalar complex                                                                                1.03      2.9±0.06ms        ? ?/sec    1.00      2.8±0.05ms        ? ?/sec
ilike_utf8 scalar contains                                                                               1.00      4.3±0.04ms        ? ?/sec    1.02      4.4±0.04ms        ? ?/sec
ilike_utf8 scalar ends with                                                                              1.00  1096.8±26.54µs        ? ?/sec    1.03  1134.9±18.99µs        ? ?/sec
ilike_utf8 scalar equals                                                                                 1.06   698.9±25.80µs        ? ?/sec    1.00   662.1±29.36µs        ? ?/sec
ilike_utf8 scalar starts with                                                                            1.00  1074.2±49.39µs        ? ?/sec    1.01  1090.0±51.03µs        ? ?/sec
ilike_utf8_scalar_dyn dictionary[10] string[4])                                                          1.00     77.7±0.67µs        ? ?/sec    1.00     77.6±0.13µs        ? ?/sec
like_utf8 scalar complex                                                                                 1.00      2.2±0.05ms        ? ?/sec    1.00      2.2±0.03ms        ? ?/sec
like_utf8 scalar contains                                                                                1.00  1597.0±15.59µs        ? ?/sec    1.15  1829.8±22.65µs        ? ?/sec
like_utf8 scalar ends with                                                                               1.00    411.0±5.32µs        ? ?/sec    1.10   452.4±11.60µs        ? ?/sec
like_utf8 scalar equals                                                                                  1.00    110.2±0.20µs        ? ?/sec    1.00    110.1±0.19µs        ? ?/sec
like_utf8 scalar starts with                                                                             1.00    337.8±7.34µs        ? ?/sec    1.10   371.6±14.57µs        ? ?/sec
like_utf8_scalar_dyn dictionary[10] string[4])                                                           1.00     77.6±0.67µs        ? ?/sec    1.00     77.5±0.12µs        ? ?/sec
like_utf8view scalar complex                                                                             1.03    202.2±5.06ms        ? ?/sec    1.00    196.4±0.84ms        ? ?/sec
like_utf8view scalar contains                                                                            1.00    152.3±0.29ms        ? ?/sec    1.07    163.1±0.43ms        ? ?/sec
like_utf8view scalar ends with 13 bytes                                                                  1.05     53.4±0.19ms        ? ?/sec    1.00     51.0±0.23ms        ? ?/sec
like_utf8view scalar ends with 4 bytes                                                                   1.04     54.3±0.38ms        ? ?/sec    1.00     52.1±0.17ms        ? ?/sec
like_utf8view scalar ends with 6 bytes                                                                   1.04     54.0±0.19ms        ? ?/sec    1.00     52.0±0.27ms        ? ?/sec
like_utf8view scalar equals                                                                              1.11     38.5±0.13ms        ? ?/sec    1.00     34.5±0.10ms        ? ?/sec
like_utf8view scalar starts with 13 bytes                                                                1.04     47.8±0.38ms        ? ?/sec    1.00     45.7±0.26ms        ? ?/sec
like_utf8view scalar starts with 4 bytes                                                                 1.05     28.3±0.10ms        ? ?/sec    1.00     26.9±0.07ms        ? ?/sec
like_utf8view scalar starts with 6 bytes                                                                 1.04     49.2±0.36ms        ? ?/sec    1.00     47.2±0.18ms        ? ?/sec
long same prefix strings like_utf8 scalar complex                                                        1.00  1565.1±63.44µs        ? ?/sec    1.00   1563.4±5.55µs        ? ?/sec
long same prefix strings like_utf8 scalar contains                                                       1.00      4.0±0.01ms        ? ?/sec    1.04      4.2±0.01ms        ? ?/sec
long same prefix strings like_utf8 scalar ends with                                                      1.00  1563.9±12.06µs        ? ?/sec    1.00   1566.1±7.32µs        ? ?/sec
long same prefix strings like_utf8 scalar equals                                                         1.02    518.0±4.84µs        ? ?/sec    1.00    508.0±2.03µs        ? ?/sec
long same prefix strings like_utf8 scalar starts with                                                    1.00  1796.4±12.88µs        ? ?/sec    1.03  1848.3±11.06µs        ? ?/sec
long same prefix strings like_utf8view scalar complex                                                    1.02   1584.9±7.55µs        ? ?/sec    1.00  1556.8±10.21µs        ? ?/sec
long same prefix strings like_utf8view scalar contains                                                   1.00      4.1±0.01ms        ? ?/sec    1.00      4.1±0.01ms        ? ?/sec
long same prefix strings like_utf8view scalar ends with                                                  1.02   1575.8±6.04µs        ? ?/sec    1.00   1551.8±4.30µs        ? ?/sec
long same prefix strings like_utf8view scalar equals                                                     1.00    535.3±2.95µs        ? ?/sec    1.01    538.9±1.92µs        ? ?/sec
long same prefix strings like_utf8view scalar starts with                                                1.00   1815.0±4.11µs        ? ?/sec    1.02  1860.0±10.45µs        ? ?/sec
lt Float32                                                                                               1.00     57.1±0.15µs        ? ?/sec    1.00     57.3±0.13µs        ? ?/sec
lt Int32                                                                                                 1.00     44.3±0.20µs        ? ?/sec    1.00     44.3±0.59µs        ? ?/sec
lt long same prefix strings StringArray                                                                  1.00    670.9±4.17µs        ? ?/sec    1.00    668.3±4.76µs        ? ?/sec
lt long same prefix strings StringViewArray                                                              1.00    859.0±3.31µs        ? ?/sec    1.01    868.2±4.10µs        ? ?/sec
lt scalar Float32                                                                                        1.00     46.4±0.10µs        ? ?/sec    1.00     46.5±0.07µs        ? ?/sec
lt scalar Int32                                                                                          1.00     44.2±0.09µs        ? ?/sec    1.00     44.2±0.08µs        ? ?/sec
lt scalar StringArray                                                                                    1.04     46.1±0.31ms        ? ?/sec    1.00     44.3±0.20ms        ? ?/sec
lt scalar StringViewArray                                                                                1.00     56.2±0.21ms        ? ?/sec    1.07     59.8±0.15ms        ? ?/sec
lt_eq Float32                                                                                            1.00     57.4±0.11µs        ? ?/sec    1.00     57.6±0.13µs        ? ?/sec
lt_eq Int32                                                                                              1.00     44.2±0.09µs        ? ?/sec    1.00     44.2±0.07µs        ? ?/sec
lt_eq scalar Float32                                                                                     1.00     45.8±0.07µs        ? ?/sec    1.00     45.8±0.06µs        ? ?/sec
lt_eq scalar Int32                                                                                       1.00     44.2±0.31µs        ? ?/sec    1.00     44.1±0.05µs        ? ?/sec
neq Float32                                                                                              1.00     44.2±0.10µs        ? ?/sec    1.00     44.3±0.11µs        ? ?/sec
neq Int32                                                                                                1.00     44.2±0.10µs        ? ?/sec    1.00     44.2±0.08µs        ? ?/sec
neq long same prefix strings StringArray                                                                 1.00    567.5±3.41µs        ? ?/sec    1.05    597.0±4.55µs        ? ?/sec
neq long same prefix strings StringViewArray                                                             1.00    909.5±4.80µs        ? ?/sec    1.01    915.1±3.38µs        ? ?/sec
neq scalar Float32                                                                                       1.00     44.2±0.09µs        ? ?/sec    1.00     44.2±0.09µs        ? ?/sec
neq scalar Int32                                                                                         1.00     44.2±0.06µs        ? ?/sec    1.00     44.2±0.08µs        ? ?/sec
nilike_utf8 scalar complex                                                                               1.04      2.9±0.05ms        ? ?/sec    1.00      2.8±0.07ms        ? ?/sec
nilike_utf8 scalar contains                                                                              1.00      4.3±0.06ms        ? ?/sec    1.01      4.4±0.05ms        ? ?/sec
nilike_utf8 scalar ends with                                                                             1.01  1142.3±48.23µs        ? ?/sec    1.00  1131.3±50.42µs        ? ?/sec
nilike_utf8 scalar equals                                                                                1.03   659.0±22.56µs        ? ?/sec    1.00   638.1±10.52µs        ? ?/sec
nilike_utf8 scalar starts with                                                                           1.00  1071.5±37.94µs        ? ?/sec    1.00  1074.9±53.24µs        ? ?/sec
nlike_utf8 scalar complex                                                                                1.02      2.2±0.03ms        ? ?/sec    1.00      2.1±0.04ms        ? ?/sec
nlike_utf8 scalar contains                                                                               1.00  1620.7±23.10µs        ? ?/sec    1.12  1807.7±16.83µs        ? ?/sec
nlike_utf8 scalar ends with                                                                              1.00   424.9±15.14µs        ? ?/sec    1.03    439.4±9.31µs        ? ?/sec
nlike_utf8 scalar equals                                                                                 1.00    110.0±0.20µs        ? ?/sec    1.00    110.0±0.10µs        ? ?/sec
nlike_utf8 scalar starts with                                                                            1.00   346.4±16.63µs        ? ?/sec    1.06    365.7±7.71µs        ? ?/sec

@alamb
Copy link
Contributor

alamb commented Jun 8, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP Wed Apr 2 16:34:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing fix/byte-array-unchecked (7414160) to 9d172a8 diff
BENCH_NAME=filter_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench filter_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=fix_byte-array-unchecked
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Jun 8, 2025

🤖: Benchmark completed

Details

group                                                                         fix_byte-array-unchecked               main
-----                                                                         ------------------------               ----
filter context decimal128 (kept 1/2)                                          1.11     44.5±4.76µs        ? ?/sec    1.00     39.9±0.40µs        ? ?/sec
filter context decimal128 high selectivity (kept 1023/1024)                   1.00     50.4±2.09µs        ? ?/sec    1.01     50.7±1.36µs        ? ?/sec
filter context decimal128 low selectivity (kept 1/1024)                       1.00    240.7±0.23ns        ? ?/sec    1.00    241.4±0.28ns        ? ?/sec
filter context f32 (kept 1/2)                                                 1.00     69.5±0.15µs        ? ?/sec    1.30     90.4±0.13µs        ? ?/sec
filter context f32 high selectivity (kept 1023/1024)                          1.00     13.8±0.53µs        ? ?/sec    1.00     13.9±0.44µs        ? ?/sec
filter context f32 low selectivity (kept 1/1024)                              1.00    442.4±6.20ns        ? ?/sec    1.05    465.9±0.99ns        ? ?/sec
filter context fsb with value length 20 (kept 1/2)                            1.00     42.5±0.07µs        ? ?/sec    1.66     70.7±0.14µs        ? ?/sec
filter context fsb with value length 20 high selectivity (kept 1023/1024)     1.00     42.4±0.08µs        ? ?/sec    1.67     70.7±0.09µs        ? ?/sec
filter context fsb with value length 20 low selectivity (kept 1/1024)         1.00     42.4±0.06µs        ? ?/sec    1.67     70.7±0.08µs        ? ?/sec
filter context fsb with value length 5 (kept 1/2)                             1.00     42.5±0.08µs        ? ?/sec    1.66     70.7±0.08µs        ? ?/sec
filter context fsb with value length 5 high selectivity (kept 1023/1024)      1.00     42.4±0.07µs        ? ?/sec    1.67     70.7±0.07µs        ? ?/sec
filter context fsb with value length 5 low selectivity (kept 1/1024)          1.00     42.4±0.05µs        ? ?/sec    1.67     70.7±0.08µs        ? ?/sec
filter context fsb with value length 50 (kept 1/2)                            1.00     42.4±0.07µs        ? ?/sec    1.67     70.7±0.08µs        ? ?/sec
filter context fsb with value length 50 high selectivity (kept 1023/1024)     1.00     42.4±0.07µs        ? ?/sec    1.67     70.7±0.10µs        ? ?/sec
filter context fsb with value length 50 low selectivity (kept 1/1024)         1.00     42.4±0.05µs        ? ?/sec    1.67     70.7±0.07µs        ? ?/sec
filter context i32 (kept 1/2)                                                 1.01     22.8±0.04µs        ? ?/sec    1.00     22.6±0.03µs        ? ?/sec
filter context i32 high selectivity (kept 1023/1024)                          1.01      6.7±0.36µs        ? ?/sec    1.00      6.6±0.49µs        ? ?/sec
filter context i32 low selectivity (kept 1/1024)                              1.01    248.2±0.45ns        ? ?/sec    1.00    245.4±0.30ns        ? ?/sec
filter context i32 w NULLs (kept 1/2)                                         1.00     65.7±0.12µs        ? ?/sec    1.43     93.9±0.22µs        ? ?/sec
filter context i32 w NULLs high selectivity (kept 1023/1024)                  1.03     13.6±0.38µs        ? ?/sec    1.00     13.2±0.29µs        ? ?/sec
filter context i32 w NULLs low selectivity (kept 1/1024)                      1.15    542.9±0.76ns        ? ?/sec    1.00    473.4±2.57ns        ? ?/sec
filter context mixed string view (kept 1/2)                                   1.00     87.9±7.74µs        ? ?/sec    1.33    116.8±6.95µs        ? ?/sec
filter context mixed string view high selectivity (kept 1023/1024)            1.01     59.1±2.18µs        ? ?/sec    1.00     58.4±1.69µs        ? ?/sec
filter context mixed string view low selectivity (kept 1/1024)                1.00    633.4±0.85ns        ? ?/sec    1.06    672.5±0.86ns        ? ?/sec
filter context short string view (kept 1/2)                                   1.00     88.8±7.38µs        ? ?/sec    1.24    109.9±2.17µs        ? ?/sec
filter context short string view high selectivity (kept 1023/1024)            1.00     57.4±1.28µs        ? ?/sec    1.02     58.5±1.41µs        ? ?/sec
filter context short string view low selectivity (kept 1/1024)                1.00    445.4±0.62ns        ? ?/sec    1.10    488.5±0.88ns        ? ?/sec
filter context string (kept 1/2)                                              1.00   554.8±10.13µs        ? ?/sec    1.08   598.9±11.17µs        ? ?/sec
filter context string dictionary (kept 1/2)                                   1.00     23.4±0.06µs        ? ?/sec    1.00     23.3±0.07µs        ? ?/sec
filter context string dictionary high selectivity (kept 1023/1024)            1.00      7.3±0.36µs        ? ?/sec    1.02      7.5±0.26µs        ? ?/sec
filter context string dictionary low selectivity (kept 1/1024)                1.00    818.7±1.34ns        ? ?/sec    1.00    816.2±1.14ns        ? ?/sec
filter context string dictionary w NULLs (kept 1/2)                           1.00     66.4±0.11µs        ? ?/sec    1.43     95.0±0.25µs        ? ?/sec
filter context string dictionary w NULLs high selectivity (kept 1023/1024)    1.00     14.3±0.60µs        ? ?/sec    1.02     14.5±0.51µs        ? ?/sec
filter context string dictionary w NULLs low selectivity (kept 1/1024)        1.00   1026.5±1.91ns        ? ?/sec    1.04   1069.9±3.08ns        ? ?/sec
filter context string high selectivity (kept 1023/1024)                       1.00   641.6±15.29µs        ? ?/sec    1.00   642.0±13.73µs        ? ?/sec
filter context string low selectivity (kept 1/1024)                           1.14   1127.3±1.05ns        ? ?/sec    1.00    986.1±1.90ns        ? ?/sec
filter context u8 (kept 1/2)                                                  1.00     18.9±0.12µs        ? ?/sec    1.00     18.8±0.03µs        ? ?/sec
filter context u8 high selectivity (kept 1023/1024)                           1.00   1807.2±5.73ns        ? ?/sec    1.00  1801.9±10.98ns        ? ?/sec
filter context u8 low selectivity (kept 1/1024)                               1.00    233.4±0.30ns        ? ?/sec    1.02    239.0±0.81ns        ? ?/sec
filter context u8 w NULLs (kept 1/2)                                          1.00     61.6±0.08µs        ? ?/sec    1.46     89.8±0.08µs        ? ?/sec
filter context u8 w NULLs high selectivity (kept 1023/1024)                   1.00      8.6±0.06µs        ? ?/sec    1.00      8.6±0.02µs        ? ?/sec
filter context u8 w NULLs low selectivity (kept 1/1024)                       1.00    542.7±1.16ns        ? ?/sec    1.05    567.2±0.98ns        ? ?/sec
filter decimal128 (kept 1/2)                                                  1.00     95.6±0.34µs        ? ?/sec    1.02     97.8±0.28µs        ? ?/sec
filter decimal128 high selectivity (kept 1023/1024)                           1.00     53.3±1.79µs        ? ?/sec    1.02     54.1±1.58µs        ? ?/sec
filter decimal128 low selectivity (kept 1/1024)                               1.02      3.1±0.00µs        ? ?/sec    1.00      3.0±0.01µs        ? ?/sec
filter f32 (kept 1/2)                                                         1.16    231.3±0.30µs        ? ?/sec    1.00    199.3±0.33µs        ? ?/sec
filter fsb with value length 20 (kept 1/2)                                    1.00    134.9±0.62µs        ? ?/sec    1.11    149.8±0.56µs        ? ?/sec
filter fsb with value length 20 high selectivity (kept 1023/1024)             1.03     71.2±2.63µs        ? ?/sec    1.00     68.9±1.17µs        ? ?/sec
filter fsb with value length 20 low selectivity (kept 1/1024)                 1.00      3.2±0.01µs        ? ?/sec    1.01      3.2±0.01µs        ? ?/sec
filter fsb with value length 5 (kept 1/2)                                     1.00    131.6±0.15µs        ? ?/sec    1.17    153.7±0.45µs        ? ?/sec
filter fsb with value length 5 high selectivity (kept 1023/1024)              1.00     11.2±0.56µs        ? ?/sec    1.03     11.5±0.58µs        ? ?/sec
filter fsb with value length 5 low selectivity (kept 1/1024)                  1.00      3.1±0.01µs        ? ?/sec    1.01      3.1±0.01µs        ? ?/sec
filter fsb with value length 50 (kept 1/2)                                    1.04    190.6±6.99µs        ? ?/sec    1.00    183.4±4.17µs        ? ?/sec
filter fsb with value length 50 high selectivity (kept 1023/1024)             1.03    219.8±7.05µs        ? ?/sec    1.00    214.1±5.86µs        ? ?/sec
filter fsb with value length 50 low selectivity (kept 1/1024)                 1.00      3.1±0.01µs        ? ?/sec    1.01      3.2±0.00µs        ? ?/sec
filter i32 (kept 1/2)                                                         1.00     93.2±0.17µs        ? ?/sec    1.00     93.5±0.14µs        ? ?/sec
filter i32 high selectivity (kept 1023/1024)                                  1.02      9.0±0.49µs        ? ?/sec    1.00      8.8±0.39µs        ? ?/sec
filter i32 low selectivity (kept 1/1024)                                      1.01      3.1±0.01µs        ? ?/sec    1.00      3.1±0.00µs        ? ?/sec
filter optimize (kept 1/2)                                                    1.00     84.8±0.20µs        ? ?/sec    1.00     84.9±0.20µs        ? ?/sec
filter optimize high selectivity (kept 1023/1024)                             1.02      2.9±0.01µs        ? ?/sec    1.00      2.8±0.01µs        ? ?/sec
filter optimize low selectivity (kept 1/1024)                                 1.00      2.8±0.01µs        ? ?/sec    1.01      2.8±0.00µs        ? ?/sec
filter run array (kept 1/2)                                                   1.23    445.5±1.01µs        ? ?/sec    1.00    360.8±1.46µs        ? ?/sec
filter run array high selectivity (kept 1023/1024)                            1.28    397.0±1.87µs        ? ?/sec    1.00    310.5±1.71µs        ? ?/sec
filter run array low selectivity (kept 1/1024)                                1.35    333.5±0.68µs        ? ?/sec    1.00    247.9±0.81µs        ? ?/sec
filter single record batch                                                    1.00     93.4±0.28µs        ? ?/sec    1.00     93.0±0.18µs        ? ?/sec
filter u8 (kept 1/2)                                                          1.00     93.1±0.19µs        ? ?/sec    1.00     93.6±0.23µs        ? ?/sec
filter u8 high selectivity (kept 1023/1024)                                   1.00      3.8±0.04µs        ? ?/sec    1.00      3.9±0.01µs        ? ?/sec
filter u8 low selectivity (kept 1/1024)                                       1.01      3.0±0.02µs        ? ?/sec    1.00      3.0±0.00µs        ? ?/sec

@alamb
Copy link
Contributor

alamb commented Jun 8, 2025

I don't see much one way or the other in the benchmarks but I would say this is a reasonable change regardless

@ctsk
Copy link
Contributor Author

ctsk commented Jun 9, 2025

Thank you for driving this forward!

I don't see much one way or the other in the benchmarks but I would say this is a reasonable change regardless

I don't know the details of the filter benchmarks, but this does seem like a significant difference doesn't it?

filter context fsb with value length 20 (kept 1/2)                            1.00     42.5±0.07µs        ? ?/sec    1.66     70.7±0.14µs        ? ?/sec
filter context fsb with value length 20 high selectivity (kept 1023/1024)     1.00     42.4±0.08µs        ? ?/sec    1.67     70.7±0.09µs        ? ?/sec
filter context fsb with value length 20 low selectivity (kept 1/1024)         1.00     42.4±0.06µs        ? ?/sec    1.67     70.7±0.08µs        ? ?/sec
filter context fsb with value length 5 (kept 1/2)                             1.00     42.5±0.08µs        ? ?/sec    1.66     70.7±0.08µs        ? ?/sec
filter context fsb with value length 5 high selectivity (kept 1023/1024)      1.00     42.4±0.07µs        ? ?/sec    1.67     70.7±0.07µs        ? ?/sec
filter context fsb with value length 5 low selectivity (kept 1/1024)          1.00     42.4±0.05µs        ? ?/sec    1.67     70.7±0.08µs        ? ?/sec
filter context fsb with value length 50 (kept 1/2)                            1.00     42.4±0.07µs        ? ?/sec    1.67     70.7±0.08µs        ? ?/sec
filter context fsb with value length 50 high selectivity (kept 1023/1024)     1.00     42.4±0.07µs        ? ?/sec    1.67     70.7±0.10µs        ? ?/sec
filter context fsb with value length 50 low selectivity (kept 1/1024)         1.00     42.4±0.05µs        ? ?/sec    1.67     70.7±0.07µs        ? ?/sec

@alamb
Copy link
Contributor

alamb commented Jun 9, 2025

I don't know the details of the filter benchmarks, but this does seem like a significant difference doesn't it?

Yes you are right! I am sorry I missed that. However, I double checked and the fsb stands for FixedSizeBinaryArray (which is not a GenericBinaryArray) so I am not sure what is going on there 🤔

@ctsk
Copy link
Contributor Author

ctsk commented Jun 9, 2025

I see other runs of the benchmarks have the same pattern (with the same numbers for the fsb benchmark). Somehow the second run is always takes 66% longer... #7463.

@alamb
Copy link
Contributor

alamb commented Jun 9, 2025

I see other runs of the benchmarks have the same pattern (with the same numbers for the fsb benchmark). Somehow the second run is always takes 66% longer... #7463.

Yeah, it is really weird. It would be great if someone could look into that.

I'll file a ticket

@alamb
Copy link
Contributor

alamb commented Jun 9, 2025

I see other runs of the benchmarks have the same pattern (with the same numbers for the fsb benchmark). Somehow the second run is always takes 66% longer... #7463.

@alamb alamb merged commit 9482f78 into apache:main Jun 9, 2025
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants