Add `Buffer::from_bitwise_unary` and `Buffer::from_bitwise_binary` me… #8854

alamb · 2025-11-16T13:46:52Z

…thods, deprecate old methods

Which issue does this PR close?

part of Consolidate bitwise operation implementations #8806

Rationale for this change

bitwise_bin_op_helper and bitwise_unary_op_helper are somewhat hard to find and use
as explained on WIP: special case bitwise ops when buffers are u64 aligned #8807
I want to optimize bitwise operations even more heavily (see WIP: special case bitwise ops when buffers are u64 aligned #8807) so I want the implementations centralized so I can focus the efforts there

Also, I think these APIs I think cover the usecase explained by @jorstmann on #8561:

Building a new buffer by starting from an empty state and incrementally appending new bits (append_value, append_slice, append_packed_range and similar methods).

By creating a method on Buffer directly, it is easier to find, and it is clearer that
a new Buffer is being created.

What changes are included in this PR?

Changes:

Add Buffer::from_bitwise_unary and Buffer::from_bitwise_binary methods that do the same thing as bitwise_unary_op_helper and bitwise_bin_op_helper but are easier to find and use
Deprecate bitwise_unary_op_helper and bitwise_bin_op_helper in favor
of the new Buffer methods
Document the new methods, with examples (specifically that the bitwise operations
operate on bits, not bytes and shouldn't do any cross byte operations)

Are these changes tested?

Yes, new doc tests

Are there any user-facing changes?

New APIs, some deprecated

…thods, deprecate old methods

alamb · 2025-11-16T14:06:09Z

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/bitwise_ops (d5a3604) to ca4a0ae diff
BENCH_NAME=boolean_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench boolean_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_bitwise_ops
Results will be posted here when complete

alamb · 2025-11-16T14:10:03Z

🤖: Benchmark completed

Details

group         alamb_bitwise_ops                      main
-----         -----------------                      ----
and           1.00    272.6±1.27ns        ? ?/sec    1.00    272.7±0.86ns        ? ?/sec
and_sliced    1.00   1096.3±7.89ns        ? ?/sec    1.00   1094.7±3.34ns        ? ?/sec
not           1.00    213.1±0.25ns        ? ?/sec    1.00    214.2±1.06ns        ? ?/sec
not_sliced    1.01    965.5±1.32ns        ? ?/sec    1.00    960.6±3.89ns        ? ?/sec
or            1.01    255.1±0.63ns        ? ?/sec    1.00    253.8±1.86ns        ? ?/sec
or_sliced     1.00   1228.0±7.56ns        ? ?/sec    1.00  1227.8±18.85ns        ? ?/sec

alamb · 2025-11-16T14:10:06Z

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/bitwise_ops (d5a3604) to ca4a0ae diff
BENCH_NAME=buffer_bit_ops
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench buffer_bit_ops
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_bitwise_ops
Results will be posted here when complete

alamb · 2025-11-16T14:13:58Z

🤖: Benchmark completed

Details

group                                alamb_bitwise_ops                      main
-----                                -----------------                      ----
buffer_binary_ops/and                1.00    259.6±0.56ns    55.1 GB/sec    1.00    258.9±2.00ns    55.2 GB/sec
buffer_binary_ops/and_with_offset    1.12   1486.1±2.12ns     9.6 GB/sec    1.00   1322.8±9.40ns    10.8 GB/sec
buffer_binary_ops/or                 1.00    239.3±0.60ns    59.8 GB/sec    1.07    256.3±1.96ns    55.8 GB/sec
buffer_binary_ops/or_with_offset     1.00   1355.4±2.50ns    10.6 GB/sec    1.10  1484.8±14.40ns     9.6 GB/sec
buffer_unary_ops/not                 1.14    257.5±0.71ns    37.0 GB/sec    1.00    225.9±3.19ns    42.2 GB/sec
buffer_unary_ops/not_with_offset     1.00    868.1±2.51ns    11.0 GB/sec    1.34  1160.1±14.15ns     8.2 GB/sec

alamb · 2025-11-16T14:14:01Z

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/bitwise_ops (d5a3604) to ca4a0ae diff
BENCH_NAME=boolean_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench boolean_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_bitwise_ops
Results will be posted here when complete

alamb · 2025-11-16T14:17:55Z

🤖: Benchmark completed

Details

group         alamb_bitwise_ops                      main
-----         -----------------                      ----
and           1.00    272.4±1.45ns        ? ?/sec    1.00    273.1±1.36ns        ? ?/sec
and_sliced    1.00   1096.0±1.60ns        ? ?/sec    1.00   1095.1±2.77ns        ? ?/sec
not           1.00    213.8±0.29ns        ? ?/sec    1.00    214.0±0.40ns        ? ?/sec
not_sliced    1.00    965.6±9.77ns        ? ?/sec    1.00    961.8±5.75ns        ? ?/sec
or            1.00    254.1±0.66ns        ? ?/sec    1.01    255.6±0.41ns        ? ?/sec
or_sliced     1.00   1225.5±2.12ns        ? ?/sec    1.00   1226.9±7.43ns        ? ?/sec

alamb · 2025-11-16T14:17:58Z

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/bitwise_ops (d5a3604) to ca4a0ae diff
BENCH_NAME=buffer_bit_ops
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench buffer_bit_ops
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_bitwise_ops
Results will be posted here when complete

alamb · 2025-11-16T14:21:49Z

🤖: Benchmark completed

Details

group                                alamb_bitwise_ops                      main
-----                                -----------------                      ----
buffer_binary_ops/and                1.00    259.7±0.55ns    55.1 GB/sec    1.00    259.3±4.36ns    55.2 GB/sec
buffer_binary_ops/and_with_offset    1.13   1486.2±3.20ns     9.6 GB/sec    1.00   1320.5±3.78ns    10.8 GB/sec
buffer_binary_ops/or                 1.00    239.2±0.34ns    59.8 GB/sec    1.07    256.2±0.89ns    55.8 GB/sec
buffer_binary_ops/or_with_offset     1.00   1355.8±4.32ns    10.6 GB/sec    1.09   1483.7±4.32ns     9.6 GB/sec
buffer_unary_ops/not                 1.13    257.1±0.97ns    37.1 GB/sec    1.00    226.6±1.72ns    42.1 GB/sec
buffer_unary_ops/not_with_offset     1.00    863.6±3.06ns    11.0 GB/sec    1.32   1139.4±2.91ns     8.4 GB/sec

alamb · 2025-11-18T22:53:19Z

The benchmarks show a slowdown for some operations for some reason

buffer_binary_ops/and_with_offset 1.13 1486.2±3.20ns 9.6 GB/sec 1.00 1320.5±3.78ns 10.8 GB/sec

However, given the duration of the benchmark, I am thinking maybe this is cache lines or something.

I have an idea of how to improve the benchmarks so they are less noisy (basically run them in a 100x loop)

Dandandan · 2025-11-19T06:54:39Z

arrow-buffer/src/buffer/immutable.rs

+        let rem = op(left_chunks.remainder_bits(), right_chunks.remainder_bits());
+        // we are counting its starting from the least significant bit, to to_le_bytes should be correct
+        let rem = &rem.to_le_bytes()[0..remainder_bytes];
+        buffer.extend_from_slice(rem);


This might do an extra allocation? Other places avoid this by preallocating the final u64 needed for the remainder as well (collect_bool)

That is a good call -- I will make the change

However, this is same code as how the current bitwise_binary_op does it, so I would expect no performance difference 🤔

https://github.com/apache/arrow-rs/pull/8854/files#diff-e7a951ab8abfeef1016ed4427a3aef25be5be470454caa1e1dd93e56968316b5L122

I agree, however allocations during benchmarking seems to make benchmarking very noisy.

🤔 I tried this

pub fn from_bitwise_binary_op<F>( left: impl AsRef<[u8]>, left_offset_in_bits: usize, right: impl AsRef<[u8]>, right_offset_in_bits: usize, len_in_bits: usize, mut op: F, ) -> Buffer where F: FnMut(u64, u64) -> u64, { let left_chunks = BitChunks::new(left.as_ref(), left_offset_in_bits, len_in_bits); let right_chunks = BitChunks::new(right.as_ref(), right_offset_in_bits, len_in_bits); let remainder_bytes = ceil(left_chunks.remainder_len(), 8); // if it evenly divides into u64 chunks let buffer = if remainder_bytes == 0 { let chunks = left_chunks .iter() .zip(right_chunks.iter()) .map(|(left, right)| op(left, right)); // Soundness: `BitChunks` is a `BitChunks` iterator which // correctly reports its upper bound unsafe { MutableBuffer::from_trusted_len_iter(chunks) } } else { // Compute last u64 here so that we can reserve exact capacity let rem = op(left_chunks.remainder_bits(), right_chunks.remainder_bits()); let chunks = left_chunks .iter() .zip(right_chunks.iter()) .map(|(left, right)| op(left, right)) .chain(std::iter::once(rem)); // Soundness: `BitChunks` is a `BitChunks` iterator which // correctly reports its upper bound, and so is the `chain` iterator let mut buffer = unsafe { MutableBuffer::from_trusted_len_iter(chunks) }; // Adjust the length down if last u64 is not fully used let extra_bytes = 8 - remainder_bytes; buffer.truncate(buffer.len() - extra_bytes); buffer }; buffer.into() }

But it seems to be slower.

I also tried making a version of MutableBuffer::from_trusted_len_iter that also added additional and it didn't seem to help either (perhaps because the benchmarks happen to avoid reallocation 🤔 )

/// Like [`from_trusted_len_iter`] but can add additional capacity at the end /// in case the caller wants to add more data after the initial iterator. #[inline] pub unsafe fn from_trusted_len_iter_with_additional_capacity<T: ArrowNativeType, I: Iterator<Item = T>>( iterator: I, additional_capacity: usize, ) -> Self { let item_size = std::mem::size_of::<T>(); let (_, upper) = iterator.size_hint(); let upper = upper.expect("from_trusted_len_iter requires an upper limit"); let len = upper * item_size; let mut buffer = MutableBuffer::new(len + additional_capacity); let mut dst = buffer.data.as_ptr(); for item in iterator { // note how there is no reserve here (compared with `extend_from_iter`) let src = item.to_byte_slice().as_ptr(); unsafe { std::ptr::copy_nonoverlapping(src, dst, item_size) }; dst = unsafe { dst.add(item_size) }; } assert_eq!( unsafe { dst.offset_from(buffer.data.as_ptr()) } as usize, len, "Trusted iterator length was not accurately reported" ); buffer.len = len; buffer }

There is also a extend from trusted len iter in MutableBuffer? Other option is to use Vec::extend here as well.

Dandandan · 2025-11-19T06:57:09Z

arrow-buffer/src/buffer/immutable.rs

+        F: FnMut(u64) -> u64,
+    {
+        // reserve capacity and set length so we can get a typed view of u64 chunks
+        let mut result =


As we overwrite the results, we shouldn't need to initialize/zero out the array.

Dandandan · 2025-11-19T08:36:41Z

The benchmarks show a slowdown for some operations for some reason

buffer_binary_ops/and_with_offset 1.13 1486.2±3.20ns 9.6 GB/sec 1.00 1320.5±3.78ns 10.8 GB/sec

However, given the duration of the benchmark, I am thinking maybe this is cache lines or something.

I have an idea of how to improve the benchmarks so they are less noisy (basically run them in a 100x loop)

Might also because of the allocation? Looks like and_with_offset and and are not a over a power of two inputs.

alamb · 2025-12-13T15:25:10Z

run benchmark buffer_bit_ops boolean_kernels

alamb-ghbot · 2025-12-13T15:52:14Z

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/bitwise_ops (819210e) to c6cc7f8 diff
BENCH_NAME=buffer_bit_ops
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench buffer_bit_ops
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_bitwise_ops
Results will be posted here when complete

alamb-ghbot · 2025-12-13T15:56:04Z

🤖: Benchmark completed

Details

group                                alamb_bitwise_ops                      main
-----                                -----------------                      ----
buffer_binary_ops/and                3.73   977.9±12.68ns    14.6 GB/sec    1.00    262.0±2.96ns    54.6 GB/sec
buffer_binary_ops/and_with_offset    1.20   1795.3±4.61ns     8.0 GB/sec    1.00  1493.0±24.14ns     9.6 GB/sec
buffer_binary_ops/or                 3.84    977.7±7.18ns    14.6 GB/sec    1.00    254.6±0.80ns    56.2 GB/sec
buffer_binary_ops/or_with_offset     1.39  1838.2±48.86ns     7.8 GB/sec    1.00   1324.5±6.34ns    10.8 GB/sec
buffer_unary_ops/not                 2.76    625.5±8.53ns    15.2 GB/sec    1.00    226.8±3.71ns    42.1 GB/sec
buffer_unary_ops/not_with_offset     1.12    927.0±4.27ns    10.3 GB/sec    1.00    831.2±1.37ns    11.5 GB/sec

alamb-ghbot · 2025-12-13T15:56:07Z

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/bitwise_ops (819210e) to c6cc7f8 diff
BENCH_NAME=boolean_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench boolean_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_bitwise_ops
Results will be posted here when complete

alamb-ghbot · 2025-12-13T16:00:01Z

🤖: Benchmark completed

Details

group         alamb_bitwise_ops                      main
-----         -----------------                      ----
and           3.00    828.3±5.53ns        ? ?/sec    1.00    276.4±1.71ns        ? ?/sec
and_sliced    1.10  1349.6±20.70ns        ? ?/sec    1.00  1230.0±36.69ns        ? ?/sec
not           1.87    402.3±4.99ns        ? ?/sec    1.00    214.9±1.25ns        ? ?/sec
not_sliced    1.11   777.0±10.46ns        ? ?/sec    1.00    701.4±6.43ns        ? ?/sec
or            3.30   821.4±24.65ns        ? ?/sec    1.00    249.0±0.79ns        ? ?/sec
or_sliced     1.23  1340.2±11.33ns        ? ?/sec    1.00   1093.7±9.34ns        ? ?/sec

alamb · 2025-12-14T15:03:14Z

run benchmark boolean_kernels

alamb-ghbot · 2025-12-14T15:03:23Z

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/bitwise_ops (c6a2e40) to c6cc7f8 diff
BENCH_NAME=boolean_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench boolean_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_bitwise_ops
Results will be posted here when complete

alamb-ghbot · 2025-12-14T15:07:21Z

🤖: Benchmark completed

Details

group         alamb_bitwise_ops                      main
-----         -----------------                      ----
and           1.00    273.5±5.64ns        ? ?/sec    1.01    275.4±1.59ns        ? ?/sec
and_sliced    1.00  1027.1±17.88ns        ? ?/sec    1.20  1229.1±17.09ns        ? ?/sec
not           1.00    183.9±2.69ns        ? ?/sec    1.18    216.2±2.04ns        ? ?/sec
not_sliced    1.00    619.8±9.56ns        ? ?/sec    1.13    701.3±1.85ns        ? ?/sec
or            1.00    248.0±0.86ns        ? ?/sec    1.01    249.8±1.80ns        ? ?/sec
or_sliced     1.00   1023.1±5.07ns        ? ?/sec    1.07   1092.0±2.68ns        ? ?/sec

alamb · 2025-12-14T20:33:51Z

run benchmark buffer_bit_ops

alamb · 2025-12-14T20:33:59Z

run benchmark boolean_kernels

alamb-ghbot · 2025-12-14T20:33:59Z

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/bitwise_ops (82bc7aa) to c6cc7f8 diff
BENCH_NAME=buffer_bit_ops
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench buffer_bit_ops
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_bitwise_ops
Results will be posted here when complete

alamb-ghbot · 2025-12-14T20:37:48Z

🤖: Benchmark completed

Details

group                                alamb_bitwise_ops                      main
-----                                -----------------                      ----
buffer_binary_ops/and                1.00    216.0±1.37ns    66.2 GB/sec    1.22   263.6±11.48ns    54.3 GB/sec
buffer_binary_ops/and_with_offset    1.00   1234.9±3.79ns    11.6 GB/sec    1.21  1489.1±11.61ns     9.6 GB/sec
buffer_binary_ops/or                 1.00    211.3±1.96ns    67.7 GB/sec    1.21    255.2±3.15ns    56.1 GB/sec
buffer_binary_ops/or_with_offset     1.00   1268.1±6.32ns    11.3 GB/sec    1.04  1324.3±13.77ns    10.8 GB/sec
buffer_unary_ops/not                 1.00    182.7±1.70ns    52.2 GB/sec    1.24    226.5±1.22ns    42.1 GB/sec
buffer_unary_ops/not_with_offset     1.00   750.8±24.40ns    12.7 GB/sec    1.11    829.7±6.96ns    11.5 GB/sec

alamb-ghbot · 2025-12-14T20:37:53Z

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/bitwise_ops (82bc7aa) to c6cc7f8 diff
BENCH_NAME=boolean_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench boolean_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_bitwise_ops
Results will be posted here when complete

alamb-ghbot · 2025-12-14T20:41:50Z

🤖: Benchmark completed

Details

group         alamb_bitwise_ops                      main
-----         -----------------                      ----
and           1.00    211.3±5.58ns        ? ?/sec    1.30    274.8±1.53ns        ? ?/sec
and_sliced    1.00  1033.6±27.44ns        ? ?/sec    1.19   1225.3±5.24ns        ? ?/sec
not           1.00    146.4±0.60ns        ? ?/sec    1.47    215.2±4.93ns        ? ?/sec
not_sliced    1.00    621.0±3.26ns        ? ?/sec    1.13    700.8±7.63ns        ? ?/sec
or            1.00    200.8±0.91ns        ? ?/sec    1.24    249.1±1.57ns        ? ?/sec
or_sliced     1.00   1030.3±4.63ns        ? ?/sec    1.06  1093.6±11.06ns        ? ?/sec

alamb · 2025-12-14T20:41:56Z

Might also because of the allocation? Looks like and_with_offset and and are not a over a power of two inputs.

You were spot on here here @Dandandan -- getting rid of the extra allocation made a non trivial difference in the benchmarks

alamb · 2025-12-14T20:45:36Z

Update here: benchmarks are looking quite good 😎

I also incorporated the changes from #8807

My next plan is to:

Add more unit tests / fuzzing
Split it into two PRs to make it easier to review: unary and binary

…unary` (#8996) # Which issue does this PR close? - part of #8806 - broken out from #8854 # Rationale for this change The current implementation of the unary not kernel has an extra allocation when operating on sliced data which is not necessary. Also, we can generate more optimal code by processing u64 words at a time when the buffer is already u64 aligned (see #8807) Also, it is hard to find the code to create new Buffers by copying bits # What changes are included in this PR? 1. Introduce `BooleanBuffer::from_bitwise_unary` and `BooleanBuffer::from_bits` 2. Deprecate `bitwise_unary_op_helper` # Are these changes tested? Yes with new tests and benchmarks # Are there any user-facing changes? new PAPI --------- Co-authored-by: Martin Hilton <[email protected]>

…er::from_bitwise_binary_op` (apache#9090) # Which issue does this PR close? - Part of apache#8806 - Closes apache#8854 - Closes apache#8807 This is the next step after - apache#8996 # Rationale for this change - we can help rust / LLVM generate more optimal code by processing u64 words at a time when the buffer is already u64 aligned (see apache#8807) Also, it is hard to find the code to create new Buffers by applying bitwise unary operations. # What changes are included in this PR? - Introduce optimized `BooleanBuffer::from_bitwise_binary` - Migrate several kernels that use `bitwise_bin_op_helper` to use the new BooleanBuffer # Are these changes tested? Yes new tests are added Performance results show 30% performance improvement for the `and` and `or` kernels for aligned buffers (common case) # Are there any user-facing changes? A new API

github-actions bot added the arrow Changes to the arrow crate label Nov 16, 2025

alamb force-pushed the alamb/bitwise_ops branch from 3c68505 to 69e68a1 Compare November 16, 2025 14:02

Add Buffer::from_bitwise_unary and Buffer::from_bitwise_binary me…

d5a3604

…thods, deprecate old methods

alamb force-pushed the alamb/bitwise_ops branch from 69e68a1 to d5a3604 Compare November 16, 2025 14:04

Merge branch 'main' into alamb/bitwise_ops

cb2ae37

This was referenced Nov 18, 2025

Consolidate bitwise operation implementations #8806

Open

#8806 Consolidate bitwise operations and make nullif respect ArrayData bitmap layout contract #8869

Draft

Dandandan reviewed Nov 19, 2025

View reviewed changes

alamb mentioned this pull request Nov 19, 2025

Run boolean and bitwise kernels for longer to reduce benchmark noise #8872

Closed

1 task

alamb added 5 commits December 13, 2025 09:57

Reduce allocations

af31aeb

Reduce allocations

70b1132

Merge remote-tracking branch 'apache/main' into alamb/bitwise_ops

64a44cb

fix

cf17667

fix

819210e

sprinkle unsafe

c6a2e40

alamb added 3 commits December 14, 2025 15:03

clenaup

1592fcf

special case aligned data

b959dd3

cleanup

82bc7aa

fix docs

3b12a46

alamb added 5 commits December 14, 2025 16:15

Add tests

02513cd

more test

ce09b38

more tests

cf0262e

fmt + clippy

d050bb6

fix docs

86f7dad

alamb mentioned this pull request Dec 15, 2025

Speed up unary not kernel by 50%, add BooleanBuffer::from_bitwise_unary #8996

Merged

alamb mentioned this pull request Dec 20, 2025

Speed up binary kernels by XXX%, add BooleanBuffer::from_bitwise_binary #9022

Closed

alamb mentioned this pull request Jan 2, 2026

Speed up binary kernels (30% faster and and or), add BooleanBuffer::from_bitwise_binary_op #9090

Merged

Dandandan closed this in 96637fc Jan 9, 2026

Add Buffer::from_bitwise_unary and Buffer::from_bitwise_binary me… #8854

Add Buffer::from_bitwise_unary and Buffer::from_bitwise_binary me… #8854

Uh oh!

Conversation

alamb commented Nov 16, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

alamb commented Nov 16, 2025

Uh oh!

alamb commented Nov 16, 2025

Uh oh!

alamb commented Nov 16, 2025

Uh oh!

alamb commented Nov 16, 2025

Uh oh!

alamb commented Nov 16, 2025

Uh oh!

alamb commented Nov 16, 2025

Uh oh!

alamb commented Nov 16, 2025

Uh oh!

alamb commented Nov 16, 2025

Uh oh!

alamb commented Nov 18, 2025

Uh oh!

Dandandan Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

alamb Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Dandandan Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

alamb Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

alamb Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

Dandandan Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Dandandan Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

Dandandan commented Nov 19, 2025

Uh oh!

alamb commented Dec 13, 2025

Uh oh!

alamb-ghbot commented Dec 13, 2025

Uh oh!

alamb-ghbot commented Dec 13, 2025

Uh oh!

alamb-ghbot commented Dec 13, 2025

Uh oh!

alamb-ghbot commented Dec 13, 2025

Uh oh!

alamb commented Dec 14, 2025

Uh oh!

alamb-ghbot commented Dec 14, 2025

Uh oh!

alamb-ghbot commented Dec 14, 2025

Uh oh!

alamb commented Dec 14, 2025

Uh oh!

alamb commented Dec 14, 2025

Uh oh!

alamb-ghbot commented Dec 14, 2025

Uh oh!

alamb-ghbot commented Dec 14, 2025

Uh oh!

alamb-ghbot commented Dec 14, 2025

Uh oh!

Add `Buffer::from_bitwise_unary` and `Buffer::from_bitwise_binary` me… #8854

Add `Buffer::from_bitwise_unary` and `Buffer::from_bitwise_binary` me… #8854

alamb Nov 19, 2025 •

edited

Loading

Dandandan Nov 19, 2025 •

edited

Loading