Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions rust/arrow/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ lazy_static = "1.4"
packed_simd = { version = "0.3.4", optional = true, package = "packed_simd_2" }
chrono = "0.4"
flatbuffers = "0.6"
bitvec = "0.19"
hex = "0.4"
prettytable-rs = { version = "0.8.0", optional = true }

Expand Down
64 changes: 39 additions & 25 deletions rust/arrow/src/buffer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,10 @@ use std::sync::Arc;
use crate::datatypes::ArrowNativeType;
use crate::error::{ArrowError, Result};
use crate::memory;
use crate::util::bit_chunk_iterator::BitChunks;
use crate::util::bit_slice_iterator::*;
use crate::util::bit_util;
use crate::util::bit_util::ceil;

#[cfg(feature = "simd")]
use std::borrow::BorrowMut;

Expand Down Expand Up @@ -258,16 +259,20 @@ impl Buffer {
/// Returns a slice of this buffer starting at a certain bit offset.
/// If the offset is byte-aligned the returned buffer is a shallow clone,
/// otherwise a new buffer is allocated and filled with a copy of the bits in the range.
pub fn bit_slice(&self, offset: usize, len: usize) -> Self {
if offset % 8 == 0 && len % 8 == 0 {
return self.slice(offset / 8);
pub fn bit_view(&self, offset_in_bits: usize, len_in_bits: usize) -> Self {
if offset_in_bits % 8 == 0 && len_in_bits % 8 == 0 {
self.slice(offset_in_bits / 8)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand why this doesn't need to refer to len_in_bits -- how do we know that len_in_bits covers the entire buffer? Maybe this should be self.slice(len_in_bits/8)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the idea, bit view doesn't cover the whole Buffer. If you give the whole buffer's length in bits and start offset as 0 then it will cover the whole buffer. Otherwise, we can use a partial bit view on the Buffer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you give the whole buffer's length in bits and start offset as 0 then it will cover the whole buffer

Right, what I don't understand is how the test for len_in_bits % 8 == 0 is checking for the whole buffer length. It seems like it is checking that len_in_bits is a multiple of 8 (aka represents whole bytes)

Maybe there is some assumption here like self.len_in_bits < 8?

} else {
self.bit_slice()
.view(offset_in_bits, len_in_bits)
.as_buffer()
}

bitwise_unary_op_helper(&self, offset, len, |a| a)
}

pub fn bit_chunks(&self, offset: usize, len: usize) -> BitChunks {
BitChunks::new(&self, offset, len)
/// Gives bit slice of the underlying buffer
/// This method can be used to get bit views for bit operations on the immutable view over the buffer.
pub fn bit_slice(&self) -> BufferBitSlice {
BufferBitSlice::new(self.data.data())
}

/// Returns an empty buffer.
Expand Down Expand Up @@ -401,20 +406,27 @@ where
let mut result =
MutableBuffer::new(ceil(len_in_bits, 8)).with_bitset(len_in_bits / 64 * 8, false);

let left_chunks = left.bit_chunks(left_offset_in_bits, len_in_bits);
let right_chunks = right.bit_chunks(right_offset_in_bits, len_in_bits);
let left_slice = left.bit_slice().view(left_offset_in_bits, len_in_bits);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the rationale for this change? It seems to use more code to accomplish the same functionality without any performance improvement. I am likely missing something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved the remainder calculation here. The remainder calculation was at the end of this method. It looks like more code because now we explicitly know what is bit view and what is chunk iterator. It was quite a bit blurry before. Now you can have a bit view over the buffer while having chunks and remaining bits do their work. It is for readability and not consuming chunk iterator.

let left_chunks = left_slice.chunks::<u64>();

let right_slice = right.bit_slice().view(right_offset_in_bits, len_in_bits);
let right_chunks = right_slice.chunks::<u64>();

let remainder_bytes = ceil(left_chunks.remainder_bit_len(), 8);
let rem = op(left_chunks.remainder_bits(), right_chunks.remainder_bits());
let rem = &rem.to_ne_bytes()[0..remainder_bytes];

let left_chunk_iter = left_chunks.interpret();
let right_chunk_iter = right_chunks.interpret();

let result_chunks = result.typed_data_mut::<u64>().iter_mut();

result_chunks
.zip(left_chunks.iter().zip(right_chunks.iter()))
.zip(left_chunk_iter.zip(right_chunk_iter))
.for_each(|(res, (left, right))| {
*res = op(left, right);
});

let remainder_bytes = ceil(left_chunks.remainder_len(), 8);
let rem = op(left_chunks.remainder_bits(), right_chunks.remainder_bits());
// we are counting its starting from the least significant bit, to to_le_bytes should be correct
let rem = &rem.to_le_bytes()[0..remainder_bytes];
result.extend_from_slice(rem);

result.freeze()
Expand All @@ -435,19 +447,21 @@ where
let mut result =
MutableBuffer::new(ceil(len_in_bits, 8)).with_bitset(len_in_bits / 64 * 8, false);

let left_chunks = left.bit_chunks(offset_in_bits, len_in_bits);
let left_slice = left.bit_slice().view(offset_in_bits, len_in_bits);
let left_chunks = left_slice.chunks::<u64>();

let remainder_bytes = ceil(left_chunks.remainder_bit_len(), 8);
let rem = op(left_chunks.remainder_bits());
let rem = &rem.to_ne_bytes()[0..remainder_bytes];

let left_chunk_iter = left_chunks.interpret();

let result_chunks = result.typed_data_mut::<u64>().iter_mut();

result_chunks
.zip(left_chunks.iter())
.for_each(|(res, left)| {
*res = op(left);
});
result_chunks.zip(left_chunk_iter).for_each(|(res, left)| {
*res = op(left);
});

let remainder_bytes = ceil(left_chunks.remainder_len(), 8);
let rem = op(left_chunks.remainder_bits());
// we are counting its starting from the least significant bit, to to_le_bytes should be correct
let rem = &rem.to_le_bytes()[0..remainder_bytes];
result.extend_from_slice(rem);

result.freeze()
Expand Down
47 changes: 27 additions & 20 deletions rust/arrow/src/compute/kernels/aggregate.rs
Original file line number Diff line number Diff line change
Expand Up @@ -152,9 +152,15 @@ where
let data_chunks = data.chunks_exact(64);
let remainder = data_chunks.remainder();

let bit_chunks = buffer.bit_chunks(array.offset(), array.len());
let buffer_slice = buffer.bit_slice().view(array.offset(), array.len());
let buffer_chunks = buffer_slice.chunks::<u64>();

let buffer_remainder_bits: u64 = buffer_chunks.remainder_bits();

let buffer_chunk_iter = buffer_chunks.interpret();

&data_chunks
.zip(bit_chunks.iter())
.zip(buffer_chunk_iter)
.for_each(|(chunk, mask)| {
chunk.iter().enumerate().for_each(|(i, value)| {
if (mask & (1 << i)) != 0 {
Expand All @@ -163,10 +169,8 @@ where
});
});

let remainder_bits = bit_chunks.remainder_bits();

remainder.iter().enumerate().for_each(|(i, value)| {
if remainder_bits & (1 << i) != 0 {
if buffer_remainder_bits & (1 << i) != 0 {
sum = sum + *value;
}
});
Expand Down Expand Up @@ -216,24 +220,27 @@ where
let data_chunks = data.chunks_exact(64);
let remainder = data_chunks.remainder();

let bit_chunks = buffer.bit_chunks(array.offset(), array.len());
let bit_slice = buffer.bit_slice().view(array.offset(), array.len());
let bit_chunks = bit_slice.chunks::<u64>();
let remainder_bits = bit_chunks.remainder_bits();

data_chunks.zip(bit_chunks).for_each(|(chunk, mut mask)| {
// split chunks further into slices corresponding to the vector length
// the compiler is able to unroll this inner loop and remove bounds checks
// since the outer chunk size (64) is always a multiple of the number of lanes
chunk.chunks_exact(T::lanes()).for_each(|chunk| {
let zero = T::init(T::default_value());
let vecmask = T::mask_from_u64(mask);
let chunk = T::load(&chunk);
let blended = T::mask_select(vecmask, chunk, zero);

vector_sum = vector_sum + blended;

mask = mask >> T::lanes();
data_chunks
.zip(bit_chunks.interpret())
.for_each(|(chunk, mut mask)| {
// split chunks further into slices corresponding to the vector length
// the compiler is able to unroll this inner loop and remove bounds checks
// since the outer chunk size (64) is always a multiple of the number of lanes
chunk.chunks_exact(T::lanes()).for_each(|chunk| {
let zero = T::init(T::default_value());
let vecmask = T::mask_from_u64(mask);
let chunk = T::load(&chunk);
let blended = T::mask_select(vecmask, chunk, zero);

vector_sum = vector_sum + blended;

mask = mask >> T::lanes();
});
});
});

remainder.iter().enumerate().for_each(|(i, value)| {
if remainder_bits & (1 << i) != 0 {
Expand Down
2 changes: 1 addition & 1 deletion rust/arrow/src/compute/kernels/boolean.rs
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ pub fn is_not_null(input: &ArrayRef) -> Result<BooleanArray> {
.with_bitset(len_bytes, true)
.freeze()
}
Some(buffer) => buffer.bit_slice(input.offset(), len),
Some(buffer) => buffer.bit_view(input.offset(), len),
};

let data =
Expand Down
8 changes: 4 additions & 4 deletions rust/arrow/src/compute/util.rs
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,10 @@ pub(super) fn combine_option_bitmap(
match left {
None => match right {
None => Ok(None),
Some(r) => Ok(Some(r.bit_slice(right_offset_in_bits, len_in_bits))),
Some(r) => Ok(Some(r.bit_view(right_offset_in_bits, len_in_bits))),
},
Some(l) => match right {
None => Ok(Some(l.bit_slice(left_offset_in_bits, len_in_bits))),
None => Ok(Some(l.bit_view(left_offset_in_bits, len_in_bits))),

Some(r) => Ok(Some(buffer_bin_and(
&l,
Expand Down Expand Up @@ -78,10 +78,10 @@ pub(super) fn compare_option_bitmap(
match left {
None => match right {
None => Ok(None),
Some(r) => Ok(Some(r.bit_slice(right_offset_in_bits, len_in_bits))),
Some(r) => Ok(Some(r.bit_view(right_offset_in_bits, len_in_bits))),
},
Some(l) => match right {
None => Ok(Some(l.bit_slice(left_offset_in_bits, len_in_bits))),
None => Ok(Some(l.bit_view(left_offset_in_bits, len_in_bits))),

Some(r) => Ok(Some(buffer_bin_or(
&l,
Expand Down
Loading