Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BooleanArray::new_from_packed and BooleanArray::new_from_u8 #6127

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 58 additions & 1 deletion arrow-array/src/array/boolean_array.rs
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ use crate::array::print_long_array;
use crate::builder::BooleanBuilder;
use crate::iterator::BooleanIter;
use crate::{Array, ArrayAccessor, ArrayRef, Scalar};
use arrow_buffer::{bit_util, BooleanBuffer, MutableBuffer, NullBuffer};
use arrow_buffer::{bit_util, BooleanBuffer, Buffer, MutableBuffer, NullBuffer};
use arrow_data::{ArrayData, ArrayDataBuilder};
use arrow_schema::DataType;
use std::any::Any;
Expand Down Expand Up @@ -110,6 +110,24 @@ impl BooleanArray {
Scalar::new(Self::new(values, None))
}

/// Create a new [`BooleanArray`] from a [`Buffer`] specified by `offset` and `len`, the `offset` and `len` in bits
/// Logically convert each bit in [`Buffer`] to boolean and use it to build [`BooleanArray`].
/// using this method will make the following points self-evident:
/// * there is no `null` in the constructed [`BooleanArray`];
/// * without considering `buffer.into()`, this method is efficient because there is no need to perform pack and unpack operations on boolean;
pub fn new_from_packed(buffer: impl Into<Buffer>, offset: usize, len: usize) -> Self {
BooleanBuffer::new(buffer.into(), offset, len).into()
}

/// Create a new [`BooleanArray`] from `&[u8]`
/// This method uses `new_from_packed` and constructs a [`Buffer`] using `value`, and offset is set to 0 and len is set to `value.len() * 8`
/// using this method will make the following points self-evident:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think many people have said it before, but I still want to say that you are a very nice person. Thank you : )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is all part of my master plan to get everyone working together well to build this ecosystem together -- it is great fun.

/// * there is no `null` in the constructed [`BooleanArray`];
/// * the length of the constructed [`BooleanArray`] is always a multiple of 8;
pub fn new_from_u8(value: &[u8]) -> Self {
BooleanBuffer::new(Buffer::from(value), 0, value.len() * 8).into()
}

/// Returns the length of this array.
pub fn len(&self) -> usize {
self.values.len()
Expand Down Expand Up @@ -509,6 +527,45 @@ mod tests {
}
}

#[test]
fn test_boolean_array_from_packed() {
let v = [1_u8, 2_u8, 3_u8];
let arr = BooleanArray::new_from_packed(v, 0, 24);
assert_eq!(24, arr.len());
assert_eq!(0, arr.offset());
assert_eq!(0, arr.null_count());
assert!(arr.nulls.is_none());
for i in 0..24 {
assert!(!arr.is_null(i));
assert!(arr.is_valid(i));
assert_eq!(
i == 0 || i == 9 || i == 16 || i == 17,
arr.value(i),
"failed t {i}"
)
}
}

#[test]
fn test_boolean_array_from_slice_u8() {
let v: Vec<u8> = vec![1, 2, 3];
let slice = &v[..];
let arr = BooleanArray::new_from_u8(slice);
assert_eq!(24, arr.len());
assert_eq!(0, arr.offset());
assert_eq!(0, arr.null_count());
assert!(arr.nulls().is_none());
for i in 0..24 {
assert!(!arr.is_null(i));
assert!(arr.is_valid(i));
assert_eq!(
i == 0 || i == 9 || i == 16 || i == 17,
arr.value(i),
"failed t {i}"
)
}
}

#[test]
fn test_boolean_array_from_iter() {
let v = vec![Some(false), Some(true), Some(false), Some(true)];
Expand Down
11 changes: 11 additions & 0 deletions arrow-buffer/src/buffer/boolean.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ use crate::{
bit_util, buffer_bin_and, buffer_bin_or, buffer_bin_xor, buffer_unary_not,
BooleanBufferBuilder, Buffer, MutableBuffer,
};

use std::ops::{BitAnd, BitOr, BitXor, Not};

/// A slice-able [`Buffer`] containing bit-packed booleans
Expand Down Expand Up @@ -414,4 +415,14 @@ mod tests {
let expected = BooleanBuffer::new(Buffer::from(&[255, 254, 254, 255, 255]), offset, len);
assert_eq!(!boolean_buf, expected);
}

#[test]
fn test_boolean_from_slice_bool() {
let v = [true, false, false];
let buf = BooleanBuffer::from(&v[..]);
assert_eq!(buf.offset(), 0);
assert_eq!(buf.len(), 3);
assert_eq!(buf.values().len(), 1);
assert!(buf.value(0));
}
}
Loading