-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BooleanArray::new_from_packed
and BooleanArray::new_from_u8
#6127
Add BooleanArray::new_from_packed
and BooleanArray::new_from_u8
#6127
Conversation
I'm not sure about this, as it doesn't provide a mechanism to indicate a length that isn't a multiple of 8 or aligned. Perhaps you could expand on what the use-case for this is, and why you would be using this as opposed to say |
Here, we only convert each bit in
Alternatively, we need to convert the bitstream into an iterator object that returns a boolean, but these come at a performance loss. |
Of course, using this method involves two self-evident facts:
If these prerequisites are not met, then this method should not be used. |
I wonder if this related to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this proposal @chloro-pn
No, we are not building a new type here. |
bcf4fba
to
0420332
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this more last night.
I think my main concern is if the semantics of BooleanArray::from(&[u8])
will be "surprising" to people
Thus I would like to propose making this a function like BooleanArray::from_u8
like
let arr = BooleanArray::from_u8(..)
We can put the documentation on that method. I think that would avoid any concern about confusion (as users would could the method explicitly).
What do you think?
@@ -50,6 +50,22 @@ use std::sync::Arc; | |||
/// assert_eq!(&values, &[Some(true), None, Some(false), None, Some(false)]) | |||
/// ``` | |||
/// | |||
/// # Example: From `&[u8]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
I think your suggestion is more appropriate, and I will modify the code later. |
Might I suggest something along the lines of
This would be explicit and consistent with other methods. Edit: although I wonder if this is then really all that different from
I wonder if we really need a new method... |
This is what I mentioned in my previous reply "providing a more convenient way to build BooleanArray from bitstream".
|
My feeling is this convenience comes at the expense of being rather confusing to users, especially those less familiar with the internal layout of arrow arrays. My 2 cents is if people want a convenient and easy interface they can use the builders or iterators, if they want lower level integration they use lower level interfaces and construct the array from parts, providing a method that is convenient but correct usage requires knowledge of the internal layout seems undesirable. If Andrew is happy to move forward with this I have no objection to this, but I'm not sold |
I agree that it takes a lot of knowledge until this is obvious:
It seems to me adding an example of how to do this in the documentation is valuable in any case. I also think it is valuable to add APIs that make arrow-rs easier to use for less expert users, even if that means they aren't the most concise, and thus I think an API such as proposd in the PR is a good addition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @chloro-pn -- I think this is a nice addition to the API that is well commented and explains what it does well.
|
||
/// Create a new [`BooleanArray`] from `&[u8]` | ||
/// This method uses `new_from_packed` and constructs a [`Buffer`] using `value`, and offset is set to 0 and len is set to `value.len() * 8` | ||
/// using this method will make the following points self-evident: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think many people have said it before, but I still want to say that you are a very nice person. Thank you : )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is all part of my master plan to get everyone working together well to build this ecosystem together -- it is great fun.
Thank you for the review and comments @tustvold |
BooleanArray::new_from_packed
and BooleanArray::new_from_u8
Thanks again @chloro-pn and @tustvold |
Which issue does this PR close?
Closes #.
Rationale for this change
Users can easily build
BooleanArray
from&[u8]
What changes are included in this PR?
BooleanArray::new_from_packed
BooleanArray::new_from_u8
Are there any user-facing changes?