-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-10378: [Rust] Update take() kernel with support for LargeList. #8556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @drusso , thanks a lot. I haven't look into the changes in detail, but I always understood offsets and indices as different concepts: AFAI understand, According to the spec,
So, if anything, what we could change is to make |
|
I'll add some background context on the implementation of Let's start with a list array: For this example, let's take indices Let's look specifically at the input values array and the output values array: We can think of the values arrays as their sequences of indices into the input values array: Constructing that output values array can be accomplished by: This is what the implementation of The I'll also note that the generic implementation is in |
|
You are obviously right, @drusso. I am sorry for the confusion. |
|
I've been away for a few days, I'll review this in the evening GMT+2, as I worked on the initial |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me -- I had some code style questions, but all in all it looks like a nice generalization to me
nevi-me
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| let start = offsets[ix]; | ||
| let end = offsets[ix + 1]; | ||
| current_offset += (end - start) as i32; | ||
| current_offset = current_offset + (end - start); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably did this in response to a clippy lint, but it's fine as we have more lints to fix at some point
This change adds support for
LargeListintake().There is an additional update to the underlying implementation of
take()such that the indices may be anyPrimitiveArrayofArrowNumericType, rather than onlyUInt32Array. This change is motivated by the recursive call totake()intake_list()(here), since in order to supportLargeListArray, which usei64offsets, the recursive call must support indices arrays that areInt64Array.