Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ListViewArray and LargeListViewArray #5375

Open
1 of 7 tasks
Tracked by #5326
alamb opened this issue Feb 8, 2024 · 6 comments
Open
1 of 7 tasks
Tracked by #5326

Add ListViewArray and LargeListViewArray #5375

alamb opened this issue Feb 8, 2024 · 6 comments
Assignees
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog

Comments

@alamb
Copy link
Contributor

alamb commented Feb 8, 2024

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Recently two new types were added to the Arrow format that make it more suitable for certain types of operations on Lists

Specifically when doing filtering / take with List data, creating a new ListArray or LargeListArray requires copying the underlying lists to a new, packed buffer. The "ListView" was designed to solve this limitation and recently added to the Arrow spec.

Describe the solution you'd like
I would like to implement ListViewArray and LargeListViewArray following the spec:
The spec: https://arrow.apache.org/docs/format/Columnar.html#listview-layout

Initially, I would suggest we get the basic types in place:

  • DataType
  • Array implementations and layout and basic construction

Then as follow on PRs, add support to key kernels:

  • cast (to/from ListAray / LargeListArray)
  • filter
  • take

Describe alternatives you've considered

Additional context
This is similar in spirit to the StringViewArray and BinaryViewArray described in #5374

Tasks:

@alamb alamb added enhancement Any new improvement worthy of a entry in the changelog arrow Changes to the arrow crate labels Feb 8, 2024
@Kikkon
Copy link
Contributor

Kikkon commented Mar 3, 2024

Hi @alamb

I'm interested in the progress of this issue regarding the addition of ListViewArray and LargeListViewArray to arrow-rs. I've been following the discussion and would like to know if there have been any updates or if there's anything specific I can assist with to help move this forward.

@alamb
Copy link
Contributor Author

alamb commented Mar 3, 2024

Hi @alamb

I'm interested in the progress of this issue regarding the addition of ListViewArray and LargeListViewArray to arrow-rs. I've been following the discussion and would like to know if there have been any updates or if there's anything specific I can assist with to help move this forward.

Hi @Kikkon -- that is great news. 🙏

There have been discussions related to implementing StringView and BinaryView as part of #5374 and I expect that work to begin shortly (within a month maybe?).

I don't know of any similar work afoot for ListViewArray but I suspect we could follow a very similar pattern to #5374 (e.g. implement the basic array structure first, followed by cast, take, and filter kernels)

Does that make sense?

@Kikkon
Copy link
Contributor

Kikkon commented Mar 10, 2024

Hi @alamb I have already created some preliminary issues: #5492 and pull requests #5493 . Regarding the subsequent tasks, if time permits, I can also try it.

@alamb
Copy link
Contributor Author

alamb commented Mar 10, 2024

Thank you @Kikkon -- that sounds awesome

@Kikkon
Copy link
Contributor

Kikkon commented Mar 13, 2024

@alamb Since #5492 has already been merged, I've created an issue #5501 to keep track of the progress. Is there anything that needs to be added?"

@alamb
Copy link
Contributor Author

alamb commented Mar 13, 2024

@alamb Since #5492 has already been merged, I've created an issue #5501 to keep track of the progress. Is there anything that needs to be added?"

Thanks @Kikkon

I added #5501 to the top of this ticket. In terms of next steps, I think the list of items we are finding on #5374 is worth looking at (e.g. IPC, support for filter/take, etc). We'll keep that list updated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

No branches or pull requests

2 participants