Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First-Class Array Abstractions #3880

Closed
tustvold opened this issue Mar 17, 2023 · 3 comments · Fixed by #4061
Closed

First-Class Array Abstractions #3880

tustvold opened this issue Mar 17, 2023 · 3 comments · Fixed by #4061
Assignees
Labels
arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate enhancement Any new improvement worthy of a entry in the changelog parquet Changes to the parquet crate

Comments

@tustvold
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Currently Array are wrappers around an ArrayData, storing it as a member of the array. This is redundant, confusing and results in API friction.

Describe the solution you'd like

I would like the Array implementations to just store their constituent parts, in the same vein as the typed ArrayData abstractions experimented with under #1799 did. This would involve the following

  • Add Array::nulls(&self) -> Option<&NullBuffer>
  • Add Array::to_data(&self) -> ArrayData
  • Implement Array::slice, Array::data_type, Array::len, Array::get_buffer_memory_size, etc... for each Array
  • Deprecate Array::data, Array::data_ref and Array::offset

Describe alternatives you've considered

Additional context

#3879

@tustvold tustvold added the enhancement Any new improvement worthy of a entry in the changelog label Mar 17, 2023
@tustvold tustvold self-assigned this Mar 17, 2023
tustvold added a commit to tustvold/arrow-rs that referenced this issue Mar 17, 2023
tustvold added a commit to tustvold/arrow-rs that referenced this issue Mar 17, 2023
tustvold added a commit to tustvold/arrow-rs that referenced this issue Mar 17, 2023
tustvold added a commit that referenced this issue Mar 17, 2023
* Add Array::to_data and Array::nulls (#3880)

* Review feedback

* Format
tustvold added a commit that referenced this issue Mar 21, 2023
* Flesh out NullBuffer abstraction (#3880)

* Review feedback
tustvold added a commit to tustvold/arrow-rs that referenced this issue Mar 21, 2023
tustvold added a commit to tustvold/arrow-rs that referenced this issue Mar 21, 2023
tustvold added a commit to tustvold/arrow-rs that referenced this issue Mar 23, 2023
tustvold added a commit that referenced this issue Mar 23, 2023
* Cleanup uses of Array::data_ref (#3880)

* Further cleanup and fixes
tustvold added a commit to tustvold/arrow-rs that referenced this issue Mar 23, 2023
tustvold added a commit to tustvold/arrow-rs that referenced this issue Mar 23, 2023
tustvold added a commit to tustvold/arrow-rs that referenced this issue Mar 23, 2023
tustvold added a commit to tustvold/arrow-rs that referenced this issue Mar 23, 2023
tustvold added a commit to tustvold/arrow-rs that referenced this issue Mar 24, 2023
tustvold added a commit to tustvold/arrow-rs that referenced this issue Mar 24, 2023
tustvold added a commit to tustvold/arrow-rs that referenced this issue Mar 24, 2023
spebern pushed a commit to spebern/arrow-rs that referenced this issue Mar 25, 2023
* Add Array::to_data and Array::nulls (apache#3880)

* Review feedback

* Format
spebern pushed a commit to spebern/arrow-rs that referenced this issue Mar 25, 2023
* Flesh out NullBuffer abstraction (apache#3880)

* Review feedback
tustvold added a commit that referenced this issue Mar 28, 2023
* Add PrimitiveArray slice (#3880)

* Add ByteArray::slice (#3929)

* Add strongly typed Array::slice (#3929)
tustvold added a commit that referenced this issue Mar 30, 2023
* Add typed buffers to UnionArray (#3880)

* Clippy

* Update arrow-array/src/array/union_array.rs

Co-authored-by: Liang-Chi Hsieh <[email protected]>

---------

Co-authored-by: Liang-Chi Hsieh <[email protected]>
tustvold added a commit to tustvold/arrow-rs that referenced this issue Apr 2, 2023
tustvold added a commit that referenced this issue Apr 3, 2023
* Cleanup more uses of Array::data (#3880)

* Fix failing test

* Fix parquet map array

* Further cleanup

* Further cleanup

* More cleanup

* Fix test

* Clippy
tustvold added a commit to tustvold/arrow-rs that referenced this issue Apr 4, 2023
tustvold added a commit that referenced this issue Apr 5, 2023
* Deprecate Array::data (#3880)

* Review feedback
tustvold added a commit to tustvold/arrow-rs that referenced this issue Apr 12, 2023
tustvold added a commit that referenced this issue Apr 12, 2023
* Remove ArrayData from Array (#3880)

* Fix doc

* Fix pyarrow-integration-testing

* Review feedback
@tustvold tustvold added the arrow Changes to the arrow crate label Apr 21, 2023
@tustvold
Copy link
Contributor Author

label_issue.py automatically added labels {'arrow'} from #3877

@tustvold tustvold added the parquet Changes to the parquet crate label Apr 21, 2023
@tustvold
Copy link
Contributor Author

label_issue.py automatically added labels {'parquet'} from #3930

@tustvold
Copy link
Contributor Author

label_issue.py automatically added labels {'arrow-flight'} from #3965

@tustvold tustvold added the arrow-flight Changes to the arrow-flight crate label Apr 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate enhancement Any new improvement worthy of a entry in the changelog parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant