-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-18237: [Java] Extend Table code #14573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-18237: [Java] Extend Table code #14573
Conversation
covers all numeric types
|
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Opening JIRAs ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename pull request title in the following format? or See also: |
|
@davisusanibar Would you please review? This is a partial implementation but it's already quite large. The code is mostly very simple and well tested. I think before adding anything more it would be best to review and merge what's here |
davisusanibar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for all the documentation added on the methods/classes (+TODO).
|
|
|
@lidavidm Could you please give this a final check and merge? It was originally intended to include new code for mutable table support but has been simplified to include adding a small number of necessary methods to the existing classes. |
| import org.apache.arrow.vector.holders.NullableUInt8Holder; | ||
|
|
||
| /** | ||
| * TODO: Modify the getters for Duration and others so that they return something better than ArrowBuf when possible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to leave this TODO in here? (It should perhaps be linked to an issue?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed obsolete TODOs
|
Benchmark runs are scheduled for baseline = 3b0e135 and contender = 2e9611a. 2e9611a is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
This PR bumps Apache Arrow version from 10.0.0 to 11.0.0. Main changes related to PyAmber: ## Java/Scala side: - Distribute Apple M1 compatible JNI libraries via mavencentral ([#14472](apache/arrow#14472)). - Improve performance by short-circuiting null checks when comparing non null field types ([#15106](apache/arrow#15106)). - Extend Table copy functionality, and support returning copies of individual vectors ([#14389](apache/arrow#14389)). - Several enhancements to dictionary encoding ([#14891](apache/arrow#14891), ([#14902](apache/arrow#14902), ([#14874](apache/arrow#14874)). - Extend Table to support additional vector types ([#14573](apache/arrow#14573)). - Enhance and simplify handling of allocation management by integrating C Data into allocator hierarchy ([#14506](apache/arrow#14506)). ## Python side: - PyArrow now requires pandas >= 1.0 ([ARROW-18173](https://issues.apache.org/jira/browse/ARROW-18173)). - Added support for the [DataFrame Interchange Protocol](https://data-apis.org/dataframe-protocol/latest/purpose_and_scope.html) for pyarrow.Table ([GH-33346](apache/arrow#33346)). - Support for custom metadata of record batches in the IPC read and write APIs ([ARROW-16430](https://issues.apache.org/jira/browse/ARROW-16430)). - The Time32Scalar, Time64Scalar, Date32Scalar and Date64Scalar classes got a .value attribute to access the underlying integer value, similar to the other date-time related scalars ([ARROW-18264](https://issues.apache.org/jira/browse/ARROW-18264)). - Casting to string is now supported for duration ([ARROW-15822](https://issues.apache.org/jira/browse/ARROW-15822)) and decimal ([ARROW-17458](https://issues.apache.org/jira/browse/ARROW-17458)) types, which also means those can now be written to CSV. ## Issues fixed: - Now Do_action (from Python server back to Java Client) is returning a stream of results properly, and it alerts when the results are not fully consumed by the client. Such results will be used to send the flow control credits back from the Python side. We limit the results to be exact 1 for now, although it can be a stream. - Fix a bug in the Python proxy server, when unregistered action is invoked, it should not parse and return the results.
Initial merge of MutableTable / MutableRow plus tests. Authored-by: Larry White <[email protected]> Signed-off-by: David Li <[email protected]>
Initial merge of MutableTable / MutableRow plus tests.