-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-43454: [C++][Python] Add Opaque canonical extension type #43458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
jorisvandenbossche
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copying my review comments from #41823 (review)
For the Python bindings, looks good as well, just a few minor comments:
- We probably want to add
OpaqueScalartopyarrow/__init__.pyas well? (although I see we forgot this for the tensor type, which I assume you have been imitating) - Can you add the three new classes to
test_extension_type_constructor_errorsintest_misc.py? - Can you list the new type in the python API docs (docs/source/python/api/datatypes.rst, docs/source/python/api/arrays.rst
)? See eg #39652 for a previous PR adding a new type where this is added to the docs.
|
I believe I've fixed everything here, thanks for the reviews! |
jorisvandenbossche
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't looked in detail at the C++ side again, but the Python changes look good!
|
Any further comments here? |
None for me. I based some of my work on Bool8 on the implementation here, so ideally we could get another C++ reviewer to approve as a third party. |
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, a couple minor comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: checked_cast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps also compare with an extension type that is not Opaque?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
Co-authored-by: Weston Pace <[email protected]>
|
I'm going to merge this in a bit given the radio silence |
|
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 6e7125b. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 38 possible false positives for unstable benchmarks that are known to sometimes produce them. |
### Rationale for this change C++ and Python implementations of #43234 ### What changes are included in this PR? - Implement C++ `Bool8Type`, `Bool8Array`, `Bool8Scalar`, and tests - Implement Python bindings to C++, as well as zero-copy numpy conversion methods - TODO: docs waiting for rebase on #43458 ### Are these changes tested? Yes ### Are there any user-facing changes? Bool8 extension type will be available in C++ and Python libraries * GitHub Issue: #17682 Authored-by: Joel Lubinitsky <[email protected]> Signed-off-by: Felipe Oliveira Carvalho <[email protected]>
|
Apologies for the late question. Is opaque type intentionally not registered? arrow/cpp/src/arrow/extension_type.cc Lines 145 to 157 in f078942
If it's an oversight I can add registration in #37298. |
|
It was an oversight, thanks for spotting it |
Let's do that as a separate PR (it should also require to update the tests slightly, and then it doesn't depend on the timeline of merging the other PR) |
|
Opened #43787 to keep track of it. |
…3788) This is to resolve #43787 > The Opaque extension type implementation for C++ (plus python bindings) was added in #43458, but it was not registered by default, which we should do for canonical extension types (see #43458 (comment)) Additionally, this adds `bool8` extension type builds with `ARROW_JSON=false` as discussed [here](5258819#r145613657) ### Rationale for this change Canonical types should be registered by default if possible (except e.g. if they can't be compiled due to `ARROW_JSON=false`). ### What changes are included in this PR? This adds default registration for `opaque`, changes when `bool8` is built and moves all canonical tests under the same test target. ### Are these changes tested? Changes are tested by previously existing tests. ### Are there any user-facing changes? `opaue` will now be registered by default and `bool8` will be present in case `ARROW_JSON=false` at build time. * GitHub Issue: #43787 Authored-by: Rok Mihevc <[email protected]> Signed-off-by: Rok Mihevc <[email protected]>
…3788) This is to resolve #43787 > The Opaque extension type implementation for C++ (plus python bindings) was added in apache/arrow#43458, but it was not registered by default, which we should do for canonical extension types (see apache/arrow#43458 (comment)) Additionally, this adds `bool8` extension type builds with `ARROW_JSON=false` as discussed [here](apache/arrow@441ecb0#r145613657) ### Rationale for this change Canonical types should be registered by default if possible (except e.g. if they can't be compiled due to `ARROW_JSON=false`). ### What changes are included in this PR? This adds default registration for `opaque`, changes when `bool8` is built and moves all canonical tests under the same test target. ### Are these changes tested? Changes are tested by previously existing tests. ### Are there any user-facing changes? `opaue` will now be registered by default and `bool8` will be present in case `ARROW_JSON=false` at build time. * GitHub Issue: #43787 Authored-by: Rok Mihevc <[email protected]> Signed-off-by: Rok Mihevc <[email protected]>
Rationale for this change
Add the newly ratified extension type.
What changes are included in this PR?
The C++/Python implementation only.
Are these changes tested?
Yes
Are there any user-facing changes?
No.