-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-39533: [Python] NumPy 2.0 compat: remove usage of np.core #39535
GH-39533: [Python] NumPy 2.0 compat: remove usage of np.core #39535
Conversation
|
@github-actions crossbow submit pandas |
Revision: 4e403be Submitted crossbow builds: ursacomputing/crossbow @ actions-fbef62db1c |
python/pyarrow/pandas_compat.py
Outdated
['object', 'bool']) | ||
"int8", "int16", "int32", "int64", | ||
"uint8", "uint16", "uint32", "uint64", | ||
"float16", "float32", "float64", "float128", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually support conversion to/from float128?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point ;) No, I don't think we even have float128 in the arrow spec, right? I just hardcoded the current dynamic content, but that can indeed be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought: those types are not arrow types, but the numpy dtype stored in the pandas metadata, i.e. the original dtype of the pandas DataFrame column that was converted to a pyarrow.Table.
So in theory you can have a pandas DataFrame with a float128 columns, and get that in the metadata (and then having that included in the list above is fine). Now, this is currently also not possible, as we haven't implemented the conversion of numpy float128 to a pyarrow float array, and thus the conversion of such a DataFrame currently fails.
### Rationale for this change Removing usage of `np.core`, as that is deprecated and will be removed in numpy 2.0. For this specific case, we can just hardcode the list of data types instead of using a numpy api (this list doesn't typically change). * Closes: #39533 Authored-by: Joris Van den Bossche <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 72ed584. There were 5 benchmark results indicating a performance regression:
The full Conbench report has more details. It also includes information about 2 possible false positives for unstable benchmarks that are known to sometimes produce them. |
…pache#39535) ### Rationale for this change Removing usage of `np.core`, as that is deprecated and will be removed in numpy 2.0. For this specific case, we can just hardcode the list of data types instead of using a numpy api (this list doesn't typically change). * Closes: apache#39533 Authored-by: Joris Van den Bossche <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
…pache#39535) ### Rationale for this change Removing usage of `np.core`, as that is deprecated and will be removed in numpy 2.0. For this specific case, we can just hardcode the list of data types instead of using a numpy api (this list doesn't typically change). * Closes: apache#39533 Authored-by: Joris Van den Bossche <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
…pache#39535) ### Rationale for this change Removing usage of `np.core`, as that is deprecated and will be removed in numpy 2.0. For this specific case, we can just hardcode the list of data types instead of using a numpy api (this list doesn't typically change). * Closes: apache#39533 Authored-by: Joris Van den Bossche <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
Rationale for this change
Removing usage of
np.core
, as that is deprecated and will be removed in numpy 2.0.For this specific case, we can just hardcode the list of data types instead of using a numpy api (this list doesn't typically change).