-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
copy the dtypes
module to the namedarray
package.
#8250
Merged
Merged
Changes from all commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
6cce0ea
move dtypes module to namedarray
andersy005 2e5d18c
keep original dtypes
andersy005 5fda4cf
Merge branch 'main' into move-dtypes-to-namedarray
andersy005 54998d6
revert utils changes
andersy005 911ea92
Update xarray/namedarray/dtypes.py
andersy005 a619861
Merge branch 'main' into move-dtypes-to-namedarray
andersy005 c45057a
Apply suggestions from code review
andersy005 f0a65ed
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] ffe4a44
fix missing imports
andersy005 f7bbbfb
update typing
andersy005 a9b3420
fix return types
andersy005 823369c
Merge branch 'main' into move-dtypes-to-namedarray
andersy005 f54ee4b
Merge branch 'main' into move-dtypes-to-namedarray
andersy005 6848022
Merge branch 'main' into move-dtypes-to-namedarray
dcherian a719a70
type fixes
Illviljan 432227a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] e2de9ce
type fixes
Illviljan f5a74c0
Merge branch 'move-dtypes-to-namedarray' of https://github.com/anders…
Illviljan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,199 @@ | ||
from __future__ import annotations | ||
|
||
import functools | ||
import sys | ||
from typing import Any, Literal | ||
|
||
if sys.version_info >= (3, 10): | ||
from typing import TypeGuard | ||
else: | ||
from typing_extensions import TypeGuard | ||
|
||
import numpy as np | ||
|
||
from xarray.namedarray import utils | ||
|
||
# Use as a sentinel value to indicate a dtype appropriate NA value. | ||
NA = utils.ReprObject("<NA>") | ||
|
||
|
||
@functools.total_ordering | ||
class AlwaysGreaterThan: | ||
def __gt__(self, other: Any) -> Literal[True]: | ||
return True | ||
|
||
def __eq__(self, other: Any) -> bool: | ||
return isinstance(other, type(self)) | ||
|
||
|
||
@functools.total_ordering | ||
class AlwaysLessThan: | ||
def __lt__(self, other: Any) -> Literal[True]: | ||
return True | ||
|
||
def __eq__(self, other: Any) -> bool: | ||
return isinstance(other, type(self)) | ||
|
||
|
||
# Equivalence to np.inf (-np.inf) for object-type | ||
INF = AlwaysGreaterThan() | ||
NINF = AlwaysLessThan() | ||
|
||
|
||
# Pairs of types that, if both found, should be promoted to object dtype | ||
# instead of following NumPy's own type-promotion rules. These type promotion | ||
# rules match pandas instead. For reference, see the NumPy type hierarchy: | ||
# https://numpy.org/doc/stable/reference/arrays.scalars.html | ||
PROMOTE_TO_OBJECT: tuple[tuple[type[np.generic], type[np.generic]], ...] = ( | ||
(np.number, np.character), # numpy promotes to character | ||
(np.bool_, np.character), # numpy promotes to character | ||
(np.bytes_, np.str_), # numpy promotes to unicode | ||
) | ||
|
||
|
||
def maybe_promote(dtype: np.dtype[np.generic]) -> tuple[np.dtype[np.generic], Any]: | ||
"""Simpler equivalent of pandas.core.common._maybe_promote | ||
|
||
Parameters | ||
---------- | ||
dtype : np.dtype | ||
|
||
Returns | ||
------- | ||
dtype : Promoted dtype that can hold missing values. | ||
fill_value : Valid missing value for the promoted dtype. | ||
""" | ||
# N.B. these casting rules should match pandas | ||
dtype_: np.typing.DTypeLike | ||
fill_value: Any | ||
if np.issubdtype(dtype, np.floating): | ||
dtype_ = dtype | ||
fill_value = np.nan | ||
elif np.issubdtype(dtype, np.timedelta64): | ||
# See https://github.com/numpy/numpy/issues/10685 | ||
# np.timedelta64 is a subclass of np.integer | ||
# Check np.timedelta64 before np.integer | ||
fill_value = np.timedelta64("NaT") | ||
dtype_ = dtype | ||
elif np.issubdtype(dtype, np.integer): | ||
dtype_ = np.float32 if dtype.itemsize <= 2 else np.float64 | ||
fill_value = np.nan | ||
elif np.issubdtype(dtype, np.complexfloating): | ||
dtype_ = dtype | ||
fill_value = np.nan + np.nan * 1j | ||
elif np.issubdtype(dtype, np.datetime64): | ||
dtype_ = dtype | ||
fill_value = np.datetime64("NaT") | ||
else: | ||
dtype_ = object | ||
fill_value = np.nan | ||
|
||
dtype_out = np.dtype(dtype_) | ||
fill_value = dtype_out.type(fill_value) | ||
return dtype_out, fill_value | ||
|
||
|
||
NAT_TYPES = {np.datetime64("NaT").dtype, np.timedelta64("NaT").dtype} | ||
|
||
|
||
def get_fill_value(dtype: np.dtype[np.generic]) -> Any: | ||
"""Return an appropriate fill value for this dtype. | ||
|
||
Parameters | ||
---------- | ||
dtype : np.dtype | ||
|
||
Returns | ||
------- | ||
fill_value : Missing value corresponding to this dtype. | ||
""" | ||
_, fill_value = maybe_promote(dtype) | ||
return fill_value | ||
|
||
|
||
def get_pos_infinity( | ||
dtype: np.dtype[np.generic], max_for_int: bool = False | ||
) -> float | complex | AlwaysGreaterThan: | ||
"""Return an appropriate positive infinity for this dtype. | ||
|
||
Parameters | ||
---------- | ||
dtype : np.dtype | ||
max_for_int : bool | ||
Return np.iinfo(dtype).max instead of np.inf | ||
|
||
Returns | ||
------- | ||
fill_value : positive infinity value corresponding to this dtype. | ||
""" | ||
if issubclass(dtype.type, np.floating): | ||
return np.inf | ||
|
||
if issubclass(dtype.type, np.integer): | ||
return np.iinfo(dtype.type).max if max_for_int else np.inf | ||
if issubclass(dtype.type, np.complexfloating): | ||
return np.inf + 1j * np.inf | ||
|
||
return INF | ||
|
||
|
||
def get_neg_infinity( | ||
dtype: np.dtype[np.generic], min_for_int: bool = False | ||
) -> float | complex | AlwaysLessThan: | ||
"""Return an appropriate positive infinity for this dtype. | ||
|
||
Parameters | ||
---------- | ||
dtype : np.dtype | ||
min_for_int : bool | ||
Return np.iinfo(dtype).min instead of -np.inf | ||
|
||
Returns | ||
------- | ||
fill_value : positive infinity value corresponding to this dtype. | ||
""" | ||
if issubclass(dtype.type, np.floating): | ||
return -np.inf | ||
|
||
if issubclass(dtype.type, np.integer): | ||
return np.iinfo(dtype.type).min if min_for_int else -np.inf | ||
if issubclass(dtype.type, np.complexfloating): | ||
return -np.inf - 1j * np.inf | ||
|
||
return NINF | ||
|
||
|
||
def is_datetime_like( | ||
dtype: np.dtype[np.generic], | ||
) -> TypeGuard[np.datetime64 | np.timedelta64]: | ||
"""Check if a dtype is a subclass of the numpy datetime types""" | ||
return np.issubdtype(dtype, np.datetime64) or np.issubdtype(dtype, np.timedelta64) | ||
|
||
|
||
def result_type( | ||
*arrays_and_dtypes: np.typing.ArrayLike | np.typing.DTypeLike, | ||
) -> np.dtype[np.generic]: | ||
"""Like np.result_type, but with type promotion rules matching pandas. | ||
|
||
Examples of changed behavior: | ||
number + string -> object (not string) | ||
bytes + unicode -> object (not unicode) | ||
|
||
Parameters | ||
---------- | ||
*arrays_and_dtypes : list of arrays and dtypes | ||
The dtype is extracted from both numpy and dask arrays. | ||
|
||
Returns | ||
------- | ||
numpy.dtype for the result. | ||
""" | ||
types = {np.result_type(t).type for t in arrays_and_dtypes} | ||
|
||
for left, right in PROMOTE_TO_OBJECT: | ||
if any(issubclass(t, left) for t in types) and any( | ||
issubclass(t, right) for t in types | ||
): | ||
return np.dtype(object) | ||
|
||
return np.result_type(*arrays_and_dtypes) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is not considered
np.typing.ArrayLike
, therefore I don't like it. Is it possible to avoid it?Example error from #8211:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why the constructor has a default value of "NA"