Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unpin numpy #8061

Merged
merged 8 commits into from
Aug 17, 2023
Merged

unpin numpy #8061

merged 8 commits into from
Aug 17, 2023

Conversation

keewis
Copy link
Collaborator

@keewis keewis commented Aug 10, 2023

It seems in a previous PR I "temporarily" pinned numpy to get CI to pass, but then forgot to unpin later and merged it as-is. As a result, we have not been running the main CI with numpy>=1.24 ever since, even though now numpy=1.25 has been around for a while.

@github-actions github-actions bot added CI Continuous Integration tools dependencies Pull requests that update a dependency file labels Aug 10, 2023
@keewis
Copy link
Collaborator Author

keewis commented Aug 12, 2023

The doctests CI fails because unstacking a int array uses np.full_like to create a integer array but fills it with nan (the result of dtypes.get_fill_value(dtype) for int dtypes). Newer numpy versions complain about this by emitting a warning, which then gets converted to an error.

xarray/xarray/core/variable.py

Lines 1870 to 1880 in eceec5f

data = np.full_like(
self.data,
fill_value=fill_value,
shape=new_shape,
dtype=dtype,
)
# Indexer is a list of lists of locations. Each list is the locations
# on the new dimension. This is robust to the data being sparse; in that
# case the destinations will be NaN / zero.
data[(..., *indexer)] = reordered

In this particular case, the fill value itself will never appear in the unstacked array, so we could just use np.empty_like instead of np.full_like (not sure what to do for sparse, but maybe we don't need to do anything?).

Copy link
Collaborator

@headtr1ck headtr1ck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These warnings were getting annoying, thanks!

@headtr1ck
Copy link
Collaborator

Unrelated, but noticed again:
Seeing that new Mypy versions create problems maybe we should pin the Mypy version and set up an updater action with a PR?

@keewis
Copy link
Collaborator Author

keewis commented Aug 13, 2023

Seeing that new Mypy versions create problems maybe we should pin the Mypy version

We can do that, but note that the only error that's new is

xarray/core/dataset.py:503: Definition of "__eq__" in base class "DatasetOpsMixin" is incompatible with definition in base class "Mapping"  [misc]

the other ones appear to be numpy:

xarray/tests/test_groupby.py:731: error: Argument 1 to "groupby" of "Dataset" has incompatible type "ndarray[Any, dtype[signedinteger[Any]]]"; expected "Hashable | DataArray | IndexVariable"  [arg-type]
xarray/tests/test_groupby.py:731: note: Following member(s) of "ndarray[Any, dtype[signedinteger[Any]]]" have conflicts:
xarray/tests/test_groupby.py:731: note:     __hash__: expected "Callable[[], int]", got "None"
xarray/tests/test_coding_strings.py:36: error: Unused "type: ignore" comment  [unused-ignore]

Any idea how to fix those?

keewis added 2 commits August 13, 2023 18:47
this important because the fix in `numpy` that now means we *don't*
need it anymore has been around for less than 3 months (requires a
sufficiently new version of `mypy`).
@headtr1ck
Copy link
Collaborator

headtr1ck commented Aug 13, 2023

I don't think those are numpys fault.
We just have to add ndarray to groupbys argument Type list (I think there is already a DataArray Compatible definition somewhere, maybe that works).

Edit: nevermind, numpy Arrays are not supposed to work and mypy is actually correct here. Your ignore is correct. Somehow the older Mypy did not recognize this.

The Hashable None error usually disappears magically when you fix the other errors.

@headtr1ck
Copy link
Collaborator

We just have to add ndarray to groupbys argument Type list

There's a explicit check that raises a TypeError, so ignoring the (intentionally) wrong typing in the test seems appropriate? See ca245d6

Yes sry. Saw that too late. You are totally correct

@keewis
Copy link
Collaborator Author

keewis commented Aug 13, 2023

hah, yeah, I also didn't see your edit soon enough.

In any case, that leaves us with the __eq__ error that is new in mypy=1.5

@keewis
Copy link
Collaborator Author

keewis commented Aug 14, 2023

I barely know what I'm doing when it comes to typing, but from what I can tell the reason is a typeshed update in mypy combined with OVERRIDE_TYPESHED in xarray.util.generate_ops not having any effect Edit: maybe not, I totally missed the .pyi file

@keewis
Copy link
Collaborator Author

keewis commented Aug 16, 2023

once tests pass this should be ready for a final review. Edit: see above for the mypy failure

@keewis keewis added the plan to merge Final call for comments label Aug 17, 2023
Copy link
Contributor

@dcherian dcherian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@keewis keewis merged commit ab096b0 into pydata:main Aug 17, 2023
@keewis keewis deleted the unpin-numpy branch August 17, 2023 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration tools dependencies Pull requests that update a dependency file plan to merge Final call for comments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants