-
Notifications
You must be signed in to change notification settings - Fork 300
Make iris.pandas.as_data_frame() n-dimensional behaviour opt-in
#5059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make iris.pandas.as_data_frame() n-dimensional behaviour opt-in
#5059
Conversation
lbdreyer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple small tweaks and then this should be good to go.
Regarding your outstanding question
should I actually make all of iris.pandas sensitive to the iris.FUTURE.pandas_ndim switch?
I think it doesn't too much which we go for as both are quite reasonable. I think it's more of a question of how you document the future switch as that will determine a user's expectations of its behaviour.
I am however leaning towards keeping the switch only applicable to as_data_frame. The only real benefit I can see for making it applicable to all is that you are saying to the user "by enabling this, you are opting into the new world of pandas integration" but I don't think a user would feel much benefit from this? By doing that we are effectively forcing them to upgrade. They may have the (admittedly unusual) scenario of wanting to use as_series but also the new as_data_frame. They could handle all this with context managers but maybe that's just more annoying to have to keep including in your code?
So overall, I don't really mind but I lean towards keeping it applicable to as_data_frame only as I don't think the other option benefits the user more.
lib/iris/tests/test_pandas.py
Outdated
| assert cube.data[0] == 99 | ||
|
|
||
| def test_copy_int64_false(self): | ||
| cube = Cube(np.array([0, 1, 2, 3, 4], dtype=np.int32), long_name="foo") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| cube = Cube(np.array([0, 1, 2, 3, 4], dtype=np.int32), long_name="foo") | |
| cube = Cube(np.array([0, 1, 2, 3, 4], dtype=np.int64), long_name="foo") |
Although you are restoring these tests, this test does look wrong and this looks like an easy fix so worth doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs/src/whatsnew/latest.rst
Outdated
| :func:`iris.pandas.as_data_frame`\'s conversion of :class:`~iris.cube.Cube`\s to | ||
| :class:`~pandas.DataFrame`\s. This includes better handling of multiple | ||
| :class:`~iris.cube.Cube` dimensions, auxiliary coordinates and attribute | ||
| information. **Note:** the improvements are opt-in, via :class:`iris.Future`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps mentions the full future flag name iris.FUTURE.pandas_ndim so a user doesn't have to go looking in the docs for it (like how we do here in the 3.3 release)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, especially as that's the path of no further changes! |
|
I can't fix the link-check failure without bringing in #5064. Should I do a cherry-pick, or should we just wait until the feature branch is merged into |
I guess it depends how much more work you expect to do on this branch. If not much more (i.e. if you plan to merge the feature branch onto main right after this PR goes in) we can probably just leave it. If you plan to make more changes a cherry pick would be best |
* added link to the docs archive. * added whatsnew
|
Thanks @lbdreyer! |
🚀 Pull Request
Description
I think the diff might render unhelpfully. Here's a summary of what I have done:
TL;DR: I have written barely any new code - I've moved some stuff around and added some info for the user.
iris.pandas.as_data_frame()'s default behaviour from Irismainiris.FUTURE.pandas_ndimiris.pandas.as_series()from Irismainiris.pandas.as_data_frame()'s default behaviour from IrismainTestAsDataFrameNDim)FUTUREswitch into the code and the docstringFUTUREwarning, and some more for deprecation warnings that I felt were missingOutstanding question
Currently
iris.FUTURE.pandas_ndimonly controls the behaviour ofiris.pandas.as_data_frame()- this is the only function that definitely needs controlling as it's the only one with old and new behaviour. But for consistent UX, should I actually make all ofiris.pandassensitive to theiris.FUTURE.pandas_ndimswitch?Here is what that would look like:
pandas_ndim == Falsepandas_ndim == Trueas_cube()Cubes (deprecated)as_series()Cubes (deprecated)as_cubes()Cubesas_data_frame()Cubes (opt-in)Consult Iris pull request check list