Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -682,6 +682,7 @@ Other
- Bug in :class:`DataFrame` and :class:`Series` raising for data of complex dtype when ``NaN`` values are present (:issue:`53627`)
- Bug in :class:`DatetimeIndex` where ``repr`` of index passed with time does not print time is midnight and non-day based freq(:issue:`53470`)
- Bug in :class:`FloatingArray.__contains__` with ``NaN`` item incorrectly returning ``False`` when ``NaN`` values are present (:issue:`52840`)
- Bug in :func:`api.interchange.from_dataframe` was raising during interchanging from non-pandas tz-aware data containing ``NaN`` values (:issue:`54287`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we write

non-pandas tz-aware data containing null values

?
technically nan is different, and specifically refers to invalid mathematical operations like 0/0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, my mistake. I corrected that.

- Bug in :func:`api.interchange.from_dataframe` when converting an empty DataFrame object (:issue:`53155`)
- Bug in :func:`assert_almost_equal` now throwing assertion error for two unequal sets (:issue:`51727`)
- Bug in :func:`assert_frame_equal` checks category dtypes even when asked not to check index type (:issue:`52126`)
Expand Down
6 changes: 6 additions & 0 deletions pandas/core/interchange/from_dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import numpy as np

from pandas.compat._optional import import_optional_dependency
from pandas.errors import SettingWithCopyError

import pandas as pd
from pandas.core.interchange.dataframe_protocol import (
Expand Down Expand Up @@ -513,5 +514,10 @@ def set_nulls(
# cast the `data` to nullable float dtype.
data = data.astype(float)
data[null_pos] = None
except SettingWithCopyError:
# SettingWithCopyError happens if the `data` appears
# to have 'NaT'. If this happens, copy the `data`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"appears to" sounds a bit mysterious :) how about

`SettingWithCopyError` may happen for datetime-like with missing values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, I changed the wording as you suggested.

data = data.copy()
data[null_pos] = None

return data
19 changes: 19 additions & 0 deletions pandas/tests/interchange/test_impl.py
Original file line number Diff line number Diff line change
Expand Up @@ -295,3 +295,22 @@ def test_datetimetzdtype(tz, unit):
)
df = pd.DataFrame({"ts_tz": tz_data})
tm.assert_frame_equal(df, from_dataframe(df.__dataframe__()))


def test_interchange_from_non_pandas_tz_aware():
# GH 54239, 54287
pa = pytest.importorskip("pyarrow")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may need to set a minimum version here

Suggested change
pa = pytest.importorskip("pyarrow")
pa = pytest.importorskip("pyarrow", "11.0.0")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, I added the minimum version here

import pyarrow.compute as pc

arr = pa.array([datetime(2020, 1, 1), None, datetime(2020, 1, 2)])
arr = pc.assume_timezone(arr, "Asia/Kathmandu")
table = pa.table({"arr": arr})
exchange_df = table.__dataframe__()
result = from_dataframe(exchange_df)

expected = pd.DataFrame(
["2020-01-01 00:00:00+05:45", "NaT", "2020-01-02 00:00:00+05:45"],
columns=["arr"],
dtype="datetime64[us, Asia/Kathmandu]",
)
tm.assert_frame_equal(expected, result)