-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
From 2021-06-09 dev call, new discussion of #22864
Problem: Frequency string 'D' and pd.offsets.Day is defined to be a fixed 24 hour period since it's a subclass of Tick.
In the context of timezones with a DST crossing, 'D' acts as a calendar day (23/24/25H) instead for the following operations:
pd.date_range(start, end, freq="D")df.resample("D")...
Original Settled Solution: Deprecate all behavior where Frequency string 'D' and pd.offsets.Day is a fixed 24 hours in favor of "24H". A private _Day offset would be used where appropriate internally and swapped out once the deprecation is enforced.
(Note: I lost steam last time catching all the warnings issued in the testing suite given the above solution touches datetimes, timedeltas, offsets, methods, etc)
Other Solutions
- Implement a new offset, e.g.
"'DayDST'"/pd.offsets.CalendarDay", that users can migrate to. - Deprecate
Ticks (Dayis a subclass) all together since they are redundant withTimedeltas
cc @pandas-dev/pandas-core
(@jbrockmendel) Updating with checkboxes to keep track of issues I think this would resolve: