-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
Description
leftover from #23623
-
Signature for
.to_numpy(): @jorisvandenbossche proposedcopy=True, which I think is good. Beyond that, we may want to control the "fidelity" of the conversion. ShouldSeries[datetime64[ns, tz]].to_numpy()be an ndarray of Timestamp objets or an ndarray of dateimte64[ns] normalized to UTC (by default, and should we allow that to be controlled)? Can we hope for a set of keywords appropriate for all subtypes, or do we need to allowkwargs? Perhapsto_numpy(copy=True, dtype=None)will suffice? -
Make
.arrayalways an ExtensionArray (via @shoyer). This gives pandas a bit more freedom going forward, since the type of.arraywill be stable if / when we flip over to Arrow arrays by default. We'll just swap out the data backing the ExtensionArray. A generic "NumpyBackedExtensionArray" is pretty easy to write (I had one in cyberpandas). My main concern here is that it makes the statement ".arrayis the actual data stored in the Series / Index" falseish, but that's OK. -
Revert the breaking changes to
Series.valuesforperiodandintervaldtype data (cc @jschendel)? I think we should do this.
In [3]: sper = pd.Series(pd.period_range('2000', periods=4))
In [4]: sper.values # on master this is the PeriodArray
Out[4]:
array([Period('2000-01-01', 'D'), Period('2000-01-02', 'D'),
Period('2000-01-03', 'D'), Period('2000-01-04', 'D')], dtype=object)
In [5]: sper.array
Out[5]:
<PeriodArray>
['2000-01-01', '2000-01-02', '2000-01-03', '2000-01-04']
Length: 4, dtype: period[D]In terms of LOC, it's a simple change
@@ -1984,6 +1984,16 @@ class ExtensionBlock(NonConsolidatableMixIn, Block):
return blocks, mask
+class ObjectValuesExtensionBlock(ExtensionBlock):
+ """Block for Interval / Period data.
+
+ Only needed for backwards compatability to ensure that
+ Series[T].values is an ndarray of objects.
+ """
+ def external_values(self, dtype=None):
+ return self.values.astype(object)
+
+
class NumericBlock(Block):
__slots__ = ()
is_numeric = True
@@ -3004,6 +3014,8 @@ def get_block_type(values, dtype=None):
if is_categorical(values):
cls = CategoricalBlock
+ elif is_interval_dtype(dtype) or is_period_dtype(dtype):
+ cls = ObjectValuesExtensionBlockThere are a couple other places (like Series._ndarray_values) that assume "extension dtype means .values is an ExtensionArray", which I've surfaced on my DatetimeArray branch. We'll need to update those to use .array anyway.
-
Series.to_numpy()signature -
Series.arrayis always an EA - Revert breaking changes to
Series.valuesfor Period / Interval (API: Revert breaking.valueschanges #24163)