Skip to content

Commit f91b73c

Browse files
itholicHyukjinKwon
authored andcommitted
Make is_monotonic~ work properly for index (#930)
Resolve #931 (The contents below is same as the contents of this issue) When we use `is_monotonic_decreasing` or `is_monotonic_increasing` for index, pandas works like below: ```python >>> s = pd.Series([7, 6, 5, 4, 3, 2, 1], index=[7, 6, 5, 4, 3, 2, 1]) >>> s.index.is_monotonic_decreasing True ``` Since the index order is literally motononic decreasing, they return `True` But our case, ```python >>> s = ks.Series([7, 6, 5, 4, 3, 2, 1], index=[7, 6, 5, 4, 3, 2, 1]) >>> s.index.is_monotonic_decreasing False ``` as seen above, it returns `False`. because our existing logic always order by index column before calculate the each rows increasing or decreasing. So they always calculate result based on data column, not index column. (index column is always sorted since order by) So i think maybe it is better to fix this logic to follow behavior of pandas one for index.
1 parent 345c7da commit f91b73c

File tree

1 file changed

+17
-2
lines changed

1 file changed

+17
-2
lines changed

databricks/koalas/base.py

+17-2
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727
from pyspark import sql as spark
2828
from pyspark.sql import functions as F, Window
2929
from pyspark.sql.types import DoubleType, FloatType, LongType, StringType, TimestampType
30+
from pyspark.sql.functions import monotonically_increasing_id
3031

3132
from databricks import koalas as ks # For running doctests and reference resolution in PyCharm.
3233
from databricks.koalas.internal import _InternalFrame
@@ -299,9 +300,16 @@ def is_monotonic(self):
299300
300301
>>> ser.rename("a").to_frame().set_index("a").index.is_monotonic
301302
True
303+
304+
>>> ser = ks.Series([5, 4, 3, 2, 1], index=[1, 2, 3, 4, 5])
305+
>>> ser.is_monotonic
306+
False
307+
308+
>>> ser.index.is_monotonic
309+
True
302310
"""
303311
col = self._scol
304-
window = Window.orderBy(self._kdf._internal.index_scols).rowsBetween(-1, -1)
312+
window = Window.orderBy(monotonically_increasing_id()).rowsBetween(-1, -1)
305313
return self._with_new_scol((col >= F.lag(col, 1).over(window)) & col.isNotNull()).all()
306314

307315
is_monotonic_increasing = is_monotonic
@@ -343,9 +351,16 @@ def is_monotonic_decreasing(self):
343351
344352
>>> ser.rename("a").to_frame().set_index("a").index.is_monotonic_decreasing
345353
True
354+
355+
>>> ser = ks.Series([5, 4, 3, 2, 1], index=[1, 2, 3, 4, 5])
356+
>>> ser.is_monotonic_decreasing
357+
True
358+
359+
>>> ser.index.is_monotonic_decreasing
360+
False
346361
"""
347362
col = self._scol
348-
window = Window.orderBy(self._kdf._internal.index_scols).rowsBetween(-1, -1)
363+
window = Window.orderBy(monotonically_increasing_id()).rowsBetween(-1, -1)
349364
return self._with_new_scol((col <= F.lag(col, 1).over(window)) & col.isNotNull()).all()
350365

351366
def astype(self, dtype):

0 commit comments

Comments
 (0)