-
Notifications
You must be signed in to change notification settings - Fork 264
Description
Describe the bug
This is a follow on issue to #825. It looks like the window function was fixed for nulls, except for the UNBOUNDED case.
Steps/Code to reproduce bug
test_window_aggs_for_ranges against cudf-0.17 currently fails all the time. I will update the test with an xfail for those situations pointing to this issue.
Expected behavior
This test should pass
It looks like we are not handling unbounded properly and in the short term will have to disable nullable timestamp columns with unbounded preceding or following intervals.
I simplified the query to make debugging this simpler
select
count(c) over
(partition by a order by cast(b as timestamp) asc
range between CURRENT ROW and UNBOUNDED following) as count_c_asc, a, b, c
from window_agg_table order by a, b, cI dropped the length of the generated data to 100 and I see results like the following (with emphasis added)
CPU:
Row(count_c_asc=*5*, a=-5831592707909023540, b=None, c=756780896),
Row(count_c_asc=4, a=-5831592707909023540, b=datetime.date(2020, 2, 29), c=-656902282),
Row(count_c_asc=3, a=-5831592707909023540, b=datetime.date(2020, 3, 18), c=-756294971),
Row(count_c_asc=2, a=-5831592707909023540, b=datetime.date(2020, 10, 10), c=2117211837),
Row(count_c_asc=1, a=-5831592707909023540, b=datetime.date(2020, 12, 25), c=110877650),
GPU:
Row(count_c_asc=*1*, a=-5831592707909023540, b=None, c=756780896),
Row(count_c_asc=4, a=-5831592707909023540, b=datetime.date(2020, 2, 29), c=-656902282),
Row(count_c_asc=3, a=-5831592707909023540, b=datetime.date(2020, 3, 18), c=-756294971),
Row(count_c_asc=2, a=-5831592707909023540, b=datetime.date(2020, 10, 10), c=2117211837),
Row(count_c_asc=1, a=-5831592707909023540, b=datetime.date(2020, 12, 25), c=110877650),
The GPU side appears to be doing what we talked about, but the CPU side appears to not care about the null in the timestamp column. It feels like unbounded really means unbounded, because when I switch the following to be INTERVAL 1000 DAYS the GPU now matches the CPU results.