-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-29110][SQL][TESTS] Port window.sql (Part 4) #26238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: DylanGuedes <[email protected]>
da0b147 to
62790ac
Compare
|
Test build #112578 has finished for PR 26238 at commit
|
| SUM(b) OVER(ORDER BY A ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) | ||
| FROM (VALUES(1,1),(2,2),(3,(cast('nan' as int))),(4,3),(5,4)) t(a,b) | ||
| -- !query 38 schema | ||
| struct<a:int,b:int,sum(b) OVER (ORDER BY A ASC NULLS FIRST ROWS BETWEEN 1 PRECEDING AND CURRENT ROW):bigint> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maropu I think that it is related to Spark handling 'NaN' as value zero in sum, such that:
0+1=1, 1+2=3, 2+'NaN'=2, 3+'NaN'=3, 4+3=7. Should I JIRA that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ur, I see. I think its worth filing it (Or, we already have any jira for that?) cc: @dongjoon-hyun @HyukjinKwon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's file a JIRA and note. Even if we don't want to fix in Spark, let's better to file a JIRA and reolsve it as Won't Fix for trackability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I think so. Thanks for the suggestion!
Signed-off-by: DylanGuedes <[email protected]>
| SELECT i,SUM(v) OVER (ORDER BY i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) | ||
| FROM (VALUES(1,1),(2,2),(3,3),(4,4)) t(i,v); | ||
|
|
||
| -- [SPARK-29638] Spark handles 'NaN' as 0 in sums |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the report! Can you add the query below as an example in the jira? I think that's a good reproducer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I added at the JIRA page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
|
Test build #112871 has finished for PR 26238 at commit
|
|
Merged to master. |
What changes were proposed in this pull request?
This PR ports window.sql from PostgreSQL regression tests https://github.com/postgres/postgres/blob/REL_12_STABLE/src/test/regress/sql/window.sql#L913-L1278
The expected results can be found in the link: https://github.com/postgres/postgres/blob/REL_12_STABLE/src/test/regress/expected/window.out
Why are the changes needed?
To ensure compatibility with PostgreSQL.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Pass the Jenkins. And, Comparison with PgSQL results.