-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-49488][SQL][FOLLOWUP] Use correct MySQL datetime functions when pushing down EXTRACT #50112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ushing down EXTRACT
3bca323 to
9a5d70c
Compare
|
ping @cloud-fan |
9a5d70c to
7c9da46
Compare
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, is this a new correctness issue due to the original implementation of SPARK-49488, @beliefer ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, may I ask why this is changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although this PR is adding a new test coverage for YEAROFWEEK, it seems that we don't need to touch this line, doesn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change follows the change from #50101
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the test coverage of YEAR, but it was not tested before.
Yes. The original PR brings the bug that the behavior of Spark not match MySQL very well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it supported in all mysql versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It is supported from MySQL 5.7 to MySQL 9.2(latest)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fragile, shall we reject such EXTRACT to be pushed down?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can simply throw UnsupportedOperation to reject it. This is also how we use isSupportedFunction to reject predicate pushdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do we come up with 52?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the behavior of MySQL.
The range is 1-53 if the mode is 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the comment is outdated now.
012a3cb to
7322837
Compare
|
thanks, merging to master/4.0! |
…n pushing down EXTRACT ### What changes were proposed in this pull request? This PR proposes to use correct MySQL datetime functions when pushing down `EXTRACT`. ### Why are the changes needed? bug fix ### Does this PR introduce _any_ user-facing change? Yes, query result is corrected, but this bug is not released yet. ### How was this patch tested? updated test ### Was this patch authored or co-authored using generative AI tooling? 'No'. Closes #50112 from beliefer/SPARK-49488_followup. Authored-by: beliefer <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 7b39d24) Signed-off-by: Wenchen Fan <[email protected]>
|
@cloud-fan @dongjoon-hyun Thank you! |
| case "DAY_OF_WEEK" => s"(WEEKDAY($source) + 1)" | ||
| case _ => super.visitExtract(field, source) | ||
| case "DAY_OF_WEEK" => s"(WEEKDAY(${build(extract.source())}) + 1)" | ||
| // SECOND, MINUTE, HOUR, DAY, MONTH, QUARTER, YEAR are identical on MySQL and Spark for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extract second returns integer on MySQL, while spark returns decimal(8,6), so it cannot be pushed down because of loosing decimal part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
…n pushing down EXTRACT ### What changes were proposed in this pull request? This PR proposes to use correct MySQL datetime functions when pushing down `EXTRACT`. ### Why are the changes needed? bug fix ### Does this PR introduce _any_ user-facing change? Yes, query result is corrected, but this bug is not released yet. ### How was this patch tested? updated test ### Was this patch authored or co-authored using generative AI tooling? 'No'. Closes apache#50112 from beliefer/SPARK-49488_followup. Authored-by: beliefer <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit c7d57d7) Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
This PR proposes to use correct MySQL datetime functions when pushing down
EXTRACT.Why are the changes needed?
bug fix
Does this PR introduce any user-facing change?
Yes, query result is corrected, but this bug is not released yet.
How was this patch tested?
updated test
Was this patch authored or co-authored using generative AI tooling?
'No'.