-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-29864][SPARK-29920][SQL] Strict parsing of day-time strings to intervals #26473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@cloud-fan I cannot easily integrate old implementation to new one because of #26327 (comment). What I am considering is to just keep old implementation of |
|
Test build #113600 has finished for PR 26473 at commit
|
| check("-12:40", HOUR, MINUTE, "-12 hours -40 minutes") | ||
| checkFail("5 12:40", HOUR, MINUTE, "must match day-time format") | ||
|
|
||
| check("12:40:30.999999999", HOUR, SECOND, "12 hours 40 minutes 30.999999 seconds") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we test 12:40:30.0123456789 as well?
|
+1 to keep the old implementation and add a legacy config to fallback |
|
Test build #113624 has finished for PR 26473 at commit
|
|
Test build #113628 has finished for PR 26473 at commit
|
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
Outdated
Show resolved
Hide resolved
…-daytime-string # Conflicts: # sql/core/src/test/resources/sql-tests/results/literals.sql.out
|
@cloud-fan I ran |
|
Test build #114278 has finished for PR 26473 at commit
|
|
jenkins, retest this, please |
|
@MaxGekk do we have a simple example to demonstrate the bug of In the meanwhile, we don't add tests to |
|
Test build #114282 has finished for PR 26473 at commit
|
|
Test build #114286 has finished for PR 26473 at commit
|
If this PR will be merged to the master, the bug can be reproduced by removing
It has been already created: https://issues.apache.org/jira/browse/SPARK-29933
Actually, it is not enough because wrong settings cause failure I could remove the tests from |
|
For some reasons, the last 2 builds finished successfully #26473 (comment) & #26473 (comment) . Probably, the order of tests was changed. |
…-daytime-string # Conflicts: # docs/sql-migration-guide.md
|
@cloud-fan Hmm, when I run all tests via |
|
Test build #114304 has finished for PR 26473 at commit
|
|
I've figured out the problem: The Spark thrift server creates many SparkSessions to serve requests, and the thrift server serves requests using a single thread. One thread can only have one active SparkSession, so For this particular problem, a simple fix in But we should really think about how to set active session correctly. |
|
I don't know when this bug #26473 (comment) will be fixed but can we merge this PR with the workaround? I just want to remind you that this PR fixes the bug #26473 (comment) |
|
It has been a while, can you remind me what's your workaround? BTW I feel it's also OK to use my one-line fix: just add |
I mean explicit set ca46f44 |
|
The dialect config has been removed. Can you try and see if the test still fail? If it does then maybe just use my one-line fix. |
…-daytime-string # Conflicts: # sql/core/src/test/resources/sql-tests/results/ansi/interval.sql.out # sql/core/src/test/resources/sql-tests/results/interval.sql.out
|
Test build #115122 has finished for PR 26473 at commit
|
|
|
||
| check("12:40:30.999999999", HOUR, SECOND, "12 hours 40 minutes 30.999999 seconds") | ||
| check("+12:40:30.999999999", HOUR, SECOND, "12 hours 40 minutes 30.999999 seconds") | ||
| check("-12:40:30.999999999", HOUR, SECOND, "-12 hours -40 minutes -30.999999 seconds") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's avoid using 99999. It's hard for reviewers to count the number of digits. Let's use 123456 to ease the digits counting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just wanted to test the maximum possible value in the fractional part. I think we need at least one test for 999999999
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replaced 999999999 by 123456789 everywhere except one check
cloud-fan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except one minor comment
|
Test build #115177 has finished for PR 26473 at commit
|
|
thanks, merging to master! |
What changes were proposed in this pull request?
In the PR, I propose new implementation of
fromDayTimeStringwhich strictly parses strings in day-time formats to intervals. New implementation accepts only strings that match to a pattern defined by thefromandto. Here is the mapping of user's bounds and patterns:[+|-]D+ H[H]:m[m]:s[s][.SSSSSSSSS]for DAY TO SECOND[+|-]D+ H[H]:m[m]for DAY TO MINUTE[+|-]D+ H[H]for DAY TO HOUR[+|-]H[H]:m[m]s[s][.SSSSSSSSS]for HOUR TO SECOND[+|-]H[H]:m[m]for HOUR TO MINUTE[+|-]m[m]:s[s][.SSSSSSSSS]for MINUTE TO SECONDCloses #26327
Closes #26358
Why are the changes needed?
Does this PR introduce any user-facing change?
Yes, before:
After:
How was this patch tested?
IntervalUtilsSuiteExpressionParserSuiteliterals.sql