-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-29623][SQL] do not allow multiple unit TO unit statements in interval literal syntax #26285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
retest this please |
|
test this please |
|
Test build #112787 has finished for PR 26285 at commit
|
|
retest this please |
docs/sql-migration-guide.md
Outdated
|
|
||
| - Since Spark 3.0, the `size` function returns `NULL` for the `NULL` input. In Spark version 2.4 and earlier, this function gives `-1` for the same input. To restore the behavior before Spark 3.0, you can set `spark.sql.legacy.sizeOfNull` to `true`. | ||
|
|
||
| - Since Spark 3.0, the interval literal syntax does not allow multiple unit TO unit statements anymore. For example, `SELECT INTERVAL '1-1' YEAR TO MONTH '2-2' YEAR TO MONTH'` throws parser exception. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit: I think its a little difficult to understand word boundaries "... multiple unit TO unit statements anymore.", so how about multiple unit-TO-unit statements anymore or multiple "unit TO unit" statements anymore
| : {ansi}? INTERVAL? intervalField+ | ||
| | {!ansi}? INTERVAL intervalField* | ||
| : {ansi}? INTERVAL? (errorCapturingMultiUnitsInterval | errorCapturingUnitToUnitInterval) | ||
| | {!ansi}? INTERVAL (errorCapturingMultiUnitsInterval | errorCapturingUnitToUnitInterval) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just out of curiosity; is that bad to just drop + and * for simple fixes?
interval
: {ansi}? INTERVAL? intervalField
| {!ansi}? INTERVAL intervalField
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
intervalField can be single-unit statements like 1 year, which is allowed to repeat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. Thanks!
maropu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for dropping the support. LGTM except for minor comments.
|
Test build #112805 has finished for PR 26285 at commit
|
|
FYI: the test failure is not related to this pr: #26287 |
| case _ => | ||
| throw new ParseException(s"Intervals $from TO $to are not supported.", ctx) | ||
| } | ||
| validate(interval != null, "No interval can be constructed", ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which of fromYearMonthString or fromDayTimeString can return null? They either throw an exception or return an interval.
| "Can only have a single unit TO unit statement in the interval literal syntax", | ||
| innerCtx.unitToUnitInterval) | ||
| } | ||
| Literal(visitMultiUnitsInterval(innerCtx.multiUnitsInterval)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you specify CalendarIntervalType to avoid unnecessary pattern matching for inferring the type by provided value.
|
Test build #112861 has finished for PR 26285 at commit
|
|
Test build #112886 has finished for PR 26285 at commit
|
|
Test build #112885 has finished for PR 26285 at commit
|
|
retest this please |
|
Test build #112896 has finished for PR 26285 at commit
|
| : {ansi}? INTERVAL? intervalField+ | ||
| | {!ansi}? INTERVAL intervalField* | ||
| : INTERVAL (errorCapturingMultiUnitsInterval | errorCapturingUnitToUnitInterval)? | ||
| | {ansi}? (errorCapturingMultiUnitsInterval | errorCapturingUnitToUnitInterval) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ansi}? (errorCapturingMultiUnitsInterval | errorCapturingUnitToUnitInterval)? is illegal as it can match anything. Note that we need to make (errorCapturingMultiUnitsInterval | errorCapturingUnitToUnitInterval) optional so that we can detect select interval and give precise error message.
|
|
||
| // Unknown FROM TO intervals | ||
| intercept("interval 10 month to second", | ||
| intercept("interval '10' month to second", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from-to unit can only handle string values ('1-2' year to month, '1 12' day to hour).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you do write interval 10 month to second, then we will ask users to use string instead, see https://github.com/apache/spark/pull/26285/files#diff-4f9e28af8e9fcb40a8a99b4e49f3b9b2R612
|
Test build #112918 has finished for PR 26285 at commit
|
|
Test build #112916 has finished for PR 26285 at commit
|
|
Test build #112917 has finished for PR 26285 at commit
|
| }) | ||
|
|
||
| val e1 = intercept[AnalysisException] { | ||
| sql("select interval") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to literals.sql
|
Test build #112928 has finished for PR 26285 at commit
|
|
Test build #112949 has finished for PR 26285 at commit
|
|
retest this please |
|
Test build #113005 has finished for PR 26285 at commit
|
|
retest this please |
|
Test build #113027 has finished for PR 26285 at commit
|
|
I believe the Spark R failure is not related. Any more comments before merging it? |
|
Test build #113088 has finished for PR 26285 at commit
|
|
thanks for the review, merging to master! |
What changes were proposed in this pull request?
re-arrange the parser rules to make it clear that multiple unit TO unit statement like
SELECT INTERVAL '1-1' YEAR TO MONTH '2-2' YEAR TO MONTHis not allowed.Why are the changes needed?
This is definitely an accident that we support such a weird syntax in the past. It's not supported by any other DBs and I can't think of any use case of it. Also no test covers this syntax in the current codebase.
Does this PR introduce any user-facing change?
Yes, and a migration guide item is added.
How was this patch tested?
new tests.