-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-35285][SQL] Parse ANSI interval types in SQL schema #32409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #138111 has finished for PR 32409 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #138114 has finished for PR 32409 at commit
|
|
@yaooqinn @AngersZhuuuu @beliefer @cloud-fan Please, review this PR. |
|
Since change the type name, should we add this to migration guide? |
|
And there are many test case use Main code LGTM |
|
@AngersZhuuuu The types have not been released yet. There are no versions to migrate from. |
This is the name of a sub-class of interval type. It is ok to use it in test titles, see PR's description #31810 |
Sounds good. |
Got it, thanks for your explain. LGTM |
|
Merged to master. |
| Literal(Period.ofDays(2))), | ||
| EmptyRow, | ||
| "sequence step must be a day year-month interval if start and end values are dates") | ||
| "sequence step must be a day interval year to month if start and end values are dates") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@beliefer The error message confuses me slightly, especially the combination a day interval year to month. Could you open a PR to improve the error, please, something like "... sequence step must be an interval of day granularity ...".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
### What changes were proposed in this pull request?
1. Extend Spark SQL parser to support parsing of:
- `INTERVAL YEAR TO MONTH` to `YearMonthIntervalType`
- `INTERVAL DAY TO SECOND` to `DayTimeIntervalType`
2. Assign new names to the ANSI interval types according to the SQL standard to be able to parse the names back by Spark SQL parser. Override the `typeName()` name of `YearMonthIntervalType`/`DayTimeIntervalType`.
### Why are the changes needed?
To be able to use new ANSI interval types in SQL. The SQL standard requires the types to be defined according to the rules:
```
<interval type> ::= INTERVAL <interval qualifier>
<interval qualifier> ::= <start field> TO <end field> | <single datetime field>
<start field> ::= <non-second primary datetime field> [ <left paren> <interval leading field precision> <right paren> ]
<end field> ::= <non-second primary datetime field> | SECOND [ <left paren> <interval fractional seconds precision> <right paren> ]
<primary datetime field> ::= <non-second primary datetime field | SECOND
<non-second primary datetime field> ::= YEAR | MONTH | DAY | HOUR | MINUTE
<interval fractional seconds precision> ::= <unsigned integer>
<interval leading field precision> ::= <unsigned integer>
```
Currently, Spark SQL supports only `YEAR TO MONTH` and `DAY TO SECOND` as `<interval qualifier>`.
### Does this PR introduce _any_ user-facing change?
Should not since the types has not been released yet.
### How was this patch tested?
By running the affected tests such as:
```
$ build/sbt "sql/testOnly *SQLQueryTestSuite -- -z interval.sql"
$ build/sbt "sql/testOnly *SQLQueryTestSuite -- -z datetime.sql"
$ build/sbt "test:testOnly *ExpressionTypeCheckingSuite"
$ build/sbt "sql/testOnly *SQLQueryTestSuite -- -z windowFrameCoercion.sql"
$ build/sbt "sql/testOnly *SQLQueryTestSuite -- -z literals.sql"
```
Closes apache#32409 from MaxGekk/parse-ansi-interval-types.
Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
What changes were proposed in this pull request?
INTERVAL YEAR TO MONTHtoYearMonthIntervalTypeINTERVAL DAY TO SECONDtoDayTimeIntervalTypetypeName()name ofYearMonthIntervalType/DayTimeIntervalType.Why are the changes needed?
To be able to use new ANSI interval types in SQL. The SQL standard requires the types to be defined according to the rules:
Currently, Spark SQL supports only
YEAR TO MONTHandDAY TO SECONDas<interval qualifier>.Does this PR introduce any user-facing change?
Should not since the types has not been released yet.
How was this patch tested?
By running the affected tests such as: