-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-29864][SPARK-29920][SQL] Strict parsing of day-time strings to intervals #26473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
43 commits
Select commit
Hold shift + click to select a range
800f6e5
Strict fromDayTimeString
MaxGekk 8cf38db
Add tests
MaxGekk 3a1a710
Update tests in IntervalUtilsSuite
MaxGekk 4e051d6
Fix ExpressionParserSuite
MaxGekk a1ef591
Add tests for invalid input into literals.sql
MaxGekk e88c8e0
Regen interval.sql.out
MaxGekk c87cf8c
Add the config spark.sql.legacy.fromDayTimeString.enabled
MaxGekk 3f66881
Restore previous version of fromDayTimeString
MaxGekk da9b801
Restore old tests in IntervalUtilsSuite
MaxGekk c8afd33
Enable legacy behavior for postgresql tests
MaxGekk 988d532
Regen interval.sql.out
MaxGekk ce28745
Set LEGACY_FROM_DAYTIME_STRING to true in ThriftServerQueryTestSuite
MaxGekk 58ba7b4
Address Wenchen's review comment about a test
MaxGekk 60fe0c1
Add comments for parseDayTimeLegacy
MaxGekk e95068e
Avoid unnecessary catching of exceptions
MaxGekk 880e7ed
Add comments for parseDayTime()
MaxGekk d00c95d
Merge remote-tracking branch 'remotes/origin/master' into strict-from…
MaxGekk a37bad4
Regen literals.sql.out
MaxGekk 8e733c1
Remove exact ops
MaxGekk 6b5b7ef
Update the SQL migration guide
MaxGekk a2ce9ae
Merge remote-tracking branch 'remotes/origin/master' into strict-from…
MaxGekk fc77452
Merge remote-tracking branch 'remotes/origin/master' into strict-from…
MaxGekk 6be5f4e
truncation -> truncated
MaxGekk d253094
Make the config internal
MaxGekk dfd0dce
Use the legacy method in the PostgreSQL dialect
MaxGekk d1145cd
Update the SQL migration guide
MaxGekk c94f1df
Reorganize tests in interval.sql
MaxGekk 833c7b0
Regen interval.sql.out
MaxGekk 5b26335
Check in the PostgreSQL dialect as well
MaxGekk f401bd2
Black list interval.sql in ThriftServerQueryTestSuite
MaxGekk ca46f44
Set settings explicitly in tests
MaxGekk eadaa92
Merge remote-tracking branch 'remotes/origin/master' into strict-from…
MaxGekk 8f10259
Regen interval.sql.out
MaxGekk d3d730a
Merge remote-tracking branch 'remotes/origin/master' into strict-from…
MaxGekk 32b4d2f
Regen interval.sql.out
MaxGekk e39ca52
Merge remote-tracking branch 'remotes/origin/master' into strict-from…
MaxGekk e012f8b
Remove explicit set
MaxGekk 73ef32f
Regen interval.sql.out
MaxGekk d27d434
Merge remote-tracking branch 'remotes/origin/master' into strict-from…
MaxGekk 9d8394e
Merge remote-tracking branch 'remotes/origin/master' into strict-from…
MaxGekk ef2cbe1
Remove usage of usePostgreSQLDialect
MaxGekk f9510e3
Regen interval.sql.out
MaxGekk c16f2a7
Replace 999999999 by 123456789 in seconds fractions in tests
MaxGekk File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -20,12 +20,14 @@ package org.apache.spark.sql.catalyst.util | |
| import java.util.concurrent.TimeUnit | ||
|
|
||
| import org.apache.spark.SparkFunSuite | ||
| import org.apache.spark.sql.catalyst.plans.SQLHelper | ||
| import org.apache.spark.sql.catalyst.util.DateTimeConstants._ | ||
| import org.apache.spark.sql.catalyst.util.IntervalUtils._ | ||
| import org.apache.spark.sql.catalyst.util.IntervalUtils.IntervalUnit._ | ||
| import org.apache.spark.sql.internal.SQLConf | ||
| import org.apache.spark.unsafe.types.{CalendarInterval, UTF8String} | ||
|
|
||
| class IntervalUtilsSuite extends SparkFunSuite { | ||
| class IntervalUtilsSuite extends SparkFunSuite with SQLHelper { | ||
|
|
||
| private def checkFromString(input: String, expected: CalendarInterval): Unit = { | ||
| assert(stringToInterval(UTF8String.fromString(input)) === expected) | ||
|
|
@@ -160,43 +162,45 @@ class IntervalUtilsSuite extends SparkFunSuite { | |
| } | ||
| } | ||
|
|
||
| test("from day-time string") { | ||
| assert(fromDayTimeString("5 12:40:30.999999999") === | ||
| new CalendarInterval( | ||
| 0, | ||
| 5, | ||
| 12 * MICROS_PER_HOUR + | ||
| 40 * MICROS_PER_MINUTE + | ||
| 30 * MICROS_PER_SECOND + 999999L)) | ||
| assert(fromDayTimeString("10 0:12:0.888") === | ||
| new CalendarInterval( | ||
| 0, | ||
| 10, | ||
| 12 * MICROS_PER_MINUTE + 888 * MICROS_PER_MILLIS)) | ||
| assert(fromDayTimeString("-3 0:0:0") === new CalendarInterval(0, -3, 0L)) | ||
|
|
||
| try { | ||
| fromDayTimeString("5 30:12:20") | ||
| fail("Expected to throw an exception for the invalid input") | ||
| } catch { | ||
| case e: IllegalArgumentException => | ||
| assert(e.getMessage.contains("hour 30 outside range")) | ||
| } | ||
|
|
||
| try { | ||
| fromDayTimeString("5 30-12") | ||
| fail("Expected to throw an exception for the invalid input") | ||
| } catch { | ||
| case e: IllegalArgumentException => | ||
| assert(e.getMessage.contains("must match day-time format")) | ||
| } | ||
|
|
||
| try { | ||
| fromDayTimeString("5 1:12:20", HOUR, MICROSECOND) | ||
| fail("Expected to throw an exception for the invalid convention type") | ||
| } catch { | ||
| case e: IllegalArgumentException => | ||
| assert(e.getMessage.contains("Cannot support (interval")) | ||
| test("from day-time string - legacy") { | ||
| withSQLConf(SQLConf.LEGACY_FROM_DAYTIME_STRING.key -> "true") { | ||
| assert(fromDayTimeString("5 12:40:30.999999999") === | ||
| new CalendarInterval( | ||
| 0, | ||
| 5, | ||
| 12 * MICROS_PER_HOUR + | ||
| 40 * MICROS_PER_MINUTE + | ||
| 30 * MICROS_PER_SECOND + 999999L)) | ||
| assert(fromDayTimeString("10 0:12:0.888") === | ||
| new CalendarInterval( | ||
| 0, | ||
| 10, | ||
| 12 * MICROS_PER_MINUTE + 888 * MICROS_PER_MILLIS)) | ||
| assert(fromDayTimeString("-3 0:0:0") === new CalendarInterval(0, -3, 0L)) | ||
|
|
||
| try { | ||
| fromDayTimeString("5 30:12:20") | ||
| fail("Expected to throw an exception for the invalid input") | ||
| } catch { | ||
| case e: IllegalArgumentException => | ||
| assert(e.getMessage.contains("hour 30 outside range")) | ||
| } | ||
|
|
||
| try { | ||
| fromDayTimeString("5 30-12") | ||
| fail("Expected to throw an exception for the invalid input") | ||
| } catch { | ||
| case e: IllegalArgumentException => | ||
| assert(e.getMessage.contains("must match day-time format")) | ||
| } | ||
|
|
||
| try { | ||
| fromDayTimeString("5 1:12:20", HOUR, MICROSECOND) | ||
| fail("Expected to throw an exception for the invalid convention type") | ||
| } catch { | ||
| case e: IllegalArgumentException => | ||
| assert(e.getMessage.contains("Cannot support (interval")) | ||
| } | ||
| } | ||
| } | ||
|
|
||
|
|
@@ -384,4 +388,61 @@ class IntervalUtilsSuite extends SparkFunSuite { | |
| val i9 = new CalendarInterval(0, 0, -3000 * MICROS_PER_HOUR) | ||
| assert(IntervalUtils.toMultiUnitsString(i9) === "-3000 hours") | ||
| } | ||
|
|
||
| test("from day-time string") { | ||
| def check(input: String, from: IntervalUnit, to: IntervalUnit, expected: String): Unit = { | ||
| withClue(s"from = $from, to = $to") { | ||
| val expectedUtf8 = UTF8String.fromString(expected) | ||
| assert(fromDayTimeString(input, from, to) === safeStringToInterval(expectedUtf8)) | ||
| } | ||
| } | ||
| def checkFail( | ||
| input: String, | ||
| from: IntervalUnit, | ||
| to: IntervalUnit, | ||
| errMsg: String): Unit = { | ||
| try { | ||
| fromDayTimeString(input, from, to) | ||
| fail("Expected to throw an exception for the invalid input") | ||
| } catch { | ||
| case e: IllegalArgumentException => | ||
| assert(e.getMessage.contains(errMsg)) | ||
| } | ||
| } | ||
|
|
||
| check("12:40", HOUR, MINUTE, "12 hours 40 minutes") | ||
| check("+12:40", HOUR, MINUTE, "12 hours 40 minutes") | ||
| check("-12:40", HOUR, MINUTE, "-12 hours -40 minutes") | ||
| checkFail("5 12:40", HOUR, MINUTE, "must match day-time format") | ||
|
|
||
| check("12:40:30.999999999", HOUR, SECOND, "12 hours 40 minutes 30.999999 seconds") | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can we test |
||
| check("+12:40:30.123456789", HOUR, SECOND, "12 hours 40 minutes 30.123456 seconds") | ||
| check("-12:40:30.123456789", HOUR, SECOND, "-12 hours -40 minutes -30.123456 seconds") | ||
| checkFail("5 12:40:30", HOUR, SECOND, "must match day-time format") | ||
| checkFail("12:40:30.0123456789", HOUR, SECOND, "must match day-time format") | ||
|
|
||
| check("40:30.123456789", MINUTE, SECOND, "40 minutes 30.123456 seconds") | ||
| check("+40:30.123456789", MINUTE, SECOND, "40 minutes 30.123456 seconds") | ||
| check("-40:30.123456789", MINUTE, SECOND, "-40 minutes -30.123456 seconds") | ||
| checkFail("12:40:30", MINUTE, SECOND, "must match day-time format") | ||
|
|
||
| check("5 12", DAY, HOUR, "5 days 12 hours") | ||
| check("+5 12", DAY, HOUR, "5 days 12 hours") | ||
| check("-5 12", DAY, HOUR, "-5 days -12 hours") | ||
| checkFail("5 12:30", DAY, HOUR, "must match day-time format") | ||
|
|
||
| check("5 12:40", DAY, MINUTE, "5 days 12 hours 40 minutes") | ||
| check("+5 12:40", DAY, MINUTE, "5 days 12 hours 40 minutes") | ||
| check("-5 12:40", DAY, MINUTE, "-5 days -12 hours -40 minutes") | ||
| checkFail("5 12", DAY, MINUTE, "must match day-time format") | ||
|
|
||
| check("5 12:40:30.123", DAY, SECOND, "5 days 12 hours 40 minutes 30.123 seconds") | ||
| check("+5 12:40:30.123456", DAY, SECOND, "5 days 12 hours 40 minutes 30.123456 seconds") | ||
| check("-5 12:40:30.123456789", DAY, SECOND, "-5 days -12 hours -40 minutes -30.123456 seconds") | ||
| checkFail("5 12", DAY, SECOND, "must match day-time format") | ||
|
|
||
| checkFail("5 30:12:20", DAY, SECOND, "hour 30 outside range") | ||
| checkFail("5 30-12", DAY, SECOND, "must match day-time format") | ||
| checkFail("5 1:12:20", HOUR, MICROSECOND, "Cannot support (interval") | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.