From 78fb74ad34170740bc53883cd320920438659205 Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Wed, 20 May 2020 17:09:37 +0800 Subject: [PATCH 01/15] [SPARK-31771][SQL] Disable Narrow TextStyle for datetime pattern 'G/M/L/E/u/Q/q' --- docs/sql-ref-datetime-pattern.md | 21 ++---- .../util/DateTimeFormatterHelper.scala | 5 ++ .../util/DateTimeFormatterHelperSuite.scala | 10 +++ .../resources/sql-tests/inputs/datetime.sql | 10 +++ .../sql-tests/results/ansi/datetime.sql.out | 65 ++++++++++++++++++- .../sql-tests/results/datetime.sql.out | 65 ++++++++++++++++++- 6 files changed, 159 insertions(+), 17 deletions(-) diff --git a/docs/sql-ref-datetime-pattern.md b/docs/sql-ref-datetime-pattern.md index df19b9ce4c08..c04edfc21696 100644 --- a/docs/sql-ref-datetime-pattern.md +++ b/docs/sql-ref-datetime-pattern.md @@ -30,17 +30,17 @@ Spark uses pattern letters in the following table for date and timestamp parsing |Symbol|Meaning|Presentation|Examples| |------|-------|------------|--------| -|**G**|era|text|AD; Anno Domini; A| +|**G**|era|text|AD; Anno Domini| |**y**|year|year|2020; 20| |**D**|day-of-year|number|189| -|**M/L**|month-of-year|number/text|7; 07; Jul; July; J| +|**M/L**|month-of-year|number/text|7; 07; Jul; July| |**d**|day-of-month|number|28| |**Q/q**|quarter-of-year|number/text|3; 03; Q3; 3rd quarter| |**Y**|week-based-year|year|1996; 96| |**w**|week-of-week-based-year|number|27| |**W**|week-of-month|number|4| -|**E**|day-of-week|text|Tue; Tuesday; T| -|**u**|localized day-of-week|number/text|2; 02; Tue; Tuesday; T| +|**E**|day-of-week|text|Tue; Tuesday| +|**u**|localized day-of-week|number/text|2; 02; Tue; Tuesday| |**F**|week-of-month|number|3| |**a**|am-pm-of-day|text|PM| |**h**|clock-hour-of-am-pm (1-12)|number|12| @@ -63,7 +63,7 @@ Spark uses pattern letters in the following table for date and timestamp parsing The count of pattern letters determines the format. -- Text: The text style is determined based on the number of pattern letters used. Less than 4 pattern letters will use the short form. Exactly 4 pattern letters will use the full form. Exactly 5 pattern letters will use the narrow form. Six or more letters will fail. +- Text: The text style is determined based on the number of pattern letters used. Less than 4 pattern letters will use the short form. Exactly 4 pattern letters will use the full form. 5 or more letters will fail. - Number: If the count of letters is one, then the value is output using the minimum number of digits and without padding. Otherwise, the count of digits is used as the width of the output field, with the value zero-padded as necessary. The following pattern letters have constraints on the count of letters. Only one letter 'F' can be specified. Up to two letters of 'd', 'H', 'h', 'K', 'k', 'm', and 's' can be specified. Up to three letters of 'D' can be specified. @@ -76,7 +76,7 @@ The count of pattern letters determines the format. - Year: The count of letters determines the minimum field width below which padding is used. If the count of letters is two, then a reduced two digit form is used. For printing, this outputs the rightmost two digits. For parsing, this will parse using the base value of 2000, resulting in a year within the range 2000 to 2099 inclusive. If the count of letters is less than four (but not two), then the sign is only output for negative years. Otherwise, the sign is output if the pad width is exceeded when 'G' is not present. -- Month: If the number of pattern letters is 3 or more, the month is interpreted as text; otherwise, it is interpreted as a number. The text form is depend on letters - 'M' denotes the 'standard' form, and 'L' is for 'stand-alone' form. The difference between the 'standard' and 'stand-alone' forms is trickier to describe as there is no difference in English. However, in other languages there is a difference in the word used when the text is used alone, as opposed to in a complete date. For example, the word used for a month when used alone in a date picker is different to the word used for month in association with a day and year in a date. In Russian, 'Июль' is the stand-alone form of July, and 'Июля' is the standard form. Here are examples for all supported pattern letters (more than 5 letters is invalid): +- Month: If the number of pattern letters is 3 or more, the month is interpreted as text; otherwise, it is interpreted as a number. The text form is depend on letters - 'M' denotes the 'standard' form, and 'L' is for 'stand-alone' form. The difference between the 'standard' and 'stand-alone' forms is trickier to describe as there is no difference in English. However, in other languages there is a difference in the word used when the text is used alone, as opposed to in a complete date. For example, the word used for a month when used alone in a date picker is different to the word used for month in association with a day and year in a date. In Russian, 'Июль' is the stand-alone form of July, and 'Июля' is the standard form. Here are examples for all supported pattern letters (more than 4 letters is invalid): - `'M'` or `'L'`: Month number in a year starting from 1. There is no difference between 'M' and 'L'. Month from 1 to 9 are printed without padding. ```sql spark-sql> select date_format(date '1970-01-01', "M"); @@ -119,13 +119,6 @@ The count of pattern letters determines the format. spark-sql> select to_csv(named_struct('date', date '1970-01-01'), map('dateFormat', 'LLLL', 'locale', 'RU')); январь ``` - - `'LLLLL'` or `'MMMMM'`: Narrow textual representation of standard or stand-alone forms. Typically it is a single letter. - ```sql - spark-sql> select date_format(date '1970-07-01', "LLLLL"); - J - spark-sql> select date_format(date '1970-01-01', "MMMMM"); - J - ``` - Zone ID(V): This outputs the display the time-zone ID. Pattern letter count must be 2. @@ -147,5 +140,3 @@ More details for the text style: - Short Form: Short text, typically an abbreviation. For example, day-of-week Monday might output "Mon". - Full Form: Full text, typically the full description. For example, day-of-week Monday might output "Monday". - -- Narrow Form: Narrow text, typically a single letter. For example, day-of-week Monday might output "M". diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala index 05ec23f7ad47..28f1faf63e6b 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala @@ -163,6 +163,8 @@ private object DateTimeFormatterHelper { } final val unsupportedLetters = Set('A', 'c', 'e', 'n', 'N', 'p') + final val unsupportedNarrowTextStyle = + Set("GGGGG", "MMMMM", "LLLLL", "EEEEE", "uuuuu", "QQQQQ", "qqqqq") /** * In Spark 3.0, we switch to the Proleptic Gregorian calendar and use DateTimeFormatter for @@ -184,6 +186,9 @@ private object DateTimeFormatterHelper { for (c <- patternPart if unsupportedLetters.contains(c)) { throw new IllegalArgumentException(s"Illegal pattern character: $c") } + for (style <- unsupportedNarrowTextStyle if patternPart.contains(style)) { + throw new IllegalArgumentException(s"Too many pattern letters: ${style.head}") + } // The meaning of 'u' was day number of week in SimpleDateFormat, it was changed to year // in DateTimeFormatter. Substitute 'u' to 'e' and use DateTimeFormatter to parse the // string. If parsable, return the result; otherwise, fall back to 'u', and then use the diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelperSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelperSuite.scala index 817e50358432..13c67dd2458d 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelperSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelperSuite.scala @@ -40,6 +40,16 @@ class DateTimeFormatterHelperSuite extends SparkFunSuite { val e = intercept[IllegalArgumentException](convertIncompatiblePattern(s"yyyy-MM-dd $l G")) assert(e.getMessage === s"Illegal pattern character: $l") } + unsupportedNarrowTextStyle.foreach { style => + val e1 = intercept[IllegalArgumentException] { + convertIncompatiblePattern(s"yyyy-MM-dd $style${style.head}") + } + assert(e1.getMessage === s"Too many pattern letters: ${style.head}") + val e2 = intercept[IllegalArgumentException] { + convertIncompatiblePattern(s"yyyy-MM-dd $style${style.head}") + } + assert(e2.getMessage === s"Too many pattern letters: ${style.head}") + } assert(convertIncompatiblePattern("yyyy-MM-dd uuuu") === "uuuu-MM-dd eeee") assert(convertIncompatiblePattern("yyyy-MM-dd EEEE") === "uuuu-MM-dd EEEE") assert(convertIncompatiblePattern("yyyy-MM-dd'e'HH:mm:ss") === "uuuu-MM-dd'e'HH:mm:ss") diff --git a/sql/core/src/test/resources/sql-tests/inputs/datetime.sql b/sql/core/src/test/resources/sql-tests/inputs/datetime.sql index fd3325085df9..58cac765e85a 100644 --- a/sql/core/src/test/resources/sql-tests/inputs/datetime.sql +++ b/sql/core/src/test/resources/sql-tests/inputs/datetime.sql @@ -122,3 +122,13 @@ select to_timestamp("2019-10-06T10:11:12'12", "yyyy-MM-dd'T'HH:mm:ss''SSSS"); -- select to_timestamp("2019-10-06T10:11:12'", "yyyy-MM-dd'T'HH:mm:ss''"); -- tail select to_timestamp("'2019-10-06T10:11:12", "''yyyy-MM-dd'T'HH:mm:ss"); -- head select to_timestamp("P2019-10-06T10:11:12", "'P'yyyy-MM-dd'T'HH:mm:ss"); -- head but as single quote + +-- Unsupported narrow text style +select date_format(date '2020-05-23', 'GGGGG'); +select date_format(date '2020-05-23', 'MMMMM'); +select date_format(date '2020-05-23', 'LLLLL'); +select date_format(date '2020-05-23', 'EEEEE'); +select date_format(date '2020-05-23', 'uuuuu'); +select date_format(date '2020-05-23', 'QQQQQ'); +select date_format(date '2020-05-23', 'qqqqq'); + diff --git a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out index aad1e5f34387..51995636e143 100644 --- a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out @@ -1,5 +1,5 @@ -- Automatically generated by SQLQueryTestSuite --- Number of queries: 85 +-- Number of queries: 92 -- !query @@ -730,3 +730,66 @@ select to_timestamp("P2019-10-06T10:11:12", "'P'yyyy-MM-dd'T'HH:mm:ss") struct -- !query output 2019-10-06 10:11:12 + + +-- !query +select date_format(date '2020-05-23', 'GGGGG') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Too many pattern letters: G + + +-- !query +select date_format(date '2020-05-23', 'MMMMM') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Too many pattern letters: M + + +-- !query +select date_format(date '2020-05-23', 'LLLLL') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Too many pattern letters: L + + +-- !query +select date_format(date '2020-05-23', 'EEEEE') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Too many pattern letters: E + + +-- !query +select date_format(date '2020-05-23', 'uuuuu') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Too many pattern letters: u + + +-- !query +select date_format(date '2020-05-23', 'QQQQQ') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Too many pattern letters: Q + + +-- !query +select date_format(date '2020-05-23', 'qqqqq') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Too many pattern letters: q diff --git a/sql/core/src/test/resources/sql-tests/results/datetime.sql.out b/sql/core/src/test/resources/sql-tests/results/datetime.sql.out index a4f5b3772d2d..d578d31a865e 100755 --- a/sql/core/src/test/resources/sql-tests/results/datetime.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/datetime.sql.out @@ -1,5 +1,5 @@ -- Automatically generated by SQLQueryTestSuite --- Number of queries: 85 +-- Number of queries: 92 -- !query @@ -702,3 +702,66 @@ select to_timestamp("P2019-10-06T10:11:12", "'P'yyyy-MM-dd'T'HH:mm:ss") struct -- !query output 2019-10-06 10:11:12 + + +-- !query +select date_format(date '2020-05-23', 'GGGGG') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Too many pattern letters: G + + +-- !query +select date_format(date '2020-05-23', 'MMMMM') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Too many pattern letters: M + + +-- !query +select date_format(date '2020-05-23', 'LLLLL') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Too many pattern letters: L + + +-- !query +select date_format(date '2020-05-23', 'EEEEE') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Too many pattern letters: E + + +-- !query +select date_format(date '2020-05-23', 'uuuuu') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Too many pattern letters: u + + +-- !query +select date_format(date '2020-05-23', 'QQQQQ') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Too many pattern letters: Q + + +-- !query +select date_format(date '2020-05-23', 'qqqqq') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Too many pattern letters: q From e178a6b739dbfdf6e59fd464ab5d7e9f8be37a7f Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Wed, 20 May 2020 21:12:39 +0800 Subject: [PATCH 02/15] fail for parser --- .../expressions/datetimeExpressions.scala | 1 + .../util/DateTimeFormatterHelper.scala | 7 +- .../util/DateTimeFormatterHelperSuite.scala | 12 +- .../sql-tests/inputs/datetime-corrected.sql | 2 + .../sql-tests/inputs/datetime-legacy.sql | 2 + .../resources/sql-tests/inputs/datetime.sql | 7 +- .../sql-tests/results/ansi/datetime.sql.out | 84 +- .../results/datetime-corrected.sql.out | 821 ++++++++++++++++++ .../sql-tests/results/datetime-legacy.sql.out | 810 +++++++++++++++++ .../sql-tests/results/datetime.sql.out | 84 +- 10 files changed, 1792 insertions(+), 38 deletions(-) create mode 100644 sql/core/src/test/resources/sql-tests/inputs/datetime-corrected.sql create mode 100644 sql/core/src/test/resources/sql-tests/inputs/datetime-legacy.sql create mode 100644 sql/core/src/test/resources/sql-tests/results/datetime-corrected.sql.out create mode 100644 sql/core/src/test/resources/sql-tests/results/datetime-legacy.sql.out diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala index ccedcb41fc28..e622ee119d52 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala @@ -803,6 +803,7 @@ abstract class ToTimestamp legacyFormat = SIMPLE_DATE_FORMAT, needVarLengthSecondFraction = true) } catch { + case e: SparkUpgradeException => throw e case NonFatal(_) => null } diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala index 28f1faf63e6b..ecb89963fade 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala @@ -187,7 +187,12 @@ private object DateTimeFormatterHelper { throw new IllegalArgumentException(s"Illegal pattern character: $c") } for (style <- unsupportedNarrowTextStyle if patternPart.contains(style)) { - throw new IllegalArgumentException(s"Too many pattern letters: ${style.head}") + val e = new IllegalArgumentException(s"Too many pattern letters: ${style.head}") + throw new SparkUpgradeException("3.0", s"Fail to recognize '$style' pattern in the" + + s" new parser. 1) You can set ${SQLConf.LEGACY_TIME_PARSER_POLICY.key} to LEGACY to" + + s" restore the behavior before Spark 3.0." + + s" 2) You can form a valid datetime pattern with the guide from" + + s" https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html", e) } // The meaning of 'u' was day number of week in SimpleDateFormat, it was changed to year // in DateTimeFormatter. Substitute 'u' to 'e' and use DateTimeFormatter to parse the diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelperSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelperSuite.scala index 13c67dd2458d..ed38194fd619 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelperSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelperSuite.scala @@ -17,7 +17,7 @@ package org.apache.spark.sql.catalyst.util -import org.apache.spark.SparkFunSuite +import org.apache.spark.{SparkFunSuite, SparkUpgradeException} import org.apache.spark.sql.catalyst.util.DateTimeFormatterHelper._ class DateTimeFormatterHelperSuite extends SparkFunSuite { @@ -41,14 +41,14 @@ class DateTimeFormatterHelperSuite extends SparkFunSuite { assert(e.getMessage === s"Illegal pattern character: $l") } unsupportedNarrowTextStyle.foreach { style => - val e1 = intercept[IllegalArgumentException] { - convertIncompatiblePattern(s"yyyy-MM-dd $style${style.head}") + val e1 = intercept[SparkUpgradeException] { + convertIncompatiblePattern(s"yyyy-MM-dd $style") } - assert(e1.getMessage === s"Too many pattern letters: ${style.head}") - val e2 = intercept[IllegalArgumentException] { + assert(e1.getCause.getMessage === s"Too many pattern letters: ${style.head}") + val e2 = intercept[SparkUpgradeException] { convertIncompatiblePattern(s"yyyy-MM-dd $style${style.head}") } - assert(e2.getMessage === s"Too many pattern letters: ${style.head}") + assert(e2.getCause.getMessage === s"Too many pattern letters: ${style.head}") } assert(convertIncompatiblePattern("yyyy-MM-dd uuuu") === "uuuu-MM-dd eeee") assert(convertIncompatiblePattern("yyyy-MM-dd EEEE") === "uuuu-MM-dd EEEE") diff --git a/sql/core/src/test/resources/sql-tests/inputs/datetime-corrected.sql b/sql/core/src/test/resources/sql-tests/inputs/datetime-corrected.sql new file mode 100644 index 000000000000..0b2386070a9b --- /dev/null +++ b/sql/core/src/test/resources/sql-tests/inputs/datetime-corrected.sql @@ -0,0 +1,2 @@ +--SET spark.sql.legacy.timeParserPolicy=CORRECTED +--IMPORT datetime.sql diff --git a/sql/core/src/test/resources/sql-tests/inputs/datetime-legacy.sql b/sql/core/src/test/resources/sql-tests/inputs/datetime-legacy.sql new file mode 100644 index 000000000000..daec2b40a620 --- /dev/null +++ b/sql/core/src/test/resources/sql-tests/inputs/datetime-legacy.sql @@ -0,0 +1,2 @@ +--SET spark.sql.legacy.timeParserPolicy=LEGACY +--IMPORT datetime.sql diff --git a/sql/core/src/test/resources/sql-tests/inputs/datetime.sql b/sql/core/src/test/resources/sql-tests/inputs/datetime.sql index 58cac765e85a..b829fc2d95c7 100644 --- a/sql/core/src/test/resources/sql-tests/inputs/datetime.sql +++ b/sql/core/src/test/resources/sql-tests/inputs/datetime.sql @@ -131,4 +131,9 @@ select date_format(date '2020-05-23', 'EEEEE'); select date_format(date '2020-05-23', 'uuuuu'); select date_format(date '2020-05-23', 'QQQQQ'); select date_format(date '2020-05-23', 'qqqqq'); - +select to_timestamp('2019-10-06 A', 'yyyy-MM-dd GGGGG'); +select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEEE'); +select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE'); +select unix_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE'); +select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')); +select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')); diff --git a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out index 51995636e143..74171b6d6c9f 100644 --- a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out @@ -1,5 +1,5 @@ -- Automatically generated by SQLQueryTestSuite --- Number of queries: 92 +-- Number of queries: 98 -- !query @@ -737,8 +737,8 @@ select date_format(date '2020-05-23', 'GGGGG') -- !query schema struct<> -- !query output -java.lang.IllegalArgumentException -Too many pattern letters: G +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -746,8 +746,8 @@ select date_format(date '2020-05-23', 'MMMMM') -- !query schema struct<> -- !query output -java.lang.IllegalArgumentException -Too many pattern letters: M +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -755,8 +755,8 @@ select date_format(date '2020-05-23', 'LLLLL') -- !query schema struct<> -- !query output -java.lang.IllegalArgumentException -Too many pattern letters: L +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'LLLLL' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -764,8 +764,8 @@ select date_format(date '2020-05-23', 'EEEEE') -- !query schema struct<> -- !query output -java.lang.IllegalArgumentException -Too many pattern letters: E +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -773,8 +773,8 @@ select date_format(date '2020-05-23', 'uuuuu') -- !query schema struct<> -- !query output -java.lang.IllegalArgumentException -Too many pattern letters: u +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'uuuuu' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -782,8 +782,8 @@ select date_format(date '2020-05-23', 'QQQQQ') -- !query schema struct<> -- !query output -java.lang.IllegalArgumentException -Too many pattern letters: Q +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'QQQQQ' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -791,5 +791,59 @@ select date_format(date '2020-05-23', 'qqqqq') -- !query schema struct<> -- !query output -java.lang.IllegalArgumentException -Too many pattern letters: q +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'qqqqq' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select to_timestamp('2019-10-06 A', 'yyyy-MM-dd GGGGG') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEEE') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select unix_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html diff --git a/sql/core/src/test/resources/sql-tests/results/datetime-corrected.sql.out b/sql/core/src/test/resources/sql-tests/results/datetime-corrected.sql.out new file mode 100644 index 000000000000..f9bcceb17dd9 --- /dev/null +++ b/sql/core/src/test/resources/sql-tests/results/datetime-corrected.sql.out @@ -0,0 +1,821 @@ +-- Automatically generated by SQLQueryTestSuite +-- Number of queries: 98 + + +-- !query +select current_date = current_date(), current_timestamp = current_timestamp() +-- !query schema +struct<(current_date() = current_date()):boolean,(current_timestamp() = current_timestamp()):boolean> +-- !query output +true true + + +-- !query +select to_date(null), to_date('2016-12-31'), to_date('2016-12-31', 'yyyy-MM-dd') +-- !query schema +struct +-- !query output +NULL 2016-12-31 2016-12-31 + + +-- !query +select to_timestamp(null), to_timestamp('2016-12-31 00:12:00'), to_timestamp('2016-12-31', 'yyyy-MM-dd') +-- !query schema +struct +-- !query output +NULL 2016-12-31 00:12:00 2016-12-31 00:00:00 + + +-- !query +select dayofweek('2007-02-03'), dayofweek('2009-07-30'), dayofweek('2017-05-27'), dayofweek(null), dayofweek('1582-10-15 13:10:15') +-- !query schema +struct +-- !query output +7 5 7 NULL 6 + + +-- !query +create temporary view ttf1 as select * from values + (1, 2), + (2, 3) + as ttf1(current_date, current_timestamp) +-- !query schema +struct<> +-- !query output + + + +-- !query +select current_date, current_timestamp from ttf1 +-- !query schema +struct +-- !query output +1 2 +2 3 + + +-- !query +create temporary view ttf2 as select * from values + (1, 2), + (2, 3) + as ttf2(a, b) +-- !query schema +struct<> +-- !query output + + + +-- !query +select current_date = current_date(), current_timestamp = current_timestamp(), a, b from ttf2 +-- !query schema +struct<(current_date() = current_date()):boolean,(current_timestamp() = current_timestamp()):boolean,a:int,b:int> +-- !query output +true true 1 2 +true true 2 3 + + +-- !query +select a, b from ttf2 order by a, current_date +-- !query schema +struct +-- !query output +1 2 +2 3 + + +-- !query +select weekday('2007-02-03'), weekday('2009-07-30'), weekday('2017-05-27'), weekday(null), weekday('1582-10-15 13:10:15') +-- !query schema +struct +-- !query output +5 3 5 NULL 4 + + +-- !query +select year('1500-01-01'), month('1500-01-01'), dayOfYear('1500-01-01') +-- !query schema +struct +-- !query output +1500 1 1 + + +-- !query +select date '2019-01-01\t' +-- !query schema +struct +-- !query output +2019-01-01 + + +-- !query +select timestamp '2019-01-01\t' +-- !query schema +struct +-- !query output +2019-01-01 00:00:00 + + +-- !query +select timestamp'2011-11-11 11:11:11' + interval '2' day +-- !query schema +struct +-- !query output +2011-11-13 11:11:11 + + +-- !query +select timestamp'2011-11-11 11:11:11' - interval '2' day +-- !query schema +struct +-- !query output +2011-11-09 11:11:11 + + +-- !query +select date'2011-11-11 11:11:11' + interval '2' second +-- !query schema +struct +-- !query output +2011-11-11 + + +-- !query +select date'2011-11-11 11:11:11' - interval '2' second +-- !query schema +struct +-- !query output +2011-11-10 + + +-- !query +select '2011-11-11' - interval '2' day +-- !query schema +struct +-- !query output +2011-11-09 00:00:00 + + +-- !query +select '2011-11-11 11:11:11' - interval '2' second +-- !query schema +struct +-- !query output +2011-11-11 11:11:09 + + +-- !query +select '1' - interval '2' second +-- !query schema +struct +-- !query output +NULL + + +-- !query +select 1 - interval '2' second +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve '1 + (- INTERVAL '2 seconds')' due to data type mismatch: argument 1 requires timestamp type, however, '1' is of int type.; line 1 pos 7 + + +-- !query +select date'2020-01-01' - timestamp'2019-10-06 10:11:12.345678' +-- !query schema +struct +-- !query output +2078 hours 48 minutes 47.654322 seconds + + +-- !query +select timestamp'2019-10-06 10:11:12.345678' - date'2020-01-01' +-- !query schema +struct +-- !query output +-2078 hours -48 minutes -47.654322 seconds + + +-- !query +select timestamp'2019-10-06 10:11:12.345678' - null +-- !query schema +struct +-- !query output +NULL + + +-- !query +select null - timestamp'2019-10-06 10:11:12.345678' +-- !query schema +struct +-- !query output +NULL + + +-- !query +select date_add('2011-11-11', 1Y) +-- !query schema +struct +-- !query output +2011-11-12 + + +-- !query +select date_add('2011-11-11', 1S) +-- !query schema +struct +-- !query output +2011-11-12 + + +-- !query +select date_add('2011-11-11', 1) +-- !query schema +struct +-- !query output +2011-11-12 + + +-- !query +select date_add('2011-11-11', 1L) +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_add(CAST('2011-11-11' AS DATE), 1L)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, '1L' is of bigint type.; line 1 pos 7 + + +-- !query +select date_add('2011-11-11', 1.0) +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_add(CAST('2011-11-11' AS DATE), 1.0BD)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, '1.0BD' is of decimal(2,1) type.; line 1 pos 7 + + +-- !query +select date_add('2011-11-11', 1E1) +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_add(CAST('2011-11-11' AS DATE), 10.0D)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, '10.0D' is of double type.; line 1 pos 7 + + +-- !query +select date_add('2011-11-11', '1') +-- !query schema +struct +-- !query output +2011-11-12 + + +-- !query +select date_add('2011-11-11', '1.2') +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +The second argument of 'date_add' function needs to be an integer.; + + +-- !query +select date_add(date'2011-11-11', 1) +-- !query schema +struct +-- !query output +2011-11-12 + + +-- !query +select date_add(timestamp'2011-11-11', 1) +-- !query schema +struct +-- !query output +2011-11-12 + + +-- !query +select date_sub(date'2011-11-11', 1) +-- !query schema +struct +-- !query output +2011-11-10 + + +-- !query +select date_sub(date'2011-11-11', '1') +-- !query schema +struct +-- !query output +2011-11-10 + + +-- !query +select date_sub(date'2011-11-11', '1.2') +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +The second argument of 'date_sub' function needs to be an integer.; + + +-- !query +select date_sub(timestamp'2011-11-11', 1) +-- !query schema +struct +-- !query output +2011-11-10 + + +-- !query +select date_sub(null, 1) +-- !query schema +struct +-- !query output +NULL + + +-- !query +select date_sub(date'2011-11-11', null) +-- !query schema +struct +-- !query output +NULL + + +-- !query +select date'2011-11-11' + 1E1 +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_add(DATE '2011-11-11', 10.0D)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, '10.0D' is of double type.; line 1 pos 7 + + +-- !query +select date'2011-11-11' + '1' +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_add(DATE '2011-11-11', CAST('1' AS DOUBLE))' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, 'CAST('1' AS DOUBLE)' is of double type.; line 1 pos 7 + + +-- !query +select null + date '2001-09-28' +-- !query schema +struct +-- !query output +NULL + + +-- !query +select date '2001-09-28' + 7Y +-- !query schema +struct +-- !query output +2001-10-05 + + +-- !query +select 7S + date '2001-09-28' +-- !query schema +struct +-- !query output +2001-10-05 + + +-- !query +select date '2001-10-01' - 7 +-- !query schema +struct +-- !query output +2001-09-24 + + +-- !query +select date '2001-10-01' - '7' +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_sub(DATE '2001-10-01', CAST('7' AS DOUBLE))' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, 'CAST('7' AS DOUBLE)' is of double type.; line 1 pos 7 + + +-- !query +select date '2001-09-28' + null +-- !query schema +struct +-- !query output +NULL + + +-- !query +select date '2001-09-28' - null +-- !query schema +struct +-- !query output +NULL + + +-- !query +create temp view v as select '1' str +-- !query schema +struct<> +-- !query output + + + +-- !query +select date_add('2011-11-11', str) from v +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_add(CAST('2011-11-11' AS DATE), v.`str`)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, 'v.`str`' is of string type.; line 1 pos 7 + + +-- !query +select date_sub('2011-11-11', str) from v +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_sub(CAST('2011-11-11' AS DATE), v.`str`)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, 'v.`str`' is of string type.; line 1 pos 7 + + +-- !query +select null - date '2019-10-06' +-- !query schema +struct +-- !query output +NULL + + +-- !query +select date '2001-10-01' - date '2001-09-28' +-- !query schema +struct +-- !query output +3 days + + +-- !query +select to_timestamp('2019-10-06 10:11:12.', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('2019-10-06 10:11:12.0', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +2019-10-06 10:11:12 + + +-- !query +select to_timestamp('2019-10-06 10:11:12.1', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +2019-10-06 10:11:12.1 + + +-- !query +select to_timestamp('2019-10-06 10:11:12.12', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +2019-10-06 10:11:12.12 + + +-- !query +select to_timestamp('2019-10-06 10:11:12.123UTC', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +2019-10-06 03:11:12.123 + + +-- !query +select to_timestamp('2019-10-06 10:11:12.1234', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +2019-10-06 10:11:12.1234 + + +-- !query +select to_timestamp('2019-10-06 10:11:12.12345CST', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +2019-10-06 08:11:12.12345 + + +-- !query +select to_timestamp('2019-10-06 10:11:12.123456PST', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +2019-10-06 10:11:12.123456 + + +-- !query +select to_timestamp('2019-10-06 10:11:12.1234567PST', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('123456 2019-10-06 10:11:12.123456PST', 'SSSSSS yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +2019-10-06 10:11:12.123456 + + +-- !query +select to_timestamp('223456 2019-10-06 10:11:12.123456PST', 'SSSSSS yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('2019-10-06 10:11:12.1234', 'yyyy-MM-dd HH:mm:ss.[SSSSSS]') +-- !query schema +struct +-- !query output +2019-10-06 10:11:12.1234 + + +-- !query +select to_timestamp('2019-10-06 10:11:12.123', 'yyyy-MM-dd HH:mm:ss[.SSSSSS]') +-- !query schema +struct +-- !query output +2019-10-06 10:11:12.123 + + +-- !query +select to_timestamp('2019-10-06 10:11:12', 'yyyy-MM-dd HH:mm:ss[.SSSSSS]') +-- !query schema +struct +-- !query output +2019-10-06 10:11:12 + + +-- !query +select to_timestamp('2019-10-06 10:11:12.12', 'yyyy-MM-dd HH:mm[:ss.SSSSSS]') +-- !query schema +struct +-- !query output +2019-10-06 10:11:12.12 + + +-- !query +select to_timestamp('2019-10-06 10:11', 'yyyy-MM-dd HH:mm[:ss.SSSSSS]') +-- !query schema +struct +-- !query output +2019-10-06 10:11:00 + + +-- !query +select to_timestamp("2019-10-06S10:11:12.12345", "yyyy-MM-dd'S'HH:mm:ss.SSSSSS") +-- !query schema +struct +-- !query output +2019-10-06 10:11:12.12345 + + +-- !query +select to_timestamp("12.12342019-10-06S10:11", "ss.SSSSyyyy-MM-dd'S'HH:mm") +-- !query schema +struct +-- !query output +2019-10-06 10:11:12.1234 + + +-- !query +select to_timestamp("12.1232019-10-06S10:11", "ss.SSSSyyyy-MM-dd'S'HH:mm") +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp("12.1232019-10-06S10:11", "ss.SSSSyy-MM-dd'S'HH:mm") +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp("12.1234019-10-06S10:11", "ss.SSSSy-MM-dd'S'HH:mm") +-- !query schema +struct +-- !query output +0019-10-06 10:11:12.1234 + + +-- !query +select to_timestamp("2019-10-06S", "yyyy-MM-dd'S'") +-- !query schema +struct +-- !query output +2019-10-06 00:00:00 + + +-- !query +select to_timestamp("S2019-10-06", "'S'yyyy-MM-dd") +-- !query schema +struct +-- !query output +2019-10-06 00:00:00 + + +-- !query +select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uuee') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Illegal pattern character: e + + +-- !query +select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uucc') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Illegal pattern character: c + + +-- !query +select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uuuu') +-- !query schema +struct +-- !query output +2019-10-06 Sunday + + +-- !query +select to_timestamp("2019-10-06T10:11:12'12", "yyyy-MM-dd'T'HH:mm:ss''SSSS") +-- !query schema +struct +-- !query output +2019-10-06 10:11:12.12 + + +-- !query +select to_timestamp("2019-10-06T10:11:12'", "yyyy-MM-dd'T'HH:mm:ss''") +-- !query schema +struct +-- !query output +2019-10-06 10:11:12 + + +-- !query +select to_timestamp("'2019-10-06T10:11:12", "''yyyy-MM-dd'T'HH:mm:ss") +-- !query schema +struct +-- !query output +2019-10-06 10:11:12 + + +-- !query +select to_timestamp("P2019-10-06T10:11:12", "'P'yyyy-MM-dd'T'HH:mm:ss") +-- !query schema +struct +-- !query output +2019-10-06 10:11:12 + + +-- !query +select date_format(date '2020-05-23', 'GGGGG') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select date_format(date '2020-05-23', 'MMMMM') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select date_format(date '2020-05-23', 'LLLLL') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'LLLLL' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select date_format(date '2020-05-23', 'EEEEE') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select date_format(date '2020-05-23', 'uuuuu') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'uuuuu' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select date_format(date '2020-05-23', 'QQQQQ') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'QQQQQ' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select date_format(date '2020-05-23', 'qqqqq') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'qqqqq' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select to_timestamp('2019-10-06 A', 'yyyy-MM-dd GGGGG') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEEE') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select unix_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html diff --git a/sql/core/src/test/resources/sql-tests/results/datetime-legacy.sql.out b/sql/core/src/test/resources/sql-tests/results/datetime-legacy.sql.out new file mode 100644 index 000000000000..acdbb74fe388 --- /dev/null +++ b/sql/core/src/test/resources/sql-tests/results/datetime-legacy.sql.out @@ -0,0 +1,810 @@ +-- Automatically generated by SQLQueryTestSuite +-- Number of queries: 98 + + +-- !query +select current_date = current_date(), current_timestamp = current_timestamp() +-- !query schema +struct<(current_date() = current_date()):boolean,(current_timestamp() = current_timestamp()):boolean> +-- !query output +true true + + +-- !query +select to_date(null), to_date('2016-12-31'), to_date('2016-12-31', 'yyyy-MM-dd') +-- !query schema +struct +-- !query output +NULL 2016-12-31 2016-12-31 + + +-- !query +select to_timestamp(null), to_timestamp('2016-12-31 00:12:00'), to_timestamp('2016-12-31', 'yyyy-MM-dd') +-- !query schema +struct +-- !query output +NULL 2016-12-31 00:12:00 2016-12-31 00:00:00 + + +-- !query +select dayofweek('2007-02-03'), dayofweek('2009-07-30'), dayofweek('2017-05-27'), dayofweek(null), dayofweek('1582-10-15 13:10:15') +-- !query schema +struct +-- !query output +7 5 7 NULL 6 + + +-- !query +create temporary view ttf1 as select * from values + (1, 2), + (2, 3) + as ttf1(current_date, current_timestamp) +-- !query schema +struct<> +-- !query output + + + +-- !query +select current_date, current_timestamp from ttf1 +-- !query schema +struct +-- !query output +1 2 +2 3 + + +-- !query +create temporary view ttf2 as select * from values + (1, 2), + (2, 3) + as ttf2(a, b) +-- !query schema +struct<> +-- !query output + + + +-- !query +select current_date = current_date(), current_timestamp = current_timestamp(), a, b from ttf2 +-- !query schema +struct<(current_date() = current_date()):boolean,(current_timestamp() = current_timestamp()):boolean,a:int,b:int> +-- !query output +true true 1 2 +true true 2 3 + + +-- !query +select a, b from ttf2 order by a, current_date +-- !query schema +struct +-- !query output +1 2 +2 3 + + +-- !query +select weekday('2007-02-03'), weekday('2009-07-30'), weekday('2017-05-27'), weekday(null), weekday('1582-10-15 13:10:15') +-- !query schema +struct +-- !query output +5 3 5 NULL 4 + + +-- !query +select year('1500-01-01'), month('1500-01-01'), dayOfYear('1500-01-01') +-- !query schema +struct +-- !query output +1500 1 1 + + +-- !query +select date '2019-01-01\t' +-- !query schema +struct +-- !query output +2019-01-01 + + +-- !query +select timestamp '2019-01-01\t' +-- !query schema +struct +-- !query output +2019-01-01 00:00:00 + + +-- !query +select timestamp'2011-11-11 11:11:11' + interval '2' day +-- !query schema +struct +-- !query output +2011-11-13 11:11:11 + + +-- !query +select timestamp'2011-11-11 11:11:11' - interval '2' day +-- !query schema +struct +-- !query output +2011-11-09 11:11:11 + + +-- !query +select date'2011-11-11 11:11:11' + interval '2' second +-- !query schema +struct +-- !query output +2011-11-11 + + +-- !query +select date'2011-11-11 11:11:11' - interval '2' second +-- !query schema +struct +-- !query output +2011-11-10 + + +-- !query +select '2011-11-11' - interval '2' day +-- !query schema +struct +-- !query output +2011-11-09 00:00:00 + + +-- !query +select '2011-11-11 11:11:11' - interval '2' second +-- !query schema +struct +-- !query output +2011-11-11 11:11:09 + + +-- !query +select '1' - interval '2' second +-- !query schema +struct +-- !query output +NULL + + +-- !query +select 1 - interval '2' second +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve '1 + (- INTERVAL '2 seconds')' due to data type mismatch: argument 1 requires timestamp type, however, '1' is of int type.; line 1 pos 7 + + +-- !query +select date'2020-01-01' - timestamp'2019-10-06 10:11:12.345678' +-- !query schema +struct +-- !query output +2078 hours 48 minutes 47.654322 seconds + + +-- !query +select timestamp'2019-10-06 10:11:12.345678' - date'2020-01-01' +-- !query schema +struct +-- !query output +-2078 hours -48 minutes -47.654322 seconds + + +-- !query +select timestamp'2019-10-06 10:11:12.345678' - null +-- !query schema +struct +-- !query output +NULL + + +-- !query +select null - timestamp'2019-10-06 10:11:12.345678' +-- !query schema +struct +-- !query output +NULL + + +-- !query +select date_add('2011-11-11', 1Y) +-- !query schema +struct +-- !query output +2011-11-12 + + +-- !query +select date_add('2011-11-11', 1S) +-- !query schema +struct +-- !query output +2011-11-12 + + +-- !query +select date_add('2011-11-11', 1) +-- !query schema +struct +-- !query output +2011-11-12 + + +-- !query +select date_add('2011-11-11', 1L) +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_add(CAST('2011-11-11' AS DATE), 1L)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, '1L' is of bigint type.; line 1 pos 7 + + +-- !query +select date_add('2011-11-11', 1.0) +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_add(CAST('2011-11-11' AS DATE), 1.0BD)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, '1.0BD' is of decimal(2,1) type.; line 1 pos 7 + + +-- !query +select date_add('2011-11-11', 1E1) +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_add(CAST('2011-11-11' AS DATE), 10.0D)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, '10.0D' is of double type.; line 1 pos 7 + + +-- !query +select date_add('2011-11-11', '1') +-- !query schema +struct +-- !query output +2011-11-12 + + +-- !query +select date_add('2011-11-11', '1.2') +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +The second argument of 'date_add' function needs to be an integer.; + + +-- !query +select date_add(date'2011-11-11', 1) +-- !query schema +struct +-- !query output +2011-11-12 + + +-- !query +select date_add(timestamp'2011-11-11', 1) +-- !query schema +struct +-- !query output +2011-11-12 + + +-- !query +select date_sub(date'2011-11-11', 1) +-- !query schema +struct +-- !query output +2011-11-10 + + +-- !query +select date_sub(date'2011-11-11', '1') +-- !query schema +struct +-- !query output +2011-11-10 + + +-- !query +select date_sub(date'2011-11-11', '1.2') +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +The second argument of 'date_sub' function needs to be an integer.; + + +-- !query +select date_sub(timestamp'2011-11-11', 1) +-- !query schema +struct +-- !query output +2011-11-10 + + +-- !query +select date_sub(null, 1) +-- !query schema +struct +-- !query output +NULL + + +-- !query +select date_sub(date'2011-11-11', null) +-- !query schema +struct +-- !query output +NULL + + +-- !query +select date'2011-11-11' + 1E1 +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_add(DATE '2011-11-11', 10.0D)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, '10.0D' is of double type.; line 1 pos 7 + + +-- !query +select date'2011-11-11' + '1' +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_add(DATE '2011-11-11', CAST('1' AS DOUBLE))' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, 'CAST('1' AS DOUBLE)' is of double type.; line 1 pos 7 + + +-- !query +select null + date '2001-09-28' +-- !query schema +struct +-- !query output +NULL + + +-- !query +select date '2001-09-28' + 7Y +-- !query schema +struct +-- !query output +2001-10-05 + + +-- !query +select 7S + date '2001-09-28' +-- !query schema +struct +-- !query output +2001-10-05 + + +-- !query +select date '2001-10-01' - 7 +-- !query schema +struct +-- !query output +2001-09-24 + + +-- !query +select date '2001-10-01' - '7' +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_sub(DATE '2001-10-01', CAST('7' AS DOUBLE))' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, 'CAST('7' AS DOUBLE)' is of double type.; line 1 pos 7 + + +-- !query +select date '2001-09-28' + null +-- !query schema +struct +-- !query output +NULL + + +-- !query +select date '2001-09-28' - null +-- !query schema +struct +-- !query output +NULL + + +-- !query +create temp view v as select '1' str +-- !query schema +struct<> +-- !query output + + + +-- !query +select date_add('2011-11-11', str) from v +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_add(CAST('2011-11-11' AS DATE), v.`str`)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, 'v.`str`' is of string type.; line 1 pos 7 + + +-- !query +select date_sub('2011-11-11', str) from v +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException +cannot resolve 'date_sub(CAST('2011-11-11' AS DATE), v.`str`)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, 'v.`str`' is of string type.; line 1 pos 7 + + +-- !query +select null - date '2019-10-06' +-- !query schema +struct +-- !query output +NULL + + +-- !query +select date '2001-10-01' - date '2001-09-28' +-- !query schema +struct +-- !query output +3 days + + +-- !query +select to_timestamp('2019-10-06 10:11:12.', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('2019-10-06 10:11:12.0', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('2019-10-06 10:11:12.1', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('2019-10-06 10:11:12.12', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('2019-10-06 10:11:12.123UTC', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('2019-10-06 10:11:12.1234', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('2019-10-06 10:11:12.12345CST', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('2019-10-06 10:11:12.123456PST', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('2019-10-06 10:11:12.1234567PST', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('123456 2019-10-06 10:11:12.123456PST', 'SSSSSS yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('223456 2019-10-06 10:11:12.123456PST', 'SSSSSS yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('2019-10-06 10:11:12.1234', 'yyyy-MM-dd HH:mm:ss.[SSSSSS]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('2019-10-06 10:11:12.123', 'yyyy-MM-dd HH:mm:ss[.SSSSSS]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('2019-10-06 10:11:12', 'yyyy-MM-dd HH:mm:ss[.SSSSSS]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('2019-10-06 10:11:12.12', 'yyyy-MM-dd HH:mm[:ss.SSSSSS]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('2019-10-06 10:11', 'yyyy-MM-dd HH:mm[:ss.SSSSSS]') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp("2019-10-06S10:11:12.12345", "yyyy-MM-dd'S'HH:mm:ss.SSSSSS") +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp("12.12342019-10-06S10:11", "ss.SSSSyyyy-MM-dd'S'HH:mm") +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp("12.1232019-10-06S10:11", "ss.SSSSyyyy-MM-dd'S'HH:mm") +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp("12.1232019-10-06S10:11", "ss.SSSSyy-MM-dd'S'HH:mm") +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp("12.1234019-10-06S10:11", "ss.SSSSy-MM-dd'S'HH:mm") +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp("2019-10-06S", "yyyy-MM-dd'S'") +-- !query schema +struct +-- !query output +2019-10-06 00:00:00 + + +-- !query +select to_timestamp("S2019-10-06", "'S'yyyy-MM-dd") +-- !query schema +struct +-- !query output +2019-10-06 00:00:00 + + +-- !query +select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uuee') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Illegal pattern character 'e' + + +-- !query +select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uucc') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Illegal pattern character 'c' + + +-- !query +select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uuuu') +-- !query schema +struct +-- !query output +2019-10-06 0007 + + +-- !query +select to_timestamp("2019-10-06T10:11:12'12", "yyyy-MM-dd'T'HH:mm:ss''SSSS") +-- !query schema +struct +-- !query output +2019-10-06 10:11:12.012 + + +-- !query +select to_timestamp("2019-10-06T10:11:12'", "yyyy-MM-dd'T'HH:mm:ss''") +-- !query schema +struct +-- !query output +2019-10-06 10:11:12 + + +-- !query +select to_timestamp("'2019-10-06T10:11:12", "''yyyy-MM-dd'T'HH:mm:ss") +-- !query schema +struct +-- !query output +2019-10-06 10:11:12 + + +-- !query +select to_timestamp("P2019-10-06T10:11:12", "'P'yyyy-MM-dd'T'HH:mm:ss") +-- !query schema +struct +-- !query output +2019-10-06 10:11:12 + + +-- !query +select date_format(date '2020-05-23', 'GGGGG') +-- !query schema +struct +-- !query output +AD + + +-- !query +select date_format(date '2020-05-23', 'MMMMM') +-- !query schema +struct +-- !query output +May + + +-- !query +select date_format(date '2020-05-23', 'LLLLL') +-- !query schema +struct +-- !query output +May + + +-- !query +select date_format(date '2020-05-23', 'EEEEE') +-- !query schema +struct +-- !query output +Saturday + + +-- !query +select date_format(date '2020-05-23', 'uuuuu') +-- !query schema +struct +-- !query output +00006 + + +-- !query +select date_format(date '2020-05-23', 'QQQQQ') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Illegal pattern character 'Q' + + +-- !query +select date_format(date '2020-05-23', 'qqqqq') +-- !query schema +struct<> +-- !query output +java.lang.IllegalArgumentException +Illegal pattern character 'q' + + +-- !query +select to_timestamp('2019-10-06 A', 'yyyy-MM-dd GGGGG') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEEE') +-- !query schema +struct +-- !query output +2020-05-22 00:00:00 + + +-- !query +select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') +-- !query schema +struct +-- !query output +2020-05-22 00:00:00 + + +-- !query +select unix_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') +-- !query schema +struct +-- !query output +1590130800 + + +-- !query +select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) +-- !query schema +struct> +-- !query output +{"time":2015-10-26 00:00:00} + + +-- !query +select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) +-- !query schema +struct> +-- !query output +{"time":2015-10-26 00:00:00} diff --git a/sql/core/src/test/resources/sql-tests/results/datetime.sql.out b/sql/core/src/test/resources/sql-tests/results/datetime.sql.out index d578d31a865e..f9bcceb17dd9 100755 --- a/sql/core/src/test/resources/sql-tests/results/datetime.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/datetime.sql.out @@ -1,5 +1,5 @@ -- Automatically generated by SQLQueryTestSuite --- Number of queries: 92 +-- Number of queries: 98 -- !query @@ -709,8 +709,8 @@ select date_format(date '2020-05-23', 'GGGGG') -- !query schema struct<> -- !query output -java.lang.IllegalArgumentException -Too many pattern letters: G +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -718,8 +718,8 @@ select date_format(date '2020-05-23', 'MMMMM') -- !query schema struct<> -- !query output -java.lang.IllegalArgumentException -Too many pattern letters: M +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -727,8 +727,8 @@ select date_format(date '2020-05-23', 'LLLLL') -- !query schema struct<> -- !query output -java.lang.IllegalArgumentException -Too many pattern letters: L +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'LLLLL' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -736,8 +736,8 @@ select date_format(date '2020-05-23', 'EEEEE') -- !query schema struct<> -- !query output -java.lang.IllegalArgumentException -Too many pattern letters: E +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -745,8 +745,8 @@ select date_format(date '2020-05-23', 'uuuuu') -- !query schema struct<> -- !query output -java.lang.IllegalArgumentException -Too many pattern letters: u +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'uuuuu' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -754,8 +754,8 @@ select date_format(date '2020-05-23', 'QQQQQ') -- !query schema struct<> -- !query output -java.lang.IllegalArgumentException -Too many pattern letters: Q +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'QQQQQ' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -763,5 +763,59 @@ select date_format(date '2020-05-23', 'qqqqq') -- !query schema struct<> -- !query output -java.lang.IllegalArgumentException -Too many pattern letters: q +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'qqqqq' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select to_timestamp('2019-10-06 A', 'yyyy-MM-dd GGGGG') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEEE') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select unix_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html From d5b5a9cb43397841abe56a428b800c7b8f7fcc4e Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Thu, 21 May 2020 18:27:26 +0800 Subject: [PATCH 03/15] address comments --- .../sql/catalyst/util/DateFormatter.scala | 25 +++++++++++++++--- .../util/DateTimeFormatterHelper.scala | 7 +---- .../catalyst/util/TimestampFormatter.scala | 26 ++++++++++++++++--- .../sql-tests/results/ansi/datetime.sql.out | 20 +++++++------- .../results/datetime-corrected.sql.out | 20 +++++++------- .../sql-tests/results/datetime.sql.out | 20 +++++++------- 6 files changed, 75 insertions(+), 43 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala index 0f79c1a6a751..60b69ef04c43 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala @@ -21,8 +21,11 @@ import java.text.SimpleDateFormat import java.time.{LocalDate, ZoneId} import java.util.{Date, Locale} +import scala.util.control.NonFatal + import org.apache.commons.lang3.time.FastDateFormat +import org.apache.spark.SparkUpgradeException import org.apache.spark.sql.catalyst.util.DateTimeUtils._ import org.apache.spark.sql.internal.SQLConf import org.apache.spark.sql.internal.SQLConf.LegacyBehaviorPolicy._ @@ -40,7 +43,23 @@ class Iso8601DateFormatter( extends DateFormatter with DateTimeFormatterHelper { @transient - private lazy val formatter = getOrCreateFormatter(pattern, locale) + private lazy val formatter = { + try { + getOrCreateFormatter(pattern, locale) + } catch { + case e: IllegalArgumentException => + try { + legacyFormatter + } catch { + case NonFatal(_) => throw e + } + throw new SparkUpgradeException("3.0", s"Fail to recognize '$pattern' pattern in the" + + s" new parser. 1) You can set ${SQLConf.LEGACY_TIME_PARSER_POLICY.key} to LEGACY to" + + s" restore the behavior before Spark 3.0." + + s" 2) You can form a valid datetime pattern with the guide from" + + s" https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html", e) + } + } @transient private lazy val legacyFormatter = DateFormatter.getLegacyFormatter( @@ -77,14 +96,14 @@ trait LegacyDateFormatter extends DateFormatter { class LegacyFastDateFormatter(pattern: String, locale: Locale) extends LegacyDateFormatter { @transient - private lazy val fdf = FastDateFormat.getInstance(pattern, locale) + private val fdf = FastDateFormat.getInstance(pattern, locale) override def parseToDate(s: String): Date = fdf.parse(s) override def formatDate(d: Date): String = fdf.format(d) } class LegacySimpleDateFormatter(pattern: String, locale: Locale) extends LegacyDateFormatter { @transient - private lazy val sdf = new SimpleDateFormat(pattern, locale) + private val sdf = new SimpleDateFormat(pattern, locale) override def parseToDate(s: String): Date = sdf.parse(s) override def formatDate(d: Date): String = sdf.format(d) } diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala index ecb89963fade..28f1faf63e6b 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala @@ -187,12 +187,7 @@ private object DateTimeFormatterHelper { throw new IllegalArgumentException(s"Illegal pattern character: $c") } for (style <- unsupportedNarrowTextStyle if patternPart.contains(style)) { - val e = new IllegalArgumentException(s"Too many pattern letters: ${style.head}") - throw new SparkUpgradeException("3.0", s"Fail to recognize '$style' pattern in the" + - s" new parser. 1) You can set ${SQLConf.LEGACY_TIME_PARSER_POLICY.key} to LEGACY to" + - s" restore the behavior before Spark 3.0." + - s" 2) You can form a valid datetime pattern with the guide from" + - s" https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html", e) + throw new IllegalArgumentException(s"Too many pattern letters: ${style.head}") } // The meaning of 'u' was day number of week in SimpleDateFormat, it was changed to year // in DateTimeFormatter. Substitute 'u' to 'e' and use DateTimeFormatter to parse the diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala index dc06fa9d6f1c..67b9d86632d0 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala @@ -26,8 +26,11 @@ import java.time.temporal.TemporalQueries import java.util.{Calendar, GregorianCalendar, Locale, TimeZone} import java.util.concurrent.TimeUnit.SECONDS +import scala.util.control.NonFatal + import org.apache.commons.lang3.time.FastDateFormat +import org.apache.spark.SparkUpgradeException import org.apache.spark.sql.catalyst.util.DateTimeConstants._ import org.apache.spark.sql.catalyst.util.DateTimeUtils._ import org.apache.spark.sql.catalyst.util.LegacyDateFormats.{LegacyDateFormat, LENIENT_SIMPLE_DATE_FORMAT} @@ -61,8 +64,23 @@ class Iso8601TimestampFormatter( needVarLengthSecondFraction: Boolean) extends TimestampFormatter with DateTimeFormatterHelper { @transient - protected lazy val formatter: DateTimeFormatter = - getOrCreateFormatter(pattern, locale, needVarLengthSecondFraction) + protected lazy val formatter: DateTimeFormatter = { + try { + getOrCreateFormatter(pattern, locale, needVarLengthSecondFraction) + } catch { + case e: IllegalArgumentException => + try { + legacyFormatter + } catch { + case NonFatal(_) => throw e + } + throw new SparkUpgradeException("3.0", s"Fail to recognize '$pattern' pattern in the" + + s" new parser. 1) You can set ${SQLConf.LEGACY_TIME_PARSER_POLICY.key} to LEGACY to" + + s" restore the behavior before Spark 3.0." + + s" 2) You can form a valid datetime pattern with the guide from" + + s" https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html", e) + } + } @transient protected lazy val legacyFormatter = TimestampFormatter.getLegacyFormatter( @@ -143,7 +161,7 @@ class LegacyFastTimestampFormatter( zoneId: ZoneId, locale: Locale) extends TimestampFormatter { - @transient private lazy val fastDateFormat = + @transient private val fastDateFormat = FastDateFormat.getInstance(pattern, TimeZone.getTimeZone(zoneId), locale) @transient private lazy val cal = new MicrosCalendar( fastDateFormat.getTimeZone, @@ -173,7 +191,7 @@ class LegacySimpleTimestampFormatter( zoneId: ZoneId, locale: Locale, lenient: Boolean = true) extends TimestampFormatter { - @transient private lazy val sdf = { + @transient private val sdf = { val formatter = new SimpleDateFormat(pattern, locale) formatter.setTimeZone(TimeZone.getTimeZone(zoneId)) formatter.setLenient(lenient) diff --git a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out index 74171b6d6c9f..179baf7db479 100644 --- a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out @@ -782,8 +782,8 @@ select date_format(date '2020-05-23', 'QQQQQ') -- !query schema struct<> -- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'QQQQQ' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +java.lang.IllegalArgumentException +Too many pattern letters: Q -- !query @@ -791,8 +791,8 @@ select date_format(date '2020-05-23', 'qqqqq') -- !query schema struct<> -- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'qqqqq' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +java.lang.IllegalArgumentException +Too many pattern letters: q -- !query @@ -801,7 +801,7 @@ select to_timestamp('2019-10-06 A', 'yyyy-MM-dd GGGGG') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'yyyy-MM-dd GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -810,7 +810,7 @@ select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -819,7 +819,7 @@ select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -828,7 +828,7 @@ select unix_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -837,7 +837,7 @@ select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampF struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -846,4 +846,4 @@ select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/ struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html diff --git a/sql/core/src/test/resources/sql-tests/results/datetime-corrected.sql.out b/sql/core/src/test/resources/sql-tests/results/datetime-corrected.sql.out index f9bcceb17dd9..2c29ab16c92b 100644 --- a/sql/core/src/test/resources/sql-tests/results/datetime-corrected.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/datetime-corrected.sql.out @@ -754,8 +754,8 @@ select date_format(date '2020-05-23', 'QQQQQ') -- !query schema struct<> -- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'QQQQQ' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +java.lang.IllegalArgumentException +Too many pattern letters: Q -- !query @@ -763,8 +763,8 @@ select date_format(date '2020-05-23', 'qqqqq') -- !query schema struct<> -- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'qqqqq' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +java.lang.IllegalArgumentException +Too many pattern letters: q -- !query @@ -773,7 +773,7 @@ select to_timestamp('2019-10-06 A', 'yyyy-MM-dd GGGGG') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'yyyy-MM-dd GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -782,7 +782,7 @@ select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -791,7 +791,7 @@ select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -800,7 +800,7 @@ select unix_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -809,7 +809,7 @@ select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampF struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -818,4 +818,4 @@ select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/ struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html diff --git a/sql/core/src/test/resources/sql-tests/results/datetime.sql.out b/sql/core/src/test/resources/sql-tests/results/datetime.sql.out index f9bcceb17dd9..2c29ab16c92b 100755 --- a/sql/core/src/test/resources/sql-tests/results/datetime.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/datetime.sql.out @@ -754,8 +754,8 @@ select date_format(date '2020-05-23', 'QQQQQ') -- !query schema struct<> -- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'QQQQQ' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +java.lang.IllegalArgumentException +Too many pattern letters: Q -- !query @@ -763,8 +763,8 @@ select date_format(date '2020-05-23', 'qqqqq') -- !query schema struct<> -- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'qqqqq' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +java.lang.IllegalArgumentException +Too many pattern letters: q -- !query @@ -773,7 +773,7 @@ select to_timestamp('2019-10-06 A', 'yyyy-MM-dd GGGGG') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'yyyy-MM-dd GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -782,7 +782,7 @@ select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -791,7 +791,7 @@ select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -800,7 +800,7 @@ select unix_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -809,7 +809,7 @@ select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampF struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -818,4 +818,4 @@ select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/ struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html From 1d31ed2fcd4da8baedc324af75d851ea3885e3da Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Thu, 21 May 2020 18:29:02 +0800 Subject: [PATCH 04/15] fix test --- .../sql/catalyst/util/DateTimeFormatterHelperSuite.scala | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelperSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelperSuite.scala index ed38194fd619..caf7bdde1012 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelperSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelperSuite.scala @@ -41,14 +41,14 @@ class DateTimeFormatterHelperSuite extends SparkFunSuite { assert(e.getMessage === s"Illegal pattern character: $l") } unsupportedNarrowTextStyle.foreach { style => - val e1 = intercept[SparkUpgradeException] { + val e1 = intercept[IllegalArgumentException] { convertIncompatiblePattern(s"yyyy-MM-dd $style") } - assert(e1.getCause.getMessage === s"Too many pattern letters: ${style.head}") - val e2 = intercept[SparkUpgradeException] { + assert(e1.getMessage === s"Too many pattern letters: ${style.head}") + val e2 = intercept[IllegalArgumentException] { convertIncompatiblePattern(s"yyyy-MM-dd $style${style.head}") } - assert(e2.getCause.getMessage === s"Too many pattern letters: ${style.head}") + assert(e2.getMessage === s"Too many pattern letters: ${style.head}") } assert(convertIncompatiblePattern("yyyy-MM-dd uuuu") === "uuuu-MM-dd eeee") assert(convertIncompatiblePattern("yyyy-MM-dd EEEE") === "uuuu-MM-dd EEEE") From 1144c035fc55b78e004dd0d39c9b7df9a638e74e Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Thu, 21 May 2020 19:21:10 +0800 Subject: [PATCH 05/15] fix tests --- .../spark/sql/catalyst/expressions/DateExpressionsSuite.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/DateExpressionsSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/DateExpressionsSuite.scala index 6e8397d12da7..12f0515c68d2 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/DateExpressionsSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/DateExpressionsSuite.scala @@ -267,7 +267,7 @@ class DateExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper { // Test escaping of format GenerateUnsafeProjection.generate( - DateFormatClass(Literal(ts), Literal("\"quote"), JST_OPT) :: Nil) + DateFormatClass(Literal(ts), Literal("\""), JST_OPT) :: Nil) // SPARK-28072 The codegen path should work checkEvaluation( From 549a1225412756a2492c4c36191d8f18b845a23f Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Thu, 21 May 2020 22:41:39 +0800 Subject: [PATCH 06/15] refine --- .../sql/catalyst/util/DateFormatter.scala | 24 +- .../util/DateTimeFormatterHelper.scala | 19 + .../catalyst/util/TimestampFormatter.scala | 22 +- .../sql/util/TimestampFormatterSuite.scala | 11 + .../sql-tests/inputs/datetime-corrected.sql | 2 - .../sql-tests/results/ansi/datetime.sql.out | 8 +- .../results/datetime-corrected.sql.out | 821 ------------------ .../sql-tests/results/datetime.sql.out | 8 +- .../apache/spark/sql/DateFunctionsSuite.scala | 17 +- 9 files changed, 58 insertions(+), 874 deletions(-) delete mode 100644 sql/core/src/test/resources/sql-tests/inputs/datetime-corrected.sql delete mode 100644 sql/core/src/test/resources/sql-tests/results/datetime-corrected.sql.out diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala index 60b69ef04c43..28af9dccfabc 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala @@ -19,13 +19,11 @@ package org.apache.spark.sql.catalyst.util import java.text.SimpleDateFormat import java.time.{LocalDate, ZoneId} +import java.time.format.DateTimeFormatter import java.util.{Date, Locale} -import scala.util.control.NonFatal - import org.apache.commons.lang3.time.FastDateFormat -import org.apache.spark.SparkUpgradeException import org.apache.spark.sql.catalyst.util.DateTimeUtils._ import org.apache.spark.sql.internal.SQLConf import org.apache.spark.sql.internal.SQLConf.LegacyBehaviorPolicy._ @@ -43,22 +41,10 @@ class Iso8601DateFormatter( extends DateFormatter with DateTimeFormatterHelper { @transient - private lazy val formatter = { + private lazy val formatter: DateTimeFormatter = { try { getOrCreateFormatter(pattern, locale) - } catch { - case e: IllegalArgumentException => - try { - legacyFormatter - } catch { - case NonFatal(_) => throw e - } - throw new SparkUpgradeException("3.0", s"Fail to recognize '$pattern' pattern in the" + - s" new parser. 1) You can set ${SQLConf.LEGACY_TIME_PARSER_POLICY.key} to LEGACY to" + - s" restore the behavior before Spark 3.0." + - s" 2) You can form a valid datetime pattern with the guide from" + - s" https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html", e) - } + } catch checkLegacyFormatter(pattern, legacyFormatter.format(0)) } @transient @@ -96,14 +82,14 @@ trait LegacyDateFormatter extends DateFormatter { class LegacyFastDateFormatter(pattern: String, locale: Locale) extends LegacyDateFormatter { @transient - private val fdf = FastDateFormat.getInstance(pattern, locale) + private lazy val fdf = FastDateFormat.getInstance(pattern, locale) override def parseToDate(s: String): Date = fdf.parse(s) override def formatDate(d: Date): String = fdf.format(d) } class LegacySimpleDateFormatter(pattern: String, locale: Locale) extends LegacyDateFormatter { @transient - private val sdf = new SimpleDateFormat(pattern, locale) + private lazy val sdf = new SimpleDateFormat(pattern, locale) override def parseToDate(s: String): Date = sdf.parse(s) override def formatDate(d: Date): String = sdf.format(d) } diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala index 28f1faf63e6b..0365e93f99ef 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala @@ -86,6 +86,25 @@ trait DateTimeFormatterHelper { throw e } } + + // When the new DateTimeFormatter failed to initialize because of invalid datetime pattern, it + // will throw IllegalArgumentException. If the pattern can be recognized by the legacy formatter + // it will raise SparkUpgradeException to tell users to restore the previous behavior via LEGACY + // policy or follow our guide to correct their pattern. + protected def checkLegacyFormatter[T1, T2]( + pattern: String, + block: T1 => T2): PartialFunction[Throwable, DateTimeFormatter] = { + case e: IllegalArgumentException => + try { + block + } catch { + case _: Throwable => throw e + } + throw new SparkUpgradeException("3.0", s"Fail to recognize '$pattern' pattern in the" + + s" new parser. 1) You can set ${SQLConf.LEGACY_TIME_PARSER_POLICY.key} to LEGACY to" + + s" restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with" + + s" the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html", e) + } } private object DateTimeFormatterHelper { diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala index 67b9d86632d0..ef661b0ace25 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala @@ -26,11 +26,8 @@ import java.time.temporal.TemporalQueries import java.util.{Calendar, GregorianCalendar, Locale, TimeZone} import java.util.concurrent.TimeUnit.SECONDS -import scala.util.control.NonFatal - import org.apache.commons.lang3.time.FastDateFormat -import org.apache.spark.SparkUpgradeException import org.apache.spark.sql.catalyst.util.DateTimeConstants._ import org.apache.spark.sql.catalyst.util.DateTimeUtils._ import org.apache.spark.sql.catalyst.util.LegacyDateFormats.{LegacyDateFormat, LENIENT_SIMPLE_DATE_FORMAT} @@ -63,23 +60,12 @@ class Iso8601TimestampFormatter( legacyFormat: LegacyDateFormat = LENIENT_SIMPLE_DATE_FORMAT, needVarLengthSecondFraction: Boolean) extends TimestampFormatter with DateTimeFormatterHelper { + @transient protected lazy val formatter: DateTimeFormatter = { try { getOrCreateFormatter(pattern, locale, needVarLengthSecondFraction) - } catch { - case e: IllegalArgumentException => - try { - legacyFormatter - } catch { - case NonFatal(_) => throw e - } - throw new SparkUpgradeException("3.0", s"Fail to recognize '$pattern' pattern in the" + - s" new parser. 1) You can set ${SQLConf.LEGACY_TIME_PARSER_POLICY.key} to LEGACY to" + - s" restore the behavior before Spark 3.0." + - s" 2) You can form a valid datetime pattern with the guide from" + - s" https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html", e) - } + } catch checkLegacyFormatter(pattern, legacyFormatter.format(0)) } @transient @@ -161,7 +147,7 @@ class LegacyFastTimestampFormatter( zoneId: ZoneId, locale: Locale) extends TimestampFormatter { - @transient private val fastDateFormat = + @transient private lazy val fastDateFormat = FastDateFormat.getInstance(pattern, TimeZone.getTimeZone(zoneId), locale) @transient private lazy val cal = new MicrosCalendar( fastDateFormat.getTimeZone, @@ -191,7 +177,7 @@ class LegacySimpleTimestampFormatter( zoneId: ZoneId, locale: Locale, lenient: Boolean = true) extends TimestampFormatter { - @transient private val sdf = { + @transient private lazy val sdf = { val formatter = new SimpleDateFormat(pattern, locale) formatter.setTimeZone(TimeZone.getTimeZone(zoneId)) formatter.setLenient(lenient) diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/util/TimestampFormatterSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/util/TimestampFormatterSuite.scala index 5d27a6b8cce1..309ff5ebd023 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/util/TimestampFormatterSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/util/TimestampFormatterSuite.scala @@ -291,4 +291,15 @@ class TimestampFormatterSuite extends SparkFunSuite with SQLHelper with Matchers } } } + + test("explicitly forbidden datetime patterns") { + // not support by the legacy one too + Seq("QQQQQ", "qqqqq", "A", "c", "e", "n", "N", "p").foreach { pattern => + intercept[IllegalArgumentException](TimestampFormatter(pattern, ZoneOffset.UTC).format(0)) + } + // supported by the legacy one, then we will suggest users with + Seq("GGGGG", "MMMMM", "LLLLL", "EEEEE", "uuuuu").foreach { pattern => + intercept[SparkUpgradeException](TimestampFormatter(pattern, ZoneOffset.UTC).format(0)) + } + } } diff --git a/sql/core/src/test/resources/sql-tests/inputs/datetime-corrected.sql b/sql/core/src/test/resources/sql-tests/inputs/datetime-corrected.sql deleted file mode 100644 index 0b2386070a9b..000000000000 --- a/sql/core/src/test/resources/sql-tests/inputs/datetime-corrected.sql +++ /dev/null @@ -1,2 +0,0 @@ ---SET spark.sql.legacy.timeParserPolicy=CORRECTED ---IMPORT datetime.sql diff --git a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out index 179baf7db479..8700f5d45891 100644 --- a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out @@ -680,7 +680,7 @@ select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uuee') struct<> -- !query output java.lang.IllegalArgumentException -Illegal pattern character: e +Illegal pattern character 'e' -- !query @@ -689,7 +689,7 @@ select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uucc') struct<> -- !query output java.lang.IllegalArgumentException -Illegal pattern character: c +Illegal pattern character 'c' -- !query @@ -783,7 +783,7 @@ select date_format(date '2020-05-23', 'QQQQQ') struct<> -- !query output java.lang.IllegalArgumentException -Too many pattern letters: Q +Illegal pattern character 'Q' -- !query @@ -792,7 +792,7 @@ select date_format(date '2020-05-23', 'qqqqq') struct<> -- !query output java.lang.IllegalArgumentException -Too many pattern letters: q +Illegal pattern character 'q' -- !query diff --git a/sql/core/src/test/resources/sql-tests/results/datetime-corrected.sql.out b/sql/core/src/test/resources/sql-tests/results/datetime-corrected.sql.out deleted file mode 100644 index 2c29ab16c92b..000000000000 --- a/sql/core/src/test/resources/sql-tests/results/datetime-corrected.sql.out +++ /dev/null @@ -1,821 +0,0 @@ --- Automatically generated by SQLQueryTestSuite --- Number of queries: 98 - - --- !query -select current_date = current_date(), current_timestamp = current_timestamp() --- !query schema -struct<(current_date() = current_date()):boolean,(current_timestamp() = current_timestamp()):boolean> --- !query output -true true - - --- !query -select to_date(null), to_date('2016-12-31'), to_date('2016-12-31', 'yyyy-MM-dd') --- !query schema -struct --- !query output -NULL 2016-12-31 2016-12-31 - - --- !query -select to_timestamp(null), to_timestamp('2016-12-31 00:12:00'), to_timestamp('2016-12-31', 'yyyy-MM-dd') --- !query schema -struct --- !query output -NULL 2016-12-31 00:12:00 2016-12-31 00:00:00 - - --- !query -select dayofweek('2007-02-03'), dayofweek('2009-07-30'), dayofweek('2017-05-27'), dayofweek(null), dayofweek('1582-10-15 13:10:15') --- !query schema -struct --- !query output -7 5 7 NULL 6 - - --- !query -create temporary view ttf1 as select * from values - (1, 2), - (2, 3) - as ttf1(current_date, current_timestamp) --- !query schema -struct<> --- !query output - - - --- !query -select current_date, current_timestamp from ttf1 --- !query schema -struct --- !query output -1 2 -2 3 - - --- !query -create temporary view ttf2 as select * from values - (1, 2), - (2, 3) - as ttf2(a, b) --- !query schema -struct<> --- !query output - - - --- !query -select current_date = current_date(), current_timestamp = current_timestamp(), a, b from ttf2 --- !query schema -struct<(current_date() = current_date()):boolean,(current_timestamp() = current_timestamp()):boolean,a:int,b:int> --- !query output -true true 1 2 -true true 2 3 - - --- !query -select a, b from ttf2 order by a, current_date --- !query schema -struct --- !query output -1 2 -2 3 - - --- !query -select weekday('2007-02-03'), weekday('2009-07-30'), weekday('2017-05-27'), weekday(null), weekday('1582-10-15 13:10:15') --- !query schema -struct --- !query output -5 3 5 NULL 4 - - --- !query -select year('1500-01-01'), month('1500-01-01'), dayOfYear('1500-01-01') --- !query schema -struct --- !query output -1500 1 1 - - --- !query -select date '2019-01-01\t' --- !query schema -struct --- !query output -2019-01-01 - - --- !query -select timestamp '2019-01-01\t' --- !query schema -struct --- !query output -2019-01-01 00:00:00 - - --- !query -select timestamp'2011-11-11 11:11:11' + interval '2' day --- !query schema -struct --- !query output -2011-11-13 11:11:11 - - --- !query -select timestamp'2011-11-11 11:11:11' - interval '2' day --- !query schema -struct --- !query output -2011-11-09 11:11:11 - - --- !query -select date'2011-11-11 11:11:11' + interval '2' second --- !query schema -struct --- !query output -2011-11-11 - - --- !query -select date'2011-11-11 11:11:11' - interval '2' second --- !query schema -struct --- !query output -2011-11-10 - - --- !query -select '2011-11-11' - interval '2' day --- !query schema -struct --- !query output -2011-11-09 00:00:00 - - --- !query -select '2011-11-11 11:11:11' - interval '2' second --- !query schema -struct --- !query output -2011-11-11 11:11:09 - - --- !query -select '1' - interval '2' second --- !query schema -struct --- !query output -NULL - - --- !query -select 1 - interval '2' second --- !query schema -struct<> --- !query output -org.apache.spark.sql.AnalysisException -cannot resolve '1 + (- INTERVAL '2 seconds')' due to data type mismatch: argument 1 requires timestamp type, however, '1' is of int type.; line 1 pos 7 - - --- !query -select date'2020-01-01' - timestamp'2019-10-06 10:11:12.345678' --- !query schema -struct --- !query output -2078 hours 48 minutes 47.654322 seconds - - --- !query -select timestamp'2019-10-06 10:11:12.345678' - date'2020-01-01' --- !query schema -struct --- !query output --2078 hours -48 minutes -47.654322 seconds - - --- !query -select timestamp'2019-10-06 10:11:12.345678' - null --- !query schema -struct --- !query output -NULL - - --- !query -select null - timestamp'2019-10-06 10:11:12.345678' --- !query schema -struct --- !query output -NULL - - --- !query -select date_add('2011-11-11', 1Y) --- !query schema -struct --- !query output -2011-11-12 - - --- !query -select date_add('2011-11-11', 1S) --- !query schema -struct --- !query output -2011-11-12 - - --- !query -select date_add('2011-11-11', 1) --- !query schema -struct --- !query output -2011-11-12 - - --- !query -select date_add('2011-11-11', 1L) --- !query schema -struct<> --- !query output -org.apache.spark.sql.AnalysisException -cannot resolve 'date_add(CAST('2011-11-11' AS DATE), 1L)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, '1L' is of bigint type.; line 1 pos 7 - - --- !query -select date_add('2011-11-11', 1.0) --- !query schema -struct<> --- !query output -org.apache.spark.sql.AnalysisException -cannot resolve 'date_add(CAST('2011-11-11' AS DATE), 1.0BD)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, '1.0BD' is of decimal(2,1) type.; line 1 pos 7 - - --- !query -select date_add('2011-11-11', 1E1) --- !query schema -struct<> --- !query output -org.apache.spark.sql.AnalysisException -cannot resolve 'date_add(CAST('2011-11-11' AS DATE), 10.0D)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, '10.0D' is of double type.; line 1 pos 7 - - --- !query -select date_add('2011-11-11', '1') --- !query schema -struct --- !query output -2011-11-12 - - --- !query -select date_add('2011-11-11', '1.2') --- !query schema -struct<> --- !query output -org.apache.spark.sql.AnalysisException -The second argument of 'date_add' function needs to be an integer.; - - --- !query -select date_add(date'2011-11-11', 1) --- !query schema -struct --- !query output -2011-11-12 - - --- !query -select date_add(timestamp'2011-11-11', 1) --- !query schema -struct --- !query output -2011-11-12 - - --- !query -select date_sub(date'2011-11-11', 1) --- !query schema -struct --- !query output -2011-11-10 - - --- !query -select date_sub(date'2011-11-11', '1') --- !query schema -struct --- !query output -2011-11-10 - - --- !query -select date_sub(date'2011-11-11', '1.2') --- !query schema -struct<> --- !query output -org.apache.spark.sql.AnalysisException -The second argument of 'date_sub' function needs to be an integer.; - - --- !query -select date_sub(timestamp'2011-11-11', 1) --- !query schema -struct --- !query output -2011-11-10 - - --- !query -select date_sub(null, 1) --- !query schema -struct --- !query output -NULL - - --- !query -select date_sub(date'2011-11-11', null) --- !query schema -struct --- !query output -NULL - - --- !query -select date'2011-11-11' + 1E1 --- !query schema -struct<> --- !query output -org.apache.spark.sql.AnalysisException -cannot resolve 'date_add(DATE '2011-11-11', 10.0D)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, '10.0D' is of double type.; line 1 pos 7 - - --- !query -select date'2011-11-11' + '1' --- !query schema -struct<> --- !query output -org.apache.spark.sql.AnalysisException -cannot resolve 'date_add(DATE '2011-11-11', CAST('1' AS DOUBLE))' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, 'CAST('1' AS DOUBLE)' is of double type.; line 1 pos 7 - - --- !query -select null + date '2001-09-28' --- !query schema -struct --- !query output -NULL - - --- !query -select date '2001-09-28' + 7Y --- !query schema -struct --- !query output -2001-10-05 - - --- !query -select 7S + date '2001-09-28' --- !query schema -struct --- !query output -2001-10-05 - - --- !query -select date '2001-10-01' - 7 --- !query schema -struct --- !query output -2001-09-24 - - --- !query -select date '2001-10-01' - '7' --- !query schema -struct<> --- !query output -org.apache.spark.sql.AnalysisException -cannot resolve 'date_sub(DATE '2001-10-01', CAST('7' AS DOUBLE))' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, 'CAST('7' AS DOUBLE)' is of double type.; line 1 pos 7 - - --- !query -select date '2001-09-28' + null --- !query schema -struct --- !query output -NULL - - --- !query -select date '2001-09-28' - null --- !query schema -struct --- !query output -NULL - - --- !query -create temp view v as select '1' str --- !query schema -struct<> --- !query output - - - --- !query -select date_add('2011-11-11', str) from v --- !query schema -struct<> --- !query output -org.apache.spark.sql.AnalysisException -cannot resolve 'date_add(CAST('2011-11-11' AS DATE), v.`str`)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, 'v.`str`' is of string type.; line 1 pos 7 - - --- !query -select date_sub('2011-11-11', str) from v --- !query schema -struct<> --- !query output -org.apache.spark.sql.AnalysisException -cannot resolve 'date_sub(CAST('2011-11-11' AS DATE), v.`str`)' due to data type mismatch: argument 2 requires (int or smallint or tinyint) type, however, 'v.`str`' is of string type.; line 1 pos 7 - - --- !query -select null - date '2019-10-06' --- !query schema -struct --- !query output -NULL - - --- !query -select date '2001-10-01' - date '2001-09-28' --- !query schema -struct --- !query output -3 days - - --- !query -select to_timestamp('2019-10-06 10:11:12.', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') --- !query schema -struct --- !query output -NULL - - --- !query -select to_timestamp('2019-10-06 10:11:12.0', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') --- !query schema -struct --- !query output -2019-10-06 10:11:12 - - --- !query -select to_timestamp('2019-10-06 10:11:12.1', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') --- !query schema -struct --- !query output -2019-10-06 10:11:12.1 - - --- !query -select to_timestamp('2019-10-06 10:11:12.12', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') --- !query schema -struct --- !query output -2019-10-06 10:11:12.12 - - --- !query -select to_timestamp('2019-10-06 10:11:12.123UTC', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') --- !query schema -struct --- !query output -2019-10-06 03:11:12.123 - - --- !query -select to_timestamp('2019-10-06 10:11:12.1234', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') --- !query schema -struct --- !query output -2019-10-06 10:11:12.1234 - - --- !query -select to_timestamp('2019-10-06 10:11:12.12345CST', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') --- !query schema -struct --- !query output -2019-10-06 08:11:12.12345 - - --- !query -select to_timestamp('2019-10-06 10:11:12.123456PST', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') --- !query schema -struct --- !query output -2019-10-06 10:11:12.123456 - - --- !query -select to_timestamp('2019-10-06 10:11:12.1234567PST', 'yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') --- !query schema -struct --- !query output -NULL - - --- !query -select to_timestamp('123456 2019-10-06 10:11:12.123456PST', 'SSSSSS yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') --- !query schema -struct --- !query output -2019-10-06 10:11:12.123456 - - --- !query -select to_timestamp('223456 2019-10-06 10:11:12.123456PST', 'SSSSSS yyyy-MM-dd HH:mm:ss.SSSSSS[zzz]') --- !query schema -struct --- !query output -NULL - - --- !query -select to_timestamp('2019-10-06 10:11:12.1234', 'yyyy-MM-dd HH:mm:ss.[SSSSSS]') --- !query schema -struct --- !query output -2019-10-06 10:11:12.1234 - - --- !query -select to_timestamp('2019-10-06 10:11:12.123', 'yyyy-MM-dd HH:mm:ss[.SSSSSS]') --- !query schema -struct --- !query output -2019-10-06 10:11:12.123 - - --- !query -select to_timestamp('2019-10-06 10:11:12', 'yyyy-MM-dd HH:mm:ss[.SSSSSS]') --- !query schema -struct --- !query output -2019-10-06 10:11:12 - - --- !query -select to_timestamp('2019-10-06 10:11:12.12', 'yyyy-MM-dd HH:mm[:ss.SSSSSS]') --- !query schema -struct --- !query output -2019-10-06 10:11:12.12 - - --- !query -select to_timestamp('2019-10-06 10:11', 'yyyy-MM-dd HH:mm[:ss.SSSSSS]') --- !query schema -struct --- !query output -2019-10-06 10:11:00 - - --- !query -select to_timestamp("2019-10-06S10:11:12.12345", "yyyy-MM-dd'S'HH:mm:ss.SSSSSS") --- !query schema -struct --- !query output -2019-10-06 10:11:12.12345 - - --- !query -select to_timestamp("12.12342019-10-06S10:11", "ss.SSSSyyyy-MM-dd'S'HH:mm") --- !query schema -struct --- !query output -2019-10-06 10:11:12.1234 - - --- !query -select to_timestamp("12.1232019-10-06S10:11", "ss.SSSSyyyy-MM-dd'S'HH:mm") --- !query schema -struct --- !query output -NULL - - --- !query -select to_timestamp("12.1232019-10-06S10:11", "ss.SSSSyy-MM-dd'S'HH:mm") --- !query schema -struct --- !query output -NULL - - --- !query -select to_timestamp("12.1234019-10-06S10:11", "ss.SSSSy-MM-dd'S'HH:mm") --- !query schema -struct --- !query output -0019-10-06 10:11:12.1234 - - --- !query -select to_timestamp("2019-10-06S", "yyyy-MM-dd'S'") --- !query schema -struct --- !query output -2019-10-06 00:00:00 - - --- !query -select to_timestamp("S2019-10-06", "'S'yyyy-MM-dd") --- !query schema -struct --- !query output -2019-10-06 00:00:00 - - --- !query -select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uuee') --- !query schema -struct<> --- !query output -java.lang.IllegalArgumentException -Illegal pattern character: e - - --- !query -select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uucc') --- !query schema -struct<> --- !query output -java.lang.IllegalArgumentException -Illegal pattern character: c - - --- !query -select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uuuu') --- !query schema -struct --- !query output -2019-10-06 Sunday - - --- !query -select to_timestamp("2019-10-06T10:11:12'12", "yyyy-MM-dd'T'HH:mm:ss''SSSS") --- !query schema -struct --- !query output -2019-10-06 10:11:12.12 - - --- !query -select to_timestamp("2019-10-06T10:11:12'", "yyyy-MM-dd'T'HH:mm:ss''") --- !query schema -struct --- !query output -2019-10-06 10:11:12 - - --- !query -select to_timestamp("'2019-10-06T10:11:12", "''yyyy-MM-dd'T'HH:mm:ss") --- !query schema -struct --- !query output -2019-10-06 10:11:12 - - --- !query -select to_timestamp("P2019-10-06T10:11:12", "'P'yyyy-MM-dd'T'HH:mm:ss") --- !query schema -struct --- !query output -2019-10-06 10:11:12 - - --- !query -select date_format(date '2020-05-23', 'GGGGG') --- !query schema -struct<> --- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html - - --- !query -select date_format(date '2020-05-23', 'MMMMM') --- !query schema -struct<> --- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html - - --- !query -select date_format(date '2020-05-23', 'LLLLL') --- !query schema -struct<> --- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'LLLLL' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html - - --- !query -select date_format(date '2020-05-23', 'EEEEE') --- !query schema -struct<> --- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html - - --- !query -select date_format(date '2020-05-23', 'uuuuu') --- !query schema -struct<> --- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'uuuuu' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html - - --- !query -select date_format(date '2020-05-23', 'QQQQQ') --- !query schema -struct<> --- !query output -java.lang.IllegalArgumentException -Too many pattern letters: Q - - --- !query -select date_format(date '2020-05-23', 'qqqqq') --- !query schema -struct<> --- !query output -java.lang.IllegalArgumentException -Too many pattern letters: q - - --- !query -select to_timestamp('2019-10-06 A', 'yyyy-MM-dd GGGGG') --- !query schema -struct<> --- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'yyyy-MM-dd GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html - - --- !query -select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEEE') --- !query schema -struct<> --- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html - - --- !query -select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') --- !query schema -struct<> --- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html - - --- !query -select unix_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') --- !query schema -struct<> --- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html - - --- !query -select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) --- !query schema -struct<> --- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html - - --- !query -select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) --- !query schema -struct<> --- !query output -org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html diff --git a/sql/core/src/test/resources/sql-tests/results/datetime.sql.out b/sql/core/src/test/resources/sql-tests/results/datetime.sql.out index 2c29ab16c92b..f73acdd8783a 100755 --- a/sql/core/src/test/resources/sql-tests/results/datetime.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/datetime.sql.out @@ -652,7 +652,7 @@ select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uuee') struct<> -- !query output java.lang.IllegalArgumentException -Illegal pattern character: e +Illegal pattern character 'e' -- !query @@ -661,7 +661,7 @@ select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uucc') struct<> -- !query output java.lang.IllegalArgumentException -Illegal pattern character: c +Illegal pattern character 'c' -- !query @@ -755,7 +755,7 @@ select date_format(date '2020-05-23', 'QQQQQ') struct<> -- !query output java.lang.IllegalArgumentException -Too many pattern letters: Q +Illegal pattern character 'Q' -- !query @@ -764,7 +764,7 @@ select date_format(date '2020-05-23', 'qqqqq') struct<> -- !query output java.lang.IllegalArgumentException -Too many pattern letters: q +Illegal pattern character 'q' -- !query diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DateFunctionsSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/DateFunctionsSuite.scala index 14e6ee2b04c1..3558e0499f45 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/DateFunctionsSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/DateFunctionsSuite.scala @@ -23,7 +23,7 @@ import java.time.{Instant, LocalDateTime, ZoneId} import java.util.{Locale, TimeZone} import java.util.concurrent.TimeUnit -import org.apache.spark.SparkException +import org.apache.spark.{SparkException, SparkUpgradeException} import org.apache.spark.sql.catalyst.util.DateTimeTestUtils.{CEST, LA} import org.apache.spark.sql.catalyst.util.DateTimeUtils import org.apache.spark.sql.functions._ @@ -450,9 +450,8 @@ class DateFunctionsSuite extends QueryTest with SharedSparkSession { checkAnswer( df.select(to_date(col("s"), "yyyy-hh-MM")), Seq(Row(null), Row(null), Row(null))) - checkAnswer( - df.select(to_date(col("s"), "yyyy-dd-aa")), - Seq(Row(null), Row(null), Row(null))) + val e = intercept[SparkException](df.select(to_date(col("s"), "yyyy-dd-aa")).collect()) + assert(e.getCause.isInstanceOf[SparkUpgradeException]) // february val x1 = "2016-02-29" @@ -618,8 +617,14 @@ class DateFunctionsSuite extends QueryTest with SharedSparkSession { Row(secs(ts4.getTime)), Row(null), Row(secs(ts3.getTime)), Row(null))) // invalid format - checkAnswer(df1.selectExpr(s"unix_timestamp(x, 'yyyy-MM-dd aa:HH:ss')"), Seq( - Row(null), Row(null), Row(null), Row(null))) + val invalid = df1.selectExpr(s"unix_timestamp(x, 'yyyy-MM-dd aa:HH:ss')") + if (legacyParserPolicy == "legacy") { + checkAnswer(invalid, + Seq(Row(null), Row(null), Row(null), Row(null))) + } else { + val exception = intercept[SparkException](invalid.collect()) + assert(exception.getCause.isInstanceOf[SparkUpgradeException]) + } // february val y1 = "2016-02-29" From c877ac541beb80b3abf57658c4c5564c7412d0bf Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Fri, 22 May 2020 01:20:05 +0800 Subject: [PATCH 07/15] initialize api --- .../apache/spark/sql/catalyst/util/DateFormatter.scala | 6 +++++- .../spark/sql/catalyst/util/DateTimeFormatterHelper.scala | 4 ++-- .../spark/sql/catalyst/util/TimestampFormatter.scala | 7 ++++++- .../resources/sql-tests/results/ansi/datetime.sql.out | 8 ++++---- .../src/test/resources/sql-tests/results/datetime.sql.out | 8 ++++---- 5 files changed, 21 insertions(+), 12 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala index 28af9dccfabc..bed48768da22 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala @@ -31,6 +31,7 @@ import org.apache.spark.sql.internal.SQLConf.LegacyBehaviorPolicy._ sealed trait DateFormatter extends Serializable { def parse(s: String): Int // returns days since epoch def format(days: Int): String + def initialize(): Unit = {} } class Iso8601DateFormatter( @@ -44,7 +45,7 @@ class Iso8601DateFormatter( private lazy val formatter: DateTimeFormatter = { try { getOrCreateFormatter(pattern, locale) - } catch checkLegacyFormatter(pattern, legacyFormatter.format(0)) + } catch checkLegacyFormatter(pattern, legacyFormatter.initialize) } @transient @@ -85,6 +86,7 @@ class LegacyFastDateFormatter(pattern: String, locale: Locale) extends LegacyDat private lazy val fdf = FastDateFormat.getInstance(pattern, locale) override def parseToDate(s: String): Date = fdf.parse(s) override def formatDate(d: Date): String = fdf.format(d) + override def initialize(): Unit = fdf } class LegacySimpleDateFormatter(pattern: String, locale: Locale) extends LegacyDateFormatter { @@ -92,6 +94,8 @@ class LegacySimpleDateFormatter(pattern: String, locale: Locale) extends LegacyD private lazy val sdf = new SimpleDateFormat(pattern, locale) override def parseToDate(s: String): Date = sdf.parse(s) override def formatDate(d: Date): String = sdf.format(d) + override def initialize(): Unit = sdf + } object DateFormatter { diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala index 0365e93f99ef..4d3d61fcd587 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala @@ -91,9 +91,9 @@ trait DateTimeFormatterHelper { // will throw IllegalArgumentException. If the pattern can be recognized by the legacy formatter // it will raise SparkUpgradeException to tell users to restore the previous behavior via LEGACY // policy or follow our guide to correct their pattern. - protected def checkLegacyFormatter[T1, T2]( + protected def checkLegacyFormatter( pattern: String, - block: T1 => T2): PartialFunction[Throwable, DateTimeFormatter] = { + block: => Unit): PartialFunction[Throwable, DateTimeFormatter] = { case e: IllegalArgumentException => try { block diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala index ef661b0ace25..2c056786a55e 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala @@ -51,6 +51,7 @@ sealed trait TimestampFormatter extends Serializable { @throws(classOf[DateTimeException]) def parse(s: String): Long def format(us: Long): String + def initialize(): Unit = {} } class Iso8601TimestampFormatter( @@ -65,7 +66,7 @@ class Iso8601TimestampFormatter( protected lazy val formatter: DateTimeFormatter = { try { getOrCreateFormatter(pattern, locale, needVarLengthSecondFraction) - } catch checkLegacyFormatter(pattern, legacyFormatter.format(0)) + } catch checkLegacyFormatter(pattern, legacyFormatter.initialize) } @transient @@ -170,6 +171,8 @@ class LegacyFastTimestampFormatter( cal.setMicros(Math.floorMod(julianMicros, MICROS_PER_SECOND)) fastDateFormat.format(cal) } + + override def initialize(): Unit = fastDateFormat } class LegacySimpleTimestampFormatter( @@ -191,6 +194,8 @@ class LegacySimpleTimestampFormatter( override def format(us: Long): String = { sdf.format(toJavaTimestamp(us)) } + + override def initialize(): Unit = sdf } object LegacyDateFormats extends Enumeration { diff --git a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out index 8700f5d45891..179baf7db479 100644 --- a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out @@ -680,7 +680,7 @@ select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uuee') struct<> -- !query output java.lang.IllegalArgumentException -Illegal pattern character 'e' +Illegal pattern character: e -- !query @@ -689,7 +689,7 @@ select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uucc') struct<> -- !query output java.lang.IllegalArgumentException -Illegal pattern character 'c' +Illegal pattern character: c -- !query @@ -783,7 +783,7 @@ select date_format(date '2020-05-23', 'QQQQQ') struct<> -- !query output java.lang.IllegalArgumentException -Illegal pattern character 'Q' +Too many pattern letters: Q -- !query @@ -792,7 +792,7 @@ select date_format(date '2020-05-23', 'qqqqq') struct<> -- !query output java.lang.IllegalArgumentException -Illegal pattern character 'q' +Too many pattern letters: q -- !query diff --git a/sql/core/src/test/resources/sql-tests/results/datetime.sql.out b/sql/core/src/test/resources/sql-tests/results/datetime.sql.out index f73acdd8783a..2c29ab16c92b 100755 --- a/sql/core/src/test/resources/sql-tests/results/datetime.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/datetime.sql.out @@ -652,7 +652,7 @@ select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uuee') struct<> -- !query output java.lang.IllegalArgumentException -Illegal pattern character 'e' +Illegal pattern character: e -- !query @@ -661,7 +661,7 @@ select date_format(timestamp '2019-10-06', 'yyyy-MM-dd uucc') struct<> -- !query output java.lang.IllegalArgumentException -Illegal pattern character 'c' +Illegal pattern character: c -- !query @@ -755,7 +755,7 @@ select date_format(date '2020-05-23', 'QQQQQ') struct<> -- !query output java.lang.IllegalArgumentException -Illegal pattern character 'Q' +Too many pattern letters: Q -- !query @@ -764,7 +764,7 @@ select date_format(date '2020-05-23', 'qqqqq') struct<> -- !query output java.lang.IllegalArgumentException -Illegal pattern character 'q' +Too many pattern letters: q -- !query From 15030422218ae905f5f56c2ab5467dd59a2ce099 Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Fri, 22 May 2020 10:01:42 +0800 Subject: [PATCH 08/15] tests --- .../native/stringCastAndExpressions.sql.out | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/stringCastAndExpressions.sql.out b/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/stringCastAndExpressions.sql.out index 8353c7e73d0b..d43e632ea63d 100644 --- a/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/stringCastAndExpressions.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/stringCastAndExpressions.sql.out @@ -136,9 +136,10 @@ NULL -- !query select to_timestamp('2018-01-01', a) from t -- !query schema -struct +struct<> -- !query output -NULL +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'aa' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -152,9 +153,10 @@ NULL -- !query select to_unix_timestamp('2018-01-01', a) from t -- !query schema -struct +struct<> -- !query output -NULL +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'aa' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -168,9 +170,10 @@ NULL -- !query select unix_timestamp('2018-01-01', a) from t -- !query schema -struct +struct<> -- !query output -NULL +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'aa' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query From 052bfada8d7a0469c76f0e9aa4386c8bc5a304c1 Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Fri, 22 May 2020 11:45:36 +0800 Subject: [PATCH 09/15] more test cases --- .../expressions/datetimeExpressions.scala | 3 ++ .../util/DateTimeFormatterHelper.scala | 16 ++++-- .../sql/util/TimestampFormatterSuite.scala | 4 +- .../resources/sql-tests/inputs/datetime.sql | 13 +++-- .../sql-tests/results/ansi/datetime.sql.out | 54 +++++++++++++++++-- .../sql-tests/results/datetime-legacy.sql.out | 54 ++++++++++++++++--- .../sql-tests/results/datetime.sql.out | 54 +++++++++++++++++-- 7 files changed, 171 insertions(+), 27 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala index e622ee119d52..f5cb32126c04 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala @@ -985,6 +985,7 @@ case class FromUnixTime(sec: Expression, format: Expression, timeZoneId: Option[ legacyFormat = SIMPLE_DATE_FORMAT, needVarLengthSecondFraction = false) } catch { + case e: SparkUpgradeException => throw e case NonFatal(_) => null } @@ -1000,6 +1001,7 @@ case class FromUnixTime(sec: Expression, format: Expression, timeZoneId: Option[ try { UTF8String.fromString(formatter.format(time.asInstanceOf[Long] * MICROS_PER_SECOND)) } catch { + case e: SparkUpgradeException => throw e case NonFatal(_) => null } } @@ -1017,6 +1019,7 @@ case class FromUnixTime(sec: Expression, format: Expression, timeZoneId: Option[ needVarLengthSecondFraction = false) .format(time.asInstanceOf[Long] * MICROS_PER_SECOND)) } catch { + case e: SparkUpgradeException => throw e case NonFatal(_) => null } } diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala index 4d3d61fcd587..3bd0b106af43 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala @@ -87,10 +87,18 @@ trait DateTimeFormatterHelper { } } - // When the new DateTimeFormatter failed to initialize because of invalid datetime pattern, it - // will throw IllegalArgumentException. If the pattern can be recognized by the legacy formatter - // it will raise SparkUpgradeException to tell users to restore the previous behavior via LEGACY - // policy or follow our guide to correct their pattern. + /** + * When the new DateTimeFormatter failed to initialize because of invalid datetime pattern, it + * will throw IllegalArgumentException. If the pattern can be recognized by the legacy formatter + * it will raise SparkUpgradeException to tell users to restore the previous behavior via LEGACY + * policy or follow our guide to correct their pattern. Otherwise, the original + * IllegalArgumentException will be thrown. + * + * @param pattern the date time pattern + * @param block a func to capture exception, identically which forces a legacy datetime formatter + * to be initialized + */ + protected def checkLegacyFormatter( pattern: String, block: => Unit): PartialFunction[Throwable, DateTimeFormatter] = { diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/util/TimestampFormatterSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/util/TimestampFormatterSuite.scala index e1480c9b7939..67ac247e3cdd 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/util/TimestampFormatterSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/util/TimestampFormatterSuite.scala @@ -323,8 +323,8 @@ class TimestampFormatterSuite extends SparkFunSuite with SQLHelper with Matchers Seq("QQQQQ", "qqqqq", "A", "c", "e", "n", "N", "p").foreach { pattern => intercept[IllegalArgumentException](TimestampFormatter(pattern, ZoneOffset.UTC).format(0)) } - // supported by the legacy one, then we will suggest users with - Seq("GGGGG", "MMMMM", "LLLLL", "EEEEE", "uuuuu").foreach { pattern => + // supported by the legacy one, then we will suggest users with SparkUpgradeException + Seq("GGGGG", "MMMMM", "LLLLL", "EEEEE", "uuuuu", "aa", "aaa").foreach { pattern => intercept[SparkUpgradeException](TimestampFormatter(pattern, ZoneOffset.UTC).format(0)) } } diff --git a/sql/core/src/test/resources/sql-tests/inputs/datetime.sql b/sql/core/src/test/resources/sql-tests/inputs/datetime.sql index b829fc2d95c7..be89478b7be0 100644 --- a/sql/core/src/test/resources/sql-tests/inputs/datetime.sql +++ b/sql/core/src/test/resources/sql-tests/inputs/datetime.sql @@ -127,13 +127,18 @@ select to_timestamp("P2019-10-06T10:11:12", "'P'yyyy-MM-dd'T'HH:mm:ss"); -- head select date_format(date '2020-05-23', 'GGGGG'); select date_format(date '2020-05-23', 'MMMMM'); select date_format(date '2020-05-23', 'LLLLL'); -select date_format(date '2020-05-23', 'EEEEE'); -select date_format(date '2020-05-23', 'uuuuu'); -select date_format(date '2020-05-23', 'QQQQQ'); -select date_format(date '2020-05-23', 'qqqqq'); +select date_format(timestamp '2020-05-23', 'EEEEE'); +select date_format(timestamp '2020-05-23', 'uuuuu'); +select date_format('2020-05-23', 'QQQQQ'); +select date_format('2020-05-23', 'qqqqq'); select to_timestamp('2019-10-06 A', 'yyyy-MM-dd GGGGG'); select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEEE'); select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE'); select unix_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE'); +select from_unixtime(12345, 'MMMMM'); +select from_unixtime(54321, 'QQQQQ'); +select from_unixtime(23456, 'aaaaa'); select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')); +select from_json('{"date":"26/October/2015"}', 'date Date', map('dateFormat', 'dd/MMMMM/yyyy')); select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')); +select from_csv('26/October/2015', 'date Date', map('dateFormat', 'dd/MMMMM/yyyy')); diff --git a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out index 179baf7db479..a34e0d4d3d4a 100644 --- a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out @@ -1,5 +1,5 @@ -- Automatically generated by SQLQueryTestSuite --- Number of queries: 98 +-- Number of queries: 103 -- !query @@ -760,7 +760,7 @@ You may get a different result due to the upgrading of Spark 3.0: Fail to recogn -- !query -select date_format(date '2020-05-23', 'EEEEE') +select date_format(timestamp '2020-05-23', 'EEEEE') -- !query schema struct<> -- !query output @@ -769,7 +769,7 @@ You may get a different result due to the upgrading of Spark 3.0: Fail to recogn -- !query -select date_format(date '2020-05-23', 'uuuuu') +select date_format(timestamp '2020-05-23', 'uuuuu') -- !query schema struct<> -- !query output @@ -778,7 +778,7 @@ You may get a different result due to the upgrading of Spark 3.0: Fail to recogn -- !query -select date_format(date '2020-05-23', 'QQQQQ') +select date_format('2020-05-23', 'QQQQQ') -- !query schema struct<> -- !query output @@ -787,7 +787,7 @@ Too many pattern letters: Q -- !query -select date_format(date '2020-05-23', 'qqqqq') +select date_format('2020-05-23', 'qqqqq') -- !query schema struct<> -- !query output @@ -831,6 +831,32 @@ org.apache.spark.SparkUpgradeException You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +-- !query +select from_unixtime(12345, 'MMMMM') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select from_unixtime(54321, 'QQQQQ') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select from_unixtime(23456, 'aaaaa') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'aaaaa' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + -- !query select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) -- !query schema @@ -840,6 +866,15 @@ org.apache.spark.SparkUpgradeException You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +-- !query +select from_json('{"date":"26/October/2015"}', 'date Date', map('dateFormat', 'dd/MMMMM/yyyy')) +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + -- !query select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) -- !query schema @@ -847,3 +882,12 @@ struct<> -- !query output org.apache.spark.SparkUpgradeException You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select from_csv('26/October/2015', 'date Date', map('dateFormat', 'dd/MMMMM/yyyy')) +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html diff --git a/sql/core/src/test/resources/sql-tests/results/datetime-legacy.sql.out b/sql/core/src/test/resources/sql-tests/results/datetime-legacy.sql.out index acdbb74fe388..35187d9c9c4f 100644 --- a/sql/core/src/test/resources/sql-tests/results/datetime-legacy.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/datetime-legacy.sql.out @@ -1,5 +1,5 @@ -- Automatically generated by SQLQueryTestSuite --- Number of queries: 98 +-- Number of queries: 103 -- !query @@ -729,23 +729,23 @@ May -- !query -select date_format(date '2020-05-23', 'EEEEE') +select date_format(timestamp '2020-05-23', 'EEEEE') -- !query schema -struct +struct -- !query output Saturday -- !query -select date_format(date '2020-05-23', 'uuuuu') +select date_format(timestamp '2020-05-23', 'uuuuu') -- !query schema -struct +struct -- !query output 00006 -- !query -select date_format(date '2020-05-23', 'QQQQQ') +select date_format('2020-05-23', 'QQQQQ') -- !query schema struct<> -- !query output @@ -754,7 +754,7 @@ Illegal pattern character 'Q' -- !query -select date_format(date '2020-05-23', 'qqqqq') +select date_format('2020-05-23', 'qqqqq') -- !query schema struct<> -- !query output @@ -794,6 +794,30 @@ struct 1590130800 +-- !query +select from_unixtime(12345, 'MMMMM') +-- !query schema +struct +-- !query output +December + + +-- !query +select from_unixtime(54321, 'QQQQQ') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select from_unixtime(23456, 'aaaaa') +-- !query schema +struct +-- !query output +PM + + -- !query select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) -- !query schema @@ -802,9 +826,25 @@ struct> {"time":2015-10-26 00:00:00} +-- !query +select from_json('{"date":"26/October/2015"}', 'date Date', map('dateFormat', 'dd/MMMMM/yyyy')) +-- !query schema +struct> +-- !query output +{"date":2015-10-26} + + -- !query select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) -- !query schema struct> -- !query output {"time":2015-10-26 00:00:00} + + +-- !query +select from_csv('26/October/2015', 'date Date', map('dateFormat', 'dd/MMMMM/yyyy')) +-- !query schema +struct> +-- !query output +{"date":2015-10-26} diff --git a/sql/core/src/test/resources/sql-tests/results/datetime.sql.out b/sql/core/src/test/resources/sql-tests/results/datetime.sql.out index 2c29ab16c92b..2e6da11e10ec 100755 --- a/sql/core/src/test/resources/sql-tests/results/datetime.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/datetime.sql.out @@ -1,5 +1,5 @@ -- Automatically generated by SQLQueryTestSuite --- Number of queries: 98 +-- Number of queries: 103 -- !query @@ -732,7 +732,7 @@ You may get a different result due to the upgrading of Spark 3.0: Fail to recogn -- !query -select date_format(date '2020-05-23', 'EEEEE') +select date_format(timestamp '2020-05-23', 'EEEEE') -- !query schema struct<> -- !query output @@ -741,7 +741,7 @@ You may get a different result due to the upgrading of Spark 3.0: Fail to recogn -- !query -select date_format(date '2020-05-23', 'uuuuu') +select date_format(timestamp '2020-05-23', 'uuuuu') -- !query schema struct<> -- !query output @@ -750,7 +750,7 @@ You may get a different result due to the upgrading of Spark 3.0: Fail to recogn -- !query -select date_format(date '2020-05-23', 'QQQQQ') +select date_format('2020-05-23', 'QQQQQ') -- !query schema struct<> -- !query output @@ -759,7 +759,7 @@ Too many pattern letters: Q -- !query -select date_format(date '2020-05-23', 'qqqqq') +select date_format('2020-05-23', 'qqqqq') -- !query schema struct<> -- !query output @@ -803,6 +803,32 @@ org.apache.spark.SparkUpgradeException You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +-- !query +select from_unixtime(12345, 'MMMMM') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select from_unixtime(54321, 'QQQQQ') +-- !query schema +struct +-- !query output +NULL + + +-- !query +select from_unixtime(23456, 'aaaaa') +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'aaaaa' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + -- !query select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) -- !query schema @@ -812,6 +838,15 @@ org.apache.spark.SparkUpgradeException You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +-- !query +select from_json('{"date":"26/October/2015"}', 'date Date', map('dateFormat', 'dd/MMMMM/yyyy')) +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + -- !query select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) -- !query schema @@ -819,3 +854,12 @@ struct<> -- !query output org.apache.spark.SparkUpgradeException You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html + + +-- !query +select from_csv('26/October/2015', 'date Date', map('dateFormat', 'dd/MMMMM/yyyy')) +-- !query schema +struct<> +-- !query output +org.apache.spark.SparkUpgradeException +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html From 8141ef946aed9d7fe97f1918d4f11698c3ece4c6 Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Fri, 22 May 2020 11:52:18 +0800 Subject: [PATCH 10/15] add doc --- docs/sql-ref-datetime-pattern.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/sql-ref-datetime-pattern.md b/docs/sql-ref-datetime-pattern.md index c04edfc21696..ad535953c57b 100644 --- a/docs/sql-ref-datetime-pattern.md +++ b/docs/sql-ref-datetime-pattern.md @@ -120,6 +120,8 @@ The count of pattern letters determines the format. январь ``` +- AM/PM(a): This outputs the am-pm-of-day. Pattern letter count must be 1. + - Zone ID(V): This outputs the display the time-zone ID. Pattern letter count must be 2. - Zone names(z): This outputs the display textual name of the time-zone ID. If the count of letters is one, two or three, then the short name is output. If the count of letters is four, then the full name is output. Five or more letters will fail. From 4491e796d1ba7712a55f90272b7f8f04e9dbb55c Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Fri, 22 May 2020 15:54:18 +0800 Subject: [PATCH 11/15] update doc and adress comments --- docs/sql-ref-datetime-pattern.md | 156 ++++++++++++++---- .../sql/catalyst/util/DateFormatter.scala | 22 ++- .../util/DateTimeFormatterHelper.scala | 10 +- .../catalyst/util/TimestampFormatter.scala | 23 ++- .../sql/util/TimestampFormatterSuite.scala | 2 +- 5 files changed, 153 insertions(+), 60 deletions(-) diff --git a/docs/sql-ref-datetime-pattern.md b/docs/sql-ref-datetime-pattern.md index ad535953c57b..379d3e919801 100644 --- a/docs/sql-ref-datetime-pattern.md +++ b/docs/sql-ref-datetime-pattern.md @@ -26,42 +26,126 @@ There are several common scenarios for datetime usage in Spark: - Datetime functions related to convert `StringType` to/from `DateType` or `TimestampType`. For example, `unix_timestamp`, `date_format`, `to_unix_timestamp`, `from_unixtime`, `to_date`, `to_timestamp`, `from_utc_timestamp`, `to_utc_timestamp`, etc. -Spark uses pattern letters in the following table for date and timestamp parsing and formatting: - -|Symbol|Meaning|Presentation|Examples| -|------|-------|------------|--------| -|**G**|era|text|AD; Anno Domini| -|**y**|year|year|2020; 20| -|**D**|day-of-year|number|189| -|**M/L**|month-of-year|number/text|7; 07; Jul; July| -|**d**|day-of-month|number|28| -|**Q/q**|quarter-of-year|number/text|3; 03; Q3; 3rd quarter| -|**Y**|week-based-year|year|1996; 96| -|**w**|week-of-week-based-year|number|27| -|**W**|week-of-month|number|4| -|**E**|day-of-week|text|Tue; Tuesday| -|**u**|localized day-of-week|number/text|2; 02; Tue; Tuesday| -|**F**|week-of-month|number|3| -|**a**|am-pm-of-day|text|PM| -|**h**|clock-hour-of-am-pm (1-12)|number|12| -|**K**|hour-of-am-pm (0-11)|number|0| -|**k**|clock-hour-of-day (1-24)|number|0| -|**H**|hour-of-day (0-23)|number|0| -|**m**|minute-of-hour|number|30| -|**s**|second-of-minute|number|55| -|**S**|fraction-of-second|fraction|978| -|**V**|time-zone ID|zone-id|America/Los_Angeles; Z; -08:30| -|**z**|time-zone name|zone-name|Pacific Standard Time; PST| -|**O**|localized zone-offset|offset-O|GMT+8; GMT+08:00; UTC-08:00;| -|**X**|zone-offset 'Z' for zero|offset-X|Z; -08; -0830; -08:30; -083015; -08:30:15;| -|**x**|zone-offset|offset-x|+0000; -08; -0830; -08:30; -083015; -08:30:15;| -|**Z**|zone-offset|offset-Z|+0000; -0800; -08:00;| -|**'**|escape for text|delimiter| | -|**''**|single quote|literal|'| -|**[**|optional section start| | | -|**]**|optional section end| | | - -The count of pattern letters determines the format. +The following tables define how the pattern letters be used for date and timestamp parsing and formatting in Spark. + +## Date Fields + +Pattern letters to output a date: + +|Pattern|Count|Meaning|Presentation|Examples| +|---|---|---|---|---| +|**G**|1|era|text|AD| +|**GG**|2|era|text|AD| +|**GGG**|3|era|text|AD| +|**GGGG**|4|era|text|Anno Domini| +|**y**|1|year|year|2020| +|**yy**|2|year|year|20| +|**yyy**|3|year|year|2020| +|**y..y**|4..n|year|year|2020; 02020| +|**Y**|1|week-based-year|year|1996| +|**YY**|2|week-based-year|year|96| +|**YYY**|3|week-based-year|year|1996| +|**Y..Y**|4..n|week-based-year|year|1996; 01996| +|**Q**|1|quarter-of-year|number/text|3| +|**QQ**|2|quarter-of-year|number/text|03| +|**QQQ**|3|quarter-of-year|number/text|Q3| +|**QQQQ**|4|quarter-of-year|number/text|3rd quarter| +|**M**|1|month-of-year|number/text|7| +|**MM**|2|month-of-year|number/text|07| +|**MMM**|3|month-of-year|number/text|Jul| +|**MMMM**|4|month-of-year|number/text|July| +|**L**|1|month-of-year|number/text|7| +|**LL**|2|month-of-year|number/text|07| +|**LLL**|3|month-of-year|number/text|Jul| +|**LLLL**|4|month-of-year|number/text|July| +|**w**|1|week-of-week-based-year|number|1; 27| +|**ww**|1|week-of-week-based-year|number|01; 27| +|**W**|1|week-of-month|number|4| +|**D**|1|day-of-year|number|1; 189| +|**DD**|1|day-of-year|number|01; 189| +|**DDD**|1|day-of-year|number|001; 189| +|**d**|1|day-of-month|number|1; 28| +|**dd**|1|day-of-month|number|01; 28| +|**E**|1|day-of-week|text|Tue| +|**EE**|2|day-of-week|text|Tue| +|**EEE**|3|day-of-week|text|Tue| +|**EEEE**|4|day-of-week|text|Tuesday| +|**u**|1|localized day-of-week|number/text|2| +|**uu**|2|localized day-of-week|number/text|02| +|**uuu**|3|localized day-of-week|number/text|Tue| +|**uuuu**|4|localized day-of-week|number/text|Tuesday| +|**F**|1|week-of-month|number|3| + +## Time Fields + +Pattern letters to output a time: + +|Pattern|Count|Meaning|Presentation|Examples| +|---|---|---|---|---| +|**a**|1|am-pm-of-day|text|PM| +|**h**|1|clock-hour-of-am-pm (1-12)|number|1; 12| +|**hh**|1|clock-hour-of-am-pm (1-12)|number|01; 12| +|**K**|1|hour-of-am-pm (0-11)|number|1; 11| +|**KK**|2|hour-of-am-pm (0-11)|number|01; 11| +|**k**|1|clock-hour-of-day (1-24)|number|1; 23| +|**kk**|2|clock-hour-of-day (1-24)|number|01; 23| +|**H**|1|hour-of-day (0-23)|number|1; 23| +|**HH**|1|hour-of-day (0-23)|number|01; 23| +|**m**|1|minute-of-hour|number|1; 30| +|**mm**|2|minute-of-hour|number|01; 30| +|**s**|1|second-of-minute|number|55| +|**ss**|2|second-of-minute|number|55| +|**S**|1..9|fraction-of-second|fraction|978| + +## Zone ID + +Pattern letters to output Zone Id: + +|Pattern|Count|Meaning|Presentation|Examples| +|---|---|---|---|---| +|**VV**|2|time-zone ID|zone-id|America/Los_Angeles; Z; -08:30| +|**z**|1|time-zone name|zone-name|PST| +|**zz**|2|time-zone name|zone-name|PST| +|**zzz**|3|time-zone name|zone-name|PST| +|**zzzz**|4|time-zone name|zone-name|Pacific Standard Time| + +## Zone offset + +Pattern letters to output Zone Offset: + +|Pattern|Count|Meaning|Presentation|Examples| +|---|---|---|---|---| +|**O**|1|localized zone-offset|offset-O|GMT+8| +|**OOOO**|4|localized zone-offset|offset-O|GMT+08:00| +|**X**|1|zone-offset 'Z' for zero|offset-X|Z; -08| +|**XX**|2|zone-offset 'Z' for zero|offset-X|Z; -0830| +|**XXX**|3|zone-offset 'Z' for zero|offset-X|Z; -08:30| +|**XXXX**|4|zone-offset 'Z' for zero|offset-X|Z; -083015| +|**XXXXX**|5|zone-offset 'Z' for zero|offset-X|Z; -08:30:15| +|**x**|1|zone-offset|offset-x|-08| +|**xx**|2|zone-offset|offset-x|-0830| +|**xxx**|3|zone-offset|offset-x|-08:30| +|**xxxx**|4|zone-offset|offset-x|-083015| +|**xxxxx**|5|zone-offset|offset-x|-08:30:15| +|**Z**|1|zone-offset|offset-Z|-0800| +|**ZZ**|2|zone-offset|offset-Z|-0800| +|**ZZZ**|3|zone-offset|offset-Z|-0800| +|**ZZZZ**|4|zone-offset|offset-Z|GMT-08:00| +|**ZZZZZ**|5|zone-offset|offset-Z|-08:00| + +## Modifiers + +Pattern letters that modify the rest of the pattern: + +|Pattern|Count|Meaning|Presentation|Examples| +|---|---|---|---|---| +|**'**|1|escape for text|delimiter| | +|**''**|1|single quote|literal|'| +|**[**|1|optional section start| | | +|**]**|1|optional section end| | | + + +- Count: The count of pattern letters determines the format. `1..n` describes the range how many numbers the pattern can have up to, `n` means no limit. - Text: The text style is determined based on the number of pattern letters used. Less than 4 pattern letters will use the short form. Exactly 4 pattern letters will use the full form. 5 or more letters will fail. diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala index 3ecf3e54df28..6d7551b9310b 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala @@ -35,7 +35,7 @@ sealed trait DateFormatter extends Serializable { def format(date: Date): String def format(localDate: LocalDate): String - def initialize(): Unit = {} + def validatePatternString(): Unit } class Iso8601DateFormatter( @@ -46,11 +46,7 @@ class Iso8601DateFormatter( extends DateFormatter with DateTimeFormatterHelper { @transient - private lazy val formatter: DateTimeFormatter = { - try { - getOrCreateFormatter(pattern, locale) - } catch checkLegacyFormatter(pattern, legacyFormatter.initialize) - } + private lazy val formatter = getOrCreateFormatter(pattern, locale) @transient private lazy val legacyFormatter = DateFormatter.getLegacyFormatter( @@ -77,6 +73,12 @@ class Iso8601DateFormatter( override def format(date: Date): String = { legacyFormatter.format(date) } + + override def validatePatternString(): Unit = { + try { + formatter + } catch checkLegacyFormatter(pattern, legacyFormatter.validatePatternString) + } } trait LegacyDateFormatter extends DateFormatter { @@ -100,7 +102,7 @@ class LegacyFastDateFormatter(pattern: String, locale: Locale) extends LegacyDat private lazy val fdf = FastDateFormat.getInstance(pattern, locale) override def parseToDate(s: String): Date = fdf.parse(s) override def format(d: Date): String = fdf.format(d) - override def initialize(): Unit = fdf + override def validatePatternString(): Unit = fdf } class LegacySimpleDateFormatter(pattern: String, locale: Locale) extends LegacyDateFormatter { @@ -108,7 +110,7 @@ class LegacySimpleDateFormatter(pattern: String, locale: Locale) extends LegacyD private lazy val sdf = new SimpleDateFormat(pattern, locale) override def parseToDate(s: String): Date = sdf.parse(s) override def format(d: Date): String = sdf.format(d) - override def initialize(): Unit = sdf + override def validatePatternString(): Unit = sdf } @@ -128,7 +130,9 @@ object DateFormatter { if (SQLConf.get.legacyTimeParserPolicy == LEGACY) { getLegacyFormatter(pattern, zoneId, locale, legacyFormat) } else { - new Iso8601DateFormatter(pattern, zoneId, locale, legacyFormat) + val df = new Iso8601DateFormatter(pattern, zoneId, locale, legacyFormat) + df.validatePatternString() + df } } diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala index 3bd0b106af43..66c386e9cffa 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala @@ -95,16 +95,16 @@ trait DateTimeFormatterHelper { * IllegalArgumentException will be thrown. * * @param pattern the date time pattern - * @param block a func to capture exception, identically which forces a legacy datetime formatter - * to be initialized + * @param tryLegacyFormatter a func to capture exception, identically which forces a legacy + * datetime formatter to be initialized */ protected def checkLegacyFormatter( pattern: String, - block: => Unit): PartialFunction[Throwable, DateTimeFormatter] = { + tryLegacyFormatter: => Unit): PartialFunction[Throwable, DateTimeFormatter] = { case e: IllegalArgumentException => try { - block + tryLegacyFormatter } catch { case _: Throwable => throw e } @@ -191,7 +191,7 @@ private object DateTimeFormatterHelper { final val unsupportedLetters = Set('A', 'c', 'e', 'n', 'N', 'p') final val unsupportedNarrowTextStyle = - Set("GGGGG", "MMMMM", "LLLLL", "EEEEE", "uuuuu", "QQQQQ", "qqqqq") + Set("GGGGG", "MMMMM", "LLLLL", "EEEEE", "uuuuu", "QQQQQ", "qqqqq", "uuuuu") /** * In Spark 3.0, we switch to the Proleptic Gregorian calendar and use DateTimeFormatter for diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala index 4c5db173c9ac..de2fd312b7db 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala @@ -54,7 +54,7 @@ sealed trait TimestampFormatter extends Serializable { def format(us: Long): String def format(ts: Timestamp): String def format(instant: Instant): String - def initialize(): Unit = {} + def validatePatternString(): Unit } class Iso8601TimestampFormatter( @@ -65,11 +65,8 @@ class Iso8601TimestampFormatter( needVarLengthSecondFraction: Boolean) extends TimestampFormatter with DateTimeFormatterHelper { @transient - protected lazy val formatter: DateTimeFormatter = { - try { - getOrCreateFormatter(pattern, locale, needVarLengthSecondFraction) - } catch checkLegacyFormatter(pattern, legacyFormatter.initialize) - } + protected lazy val formatter: DateTimeFormatter = + getOrCreateFormatter(pattern, locale, needVarLengthSecondFraction) @transient protected lazy val legacyFormatter = TimestampFormatter.getLegacyFormatter( @@ -103,6 +100,12 @@ class Iso8601TimestampFormatter( override def format(ts: Timestamp): String = { legacyFormatter.format(ts) } + + override def validatePatternString(): Unit = { + try { + formatter + } catch checkLegacyFormatter(pattern, legacyFormatter.validatePatternString) + } } /** @@ -207,7 +210,7 @@ class LegacyFastTimestampFormatter( format(instantToMicros(instant)) } - override def initialize(): Unit = fastDateFormat + override def validatePatternString(): Unit = fastDateFormat } class LegacySimpleTimestampFormatter( @@ -238,7 +241,7 @@ class LegacySimpleTimestampFormatter( format(instantToMicros(instant)) } - override def initialize(): Unit = sdf + override def validatePatternString(): Unit = sdf } object LegacyDateFormats extends Enumeration { @@ -263,8 +266,10 @@ object TimestampFormatter { if (SQLConf.get.legacyTimeParserPolicy == LEGACY) { getLegacyFormatter(pattern, zoneId, locale, legacyFormat) } else { - new Iso8601TimestampFormatter( + val tf = new Iso8601TimestampFormatter( pattern, zoneId, locale, legacyFormat, needVarLengthSecondFraction) + tf.validatePatternString() + tf } } diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/util/TimestampFormatterSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/util/TimestampFormatterSuite.scala index 67ac247e3cdd..6158b942f1c9 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/util/TimestampFormatterSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/util/TimestampFormatterSuite.scala @@ -324,7 +324,7 @@ class TimestampFormatterSuite extends SparkFunSuite with SQLHelper with Matchers intercept[IllegalArgumentException](TimestampFormatter(pattern, ZoneOffset.UTC).format(0)) } // supported by the legacy one, then we will suggest users with SparkUpgradeException - Seq("GGGGG", "MMMMM", "LLLLL", "EEEEE", "uuuuu", "aa", "aaa").foreach { pattern => + Seq("GGGGG", "MMMMM", "LLLLL", "EEEEE", "uuuuu", "aa", "aaa", "uuuuu").foreach { pattern => intercept[SparkUpgradeException](TimestampFormatter(pattern, ZoneOffset.UTC).format(0)) } } From 75fdbcb10907010cf3ecccc5e61dfa5ed0ab0591 Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Fri, 22 May 2020 17:25:10 +0800 Subject: [PATCH 12/15] doc update --- docs/sql-ref-datetime-pattern.md | 164 ++++++++----------------------- 1 file changed, 40 insertions(+), 124 deletions(-) diff --git a/docs/sql-ref-datetime-pattern.md b/docs/sql-ref-datetime-pattern.md index 379d3e919801..d6c8564f09e5 100644 --- a/docs/sql-ref-datetime-pattern.md +++ b/docs/sql-ref-datetime-pattern.md @@ -26,130 +26,46 @@ There are several common scenarios for datetime usage in Spark: - Datetime functions related to convert `StringType` to/from `DateType` or `TimestampType`. For example, `unix_timestamp`, `date_format`, `to_unix_timestamp`, `from_unixtime`, `to_date`, `to_timestamp`, `from_utc_timestamp`, `to_utc_timestamp`, etc. -The following tables define how the pattern letters be used for date and timestamp parsing and formatting in Spark. - -## Date Fields - -Pattern letters to output a date: - -|Pattern|Count|Meaning|Presentation|Examples| -|---|---|---|---|---| -|**G**|1|era|text|AD| -|**GG**|2|era|text|AD| -|**GGG**|3|era|text|AD| -|**GGGG**|4|era|text|Anno Domini| -|**y**|1|year|year|2020| -|**yy**|2|year|year|20| -|**yyy**|3|year|year|2020| -|**y..y**|4..n|year|year|2020; 02020| -|**Y**|1|week-based-year|year|1996| -|**YY**|2|week-based-year|year|96| -|**YYY**|3|week-based-year|year|1996| -|**Y..Y**|4..n|week-based-year|year|1996; 01996| -|**Q**|1|quarter-of-year|number/text|3| -|**QQ**|2|quarter-of-year|number/text|03| -|**QQQ**|3|quarter-of-year|number/text|Q3| -|**QQQQ**|4|quarter-of-year|number/text|3rd quarter| -|**M**|1|month-of-year|number/text|7| -|**MM**|2|month-of-year|number/text|07| -|**MMM**|3|month-of-year|number/text|Jul| -|**MMMM**|4|month-of-year|number/text|July| -|**L**|1|month-of-year|number/text|7| -|**LL**|2|month-of-year|number/text|07| -|**LLL**|3|month-of-year|number/text|Jul| -|**LLLL**|4|month-of-year|number/text|July| -|**w**|1|week-of-week-based-year|number|1; 27| -|**ww**|1|week-of-week-based-year|number|01; 27| -|**W**|1|week-of-month|number|4| -|**D**|1|day-of-year|number|1; 189| -|**DD**|1|day-of-year|number|01; 189| -|**DDD**|1|day-of-year|number|001; 189| -|**d**|1|day-of-month|number|1; 28| -|**dd**|1|day-of-month|number|01; 28| -|**E**|1|day-of-week|text|Tue| -|**EE**|2|day-of-week|text|Tue| -|**EEE**|3|day-of-week|text|Tue| -|**EEEE**|4|day-of-week|text|Tuesday| -|**u**|1|localized day-of-week|number/text|2| -|**uu**|2|localized day-of-week|number/text|02| -|**uuu**|3|localized day-of-week|number/text|Tue| -|**uuuu**|4|localized day-of-week|number/text|Tuesday| -|**F**|1|week-of-month|number|3| - -## Time Fields - -Pattern letters to output a time: - -|Pattern|Count|Meaning|Presentation|Examples| -|---|---|---|---|---| -|**a**|1|am-pm-of-day|text|PM| -|**h**|1|clock-hour-of-am-pm (1-12)|number|1; 12| -|**hh**|1|clock-hour-of-am-pm (1-12)|number|01; 12| -|**K**|1|hour-of-am-pm (0-11)|number|1; 11| -|**KK**|2|hour-of-am-pm (0-11)|number|01; 11| -|**k**|1|clock-hour-of-day (1-24)|number|1; 23| -|**kk**|2|clock-hour-of-day (1-24)|number|01; 23| -|**H**|1|hour-of-day (0-23)|number|1; 23| -|**HH**|1|hour-of-day (0-23)|number|01; 23| -|**m**|1|minute-of-hour|number|1; 30| -|**mm**|2|minute-of-hour|number|01; 30| -|**s**|1|second-of-minute|number|55| -|**ss**|2|second-of-minute|number|55| -|**S**|1..9|fraction-of-second|fraction|978| - -## Zone ID - -Pattern letters to output Zone Id: - -|Pattern|Count|Meaning|Presentation|Examples| -|---|---|---|---|---| -|**VV**|2|time-zone ID|zone-id|America/Los_Angeles; Z; -08:30| -|**z**|1|time-zone name|zone-name|PST| -|**zz**|2|time-zone name|zone-name|PST| -|**zzz**|3|time-zone name|zone-name|PST| -|**zzzz**|4|time-zone name|zone-name|Pacific Standard Time| - -## Zone offset - -Pattern letters to output Zone Offset: - -|Pattern|Count|Meaning|Presentation|Examples| -|---|---|---|---|---| -|**O**|1|localized zone-offset|offset-O|GMT+8| -|**OOOO**|4|localized zone-offset|offset-O|GMT+08:00| -|**X**|1|zone-offset 'Z' for zero|offset-X|Z; -08| -|**XX**|2|zone-offset 'Z' for zero|offset-X|Z; -0830| -|**XXX**|3|zone-offset 'Z' for zero|offset-X|Z; -08:30| -|**XXXX**|4|zone-offset 'Z' for zero|offset-X|Z; -083015| -|**XXXXX**|5|zone-offset 'Z' for zero|offset-X|Z; -08:30:15| -|**x**|1|zone-offset|offset-x|-08| -|**xx**|2|zone-offset|offset-x|-0830| -|**xxx**|3|zone-offset|offset-x|-08:30| -|**xxxx**|4|zone-offset|offset-x|-083015| -|**xxxxx**|5|zone-offset|offset-x|-08:30:15| -|**Z**|1|zone-offset|offset-Z|-0800| -|**ZZ**|2|zone-offset|offset-Z|-0800| -|**ZZZ**|3|zone-offset|offset-Z|-0800| -|**ZZZZ**|4|zone-offset|offset-Z|GMT-08:00| -|**ZZZZZ**|5|zone-offset|offset-Z|-08:00| - -## Modifiers - -Pattern letters that modify the rest of the pattern: - -|Pattern|Count|Meaning|Presentation|Examples| -|---|---|---|---|---| -|**'**|1|escape for text|delimiter| | -|**''**|1|single quote|literal|'| -|**[**|1|optional section start| | | -|**]**|1|optional section end| | | - - -- Count: The count of pattern letters determines the format. `1..n` describes the range how many numbers the pattern can have up to, `n` means no limit. - -- Text: The text style is determined based on the number of pattern letters used. Less than 4 pattern letters will use the short form. Exactly 4 pattern letters will use the full form. 5 or more letters will fail. - -- Number: If the count of letters is one, then the value is output using the minimum number of digits and without padding. Otherwise, the count of digits is used as the width of the output field, with the value zero-padded as necessary. The following pattern letters have constraints on the count of letters. Only one letter 'F' can be specified. Up to two letters of 'd', 'H', 'h', 'K', 'k', 'm', and 's' can be specified. Up to three letters of 'D' can be specified. +Spark uses pattern letters in the following table for date and timestamp parsing and formatting: + +|Symbol|Meaning|Presentation|Examples| +|------|-------|------------|--------| +|**G**|era|text|AD; Anno Domini| +|**y**|year|year|2020; 20| +|**D**|day-of-year|number(3)|189| +|**M/L**|month-of-year|month|7; 07; Jul; July| +|**d**|day-of-month|number(3)|28| +|**Q/q**|quarter-of-year|number/text|3; 03; Q3; 3rd quarter| +|**Y**|week-based-year|year|1996; 96| +|**w**|week-of-week-based-year|number(2)|27| +|**W**|week-of-month|number(1)|4| +|**E**|day-of-week|text|Tue; Tuesday| +|**u**|localized day-of-week|number/text|2; 02; Tue; Tuesday| +|**F**|week-of-month|number(1)|3| +|**a**|am-pm-of-day|am/pm|PM| +|**h**|clock-hour-of-am-pm (1-12)|number(2)|12| +|**K**|hour-of-am-pm (0-11)|number(2)|0| +|**k**|clock-hour-of-day (1-24)|number(2)|0| +|**H**|hour-of-day (0-23)|number(2)|0| +|**m**|minute-of-hour|number(2)|30| +|**s**|second-of-minute|number(2)|55| +|**S**|fraction-of-second|fraction|978| +|**V**|time-zone ID|zone-id|America/Los_Angeles; Z; -08:30| +|**z**|time-zone name|zone-name|Pacific Standard Time; PST| +|**O**|localized zone-offset|offset-O|GMT+8; GMT+08:00; UTC-08:00;| +|**X**|zone-offset 'Z' for zero|offset-X|Z; -08; -0830; -08:30; -083015; -08:30:15;| +|**x**|zone-offset|offset-x|+0000; -08; -0830; -08:30; -083015; -08:30:15;| +|**Z**|zone-offset|offset-Z|+0000; -0800; -08:00;| +|**'**|escape for text|delimiter| | +|**''**|single quote|literal|'| +|**[**|optional section start| | | +|**]**|optional section end| | | + +The count of pattern letters determines the format. + +- Text: The text style is determined based on the number of pattern letters used. Less than 4 pattern letters will use the short form. Exactly 4 pattern letters will use the full form. Exactly 5 pattern letters will use the narrow form. 5 or more letters will fail. + +- Number(n): the n here represents the maximum count of letters this type of datetime pattern can be used. If the count of letters is one, then the value is output using the minimum number of digits and without padding. Otherwise, the count of digits is used as the width of the output field, with the value zero-padded as necessary. - Number/Text: If the count of pattern letters is 3 or greater, use the Text rules above. Otherwise use the Number rules above. From 0a76ba38480bc4ec845dd8286f858fda0d01f97e Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Fri, 22 May 2020 18:54:56 +0800 Subject: [PATCH 13/15] fix doc and tests --- docs/sql-ref-datetime-pattern.md | 6 +++--- .../org/apache/spark/sql/DateFunctionsSuite.scala | 11 +++++++---- 2 files changed, 10 insertions(+), 7 deletions(-) diff --git a/docs/sql-ref-datetime-pattern.md b/docs/sql-ref-datetime-pattern.md index d6c8564f09e5..4275f03335b3 100644 --- a/docs/sql-ref-datetime-pattern.md +++ b/docs/sql-ref-datetime-pattern.md @@ -42,7 +42,7 @@ Spark uses pattern letters in the following table for date and timestamp parsing |**E**|day-of-week|text|Tue; Tuesday| |**u**|localized day-of-week|number/text|2; 02; Tue; Tuesday| |**F**|week-of-month|number(1)|3| -|**a**|am-pm-of-day|am/pm|PM| +|**a**|am-pm-of-day|am-pm|PM| |**h**|clock-hour-of-am-pm (1-12)|number(2)|12| |**K**|hour-of-am-pm (0-11)|number(2)|0| |**k**|clock-hour-of-day (1-24)|number(2)|0| @@ -65,7 +65,7 @@ The count of pattern letters determines the format. - Text: The text style is determined based on the number of pattern letters used. Less than 4 pattern letters will use the short form. Exactly 4 pattern letters will use the full form. Exactly 5 pattern letters will use the narrow form. 5 or more letters will fail. -- Number(n): the n here represents the maximum count of letters this type of datetime pattern can be used. If the count of letters is one, then the value is output using the minimum number of digits and without padding. Otherwise, the count of digits is used as the width of the output field, with the value zero-padded as necessary. +- Number(n): The n here represents the maximum count of letters this type of datetime pattern can be used. If the count of letters is one, then the value is output using the minimum number of digits and without padding. Otherwise, the count of digits is used as the width of the output field, with the value zero-padded as necessary. - Number/Text: If the count of pattern letters is 3 or greater, use the Text rules above. Otherwise use the Number rules above. @@ -120,7 +120,7 @@ The count of pattern letters determines the format. январь ``` -- AM/PM(a): This outputs the am-pm-of-day. Pattern letter count must be 1. +- am-pm: This outputs the am-pm-of-day. Pattern letter count must be 1. - Zone ID(V): This outputs the display the time-zone ID. Pattern letter count must be 2. diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DateFunctionsSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/DateFunctionsSuite.scala index 3558e0499f45..c12468a4e70f 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/DateFunctionsSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/DateFunctionsSuite.scala @@ -450,8 +450,9 @@ class DateFunctionsSuite extends QueryTest with SharedSparkSession { checkAnswer( df.select(to_date(col("s"), "yyyy-hh-MM")), Seq(Row(null), Row(null), Row(null))) - val e = intercept[SparkException](df.select(to_date(col("s"), "yyyy-dd-aa")).collect()) - assert(e.getCause.isInstanceOf[SparkUpgradeException]) + val e = intercept[SparkUpgradeException](df.select(to_date(col("s"), "yyyy-dd-aa")).collect()) + assert(e.getCause.isInstanceOf[IllegalArgumentException]) + assert(e.getMessage.contains("You may get a different result due to the upgrading of Spark")) // february val x1 = "2016-02-29" @@ -622,8 +623,10 @@ class DateFunctionsSuite extends QueryTest with SharedSparkSession { checkAnswer(invalid, Seq(Row(null), Row(null), Row(null), Row(null))) } else { - val exception = intercept[SparkException](invalid.collect()) - assert(exception.getCause.isInstanceOf[SparkUpgradeException]) + val e = intercept[SparkUpgradeException](invalid.collect()) + assert(e.getCause.isInstanceOf[IllegalArgumentException]) + assert( + e.getMessage.contains("You may get a different result due to the upgrading of Spark")) } // february From 5360d888b6db2562ea7ada6ee40f1bf606fbe809 Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Fri, 22 May 2020 19:07:40 +0800 Subject: [PATCH 14/15] update tests --- .../util/DateTimeFormatterHelper.scala | 7 +++-- .../sql-tests/results/ansi/datetime.sql.out | 30 +++++++++---------- .../sql-tests/results/datetime.sql.out | 30 +++++++++---------- .../native/stringCastAndExpressions.sql.out | 6 ++-- 4 files changed, 37 insertions(+), 36 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala index ea8ba2d076f4..0ea54c28cb28 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala @@ -140,9 +140,10 @@ trait DateTimeFormatterHelper { case _: Throwable => throw e } throw new SparkUpgradeException("3.0", s"Fail to recognize '$pattern' pattern in the" + - s" new parser. 1) You can set ${SQLConf.LEGACY_TIME_PARSER_POLICY.key} to LEGACY to" + - s" restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with" + - s" the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html", e) + s" DateTimeFormatter. 1) You can set ${SQLConf.LEGACY_TIME_PARSER_POLICY.key} to LEGACY" + + s" to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern" + + s" with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html", + e) } } diff --git a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out index 546e9d59e24d..5e007210def7 100644 --- a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out @@ -786,7 +786,7 @@ select date_format(date '2020-05-23', 'GGGGG') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'GGGGG' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -795,7 +795,7 @@ select date_format(date '2020-05-23', 'MMMMM') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -804,7 +804,7 @@ select date_format(date '2020-05-23', 'LLLLL') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'LLLLL' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'LLLLL' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -813,7 +813,7 @@ select date_format(timestamp '2020-05-23', 'EEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -822,7 +822,7 @@ select date_format(timestamp '2020-05-23', 'uuuuu') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'uuuuu' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'uuuuu' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -849,7 +849,7 @@ select to_timestamp('2019-10-06 A', 'yyyy-MM-dd GGGGG') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'yyyy-MM-dd GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'yyyy-MM-dd GGGGG' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -858,7 +858,7 @@ select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEEE' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -867,7 +867,7 @@ select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -876,7 +876,7 @@ select unix_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -885,7 +885,7 @@ select from_unixtime(12345, 'MMMMM') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -902,7 +902,7 @@ select from_unixtime(23456, 'aaaaa') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'aaaaa' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'aaaaa' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -911,7 +911,7 @@ select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampF struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -920,7 +920,7 @@ select from_json('{"date":"26/October/2015"}', 'date Date', map('dateFormat', 'd struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -929,7 +929,7 @@ select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/ struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -938,4 +938,4 @@ select from_csv('26/October/2015', 'date Date', map('dateFormat', 'dd/MMMMM/yyyy struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html diff --git a/sql/core/src/test/resources/sql-tests/results/datetime.sql.out b/sql/core/src/test/resources/sql-tests/results/datetime.sql.out index 865f73644556..84179cd5e79f 100755 --- a/sql/core/src/test/resources/sql-tests/results/datetime.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/datetime.sql.out @@ -758,7 +758,7 @@ select date_format(date '2020-05-23', 'GGGGG') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'GGGGG' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -767,7 +767,7 @@ select date_format(date '2020-05-23', 'MMMMM') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -776,7 +776,7 @@ select date_format(date '2020-05-23', 'LLLLL') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'LLLLL' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'LLLLL' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -785,7 +785,7 @@ select date_format(timestamp '2020-05-23', 'EEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'EEEEE' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -794,7 +794,7 @@ select date_format(timestamp '2020-05-23', 'uuuuu') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'uuuuu' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'uuuuu' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -821,7 +821,7 @@ select to_timestamp('2019-10-06 A', 'yyyy-MM-dd GGGGG') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'yyyy-MM-dd GGGGG' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'yyyy-MM-dd GGGGG' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -830,7 +830,7 @@ select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEEE' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -839,7 +839,7 @@ select to_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -848,7 +848,7 @@ select unix_timestamp('22 05 2020 Friday', 'dd MM yyyy EEEEE') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd MM yyyy EEEEE' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -857,7 +857,7 @@ select from_unixtime(12345, 'MMMMM') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'MMMMM' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -874,7 +874,7 @@ select from_unixtime(23456, 'aaaaa') struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'aaaaa' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'aaaaa' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -883,7 +883,7 @@ select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampF struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -892,7 +892,7 @@ select from_json('{"date":"26/October/2015"}', 'date Date', map('dateFormat', 'd struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -901,7 +901,7 @@ select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/ struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -910,4 +910,4 @@ select from_csv('26/October/2015', 'date Date', map('dateFormat', 'dd/MMMMM/yyyy struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'dd/MMMMM/yyyy' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html diff --git a/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/stringCastAndExpressions.sql.out b/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/stringCastAndExpressions.sql.out index d43e632ea63d..02944c268ed2 100644 --- a/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/stringCastAndExpressions.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/stringCastAndExpressions.sql.out @@ -139,7 +139,7 @@ select to_timestamp('2018-01-01', a) from t struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'aa' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'aa' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -156,7 +156,7 @@ select to_unix_timestamp('2018-01-01', a) from t struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'aa' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'aa' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query @@ -173,7 +173,7 @@ select unix_timestamp('2018-01-01', a) from t struct<> -- !query output org.apache.spark.SparkUpgradeException -You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'aa' pattern in the new parser. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html +You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'aa' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html -- !query From ee1d62a9431d61c1d6c9e83bad2b8e9237b7b216 Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Fri, 22 May 2020 20:03:24 +0800 Subject: [PATCH 15/15] lazy --- .../org/apache/spark/sql/catalyst/csv/UnivocityParser.scala | 4 ++-- .../org/apache/spark/sql/catalyst/json/JacksonParser.scala | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala index 8e87a8276947..f2bb7db895ca 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala @@ -85,13 +85,13 @@ class UnivocityParser( // We preallocate it avoid unnecessary allocations. private val noRows = None - private val timestampFormatter = TimestampFormatter( + private lazy val timestampFormatter = TimestampFormatter( options.timestampFormat, options.zoneId, options.locale, legacyFormat = FAST_DATE_FORMAT, needVarLengthSecondFraction = true) - private val dateFormatter = DateFormatter( + private lazy val dateFormatter = DateFormatter( options.dateFormat, options.zoneId, options.locale, diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala index ef987931e928..c4f612172349 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala @@ -56,13 +56,13 @@ class JacksonParser( private val factory = options.buildJsonFactory() - private val timestampFormatter = TimestampFormatter( + private lazy val timestampFormatter = TimestampFormatter( options.timestampFormat, options.zoneId, options.locale, legacyFormat = FAST_DATE_FORMAT, needVarLengthSecondFraction = true) - private val dateFormatter = DateFormatter( + private lazy val dateFormatter = DateFormatter( options.dateFormat, options.zoneId, options.locale,