-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-30668][SQL] Support SimpleDateFormat patterns in parsing timestamps/dates strings
#27441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
68f6c65
13d0ba1
3d335ff
c8bc585
af38b87
03cec7b
31675fa
8ecb318
fc25dc7
19a47c1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -789,4 +789,18 @@ class DateFunctionsSuite extends QueryTest with SharedSparkSession { | |
| Row(Timestamp.valueOf("2015-07-24 07:00:00")), | ||
| Row(Timestamp.valueOf("2015-07-24 22:00:00")))) | ||
| } | ||
|
|
||
| test("SPARK-30668: use legacy timestamp parser in to_timestamp") { | ||
MaxGekk marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| def checkTimeZoneParsing(expected: Any): Unit = { | ||
| val df = Seq("2020-01-27T20:06:11.847-0800").toDF("ts") | ||
| checkAnswer(df.select(to_timestamp(col("ts"), "yyyy-MM-dd'T'HH:mm:ss.SSSz")), | ||
| Row(expected)) | ||
| } | ||
| withSQLConf(SQLConf.LEGACY_TIME_PARSER_ENABLED.key -> "true") { | ||
| checkTimeZoneParsing(Timestamp.valueOf("2020-01-27 20:06:11.847")) | ||
| } | ||
| withSQLConf(SQLConf.LEGACY_TIME_PARSER_ENABLED.key -> "false") { | ||
| checkTimeZoneParsing(null) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fallback to the old parser?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Silent fallback to old parser can lead to mixed values in the same column - some in combined calendar Julian+Gregorian another in Proleptic Gregorian calendar.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Strings parsed in different calendars may have difference of dozen days. |
||
| } | ||
MaxGekk marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this really related to the Proleptic Gregorian calendar switch? It looks to me that we just switch to a better pattern string implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is related because
SimpleDateFormatandDateTimeFormatteruse different calendars underneath. Slightly different patterns are just a consequence of switching.