-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-27199][SQL] Replace TimeZone by ZoneId in TimestampFormatter API #24141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #103667 has finished for PR 24141 at commit
|
| pattern: String, | ||
| timeZone: TimeZone, | ||
| locale: Locale) extends TimestampFormatter with DateTimeFormatterHelper { | ||
| pattern: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanx. Just wondering why scalastyle didn't catch that.
|
Looks reasonable to me. Is there any behavior change? I don't think so, just checking. |
Potentially we could observe a behavoir change in parsing time zone id strings but because we did it twice in |
|
Test build #103675 has started for PR 24141 at commit |
|
jenkins, retest this, please |
|
Test build #103693 has finished for PR 24141 at commit
|
|
Merged to master. |
… and FromUnixTime ## What changes were proposed in this pull request? SPARK-27199 introduced the use of `ZoneId` instead of `TimeZone` in a few date/time expressions. There were 3 occurrences of `ctx.addReferenceObj("zoneId", zoneId)` in that PR, which had a bug because while the `java.time.ZoneId` base type is public, the actual concrete implementation classes are not public, so using the 2-arg version of `CodegenContext.addReferenceObj` would incorrectly generate code that reference non-public types (`java.time.ZoneRegion`, to be specific). The 3-arg version should be used, with the class name of the referenced object explicitly specified to the public base type. One of such occurrences was caught in testing in the main PR of SPARK-27199 (#24141), for `DateFormatClass`. But the other 2 occurrences slipped through because there were no test cases that covered them. Example of this bug in the current Apache Spark master, in a Spark Shell: ``` scala> Seq(("2016-04-08", "yyyy-MM-dd")).toDF("s", "f").repartition(1).selectExpr("to_unix_timestamp(s, f)").show ... java.lang.IllegalAccessError: tried to access class java.time.ZoneRegion from class org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1 ``` This PR fixes the codegen issues and adds the corresponding unit tests. ## How was this patch tested? Enhanced tests in `DateExpressionsSuite` for `to_unix_timestamp` and `from_unixtime`. Closes #24352 from rednaxelafx/fix-spark-27199. Authored-by: Kris Mok <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
In the PR, I propose to use
ZoneIdinstead ofTimeZonein:applyandgetFractionFormattermethods of theTimestampFormatterobject,TimestampFormattertrait likeFractionTimestampFormatter.The reason of the changes is to avoid unnecessary conversion from
TimeZonetoZoneIdbecauseZoneIdis used inTimestampFormatterimplementations internally, and the conversion is performed viaStringwhich is not for free. Also taking into account thatTimeZoneinstances are converted fromStringin some cases, the worse case looks likeString->TimeZone->String->ZoneId. The PR eliminates the unneeded conversions.How was this patch tested?
It was tested by
DateExpressionsSuite,DateTimeUtilsSuiteandTimestampFormatterSuite.