[SPARK-34357][SQL] Map JDBC SQL TIME type to TimestampType with time portion fixed regardless of timezone#31473
[SPARK-34357][SQL] Map JDBC SQL TIME type to TimestampType with time portion fixed regardless of timezone#31473saikocat wants to merge 3 commits intoapache:masterfrom
Conversation
|
CC: @cloud-fan @sarutak @skestle - regarding the updated changes to respect vendor db integrations |
|
ok to test. |
|
@saikocat Have you confirmed all the integration tests like |
Yup, I learned from my previous mistake. All IT all passed. Edit: manual loaded the docker image for mssql works |
Thanks. Anyway, I'll run all the integration tests before this change merged. |
| if (rawTime != null) { | ||
| val localTimeMicro = TimeUnit.NANOSECONDS.toMicros(rawTime.toLocalTime().toNanoOfDay()) | ||
| val localTimeMillis = DateTimeUtils.microsToMillis(localTimeMicro) | ||
| val timeZoneOffset = TimeZone.getDefault match { |
There was a problem hiding this comment.
In most cases, session timezone should be the same as JVM default timezone, but ideally we should use session timezone for datetime operations: SQLConf.get.sessionLocalTimeZone
| val timeZoneOffset = TimeZone.getDefault match { | ||
| case zoneInfo: ZoneInfo => zoneInfo.getOffsetsByWall(localTimeMillis, null) | ||
| case timeZone: TimeZone => timeZone.getOffset(localTimeMillis - timeZone.getRawOffset) | ||
| } |
There was a problem hiding this comment.
Seems we can just do DateTimeUtils.toUTCTime(localTimeMicro, SQLConf.get.sessionLocalTimeZone)
There was a problem hiding this comment.
Superb! This is really more elegant. Thanks for this suggestion!
|
Kubernetes integration test starting |
|
Confirmed all the JDBC integration tests pass for the latest commit 0b4a344. |
|
Kubernetes integration test status success |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #134872 has finished for PR 31473 at commit
|
|
thanks, merging to master! |
|
Test build #134878 has finished for PR 31473 at commit
|
What changes were proposed in this pull request?
Due to user-experience (confusing to Spark users - java.sql.Time using milliseconds vs Spark using microseconds; and user losing useful functions like hour(), minute(), etc on the column), we have decided to revert back to use TimestampType but this time we will enforce the hour to be consistently across system timezone (via offset manipulation) and date part fixed to zero epoch.
Full Discussion with Wenchen Fan Wenchen Fan regarding this ticket is here #30902 (comment)
Why are the changes needed?
Revert and improvement to sql.Time handling
Does this PR introduce any user-facing change?
No
How was this patch tested?
Unit tests and integration tests