Skip to content

[SPARK-34357][SQL] Map JDBC SQL TIME type to TimestampType with time portion fixed regardless of timezone#31473

Closed
saikocat wants to merge 3 commits intoapache:masterfrom
saikocat:SPARK-34357
Closed

[SPARK-34357][SQL] Map JDBC SQL TIME type to TimestampType with time portion fixed regardless of timezone#31473
saikocat wants to merge 3 commits intoapache:masterfrom
saikocat:SPARK-34357

Conversation

@saikocat
Copy link
Contributor

@saikocat saikocat commented Feb 4, 2021

What changes were proposed in this pull request?

Due to user-experience (confusing to Spark users - java.sql.Time using milliseconds vs Spark using microseconds; and user losing useful functions like hour(), minute(), etc on the column), we have decided to revert back to use TimestampType but this time we will enforce the hour to be consistently across system timezone (via offset manipulation) and date part fixed to zero epoch.

Full Discussion with Wenchen Fan Wenchen Fan regarding this ticket is here #30902 (comment)

Why are the changes needed?

Revert and improvement to sql.Time handling

Does this PR introduce any user-facing change?

No

How was this patch tested?

Unit tests and integration tests

@saikocat
Copy link
Contributor Author

saikocat commented Feb 4, 2021

CC: @cloud-fan @sarutak @skestle - regarding the updated changes to respect vendor db integrations

@sarutak
Copy link
Member

sarutak commented Feb 4, 2021

ok to test.

@sarutak
Copy link
Member

sarutak commented Feb 4, 2021

@saikocat Have you confirmed all the integration tests like PostgresIntegrationSuite pass?
Jenkins and GA don't run them so we need to confirm by ourselves.

@saikocat
Copy link
Contributor Author

saikocat commented Feb 4, 2021

@saikocat Have you confirmed all the integration tests like PostgresIntegrationSuite pass?
Jenkins and GA don't run them so we need to confirm by ourselves.

Yup, I learned from my previous mistake. All IT all passed.

Edit: manual loaded the docker image for mssql works

@sarutak sarutak changed the title [SPARK-34357] Revert JDBC SQL TIME type to TimestampType with time portion fixed regardless of timezone [SPARK-34357][SQL] Map JDBC SQL TIME type to TimestampType with time portion fixed regardless of timezone Feb 4, 2021
@sarutak
Copy link
Member

sarutak commented Feb 4, 2021

Yup, I learned from my previous mistake. All IT all passed.

Edit: manual loaded the docker image for mssql works

Thanks. Anyway, I'll run all the integration tests before this change merged.

if (rawTime != null) {
val localTimeMicro = TimeUnit.NANOSECONDS.toMicros(rawTime.toLocalTime().toNanoOfDay())
val localTimeMillis = DateTimeUtils.microsToMillis(localTimeMicro)
val timeZoneOffset = TimeZone.getDefault match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In most cases, session timezone should be the same as JVM default timezone, but ideally we should use session timezone for datetime operations: SQLConf.get.sessionLocalTimeZone

val timeZoneOffset = TimeZone.getDefault match {
case zoneInfo: ZoneInfo => zoneInfo.getOffsetsByWall(localTimeMillis, null)
case timeZone: TimeZone => timeZone.getOffset(localTimeMillis - timeZone.getRawOffset)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we can just do DateTimeUtils.toUTCTime(localTimeMicro, SQLConf.get.sessionLocalTimeZone)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superb! This is really more elegant. Thanks for this suggestion!

@github-actions github-actions bot added the SQL label Feb 4, 2021
@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39458/

@sarutak
Copy link
Member

sarutak commented Feb 4, 2021

Confirmed all the JDBC integration tests pass for the latest commit 0b4a344.

@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39458/

@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39461/

@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39461/

@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Test build #134872 has finished for PR 31473 at commit 15e980f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • assert(types(1).equals(\"class java.sql.Timestamp\"))
  • assert(types(5).equals(\"class java.sql.Timestamp\"))
  • assert(types(1).equals(\"class java.sql.Timestamp\"))
  • assert(types(2).equals(\"class java.sql.Timestamp\"))

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 7675582 Feb 4, 2021
@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Test build #134878 has finished for PR 31473 at commit 0b4a344.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants