Skip to content

Conversation

@ScrapCodes
Copy link
Member

@ScrapCodes ScrapCodes commented Oct 23, 2020

What changes were proposed in this pull request?

Support rename column for mysql dialect.

Why are the changes needed?

At the moment, it does not work for mysql version 5.x. So, we should throw proper exception for that case.

Does this PR introduce any user-facing change?

Yes, column rename with mysql dialect should work correctly.

How was this patch tested?

Added tests for rename column.
Ran the tests to pass with both versions of mysql.

  • export MYSQL_DOCKER_IMAGE_NAME=mysql:5.7.31

  • export MYSQL_DOCKER_IMAGE_NAME=mysql:8.0

@ScrapCodes
Copy link
Member Author

While testing #30038, I found rename column needs some work for mysql dialect. More details added in the code comments.

cc @cloud-fan and @huaxingao

@SparkQA
Copy link

SparkQA commented Oct 23, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34806/

@SparkQA
Copy link

SparkQA commented Oct 23, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34807/

@SparkQA
Copy link

SparkQA commented Oct 23, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34806/

@SparkQA
Copy link

SparkQA commented Oct 23, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34807/

@SparkQA
Copy link

SparkQA commented Oct 23, 2020

Test build #130205 has finished for PR 30142 at commit 73b0550.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 23, 2020

Test build #130207 has finished for PR 30142 at commit f4b8a24.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 23, 2020

Test build #130206 has finished for PR 30142 at commit 7c9aa72.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

}

override def testRenameColumn(tbl: String): Unit = {
if (db.imageName.matches(".*mysql.5\\.[0-9].*")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can check the config value here.

override def testRenameColumn(tbl: String): Unit = {
if (db.imageName.matches(".*mysql.5\\.[0-9].*")) {
sql(s"CREATE TABLE $tbl (ID STRING NOT NULL) USING _")
// Update nullability is unsupported for mysql db.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename is unsupported ...

.createWithDefault("")

val JDBC_MYSQL_VERSION =
buildConf("spark.sql.jdbc.mysql.version")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if users want to connect to both mysql 5 and 8? I think it's better to make it a catalog option, not a session config.

tableName: String,
columnName: String,
newName: String): String = {
if (SQLConf.get.jdbcMySQLVersion.matches("^8\\.[0-9].*")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using a session config, we probably want to get database version using DatabaseMetaData in JdbcUtils.altertable and pass the info here, and only do RENAME if version >= 8

conn.getMetaData.getDatabaseMajorVersion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, it's even better if we can get the database version, so that users don't need to specify it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @huaxingao and @cloud-fan, for this suggestion.

@ScrapCodes ScrapCodes force-pushed the mysql-dialect-rename branch from f4b8a24 to 2e4fd62 Compare October 29, 2020 09:08
@SparkQA
Copy link

SparkQA commented Oct 29, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35006/

@SparkQA
Copy link

SparkQA commented Oct 29, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35006/

@ScrapCodes
Copy link
Member Author

@cloud-fan and @huaxingao I have updated the PR with your suggestions and also rebased it with master. Please take a look again.

@ScrapCodes ScrapCodes force-pushed the mysql-dialect-rename branch from 55cca16 to 9e1f75a Compare October 29, 2020 10:32
@SparkQA
Copy link

SparkQA commented Oct 29, 2020

Test build #130402 has finished for PR 30142 at commit 2e4fd62.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 29, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35008/

@ScrapCodes
Copy link
Member Author

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Oct 29, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35008/

override def sparkConf: SparkConf = super.sparkConf
.set("spark.sql.catalog.mysql", classOf[JDBCTableCatalog].getName)
.set("spark.sql.catalog.mysql.url", db.getJdbcUrl(dockerIp, externalPort))
override def sparkConf: SparkConf = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary change

@SparkQA
Copy link

SparkQA commented Oct 29, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35011/

}

override def testRenameColumn(tbl: String): Unit = {
assert( mySQLVersion > 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: assert(mySQLVersion > 0) (remove extra space)

val expectedSchema = new StructType().add("ID2", StringType, nullable = true)
.add("ID1", StringType, nullable = true)
assert(t.schema === expectedSchema)
// Rename to already existing column
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be put in the test("SPARK-33034: ALTER TABLE ... rename column"), as the behavior should always be the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@SparkQA
Copy link

SparkQA commented Oct 29, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35011/

@SparkQA
Copy link

SparkQA commented Oct 29, 2020

Test build #130404 has finished for PR 30142 at commit 9e1f75a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 29, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35012/

@SparkQA
Copy link

SparkQA commented Oct 29, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35012/

@SparkQA
Copy link

SparkQA commented Oct 29, 2020

Test build #130407 has finished for PR 30142 at commit 9e1f75a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 29, 2020

Test build #130408 has finished for PR 30142 at commit fe1e44f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@huaxingao
Copy link
Contributor

@ScrapCodes seems you forgot to address this #30142 (comment). Everything else is good.

@ScrapCodes ScrapCodes force-pushed the mysql-dialect-rename branch from 5b8e45e to ce70996 Compare October 30, 2020 09:32
@SparkQA
Copy link

SparkQA commented Oct 30, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35057/

@SparkQA
Copy link

SparkQA commented Oct 30, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35057/

@SparkQA
Copy link

SparkQA commented Oct 30, 2020

Test build #130452 has finished for PR 30142 at commit ce70996.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ScrapCodes
Copy link
Member Author

Hi @huaxingao and @cloud-fan, Please take a look.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 6226ccc Nov 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants