Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Jan 3, 2020

What changes were proposed in this pull request?

  1. Put all deprecated SQL configs the map SQLConf.deprecatedSQLConfigs with extra info about when configs were deprecated and additional comments that explain why a config was deprecated, what an user can use instead of it. Here is the list of already deprecated configs:

    • spark.sql.hive.verifyPartitionPath
    • spark.sql.execution.pandas.respectSessionTimeZone
    • spark.sql.legacy.execution.pandas.groupedMap.assignColumnsByName
    • spark.sql.parquet.int64AsTimestampMillis
    • spark.sql.variable.substitute.depth
    • spark.sql.execution.arrow.enabled
    • spark.sql.execution.arrow.fallback.enabled
  2. Output warning in set() and unset() about deprecated SQL configs

Why are the changes needed?

This should improve UX with Spark SQL and notify users about already deprecated SQL configs.

Does this PR introduce any user-facing change?

Yes, before:

spark-sql> set spark.sql.hive.verifyPartitionPath=true;
spark.sql.hive.verifyPartitionPath	true

After:

spark-sql> set spark.sql.hive.verifyPartitionPath=true;
20/01/03 21:28:17 WARN RuntimeConfig: The SQL config 'spark.sql.hive.verifyPartitionPath' has been deprecated in Spark v3.0.0 and may be removed in the future. This config is replaced by spark.files.ignoreMissingFiles.
spark.sql.hive.verifyPartitionPath	true

How was this patch tested?

Add new test which registers new log appender and catches all logging to check that set() and unset() log any warning.

@SparkQA
Copy link

SparkQA commented Jan 3, 2020

Test build #116099 has finished for PR 27092 at commit 722ddaa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@MaxGekk
Copy link
Member Author

MaxGekk commented Jan 4, 2020

@HyukjinKwon Please, have a look at the PR.

@MaxGekk
Copy link
Member Author

MaxGekk commented Jan 6, 2020

@cloud-fan @maropu Please, have a look at the PR.

@SparkQA
Copy link

SparkQA commented Jan 6, 2020

Test build #116164 has finished for PR 27092 at commit f9a2dcd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

}

test("log deprecation warnings") {
val logAppender = new AppenderSkeleton {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The same class logAppender seems to be defined in some places below, so can we define a helper method for this test purpose somewhere (e.g., TestUtils)?

$grep -nr "extends AppenderSkeleton" .
./catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveHintsSuite.scala:36:  class MockAppender extends AppenderSkeleton {
./catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala:525:    class MockAppender extends AppenderSkeleton {
./catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizerLoggingSuite.scala:42:  class MockAppender extends AppenderSkeleton {
./core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala:1766:    class TestAppender extends AppenderSkeleton {
./core/src/test/scala/org/apache/spark/sql/JoinHintSuite.scala:41:  class MockAppender extends AppenderSkeleton {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is slightly orthogonal to the PR but if you think it makes sense I will do that here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I think its ok in follow-up.

Copy link
Member

@maropu maropu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks useful! LGTM. cc: @HyukjinKwon

*/
val deprecatedSQLConfigs: Map[String, DeprecatedConfig] = {
val configs = Seq(
DeprecatedConfig(VARIABLE_SUBSTITUTE_DEPTH.key, "2.1",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't found where this config is used. We can remove it, I think.

.doc("When true, check all the partition paths under the table\'s root directory " +
"when reading data stored in HDFS. This configuration will be deprecated in the future " +
"releases and replaced by spark.files.ignoreMissingFiles.")
s"releases and replaced by ${SPARK_IGNORE_MISSING_FILES.key}.")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan Regarding to your comment #19868 (comment), spark.sql.hive.verifyPartitionPath can be changed at runtime but spark.files.ignoreMissingFiles cannot be. Is it fair replacement?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just slightly reword that users can use spark.files.ignoreMissingFiles instead of saying it's a replacement if you're concerned. I don't believe this configuration is commonly used enough, and it should be fine.

If users find this is unreasonable, we can un-deprecate it later given the feedback.

@SparkQA
Copy link

SparkQA commented Jan 8, 2020

Test build #116292 has finished for PR 27092 at commit e93f587.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class DeprecatedConfig(key: String, version: String, comment: String)

@SparkQA
Copy link

SparkQA commented Jan 8, 2020

Test build #116299 has finished for PR 27092 at commit 496ca8c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

@MaxGekk mind resolving conflicts?

…cated-sql-configs

# Conflicts:
#	sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@SparkQA
Copy link

SparkQA commented Jan 9, 2020

Test build #116349 has finished for PR 27092 at commit faf88e8.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@MaxGekk
Copy link
Member Author

MaxGekk commented Jan 9, 2020

jenkins, retest this, please

@SparkQA
Copy link

SparkQA commented Jan 9, 2020

Test build #116371 has finished for PR 27092 at commit faf88e8.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

@viirya, mind asking it again to the r sysadmin? It starts to fail again:

* checking CRAN incoming feasibility ...Error in .check_package_CRAN_incoming(pkgdir) : 
  dims [product 24] do not match the length of object [0]
Execution halted

@viirya
Copy link
Member

viirya commented Jan 9, 2020

@viirya, mind asking it again to the r sysadmin? It starts to fail again:

* checking CRAN incoming feasibility ...Error in .check_package_CRAN_incoming(pkgdir) : 
  dims [product 24] do not match the length of object [0]
Execution halted

Sure, just asked CRAN admin for help. :)

@viirya
Copy link
Member

viirya commented Jan 9, 2020

@HyukjinKwon Although I don't get reply, looks like it was fixed because I saw successful build from other PR.

@viirya
Copy link
Member

viirya commented Jan 9, 2020

retest this please

@SparkQA
Copy link

SparkQA commented Jan 10, 2020

Test build #116416 has finished for PR 27092 at commit faf88e8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Thanks, @viirya

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants