Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Feb 17, 2021

What changes were proposed in this pull request?

  1. Put the SQL config spark.sql.legacy.replaceDatabricksSparkAvro.enabled to the list of deprecated configs deprecatedSQLConfigs
  2. Update docs for the Avro datasource

Screenshot 2021-02-17 at 21 04 26

Why are the changes needed?

The config exists for enough time. We can deprecate it, and recommend users to use .format("avro") instead.

Does this PR introduce any user-facing change?

Should not except of the warning with the recommendation to use the avro format.

How was this patch tested?

  1. By generating docs via:
$ SKIP_API=1 SKIP_SCALADOC=1 SKIP_PYTHONDOC=1 SKIP_RDOC=1 jekyll serve --watch
  1. Manually checking the warning:
scala> spark.conf.set("spark.sql.legacy.replaceDatabricksSparkAvro.enabled", false)
21/02/17 21:20:18 WARN SQLConf: The SQL config 'spark.sql.legacy.replaceDatabricksSparkAvro.enabled' has been deprecated in Spark v3.2 and may be removed in the future. Use `.format("avro")` in `DataFrameWriter` or `DataFrameReader` instead.

@SparkQA
Copy link

SparkQA commented Feb 17, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39783/

@SparkQA
Copy link

SparkQA commented Feb 17, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39783/

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @MaxGekk .

s"Use '${AVRO_REBASE_MODE_IN_READ.key}' instead.")
s"Use '${AVRO_REBASE_MODE_IN_READ.key}' instead."),
DeprecatedConfig(LEGACY_REPLACE_DATABRICKS_SPARK_AVRO_ENABLED.key, "3.2",
"""Use `.format("avro")` in `DataFrameWriter` or `DataFrameReader` instead.""")
Copy link
Member

@dongjoon-hyun dongjoon-hyun Feb 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ur, is this guide correct? I guess the users are using .format("avro") already.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configs is about automatically mapping of .format("com.databricks.spark.avro") to .format("avro"), right? If we remove the config in the future, "com.databricks.spark.avro" will be not mapped to built in avro. So, in the guide, we recommend to change users code, and use the avro format directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I was confused at that.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question on the deprecation message.

@SparkQA
Copy link

SparkQA commented Feb 17, 2021

Test build #135202 has finished for PR 31578 at commit 660db04.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants