Skip to content

Conversation

@imback82
Copy link
Contributor

@imback82 imback82 commented Sep 9, 2020

What changes were proposed in this pull request?

This PR is a follow up to #29543 (comment), which correctly points out that the check for the empty string is not necessary.

Why are the changes needed?

The unnecessary check actually could cause more confusion.

For example,

scala> Seq(1).toDF.write.option("path", "/tmp/path1").parquet("")
java.lang.IllegalArgumentException: Can not create a Path from an empty string
  at org.apache.hadoop.fs.Path.checkPathArg(Path.java:168)

even when path option is available. This PR addresses to fix this confusion.

Does this PR introduce any user-facing change?

Yes, now the above example prints the consistent exception message whether the path parameter value is empty or not.

scala> Seq(1).toDF.write.option("path", "/tmp/path1").parquet("")
org.apache.spark.sql.AnalysisException: There is a 'path' option set and save() is called with a path parameter. Either remove the path option, or call save() without the parameter. To ignore this check, set 'spark.sql.legacy.pathOptionBehavior.enabled' to 'true'.;
  at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:290)
  at org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:856)
  ... 47 elided

How was this patch tested?

Added unit tests.

spark.readStream
.format("org.apache.spark.sql.streaming.test")
.option("path", "tmp4")
.load("tmp5")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was testing an incorrect API, so I am fixing it here.

@SparkQA
Copy link

SparkQA commented Sep 9, 2020

Test build #128464 has finished for PR 29697 at commit d2c7756.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

@imback82 thanks for the followup! can you address this comment as well? https://github.com/apache/spark/pull/29543/files#r485994670 We can use withTempPaths to generate fresh temp paths.

@cloud-fan
Copy link
Contributor

that comment is not an issue, I'm merging it to master, thanks!

@cloud-fan cloud-fan closed this in ab2fa88 Sep 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants