-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-16236] [SQL] [FOLLOWUP] Add Path Option back to Load API in DataFrameReader #13965
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #61447 has finished for PR 13965 at commit
|
| def json(path: String): DataFrame = { | ||
| // This method ensures that calls that explicit need single argument works, see SPARK-16009 | ||
| json(Seq(path): _*) | ||
| option("path", path).json(Seq.empty: _*) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gatorsmile setting path or using load(Seq[String]) are both fine for json, parquet, and other file formats. See https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L312
You don't need to change this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zsxwing I see your point. The issue is from the others who build the data sources using the data source APIs. See the discussion in #13727 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm. I found the issue. Json, parquet and the other file formats will not affect the data source API developers. : ) Let me revert the changes.
|
Test build #61477 has finished for PR 13965 at commit
|
|
LGTM. Merging to master and 2.0. Thanks! |
…FrameReader
#### What changes were proposed in this pull request?
In Python API, we have the same issue. Thanks for identifying this issue, zsxwing ! Below is an example:
```Python
spark.read.format('json').load('python/test_support/sql/people.json')
```
#### How was this patch tested?
Existing test cases cover the changes by this PR
Author: gatorsmile <[email protected]>
Closes #13965 from gatorsmile/optionPaths.
(cherry picked from commit 39f2eb1)
Signed-off-by: Shixiong Zhu <[email protected]>
### What changes were proposed in this pull request? The pr aims to upgrade `netty` from `4.1.109.Final` to `4.1.110.Final`. ### Why are the changes needed? - https://netty.io/news/2024/05/22/4-1-110-Final.html This version has brought some bug fixes and improvements, such as: Fix Zstd throws Exception on read-only volumes (netty/netty#13982) Add unix domain socket transport in netty 4.x via JDK16+ ([#13965](netty/netty#13965)) Backport #13075: Add the AdaptivePoolingAllocator ([#13976](netty/netty#13976)) Add no-value key handling only for form body ([#13998](netty/netty#13998)) Add support for specifying SecureRandom in SSLContext initialization ([#14058](netty/netty#14058)) - https://github.com/netty/netty/issues?q=milestone%3A4.1.110.Final+is%3Aclosed ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #46744 from panbingkun/SPARK-48420. Authored-by: panbingkun <[email protected]> Signed-off-by: yangjie01 <[email protected]>
What changes were proposed in this pull request?
In Python API, we have the same issue. Thanks for identifying this issue, @zsxwing ! Below is an example:
How was this patch tested?
Existing test cases cover the changes by this PR