-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-31308][PySpark] Merging pyFiles to files argument for Non-PySpark applications #28077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Maybe we need a unit test too. But let me wait for some comments first. |
|
I think it's fine. cc @vanzin and @jerryshao |
|
Hi, @viirya . This PR title looks too broad. Could you be more specific by excluding the scope of SPARK-24377 ?
|
|
Test build #120622 has finished for PR 28077 at commit
|
|
Test build #120628 has finished for PR 28077 at commit
|
|
retest this please |
|
Test build #120631 has finished for PR 28077 at commit
|
|
Merged to master. Thank you, @viirya and @HyukjinKwon . |
…ark applications ### What changes were proposed in this pull request? This PR (SPARK-31308) proposed to add python dependencies even it is not Python applications. ### Why are the changes needed? For now, we add `pyFiles` argument to `files` argument only for Python applications, in SparkSubmit. Like the reason in apache#21420, "for some Spark applications, though they're a java program, they require not only jar dependencies, but also python dependencies.", we need to add `pyFiles` to `files` even it is not Python applications. ### Does this PR introduce any user-facing change? Yes. After this change, for non-PySpark applications, the Python files specified by `pyFiles` are also added to `files` like PySpark applications. ### How was this patch tested? Manually test on jupyter notebook or do `spark-submit` with `--verbose`. ``` Spark config: ... (spark.files,file:/Users/dongjoon/PRS/SPARK-PR-28077/a.py) (spark.submit.deployMode,client) (spark.master,local[*]) ``` Closes apache#28077 from viirya/pyfile. Lead-authored-by: Liang-Chi Hsieh <[email protected]> Co-authored-by: Liang-Chi Hsieh <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
This PR (SPARK-31308) proposed to add python dependencies even it is not Python applications.
Why are the changes needed?
For now, we add
pyFilesargument tofilesargument only for Python applications, in SparkSubmit. Like the reason in #21420, "for some Spark applications, though they're a java program, they require not only jar dependencies, but also python dependencies.", we need to addpyFilestofileseven it is not Python applications.Does this PR introduce any user-facing change?
Yes. After this change, for non-PySpark applications, the Python files specified by
pyFilesare also added tofileslike PySpark applications.How was this patch tested?
Manually test on jupyter notebook or do
spark-submitwith--verbose.