-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-19064][PySpark]Fix pip installing of sub components #16465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19064][PySpark]Fix pip installing of sub components #16465
Conversation
…stribution scripts to be more explicit about cleanup. Also add pypandoc to dev requirements file since we want it for publishing
|
Test build #70848 has finished for PR 16465 at commit
|
|
cc @JoshRosen who reviewed the original PR and is probably the most familiar. |
|
|
||
| from pyspark.sql import SparkSession | ||
| from pyspark.ml.param import Params | ||
| from pyspark.mllib.linalg import * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better to import pyspark.ml.linalg or both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This just checks one sub component from each, we could import each with rename I suppose but not sure it would do much?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. i think this should be enough.
|
Gentle ping for @JoshRosen to @davies maybe? |
|
Small follow up ping for @JoshRosen to @davies maybe? |
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say you can merge it yourself !
|
Sounds like a plan, I think should should probably be on the 2.1 branch as well so I'll go bug someone who has done backports to make sure I do that part right :) |
|
Holden, https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py works well for merging to master and back port to any branch :) unless there is conflict then it would be easier with a separate PR.
Have fun!
|
## What changes were proposed in this pull request? Fix instalation of mllib and ml sub components, and more eagerly cleanup cache files during test script & make-distribution. ## How was this patch tested? Updated sanity test script to import mllib and ml sub-components. Author: Holden Karau <[email protected]> Closes #16465 from holdenk/SPARK-19064-fix-pip-install-sub-components. (cherry picked from commit 965c82d) Signed-off-by: Holden Karau <[email protected]>
## What changes were proposed in this pull request? Fix instalation of mllib and ml sub components, and more eagerly cleanup cache files during test script & make-distribution. ## How was this patch tested? Updated sanity test script to import mllib and ml sub-components. Author: Holden Karau <[email protected]> Closes apache#16465 from holdenk/SPARK-19064-fix-pip-install-sub-components.
|
Merged into master & branch-2.1 |
## What changes were proposed in this pull request? Fix instalation of mllib and ml sub components, and more eagerly cleanup cache files during test script & make-distribution. ## How was this patch tested? Updated sanity test script to import mllib and ml sub-components. Author: Holden Karau <[email protected]> Closes apache#16465 from holdenk/SPARK-19064-fix-pip-install-sub-components.
What changes were proposed in this pull request?
Fix instalation of mllib and ml sub components, and more eagerly cleanup cache files during test script & make-distribution.
How was this patch tested?
Updated sanity test script to import mllib and ml sub-components.