-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-21685][PYTHON][ML] PySpark Params isSet state should not change after transform #18982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
BryanCutler
wants to merge
15
commits into
apache:master
from
BryanCutler:pyspark-ml-param-to-java-defaults-SPARK-21685
Closed
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
bbe3ef7
added regression test for preserving Param set state when transferrin…
BryanCutler 1203056
changed _transfer_params_to_java to not set param in Java unless expl…
BryanCutler 6f97a9d
Merge remote-tracking branch 'upstream/master' into pyspark-ml-param-…
BryanCutler 482c025
added transform to test
BryanCutler 1a9fa46
added python friendly method to set a single Java ParamPair
BryanCutler 088ee52
added MiMa exclude
BryanCutler fa4f974
Merge remote-tracking branch 'upstream/master' into pyspark-ml-param-…
BryanCutler bef3fb5
updated MimaExcludes
BryanCutler bbd2144
Merge remote-tracking branch 'upstream/master' into pyspark-ml-param-…
BryanCutler 339c793
explicit calls to param checks
BryanCutler 9162944
Merge remote-tracking branch 'upstream/master' into pyspark-ml-param-…
BryanCutler 037ebf7
Revert "added MiMa exclude"
BryanCutler 32ab60d
Revert "added python friendly method to set a single Java ParamPair"
BryanCutler 24a1dbf
cleanup and add test for default transfer
BryanCutler 2eaf1a7
cleanup
BryanCutler File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If java side and python side the default params are the same, do we still need to set default params for the java object? Are't they already set in java object if they are default params?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My take is that while they should be the same, it's still possible they might not be. The user could extend their own classes or it's quite easy to change in Python. Although we don't really support this, if there was a mismatch the user would probably just get bad results and it would be really hard to figure out why. From the Python API, it would look like it was one value but actually using another in Scala.
If you all think it's overly cautious to do this, I can take it out. I just thought it would be cheap insurance to just set these values regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is reasonable, a few extra lines to avoid potential unwanted user surprise is worth it.