-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-38775][ML] cleanup validation functions #36049
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just refactoring - shouldn't change behavior? seems OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getOrElse works here too but doesn't matter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, will swith back to getOrElse
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @zhengruifeng and @srowen (also cc @gengliangwang )
If you don't mind, could you hold on this refactoring a little bit more until @MaxGekk releases RC1?
- Currently, Apache Spark 3.3 is under active testing and unstable.2.
- GitHub Action PR builder also doesn't provide you the ANSI test result
- This kind of massive refactoring increases the complexity during the Apache Spark 3.3.0 release process.
- For example, SPARK-38490 (#35797) introduced a ANSI conf related bug and SPARK-38776 fixed it. However, SPARK-38669 (#35983) refactored
masterandbranch-3.3different a lots and the backporting was difficult.
- For example, SPARK-38490 (#35797) introduced a ANSI conf related bug and SPARK-38776 fixed it. However, SPARK-38669 (#35983) refactored
Hopefully, please give us more time until we finish Apache Spark 3.3, @zhengruifeng . I believe Cleanups and refactoring PRs are less urgent than the scheduled release.
|
@dongjoon-hyun Ok, I will hold on this PR since its target version is 3.4 |
|
Thank you so much! |
|
We can revisit this now. Rerun tests I think? |
|
+1 for restarting. Thank you, @zhengruifeng and @srowen . |
|
Sure, let me update this PR |
f41e956 to
87fe98f
Compare
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @zhengruifeng , @srowen , @huaxingao .
Merged to master.
|
Thanks all! |
What changes were proposed in this pull request?
1, remove unused
extractInstancesandextractLabeledPointsinPredictor;2, remove unused
checkNonNegativeWeightinfunction;3, move
getNumClassesfromClasifiertoDatasetUtils;4, move
getNumFeaturesfromMetadataUtilstoDatasetUtils;Why are the changes needed?
to unify to methods
Does this PR introduce any user-facing change?
No
How was this patch tested?
existing testsuites