-
-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sklearn kwargs #2338
Sklearn kwargs #2338
Conversation
Can you please document what happens if a parameter is provided in kwargs and as normal parameter? Does one override the other? |
Scratch that, it actually throws a TypeError because it's being provided multiple values for the same argument. I think this is actually the proper behavior. I am making adjustments accordingly. I am also adding tests to demonstrate this behavior. |
It might not be recommended to use kwargs for sklearn. see the notes on this page: http://scikit-learn.org/stable/modules/generated/sklearn.base.BaseEstimator.html. |
@gaw89 Could you please find out whether the requirement that @wxchan has pointed out is about consistency, clarity and safety (e.g., protecting against typos) required for an estimator to become an official part of sklearn (which is not a goal for xgboost), or whether using kwargs might actually break something. |
@khotilov, yes, trying to sort that out now. Unfortunately, the page that @wxchan linked doesn't give much info, so I might have to go through some Sklearn source... I may try to put in issue in for Sklearn as well to get clarification. In the meantime, I am using this branch on my machine and it's working great. Allows me to use GPU in GridSearchCV! :) |
@khotilov and @wxchan, I filed an issue with Sklearn. Here's their response: "It's the API of scikit-learn. Things are not garantied to work if you do So, it seems that it is primarily an API/support issue - they won't guarantee that things will not break if we use this. However, it does not guarantee that things will break either. From looking at the Sklearn source as it is presently, I don't see how this would cause problems, but I am not terribly familiar with the inner workings of Sklearn. I'm not sure what the best course of action is here. It seems fairly unlikely that it will make things break, but there is the risk. At the same time, individually adding the various booster params to the init could be cumbersome both for maintenance and usage. I am happy to work on updating these things individually if that is the decision. |
It seems likely that they want each parameter explicit in the init function for reflection purposes. I think we go ahead with the documentation that this is an unsupported sklearn parameter and parameters passed this way are not gauranteed to interact correctly with sklearn. It is clear that we need a way to pass a variety of parameters to the underlying xgboost model. Great work @gaw89. |
@RAMitchell, thanks! I went ahead and added a note to indicate this caveat with usage of kwargs. Then I fixed my lint errors again... Sick of Spyder adding whitespace at the end of my lines. |
Thank you! |
* Added kwargs support for Sklearn API * Updated NEWS and CONTRIBUTORS * Fixed CONTRIBUTORS.md * Added clarification of **kwargs and test for proper usage * Fixed lint error * Fixed more lint errors and clf assigned but never used * Fixed more lint errors * Fixed more lint errors * Fixed issue with changes from different branch bleeding over * Fixed issue with changes from other branch bleeding over * Added note that kwargs may not be compatible with Sklearn * Fixed linting on kwargs note
This allows users of the Sklearn API to provide all Booster params via a **kwargs dict.