Sklearn kwargs #2338

gaw89 · 2017-05-22T18:28:32Z

This allows users of the Sklearn API to provide all Booster params via a **kwargs dict.

RAMitchell · 2017-05-23T01:51:05Z

Can you please document what happens if a parameter is provided in kwargs and as normal parameter? Does one override the other?

gaw89 · 2017-05-23T10:11:30Z

~~@RAMitchell great idea. Adding that clarification now. As of now, kwargs will override normal parameters. Is this acceptable behavior?~~

Scratch that, it actually throws a TypeError because it's being provided multiple values for the same argument. I think this is actually the proper behavior. I am making adjustments accordingly.

I am also adding tests to demonstrate this behavior.

wxchan · 2017-05-23T11:48:23Z

It might not be recommended to use kwargs for sklearn. see the notes on this page: http://scikit-learn.org/stable/modules/generated/sklearn.base.BaseEstimator.html.

…n_kwargs

khotilov · 2017-05-23T14:59:48Z

@gaw89 Could you please find out whether the requirement that @wxchan has pointed out is about consistency, clarity and safety (e.g., protecting against typos) required for an estimator to become an official part of sklearn (which is not a goal for xgboost), or whether using kwargs might actually break something.

gaw89 · 2017-05-23T15:03:28Z

@khotilov, yes, trying to sort that out now. Unfortunately, the page that @wxchan linked doesn't give much info, so I might have to go through some Sklearn source... I may try to put in issue in for Sklearn as well to get clarification.

In the meantime, I am using this branch on my machine and it's working great. Allows me to use GPU in GridSearchCV! :)

gaw89 · 2017-05-23T16:07:50Z

@khotilov and @wxchan, I filed an issue with Sklearn. Here's their response:

"It's the API of scikit-learn. Things are not garantied to work if you do
it differently. We don't support it."

So, it seems that it is primarily an API/support issue - they won't guarantee that things will not break if we use this. However, it does not guarantee that things will break either. From looking at the Sklearn source as it is presently, I don't see how this would cause problems, but I am not terribly familiar with the inner workings of Sklearn.

I'm not sure what the best course of action is here. It seems fairly unlikely that it will make things break, but there is the risk. At the same time, individually adding the various booster params to the init could be cumbersome both for maintenance and usage.

I am happy to work on updating these things individually if that is the decision.

RAMitchell · 2017-05-23T16:29:18Z

It seems likely that they want each parameter explicit in the init function for reflection purposes.

I think we go ahead with the documentation that this is an unsupported sklearn parameter and parameters passed this way are not gauranteed to interact correctly with sklearn.

It is clear that we need a way to pass a variety of parameters to the underlying xgboost model.

Great work @gaw89.

gaw89 · 2017-05-23T16:37:40Z

@RAMitchell, thanks! I went ahead and added a note to indicate this caveat with usage of kwargs. Then I fixed my lint errors again... Sick of Spyder adding whitespace at the end of my lines.

terrytangyuan · 2017-05-24T02:48:02Z

Thank you!

* Added kwargs support for Sklearn API * Updated NEWS and CONTRIBUTORS * Fixed CONTRIBUTORS.md * Added clarification of **kwargs and test for proper usage * Fixed lint error * Fixed more lint errors and clf assigned but never used * Fixed more lint errors * Fixed more lint errors * Fixed issue with changes from different branch bleeding over * Fixed issue with changes from other branch bleeding over * Added note that kwargs may not be compatible with Sklearn * Fixed linting on kwargs note

gaw89 added 3 commits May 22, 2017 12:49

Added kwargs support for Sklearn API

af60db1

Updated NEWS and CONTRIBUTORS

8268bc2

Fixed CONTRIBUTORS.md

49f3b64

terrytangyuan requested review from RAMitchell and terrytangyuan May 23, 2017 04:15

terrytangyuan self-assigned this May 23, 2017

jayzed82 mentioned this pull request May 23, 2017

Add option to choose booster in scikit intreface (gbtree by default) #2303

Merged

gaw89 added 3 commits May 23, 2017 07:02

Added clarification of **kwargs and test for proper usage

5b791b5

Fixed lint error

93ea53d

Fixed more lint errors and clf assigned but never used

5f060db

gaw89 added 5 commits May 23, 2017 07:54

Fixed more lint errors

89bd0f9

Fixed more lint errors

9db3a24

Fixed issue with changes from different branch bleeding over

7d3a90e

Merge branch 'sklearn_kwargs' of github.com:gaw89/xgboost into sklear…

91f6487

…n_kwargs

Fixed issue with changes from other branch bleeding over

483bfad

gaw89 added 2 commits May 23, 2017 12:33

Added note that kwargs may not be compatible with Sklearn

12b7e95

Fixed linting on kwargs note

b99e423

RAMitchell approved these changes May 24, 2017

View reviewed changes

terrytangyuan merged commit 0f3a404 into dmlc:master May 24, 2017

gaw89 mentioned this pull request May 24, 2017

Sklearn API support for histogram optimized tree grower. #2343

Closed

khotilov mentioned this pull request Jun 1, 2017

passing monotone constraints to xgbclassifier? #1953

Closed

lock bot locked as resolved and limited conversation to collaborators Jan 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sklearn kwargs #2338

Sklearn kwargs #2338

gaw89 commented May 22, 2017

RAMitchell commented May 23, 2017

gaw89 commented May 23, 2017 •

edited

Loading

wxchan commented May 23, 2017

khotilov commented May 23, 2017

gaw89 commented May 23, 2017 •

edited

Loading

gaw89 commented May 23, 2017

RAMitchell commented May 23, 2017

gaw89 commented May 23, 2017

terrytangyuan commented May 24, 2017

Sklearn kwargs #2338

Sklearn kwargs #2338

Conversation

gaw89 commented May 22, 2017

RAMitchell commented May 23, 2017

gaw89 commented May 23, 2017 • edited Loading

wxchan commented May 23, 2017

khotilov commented May 23, 2017

gaw89 commented May 23, 2017 • edited Loading

gaw89 commented May 23, 2017

RAMitchell commented May 23, 2017

gaw89 commented May 23, 2017

terrytangyuan commented May 24, 2017

gaw89 commented May 23, 2017 •

edited

Loading

gaw89 commented May 23, 2017 •

edited

Loading