-
Notifications
You must be signed in to change notification settings - Fork 737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about transfer learning development #407
Comments
Hi @JDE65 -- I'm a little unclear on what you're asking for here. Are you trying to construct an EBM with 3-way and possibly 4-way terms? And to do that you would like to use the list of pairs when deciding which 3-way interactions to consider because the list of possible 3 and 4 way interactions is very large? |
Hi @paulbkoch |
Ok, I see now what you are trying to do @JDE65. The short answer is that this is theoretically possible to do with the EBM model class, but to do it currently you would need to modify our code and write some of your own utility functions to handle the post-process merging. I'll just first point out that you might, in some settings, want to continue shaping the original 20 features when training on the 22 features later on, but you've indicated that you only want to shape the 2 new features. That's good because it's simpler. The first thing you would need is to be able to continue training from a previous model. If you have a regression problem you can do this easily by calculating the residual error using the original EBM that you constructed in the first step and subtracting the predicted value from the actual. For classification this doesn't work though, and you would need something akin to the init_score parameter in LightGBM. We have a PR that adds this functionality actually (#371), but it still needs some work before merging, so this functionality is not part of our package yet. Using one of the methods above for either regression or classification, you would create a new EBM that would have only 2 features. At this point you would have one EBM with 20 features and another EBM with 2 features. Next you need to combine these. Since the models are additive, this works mathematically, but you need to modify all our existing attributes to accept 22 features. Mostly this involves appending the attributes of the two new features to the existing attributes in the EBM with 20 features. We do have a function called merge_ebms that you might want to look at (see example: https://github.com/interpretml/interpret/blob/develop/examples/python/notebooks/Merging%20EBM%20Models.ipynb). It won't do what you are looking for since it is designed to merge two EBMs that have the same number of features, but you may find some inspiration there. One additional quirk about EBMs that I'll point out. A legal EBM can consist of only 2 attributes (ebm.bins_ and ebm.term_scores_), so you can first try getting it to work with just those two attributes and delete the rest. Of course, nice things like visualizations are better if you have histograms and all the rest of the other attributes. |
Thanks a lot. Answer is "crystal clear". Now a dumb question that relates to the handling of missing values.
I would then use this trained model EBM_US with Canadian dataset that has 22 features, and do |
Hi @JDE65 -- That won't do what you expect. In scikit-learn, calling fit a second time will overwrite the previous model, so you'll just get a trained model from the data X_CA & y_CA. But also, for EBMs we learn from missing values. If all the values for those two extra features are missing then there won't be any signal in those features however, so you should get a corresponding ebm.term_scores_[term_index] that is filled with zeros. Doing it like this might help you though in avoiding some work to later munge the two models together though, so I think you have a good idea here to reduce some of the work. |
Thanks a lot for this. |
Hi @JDE65 -- Stagewise fitting is indeed something we've discussed. There's a longer discussion regarding it in issue I was talking with Rich Caruana the other day and he suggested a much easier way to approach your original problem. Instead of modifying our code to stagewise fit and then edit models to add new univariate features afterwards, you can get very close to the final solution by making a "partly useless pair". Imagine you made a 23rd feature that was always false. Since it's always false, if you made a pair with that feature one axis will always be useless and it won't learn anything from it. Your "pair" is now effectively a main. So now, you just need to do something like this: ebm=ExplainableBoostingClassifier(mains=[0,1,2,...19], interactions=[(20,22), (21,22)]) And since our framework is setup by default to do stagewise fitting of pairs, it'll do what you want for less hassle. |
Closing this issue for now since we're tracking the sub-topics discussed in other threads already. Please re-open if you want to add more. |
The EBM will first look for minimizing the error without interactions (step 1), than with 1st order interactions (step 2), then with higher order interactions (step 3).
Is it possible to replace the higher order interaction process in step 3 (or to add a step 4?) with a new added feature that would benefit from the learned parameters from steps 1 to 2 (or 3), so that we would realise a transfer learning ?
If so, do you have any view on how to implement this?
The text was updated successfully, but these errors were encountered: