Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meaning Expected Gain #13

Open
Toekan opened this issue Apr 4, 2018 · 0 comments
Open

Meaning Expected Gain #13

Toekan opened this issue Apr 4, 2018 · 0 comments

Comments

@Toekan
Copy link

Toekan commented Apr 4, 2018

Hi,

I'm thinking of using this package to get some more insights into the behaviour of my xgboost models, as the extra features look very interesting, and looking at interactions is something I have wanted to do for a while, so thanks for this!

But I'm struggling to find information on how some of the extra features are defined. Most importantly for me, expected gain. The readme of this package says:

Total gain of each feature or feature interaction weighted by the probability to gather the gain

I scanned through your code a bit, and the gain for each node seems to be scaled by the path_proba of that node. Is path_proba the number of samples involved in a node divided by the total number of samples?

If so, I was wondering what the specific reasoning behind this is? As far as I understand, the original gain already scales with the number of samples involved in the node (albeit nonlinearly). Does Expected gain represent better what the gain per sample is than gain, or does it serve a different goal?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant