Meaning Expected Gain #13

Toekan · 2018-04-04T13:01:49Z

Hi,

I'm thinking of using this package to get some more insights into the behaviour of my xgboost models, as the extra features look very interesting, and looking at interactions is something I have wanted to do for a while, so thanks for this!

But I'm struggling to find information on how some of the extra features are defined. Most importantly for me, expected gain. The readme of this package says:

Total gain of each feature or feature interaction weighted by the probability to gather the gain

I scanned through your code a bit, and the gain for each node seems to be scaled by the path_proba of that node. Is path_proba the number of samples involved in a node divided by the total number of samples?

If so, I was wondering what the specific reasoning behind this is? As far as I understand, the original gain already scales with the number of samples involved in the node (albeit nonlinearly). Does Expected gain represent better what the gain per sample is than gain, or does it serve a different goal?

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meaning Expected Gain #13

Meaning Expected Gain #13

Toekan commented Apr 4, 2018

Meaning Expected Gain #13

Meaning Expected Gain #13

Comments

Toekan commented Apr 4, 2018