-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] LGBMClassifier produces empty trees #6080
Comments
Thanks for using LightGBM and for the report. Short Answer
The existence of trees with 0 splits is not a bug in LightGBM. LightGBM comes with many settings to limit overfitting, and it's completely possible that during training, it may intentionally choose to produce a tree with 0 splits. In fact, there's even a separate thread going on here specifically about adding support for such 0-split trees in
Long Answer For debugging activities like this, it's useful to include the logs produced by your program. I ran the following modified version of your code (with import numpy as np
from sklearn.datasets import load_digits
from lightgbm import LGBMClassifier
np.random.seed(0)
# data | regression | binary | multi class
data_mult = load_digits(as_frame=True)
# Model
model_mult = LGBMClassifier(verbosity=1).fit(data_mult.data, data_mult.target)
for tree in model_mult._Booster.dump_model()["tree_info"]:
if tree["num_leaves"] == 1:
print(f"Tree {tree['tree_index']} has only one {tree['num_leaves']} leaf") Doing that, I can see that the logs are full of this same warning:
It appears to me that in this example, LightGBM added new trees with splits for the first few iterations, and then was not able to find any additional splits satisfying its constraints for splitting. Examples of those constraints:
You can find documentation on these and others here:
If you search for that warning message here and on Stack Overflow, you'll find many explanations of this:
|
Thanks for your quick return and explanation. |
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Description
While exploring the trees produced by LGBMClassifier, it appears that some trees can have only one leaves.
This behavior seems strange as we would expect at least one split for every trees.
Reproducible example
Environment info
LightGBM version or commit hash: 4.0.0
Command(s) you used to install LightGBM
Additional Comments
This bug has been found while working on an issue on the SHAP library.
The text was updated successfully, but these errors were encountered: