Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python] support saving and loading CVBooster #3556

Closed
Tracked by #5153
samFarrellDay opened this issue Nov 11, 2020 · 6 comments · Fixed by #5160
Closed
Tracked by #5153

[python] support saving and loading CVBooster #3556

samFarrellDay opened this issue Nov 11, 2020 · 6 comments · Fixed by #5160

Comments

@samFarrellDay
Copy link
Contributor

The two most immediately apparent ways to save a cvbooster (to me at least) would be to use pickle, or save_model() as is done with basic booster models. However, neither of these appear to work:

import pandas as pd
import numpy as np
import lightgbm as lgb
import dill

data = pd.DataFrame(np.random.rand(100, 10))
data.columns = [str(i) for i in range(len(data.columns))]
target = data.pop("9")
params = {
    'learning_rate': 0.1,
    'min_data_in_leaf':3,
    'verbose': -1
}
dtrain = lgb.Dataset(data=data)
lgbcv = lgb.cv(params,train_set=dtrain,return_cvbooster=True)
cvbooster = lgbcv.pop("cvbooster")

# Using save_model()
cvbooster.save_model("File.txt")

# Fails
cvbooster_from_file = lgb.CVBooster(model_file = "File.txt")

# Returns a single booster
cvbooster_from_file = lgb.Booster(model_file = "File.txt")


######## Using pickle (dill)
# Save
with open("File.pkl","wb") as f:
    dill.dump(cvbooster,f)

# Fails
with open("File.pkl","rb") as f:
    cvbooster_from_pickle = dill.load(f)

Would it be possible to get a .save_model() method for CVBooster?

@guolinke
Copy link
Collaborator

cvbooster contains a list of Booster. So you can achieve it like

cvbooster = lgbcv.pop("cvbooster")
for i, booster in enumerate(cvbooster.boosters):
    booster.save_model("model_{}.txt".format(i))

For the restore, you can load model one-by-one, and append it to a empty cvbooster. refer to

def _append(self, booster):
"""Add a booster to CVBooster."""
self.boosters.append(booster)

@jameslamb jameslamb reopened this Jul 6, 2021
@jameslamb
Copy link
Collaborator

Re-opening this issue because I don't think it was totally resolved.

By chance, I got an email notification this week because someone left a comment on this issue and then deleted it.

First, I think it's important to include the actual text of the error messages, so that this issue can be found from search engines.

I tried running the example code given with lightgbm 3.2.1 built from the latest commit on master (ec1debc).

This code fails with the following error.


TypeError Traceback (most recent call last)
in
20
21 # Fails
---> 22 cvbooster_from_file = lgb.CVBooster(model_file = "File.txt")
23
24 # Returns a single booster

TypeError: init() got an unexpected keyword argument 'model_file'

cvbooster.save_model() seems to succeed because it doesn't throw an error, but what it is doing is probably not what users would want.

def __getattr__(self, name):
"""Redirect methods call of CVBooster."""
def handler_function(*args, **kwargs):
"""Call methods with each booster, and concatenate their results."""
ret = []
for booster in self.boosters:
ret.append(getattr(booster, name)(*args, **kwargs))
return ret
return handler_function

cvbooster.save_model() is calling .save_model() on each Booster, which means that after that call only the final booster has been stored into "File.txt".


Trying to save the CVBooster object using dill results in the following error.


TypeError Traceback (most recent call last)
in
6 # Fails
7 with open("File.pkl","rb") as f:
----> 8 cvbooster_from_pickle = dill.load(f)

~/miniconda3/lib/python3.8/site-packages/dill/_dill.py in load(file, ignore, **kwds)
311 See :func:loads for keyword arguments.
312 """
--> 313 return Unpickler(file, ignore=ignore, **kwds).load()
314
315 def loads(str, ignore=None, **kwds):

~/miniconda3/lib/python3.8/site-packages/dill/_dill.py in load(self)
523
524 def load(self): #NOTE: if settings change, need to update attributes
--> 525 obj = StockUnpickler.load(self)
526 if type(obj).module == getattr(_main_module, 'name', 'main'):
527 if not self._ignore:

~/miniconda3/lib/python3.8/site-packages/lightgbm/engine.py in handler_function(*args, **kwargs)
312 """Call methods with each booster, and concatenate their results."""
313 ret = []
--> 314 for booster in self.boosters:
315 ret.append(getattr(booster, name)(*args, **kwargs))
316 return ret

TypeError: 'function' object is not iterable


So I think there are multiple issues here to be addressed:

  • CVBooster.save_model() should throw a warning mentioning that it is only saving one Booster.
    • and in release 4.0.0, I think we should make the breaking change to have CVBooster.save_model() raise a NotImplementedError
  • CVBooster should be pickleable and it should be possible to load a CVBooster that has been pickled

@StrikerRUS what do you think?

@StrikerRUS
Copy link
Collaborator

@jameslamb Seems reasonable to me.

@jameslamb jameslamb changed the title Is it possible to save a CVbooster instance? [python] support saving and loading CVBooster Aug 15, 2021
@jameslamb
Copy link
Collaborator

I've updated the description and added this to #2302, where we store the full list of feature requests.

Closing this for now per our policy in #2302. Anyone is welcome to pursue adding this feature! Please just leave a comment here saying that you are interested, and the issue can be re-opened as a place to ask maintainers questions.

@hoangphucITJP
Copy link

hoangphucITJP commented Nov 27, 2021

Please if you have time, please work on this feature.
The CVBooster.save_model() is a lie! I used it without any thought then load the model with the basic booster and believed the result was correct which frustrated me!

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants