Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deepcopy lightgbm model causing missing parameters? #4085

Closed
noklam opened this issue Mar 19, 2021 · 8 comments
Closed

Deepcopy lightgbm model causing missing parameters? #4085

noklam opened this issue Mar 19, 2021 · 8 comments

Comments

@noklam
Copy link

noklam commented Mar 19, 2021

Description

Deepcopy lightgbm model causing model lost parameters

Reproducible example

import pandas as pd
import numpy as np
import lightgbm as lgb
from copy import deepcopy

params = {
'objective': 'regression',
'verbose': -1,
'num_leaves': 3
}

X = np.random.rand(100,2)
Y = np.ravel(np.random.rand(100,1))
lgbm = lgb.train(params, lgb.Dataset(X,label=Y),num_boost_round=1)


print(lgbm.params)

## Deep copy will missing params
new_model = deepcopy(lgbm)
print(new_model.params)

Output:

{'objective': 'regression', 'verbose': -1, 'num_leaves': 3}
Finished loading model, total used 1 iterations
{}

Environment info

Windows 10, Python 3.7.5
LightGBM version or commit hash:
Tested with both lightgbm==2.3.1 and lightgbm==3.1.1 (latest pip version)

Command(s) you used to install LightGBM

Additional Comments

@noklam
Copy link
Author

noklam commented Mar 19, 2021

After digging into the source code, this seems to be the root cause, the __deepcopy__ only create the Boost object.

def __deepcopy__(self, _):
model_str = self.model_to_string(num_iteration=-1)
booster = Booster(model_str=model_str)
return booster

    def __deepcopy__(self, _):
        model_str = self.model_to_string(num_iteration=-1)
        booster = Booster(model_str=model_str)
        return booster

Those commit are quite old already (2+years), I wonder why a custom implementation of __deepcopy__ is necessary in the first place.

@StrikerRUS
Copy link
Collaborator

Hi @noklam !
Indeed, params are not pre-saved at the Python wrapper layer after save/load process. However, underlying model is correct and it doesn't affect model behavior.
We already have the feature request for displaying original params used to train a model: #2613.

I wonder why a custom implementation of __deepcopy__ is necessary in the first place.

Custom __deepcopy__ is needed to make Booster class picklable.

@noklam
Copy link
Author

noklam commented Mar 19, 2021

If I understand correctly, the fix should be release in the next version and parameters could be saved with mode?

May I ask what's making it not picklable? Just trying to learn more about the issue

@StrikerRUS
Copy link
Collaborator

Unfortunately, there is no fix for that yet.

@noklam
Copy link
Author

noklam commented Mar 19, 2021

It seems picklable even without the deepcopy method? I try removed the __deepcopy__ and parameters is preserved and I can just pickle the model.

import lightgbm
from lightgbm import Booster
del Booster.__deepcopy__

params = {
'objective': 'regression',
'verbose': -1,
'num_leaves': 3
}

X = np.random.rand(100,2)
Y = np.ravel(np.random.rand(100,1))
lgbm = lgb.train(params, lgb.Dataset(X,label=Y),num_boost_round=1)


deepcopy_lgbm = deepcopy(lgbm)
lgbm.params, deepcopy_lgbm.params

image

@StrikerRUS
Copy link
Collaborator

Please refer to the following PR where custom __deepcopy__ was introduced and feel free to provide a PR if you think that this code can be removed.
#151

@StrikerRUS
Copy link
Collaborator

But I think that without custom __deepcopy__ cloned Booster will handle the same underlying cpp object as the original one.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants