Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple boosting types #3128

Closed
candalfigomoro opened this issue May 29, 2020 · 6 comments · Fixed by #4827
Closed

Decouple boosting types #3128

candalfigomoro opened this issue May 29, 2020 · 6 comments · Fixed by #4827

Comments

@candalfigomoro
Copy link

With xgboost you can build (a sort of) Random Forest by setting num_parallel_tree>1 and nrounds=1

In LightGBM we can build (a sort of) Random Forest by setting boosting='rf'

Since the num_parallel_tree and nrounds params are decoupled in xgboost, what you can do is to set num_parallel_tree>1 and nrounds>1 to build a "boosted random forest". This is not possible in LightGBM afaik.

The fact that boosting types are mutually exclusive, make also impossible to create other combinations such as dart+goss (see #2991).

The "rf" and "goss" modes should be decoupled from the boosting type. Maybe I want to use dart+rf or gbrt+rf (like in xgboost) to build boosted random forests. Maybe I want to use dart+goss.

@guolinke Maybe this can be considered for LightGBM 3 (#3071)?

@guolinke
Copy link
Collaborator

guolinke commented Aug 6, 2020

I think it is a useful feature, but not trivial to implement it.
we will need to refactor the whole boosting part. And I don't have much time recently.
@shiyu1994 is helping me for the lightgbm project recently, but he is also busy this month.
We can start refactoring in the next couple of months.

@candalfigomoro
Copy link
Author

I think this would be a huge win in the long term because the code would be much more modular, but I understand the required effort is big

@StrikerRUS
Copy link
Collaborator

Closed in favor of being in #2302. We decided to keep all feature requests in one place.

Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.

@yiwiz-sai
Copy link

This feature is very useful, hope lightgbm team can support it.

num_parallel_tree is important to avoid overfitting, I think that's an advantage of xgboost,

@StrikerRUS
Copy link
Collaborator

Active work happens in #4827.

@StrikerRUS StrikerRUS reopened this Jun 12, 2022
shiyu1994 added a commit that referenced this issue Dec 28, 2022
* add parameter data_sample_strategy

* abstract GOSS as a sample strategy(GOSS1), togetherwith origial GOSS (Normal Bagging has not been abstracted, so do NOT use it now)

* abstract Bagging as a subclass (BAGGING), but original Bagging members in GBDT are still kept

* fix some variables

* remove GOSS(as boost) and Bagging logic in GBDT

* rename GOSS1 to GOSS(as sample strategy)

* add warning about use GOSS as boosting_type

* a little ; bug

* remove CHECK when "gradients != nullptr"

* rename DataSampleStrategy to avoid confusion

* remove and add some ccomments, followingconvention

* fix bug about GBDT::ResetConfig (ObjectiveFunction inconsistencty bet…

* add std::ignore to avoid compiler warnings (anpotential fails)

* update Makevars and vcxproj

* handle constant hessian

move resize of gradient vectors out of sample strategy

* mark override for IsHessianChange

* fix lint errors

* rerun parameter_generator.py

* update config_auto.cpp

* delete redundant blank line

* update num_data_ when train_data_ is updated

set gradients and hessians when GOSS

* check bagging_freq is not zero

* reset config_ value

merge ResetBaggingConfig and ResetGOSS

* remove useless check

* add ttests in test_engine.py

* remove whitespace in blank line

* remove arguments verbose_eval and evals_result

* Update tests/python_package_test/test_engine.py

reduce num_boost_round

Co-authored-by: James Lamb <[email protected]>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round

Co-authored-by: James Lamb <[email protected]>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round

Co-authored-by: James Lamb <[email protected]>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round

Co-authored-by: James Lamb <[email protected]>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round

Co-authored-by: James Lamb <[email protected]>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round

Co-authored-by: James Lamb <[email protected]>

* Update src/boosting/sample_strategy.cpp

modify warning about setting goss as `boosting_type`

Co-authored-by: James Lamb <[email protected]>

* Update tests/python_package_test/test_engine.py

replace load_boston() with make_regression()

remove value checks of mean_squared_error in test_sample_strategy_with_boosting()

* Update tests/python_package_test/test_engine.py

add value checks of mean_squared_error in test_sample_strategy_with_boosting()

* Modify warnning about using goss as boosting type

* Update tests/python_package_test/test_engine.py

add random_state=42 for make_regression()

reduce the threshold of mean_square_error

* Update src/boosting/sample_strategy.cpp

Co-authored-by: James Lamb <[email protected]>

* remove goss from boosting types in documentation

* Update src/boosting/bagging.hpp

Co-authored-by: Nikita Titov <[email protected]>

* Update src/boosting/bagging.hpp

Co-authored-by: Nikita Titov <[email protected]>

* Update src/boosting/goss.hpp

Co-authored-by: Nikita Titov <[email protected]>

* Update src/boosting/goss.hpp

Co-authored-by: Nikita Titov <[email protected]>

* rename GOSS with GOSSStrategy

* update doc

* address comments

* fix table in doc

* Update include/LightGBM/config.h

Co-authored-by: Nikita Titov <[email protected]>

* update documentation

* update test case

* revert useless change in test_engine.py

* add tests for evaluation results in test_sample_strategy_with_boosting

* include <string>

* change to assert_allclose in test_goss_boosting_and_strategy_equivalent

* more tolerance in result checking, due to minor difference in results of gpu versions

* change == to np.testing.assert_allclose

* fix test case

* set gpu_use_dp to true

* change --report to --report-level for rstcheck

* use gpu_use_dp=true in test_goss_boosting_and_strategy_equivalent

* revert unexpected changes of non-ascii characters

* revert unexpected changes of non-ascii characters

* remove useless changes

* allocate gradients_pointer_ and hessians_pointer when necessary

* add spaces

* remove redundant virtual

* include <LightGBM/utils/log.h> for USE_CUDA

* check for  in test_goss_boosting_and_strategy_equivalent

* check for identity in test_sample_strategy_with_boosting

* remove cuda  option in test_sample_strategy_with_boosting

* Update tests/python_package_test/test_engine.py

Co-authored-by: Nikita Titov <[email protected]>

* Update tests/python_package_test/test_engine.py

Co-authored-by: James Lamb <[email protected]>

* ResetGradientBuffers after ResetSampleConfig

* ResetGradientBuffers after ResetSampleConfig

* ResetGradientBuffers after bagging

* remove useless code

* check objective_function_ instead of gradients

* enable rf with goss

simplify params in test cases

* remove useless changes

* allow rf with feature subsampling alone

* change position of ResetGradientBuffers

* check for dask

* add parameter types for data_sample_strategy

Co-authored-by: Guangda Liu <[email protected]>
Co-authored-by: Yu Shi <[email protected]>
Co-authored-by: GuangdaLiu <[email protected]>
Co-authored-by: James Lamb <[email protected]>
Co-authored-by: Nikita Titov <[email protected]>
@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants