Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regularized Greedy Forest is not in the lightgbm #315

Closed
defaultRobot opened this issue Feb 24, 2017 · 11 comments
Closed

Regularized Greedy Forest is not in the lightgbm #315

defaultRobot opened this issue Feb 24, 2017 · 11 comments

Comments

@defaultRobot
Copy link

I heard that Regularized Greedy Forest (RGF) have been used to win some kaggle competitions and it works better than gradient boosting in some situations . Apparently, up to now, rgf is not included in lightgbm. I hope lightgbm can support it as soon as possible.

@gugatr0n1c
Copy link

Voting for this as well

@avannaldas
Copy link

Hi, I can take this up. Will read about RGFs and get back in couple of days.

@avannaldas
Copy link

avannaldas commented Apr 4, 2017

Hi @guolinke ,

I went through the paper Learning Nonlinear Functions Using Regularized Greedy Forest - arXiv which originally proposed RGF.

The authors of the paper propose three improvements to a Gradient Boosting technique...

  1. Introduce regularization, to counter overfitting effects due (2)
  2. Update coefficients/weights repeatedly at every step rather than waiting for an individual tree to build completely. A more aggressive learning
  3. Perform greedy search over the forest nodes based on forest structure

Though I've understood RGF in a broad sense, I think I'll have to spend some more time on the paper to understand the intricacies accurately.

Also, the original C++ implementation of RGF is available as a compressed archive here and uploaded by a GitHub user under a separate folder in this repo. I haven't yet looked into this implementation.

I've cloned the LightGBM repo and opened in VS2015, I get an upgrade prompt, which looks optional, hence I skipped upgrade. I'm unable to build the project due to older version of build tools. Is there a way to build the project in VS2015? Shall I upgrade the project and proceed for now?
image

Also, can you please help me understand LightGBM and share your inputs/guidance on how do we take this forward.

Thanks,
Abhijit Annaldas

@guolinke
Copy link
Collaborator

guolinke commented Apr 5, 2017

Thanks @avannaldas !
you can upgrade to the vs2015 in your local project, but don't push the update of the .sln and .vcproj to the online.

for the development guide, you can check the https://github.com/Microsoft/LightGBM/blob/master/docs/development.md .

It seems the GRF is the improvement of Gradient Boosting. So I think you can just inherit the gbdt class(https://github.com/Microsoft/LightGBM/blob/master/src/boosting/gbdt.h).

Let me know if you have more questions.

@avannaldas
Copy link

Thanks for the info @guolinke, I'll spend some more time understanding RGF in depth and go through the existing gbdt implementation in LightGBM. Will get back in with questions by Monday.

@avannaldas
Copy link

Hi @guolinke , I've sent you an email to your Microsoft alias. Thanks!

@Laurae2
Copy link
Contributor

Laurae2 commented Jul 15, 2017

@avannaldas Any news?

@avannaldas
Copy link

I spent some time understanding how this can be done. I had multiple conversations with Guolin about this. Guolin thinks that RGF may not give significant performance improvement over existing GBDT implemenations. @guolinke can share more on this.

@guolinke
Copy link
Collaborator

guolinke commented Aug 3, 2017

Thanks @avannaldas .

Actually, LightGBM already contained many ideas in RGF, e.g. the best-first tree growth, newton step, and so on. So it doesn't have much value to implement it. The remaining difference is their many regularization methods. Maybe we can add these regularization methods in LightGBM.

However, the contribution is still welcome.

@StrikerRUS
Copy link
Collaborator

For those who comes here from search engines and searches for the working implementation of the Regularized Greedy Forest (RGF), you could use Python wrapper of the original C++ implementations (both vanilla and fast ones) which were written by the authors of the paper.
https://github.com/fukatani/rgf_python

@StrikerRUS
Copy link
Collaborator

Closed in favor of being in #2302. We decided to keep all feature requests in one place.

Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants