We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LightGBM component: R-package
(...)
Other: applies to any R version / compiler combination
LightGBM version or commit hash: 0e3509c
Training a model with integer labels seem to provide wrong results and/or change LightGBM behavior.
Training matrix :
Expected prediction (should predict "1 0"):
Results:
Increased the number of iterations but no better results.
Changing labels to (1, 2) instead of (0, 1) leads to:
Full logs:
> # Data Int / Labels Int > train_mat <- matrix(c(0L, 1L, 1L, 0L), nrow = 2, ncol = 2) > train_labels <- c(0L, 1L) > dtrain <- lgb.Dataset(train_mat, label = train_labels) > model <- lgb.train( + params = list(objective = "regression", metric = "l2") + , data = dtrain + , nrounds = 1L + , min_data = 1L + , learning_rate = 1.0 + , verbose = -1 + ) > round(predict(model, train_mat), digits = 10) [1] 0 0 > round(sum(abs(predict(model, train_mat) - train_labels)), digits = 10) [1] 1 > > # Data Num / Labels Int > train_mat <- matrix(c(0, 1, 1, 0), nrow = 2, ncol = 2) > train_labels <- c(0L, 1L) > dtrain <- lgb.Dataset(train_mat, label = train_labels) > model <- lgb.train( + params = list(objective = "regression", metric = "l2") + , data = dtrain + , nrounds = 1L + , min_data = 1L + , learning_rate = 1.0 + , verbose = -1 + ) > round(predict(model, train_mat), digits = 10) [1] 0 1 > round(sum(abs(predict(model, train_mat) - train_labels)), digits = 10) [1] 0 > > # Data Int / Labels Num > train_mat <- matrix(c(0L, 1L, 1L, 0L), nrow = 2, ncol = 2) > train_labels <- c(0, 1) > dtrain <- lgb.Dataset(train_mat, label = train_labels) > model <- lgb.train( + params = list(objective = "regression", metric = "l2") + , data = dtrain + , nrounds = 1L + , min_data = 1L + , learning_rate = 1.0 + , verbose = -1 + ) > round(predict(model, train_mat), digits = 10) [1] 0 0 > round(sum(abs(predict(model, train_mat) - train_labels)), digits = 10) [1] 1 > > # Data Num / Labels Num > train_mat <- matrix(c(0, 1, 1, 0), nrow = 2, ncol = 2) > train_labels <- c(0, 1) > dtrain <- lgb.Dataset(train_mat, label = train_labels) > model <- lgb.train( + params = list(objective = "regression", metric = "l2") + , data = dtrain + , nrounds = 1L + , min_data = 1L + , learning_rate = 1.0 + , verbose = -1 + ) > round(predict(model, train_mat), digits = 10) [1] 0 1 > round(sum(abs(predict(model, train_mat) - train_labels)), digits = 10) [1] 0
library(lightgbm) # Data Int / Labels Int train_mat <- matrix(c(0L, 1L, 1L, 0L), nrow = 2, ncol = 2) train_labels <- c(0L, 1L) dtrain <- lgb.Dataset(train_mat, label = train_labels) model <- lgb.train( params = list(objective = "regression", metric = "l2") , data = dtrain , nrounds = 1L , min_data = 1L , learning_rate = 1.0 , verbose = -1 ) round(predict(model, train_mat), digits = 10) # Must be 0, 1 round(sum(abs(predict(model, train_mat) - train_labels)), digits = 10) # Must be 0 # Data Num / Labels Int train_mat <- matrix(c(0, 1, 1, 0), nrow = 2, ncol = 2) train_labels <- c(0L, 1L) dtrain <- lgb.Dataset(train_mat, label = train_labels) model <- lgb.train( params = list(objective = "regression", metric = "l2") , data = dtrain , nrounds = 1L , min_data = 1L , learning_rate = 1.0 , verbose = -1 ) round(predict(model, train_mat), digits = 10) # Must be 0, 1 round(sum(abs(predict(model, train_mat) - train_labels)), digits = 10) # Must be 0 # Data Int / Labels Num train_mat <- matrix(c(0L, 1L, 1L, 0L), nrow = 2, ncol = 2) train_labels <- c(0, 1) dtrain <- lgb.Dataset(train_mat, label = train_labels) model <- lgb.train( params = list(objective = "regression", metric = "l2") , data = dtrain , nrounds = 1L , min_data = 1L , learning_rate = 1.0 , verbose = -1 ) round(predict(model, train_mat), digits = 10) # Must be 0, 1 round(sum(abs(predict(model, train_mat) - train_labels)), digits = 10) # Must be 0 # Data Num / Labels Num train_mat <- matrix(c(0, 1, 1, 0), nrow = 2, ncol = 2) train_labels <- c(0, 1) dtrain <- lgb.Dataset(train_mat, label = train_labels) model <- lgb.train( params = list(objective = "regression", metric = "l2") , data = dtrain , nrounds = 1L , min_data = 1L , learning_rate = 1.0 , verbose = -1 ) round(predict(model, train_mat), digits = 10) # Must be 0, 1 round(sum(abs(predict(model, train_mat) - train_labels)), digits = 10) # Must be 0
Run the following code in R:
The text was updated successfully, but these errors were encountered:
Wow thank you for the detailed write-up! I will look into this.
Sorry, something went wrong.
closed by #3140 , thanks to @mayer79
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.
jameslamb
guolinke
No branches or pull requests
How you are using LightGBM?
LightGBM component: R-package
Environment info
(...)
Other: applies to any R version / compiler combination
LightGBM version or commit hash: 0e3509c
Error message and / or logs
Training a model with integer labels seem to provide wrong results and/or change LightGBM behavior.
Training matrix :
Expected prediction (should predict "1 0"):
Results:
Increased the number of iterations but no better results.
Changing labels to (1, 2) instead of (0, 1) leads to:
Full logs:
Reproducible example(s)
Steps to reproduce
Run the following code in R:
The text was updated successfully, but these errors were encountered: