Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LightGBM] [Warning] Met 'abs(label) < 1', will convert them to '1' in MAPE objective and metric #3608

Closed
michael135 opened this issue Nov 29, 2020 · 13 comments

Comments

@michael135
Copy link

michael135 commented Nov 29, 2020

I'm doing Regeression with MAPE (Mean Absolute Percentage Error) objective and eva_metric.
What is the meaning, of following warning?

[LightGBM] [Warning] Met 'abs(label) < 1', will convert them to '1' in MAPE objective and metric
@jameslamb
Copy link
Collaborator

Thanks for using LightGBM @michael135 !

There are values in your target variable which have an absolute value < 1. MAPE is unstable under such conditions, so LightGBM converts those values to 1.0 before evaluation. This warning is telling you that that's happening.

The code where this rounding happens:

inline static double LossOnPoint(label_t label, double score, const Config&) {
return std::fabs((label - score)) / std::max(1.0f, std::fabs(label));
}

To see this instability, consider the following MAPE on a single prediction.

actual pred error MAPE
2.100 2.200 0.100 0.048
2.105 2.200 0.095 0.045
1.100 1.200 0.100 0.091
1.105 1.200 0.095 0.086
0.100 0.200 0.100 1.000
0.105 0.200 0.095 0.905

Rounding to 1 like this also avoids the case where calculating MAPE fails with a divide-by-zero error because the target is exactly 0.

If you want to use MAPE as an evaluation metric and are uncomfortable with the way that LightGBM is using rounding to calculate it in this setting, I recommend altering your target variable to ensure that all the values are > 1 in absolute value.

I think this warning could be made a bit clearer, and I'll open a pull request to clarify it. Thanks for pointing it out!

@michael135
Copy link
Author

michael135 commented Dec 1, 2020

Thank you @jameslamb, for detailed explanation.

Why MAPE is unstable when MAPE < 1 ?
(Is there any reference to read more about it?)

P.S.
I'm familiar with an "MAPE stability issue", when $actual = 0$ or close to zero.

@jameslamb
Copy link
Collaborator

Why MAPE is unstable when MAPE < 1 ?

It's not that it is unstable when MAPE < 1. This metric is unstable when the actual value of the target is less than 1 (in absolute value).

In the table I posted above, look at the MAPE values for the case where the target has values like 0.1, and compare those to the MAP for target values like 1.1 and 2.1.

Does that clarify it? If not, could you reword your question?

@michael135
Copy link
Author

michael135 commented Dec 1, 2020

I think now it's more clear to me.

So it will change this line:

actual pred error MAPE
0.100 0.200 0.100 1.000

instead |0.1 - 0.2| / 0.1 = 1.0

we will get: |0.1 - 0.2| / 1 = 0.1 ?

@jameslamb
Copy link
Collaborator

yep, that's right! So it's not perfect, but it's at least better.

You can see that the larger errors still result in a larger penalty:

|0.1 - 0.2| / 1 = 0.1
|0.1 - 0.3| / 1 = 0.2
|0.1 - 0.4| / 1 = 0.3

but with this formulation, the effect of having a target value close to 0 is softened a lot, so it's less likely to distort the MAPE. Since the MAPE is a mean over all absolute percentage errors, it's sensitive to extreme values.

If you have any concerns about this, you can use a metric that doesn't rely on percentages, like mae.

@guolinke
Copy link
Collaborator

guolinke commented Dec 2, 2020

@michael135 MAPE is not a perfect metric, as it could divide zero when label is zero.
Therefore, we need a pre-defined threshold to avoid dividing zero. We simply use 1, as we don't want to emphasize the data with small labels have large errors, which conflicts with MAPE's design.

For example, when label is 1e-5 and predict is 1, the error will close to 1e5, which is very large.

@michael135
Copy link
Author

michael135 commented Dec 2, 2020

Thank you @jameslamb and @guolinke for the clarifications!

I absolutely agree for case when label = 1e-5.

But when label = 0.1 (or even 0.01). I'm bit confused.
Is there some easy option to define (provide) a threshold when label will be rounded up to 1.0?

Regarding the MAPE, I know, that it's not perfect, but for current problem, which I'm trying to solve it's kind of a least worse for me. (Since I don't have zeroes, and when label = 5 and prediction is 4.9 (mae=0.1), it's not the same as when label=0.5 and prediction=0.4 (same mae=0.1), but mape will "emphasise" the differences.

@guolinke
Copy link
Collaborator

guolinke commented Dec 2, 2020

@michael135 you can simply multiply your labels by 10 (for 0.1), or 100 (for 0.01).

@michael135
Copy link
Author

@guolinke , what is the difference, if I multiply or "lightgbm multiply" ?

@guolinke
Copy link
Collaborator

guolinke commented Dec 3, 2020

@michael135 I was saying a workaround for your question, "Is there some easy option to define (provide) a threshold when label will be rounded up to 1.0?"
Theoretically, multiply all labels by a constant, will not change the value of MAPE metric. So it is safe. And with large label range, the threshold is reduced accordingly.

@michael135
Copy link
Author

@guolinke.
Now. I've got it.
This workaround is fine (to multiply, and then to divide back after prediction).

Thank you again @jameslamb and @guolinke.

@jameslamb
Copy link
Collaborator

Glad we could help! If you don't mind, could you please accept my original answer on your Stack Overflow post, so it will be marked as answered?

https://stackoverflow.com/questions/65075804/lightgbm-warning-met-abslabel-1

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants