Add log error metrics (MALE and MSLE) #2621

chris-mcdo · 2023-02-07T15:34:04Z

Description

@lostella suggested you might be interested in having the mean absolute log error (MALE) and mean square log error (MSLE) error metrics in GluonTS.

Here's a quick summary of the MALE and MSLE:

The metrics are both based on the “log error”, LE = log(f / y) (here f is the forecast, and y is the observed value).
Like other “relative” metrics (MAPE, sMAPE), they only really make sense for strictly positive data, and they assume the data have a meaningful zero point.
But in this case, they have some nice properties:
- They produce estimates of the median and geometric mean respectively. By contrast, the MAPE and sMAPE produce biased estimates in general.
- They can be interpreted as the average “relative” error (i.e., by what multiplicative factor you expect to be wrong).
- They can be easily compared / aggregated across time series. (Like other relative metrics.)
- They avoid the main problems of the MAPE and sMAPE.
They are basically the MAE and MSE on the log scale.

I wrote a longer (~10min) article making the case for these metrics here.

References

[1] https://towardsdatascience.com/mean-absolute-log-error-male-a-better-relative-performance-metric-a8fd17bc5f75
[2] Tofallis, Chris, A Better Measure of Relative Prediction Accuracy for Model Selection and Model Estimation (July 2014). Journal of the Operational Research Society (2015) 66, 1352–1362, Available at SSRN: https://ssrn.com/abstract=2635088

chris-mcdo · 2023-02-07T15:35:06Z

I'd be happy to write a PR for it.

Here are a few questions / thoughts I have about implementation:

I notice that there are two model evaluation packages, gluonts.ev and gluonts.evaluation. Should I implement the metrics in both, or just one?
The (expected) MSLE is minimized by the geometric mean of the true distribution. So it would make sense to compute the MSLE with respect to the geometric mean of the forecast distribution (see Inconsistent error definition #2272). But none of the Forecast classes have a method of getting the geometric mean. So for now, maybe we could use the median instead?

jaheba · 2023-02-09T10:20:42Z

Hi Christopher,

I notice that there are two model evaluation packages, gluonts.ev and gluonts.evaluation. Should I implement the metrics in both, or just one?

evaluation is a convoluted mess and ev is an attempt to have a more consistent interface. However, ev is not used yet, but we plan to deprecate evaluation in favour of ev.

I think I would start implementing the metrics in ev and then we can think about if it makes sense to also have them in evaluation.

chris-mcdo added the enhancement New feature or request label Feb 7, 2023

chris-mcdo added a commit to chris-mcdo/gluonts that referenced this issue Feb 9, 2023

add log error metrics to evaluation pkg (awslabs#2621)

8477bb3

chris-mcdo added a commit to chris-mcdo/gluonts that referenced this issue Feb 9, 2023

add log error metrics to ev pkg (awslabs#2621)

97a5125

chris-mcdo linked a pull request Feb 9, 2023 that will close this issue

Add log error metrics #2634

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add log error metrics (MALE and MSLE) #2621

Add log error metrics (MALE and MSLE) #2621

chris-mcdo commented Feb 7, 2023

chris-mcdo commented Feb 7, 2023

jaheba commented Feb 9, 2023

Add log error metrics (MALE and MSLE) #2621

Add log error metrics (MALE and MSLE) #2621

Comments

chris-mcdo commented Feb 7, 2023

Description

References

chris-mcdo commented Feb 7, 2023

jaheba commented Feb 9, 2023