Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve GMARE metric computation formula #2087

Closed
KickItLikeShika opened this issue Jun 28, 2021 · 9 comments · Fixed by #2096
Closed

Improve GMARE metric computation formula #2087

KickItLikeShika opened this issue Jun 28, 2021 · 9 comments · Fixed by #2096
Labels

Comments

@KickItLikeShika
Copy link
Contributor

Months ago I tried to improve GMARE to be compatible with DDP, but I had a very strange issue, and the distributed tests were failing, I thought I did something wrong (even I updated +10 metrics for DDP then) so I ignored it at that time, today I was trying to update it and send the PR but I had the same issue, the tests distributed tests are failing.

distributed_context_single_node_gloo = {'local_rank': 1, 'rank': 1, 'world_size': 2}

    @pytest.mark.distributed
    @pytest.mark.skipif(not idist.has_native_dist_support, reason="Skip if no native dist support")
    def test_distrib_gloo_cpu_or_gpu(distributed_context_single_node_gloo):
    
        device = idist.device()
>       _test_distrib_compute(device)


metric_device = device(type='cpu')

    def _test(metric_device):
        metric_device = torch.device(metric_device)
        m = GeometricMeanRelativeAbsoluteError(device=metric_device)
        torch.manual_seed(10 + rank)
    
        y_pred = torch.rand(size=(100,), device=device)
        y = torch.rand(size=(100,), device=device)
    
        m.update((y_pred, y))
    
        y_pred = idist.all_gather(y_pred)
        y = idist.all_gather(y)
    
        np_y = y.cpu().numpy()
        np_y_pred = y_pred.cpu().numpy()
    
        np_gmrae = np.exp(np.log(np.abs(np_y - np_y_pred) / np.abs(np_y - np_y.mean())).mean())
    
        # sum_y = 0
        # num_examples = 0
        # sum_of_errors = 0
        # np_gmrae = 0
        # sum_y += np_y.sum()
        # num_examples += np_y.shape[0]
        # y_mean = sum_y / num_examples
        # numerator = np.abs(y.view_as(y_pred) - y_pred)
        # denominator = np.abs(y.view_as(y_pred) - y_mean)
        # sum_of_errors += np.log(numerator / denominator).sum()
        # np_gmrae += np.exp((sum_of_errors / num_examples).mean())
    
>       assert m.compute() == pytest.approx(np_gmrae)
E       assert 1.3195266723632812 == 1.3320785760879517 ± 1.3e-06
E         +1.3195266723632812
E         -1.3320785760879517 ± 1.3e-06


FAILED tests/ignite/contrib/metrics/regression/test_geometric_mean_relative_absolute_error.py::test_distrib_gloo_cpu_or_gpu - assert 1.3195266723632812 == 1.3320785760879517 ± 1.3e-06
FAILED tests/ignite/contrib/metrics/regression/test_geometric_mean_relative_absolute_error.py::test_distrib_gloo_cpu_or_gpu - assert 1.3195266723632812 == 1.3320785760879517 ± 1.3e-06

And here is the code and the tests
master...KickItLikeShika:improve-GMRAE-ddp

@sdesrozis
Copy link
Contributor

This metric is similar to the Geometric Mean Absolute Error which works fine in ddp. I think the error is in the computation of the relative term.

Line 147 of the test, np_y.mean() is computed on the complete y vector gathered over procs. However, Line 48 of the metric, the mean is computed on y separately on each proc.

@KickItLikeShika What do you think ?

@KickItLikeShika
Copy link
Contributor Author

Interesting! thanks for the explanation @sdesrozis , but I'm wondering here, isn't that mismatch in the results confusing for the user? I don't exactly which way is the right one for computing this metric (computing the mean on each proc separately -like the metric itself-, or the way in the tests). So I don't know should I edit the way we compute the metric, or this test?

@sdesrozis
Copy link
Contributor

sdesrozis commented Jun 29, 2021

The answer is in the GMARE definition 😊

The actual implementation sounds weird to me. The results should be different whether the number of procs used. I didn't look in details and I think you should do to fix this issue. I can help if needed.

IMO the good implementation is the test one.

@KickItLikeShika
Copy link
Contributor Author

KickItLikeShika commented Jun 29, 2021

Thanks @sdesrozis for clarifications!
OK, I will change the way we compute this metric, does this sound OK for you? @sdesrozis @vfdev-5

@sdesrozis
Copy link
Contributor

sdesrozis commented Jun 29, 2021

From the reference paper p14 :

Normalization by variability of actuals: N3 = (𝐴 − 𝐴̅)^−𝑐
Normalization by variability of actuals includes division of the error by the difference between the actual value and mean value of all actuals. For the magnitude of error and absolute error c = 1, and for the squared error c = 2. Also, for the absolute distance error, absolute actuals are used.
Inclusion of the actuals mean 𝐴̅ is intended to lower the risk of division-by-zero situations. Actuals mean is implemented in R packages (e.g. in MLmetrics, metrics, rminer). In general case, normalization can use an error from a benchmark method (usually naïve forecasting) (Hyndman, 2006).

So the key point is to understand well what means the mean of all actuals. It sounds to me that a mean over all y as in test is correct.

On the other hand, it seems that GMRAE would be a metric for forecasting time series. Therefore, 𝐴̅ stands for A at the previous time... See https://www.rdocumentation.org/packages/rminer/versions/1.4.6/topics/mmetric or https://support.numxl.com/hc/en-us/articles/115001223403-GMRAE-Geometric-Mean-Relative-Absolute-Error

I don't know whether this metric is relevant...

HTH

@KickItLikeShika KickItLikeShika changed the title GMARE compatibility with DDP Improve GMARE metric computation formula Jun 30, 2021
@KickItLikeShika
Copy link
Contributor Author

Thanks for @sdesrozis for providing help!
After more digging I understood what's going on here, I tried to update this metric to be like the other, so I implemented something like this

    @reinit__is_reduced
    def reset(self) -> None:
        self._num_examples = 0
        self._sum_of_errors = torch.tensor(0.0, device=self._device)

    def _update(self, output: Tuple[torch.Tensor, torch.Tensor]) -> None:
        y_pred, y = output[0].detach(), output[1].detach()
        errors = torch.log(torch.abs(y - y_pred) / torch.abs(y - y.mean()))
        self._sum_of_errors += torch.sum(errors).to(self._device)
        self._num_examples += y.shape[0]
    
    @sync_all_reduce("_sum_of_errors", "_num_examples")
    def compute(self) -> float:
        if self._num_examples == 0:
            raise NotComputableError(
                "GeometricMeanRelativeAbsoluteError must have at least one example before it can be computed."
            )
        return torch.exp(self._sum_of_errors / self._num_examples).item()

It worked well but not for the integration tests as we define batches (and that was the reason that the integration tests were different from other metrics), so we calculate the mean over a single batch, and that's the cause of the misleading results in case of (integration and distributed).
So I think I have an idea to update this metric and update it for DDP too, have a look here: https://gist.github.com/bshishov/5dc237f59f019b26145648e2124ca1c9#file-forecasting_metrics-py-L240
GMARE is used for forecasting, so in the previous gist the seasonality/benchmark option is provided, I'm thinking about providing this option and use it instead of the mean. Please let me know what do you think about that, thanks!

@sdesrozis
Copy link
Contributor

@KickItLikeShika Thanks for exploring the usage of this metric.

We don't have any time series use case. Why not provide such an example to illustrate the GMRAE usage, and others regression metrics as well ? Although I don't have any experience on this topic.

@KickItLikeShika
Copy link
Contributor Author

I have been looking for examples or something shows the usage of the metric, but I found nothing solid yet, but this link you sent above somehow clarifies the usage of the metric https://support.numxl.com/hc/en-us/articles/115001223403-GMRAE-Geometric-Mean-Relative-Absolute-Error

@sdesrozis
Copy link
Contributor

Yep I cited this article in my previous post. Nothing solid, I agree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants