Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

list states leak (Tensor) memory #2492

Closed
dominicgkerr opened this issue Apr 6, 2024 · 1 comment · Fixed by #2493
Closed

list states leak (Tensor) memory #2492

dominicgkerr opened this issue Apr 6, 2024 · 1 comment · Fixed by #2493
Labels
bug / fix Something isn't working help wanted Extra attention is needed

Comments

@dominicgkerr
Copy link
Contributor

🐛 Bug

Hello!

I've recently tracked a CPU (possibly GPU also) memory leak to list[Tensor] states. For example:

import torch
from torchmetrics import Metric

class DummyListMetric(Metric):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.add_state("x", default=[])

    def update(self, x=None):
        x = torch.tensor(1) if x is None else x
        self.x.append(x)

When (the parent) Metric.reset() is called, self.x is simply overwritten with an empty list [1]. Unfortunately, this doesn't guarantee that the contents of self.x are deleted, meaning Tensor elements are not always correctly freed

After some investigation, I found my custom metrics (subclasses ofMetric) didn't cause my (work) system to run out of memory if I added the following overload:

def reset(self):
    for attr, default in self._defaults.items():
        if isinstance(default, list):
            getattr(self, attr).clear()

    return super().reset()

The same fix can be applied directly inside torchmetrics with a single line change, modifying [2] to:

getattr(self, attr).clear()

Looking at other open issues, this issue might be related to #2481 (which also references list states). I'd be very happy to open a PR if the above sounds reasonable. Many thanks!

Environment

  • TorchMetrics version: 1.3.2
  • Python & PyTorch Version: 3.11.8 & 2.2.1
@dominicgkerr dominicgkerr added bug / fix Something isn't working help wanted Extra attention is needed labels Apr 6, 2024
Copy link

github-actions bot commented Apr 6, 2024

Hi! thanks for your contribution!, great first issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug / fix Something isn't working help wanted Extra attention is needed
Projects
None yet
2 participants
@dominicgkerr and others