Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aim UI does not scale well when logging many metrics (multitask) #3215

Open
jorenretel opened this issue Sep 1, 2024 · 0 comments
Open

Aim UI does not scale well when logging many metrics (multitask) #3215

jorenretel opened this issue Sep 1, 2024 · 0 comments
Labels
help wanted Extra attention is needed type / bug Issue type: something isn't working

Comments

@jorenretel
Copy link

🐛 Bug

Hi,
first of all, thanks for this great tool. It is a pleasure to use.

I have this specific project though where I am training a network on very many tasks (1000's). So what I do that works really well: log a distribution of metrics (say a distribution of correlation coefficients over the different tasks).

So now what does NOT work well: I want to log individual metrics for these tasks so that I can find out easily which tasks train well and which have more problems. I don't get that information from looking at the distribution plot. I don't necessarily want to look at all the individual metric curves for each task, but storing the numbers in aim is really useful as it is also great to programmatically access them.

Logging many tasks to aim does not seem to be a problem in itself. But when I open the UI, it completely stops working (especially if I "accidentally" click the metrics tab). There are of course some metrics (like aggregated metrics and the loss curve) that I am interested in seeing though. So clicking that tab is not completely "accidental".

I did see the issues about performance issues in the case of very many runs:

But I think this is orthogonal to that in some sense, therefore making this separate issue.

To reproduce

create a new aim repo:

aim init

run this python script simulating logging 5 epochs with 5000 tasks:

import aim
import math

n_tasks = 5000
run = aim.Run()
for epoch in range(5):
    for i in range(n_tasks):
        run.track(math.sin(i), name=f'metric_task_{i}', epoch=epoch)

Spin up the UI and click around a bit:

aim up

The runs window loads really slowly and it has problems displaying the table. The metrics tab basically completely blocks.

Expected behavior

either:

  1. the UI would somehow be able to deal with this number of metrics (by lazy loading or something, which the aim UI actually already seems to do for large parts, but somehow not enough).

or:

  1. let me declare while logging that this some metrics are aggregated (just a normal metric) and some others to be part of a collection where each individual scalar belongs to one task, basically a vector of metrics. This could help decide how to treat them in the UI. Note that, at least in my use case, these tasks have names, not just a position in the vector, so it's more like a dict actually.

I realize that option number 2 is a feature request and not a bug report. In which case, my excuses.

Environment

  • Aim Version: v3.24.0
  • Python version: 3.12.5
  • pip version: 24.2
  • OS: linux/OSX
  • Any other relevant information
@jorenretel jorenretel added help wanted Extra attention is needed type / bug Issue type: something isn't working labels Sep 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed type / bug Issue type: something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant