Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Side by side comparison of two models #1327

Open
yassersouri opened this issue Feb 16, 2025 · 5 comments
Open

Side by side comparison of two models #1327

yassersouri opened this issue Feb 16, 2025 · 5 comments
Assignees

Comments

@yassersouri
Copy link

Is there a way to get a side by side view comparing two models on the same tasks?
For example seeing which instances both models got wrong or when one got something wrong while the other didn't, etc.

Thanks for the great library.

@dragonstyle
Copy link
Collaborator

Right now there isn't an automated way to do this (though it is on our todo list). You can sort of set this up using the Inspect VSCode extension by just opening two logs side by side and visually comparing them (tedious but perhaps helpful).

@dragonstyle dragonstyle self-assigned this Feb 18, 2025
@tadamcz
Copy link
Contributor

tadamcz commented Feb 18, 2025

We would also find this a very interesting feature for the log viewer. I had kind of assumed this would be too much work to be worth it, as it seems to invalidate a fundamental assumption of the log viewer. But great to hear it's on the todo list! Maybe it's easier than I thought.

@dragonstyle
Copy link
Collaborator

It isn't easy, unfortunately, but I think we'd still like to do it!

@yassersouri
Copy link
Author

yassersouri commented Feb 19, 2025 via email

@dragonstyle
Copy link
Collaborator

Thanks, this is a great suggestion. I'm right in the middle of quite a bit of refactoring of the main application logic which is necessary to tackle a feature like this (and will cause a lot of merge issues as I work through it). Let me get through this work and then I'll revert here with some suggestions as I see a good path!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants