-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LB customization is difficult #2104
Comments
I think it's not entirely trivial what the default behaviour should be. What would you expect it to do when all domains are deleted? Should it just then include all tasks in the task list? I was thinking about doing a custom benchmark tab, I think from the user's perspective it would make quite a bit of sense. On the other hand, it might prove a bit technically challenging, since the leaderboard, as it is right now, relies quite a bit on the selected benchmark (to speed things up by a lot). Can you provide a scenario, where you would be interested in performance on a single task, but don't necessarily know what benchmark that task belongs to? I'm just wondering what the exact use case is here, and then based on that we can figure out a sensible way to do this. |
I think for sth like the attached I would not expect it to be empty? 🤔 Especially since "jpn" only works but only when adding "zho" it is empty tmp2.mov |
Hmm yea, this seems odd, we def. have "jpn" tasks in:
produces using:
@x-tabdeveloping I feel like this would have worked previously. My guess is that this is probably a bug? |
I can't reproduce it either, it seems to work as intended |
|
@Mateleo didn't the dropdown selection for language work? |
Some people just care about selecting specific tasks but if I clear out all domains and want to select specific tasks then they will not appear cuz domains are empty I guess? also when adding a domain it will clear out my current task selection 🤔
Maybe it is worth having a separate field under prebuilt benchmark that is just called Custom that starts with nothing and makes it super easy to create one's custom benchmark
The text was updated successfully, but these errors were encountered: