-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch integration result metrics are not in proper range #692
Comments
Thanks for the issue @adamgayoso. We are currently rescaling the metrics to a range between a random and perfect baseline in open problems. We were discussing on Tuesday how to best document this on the website. One thought was to enable a toggle between raw and scaled values in the tables. Do you have any thoughts on this? This rescaling is also used to compute a total score just using mean aggregation. Displaying that would indeed be a good idea. |
I like the idea of a toggle and including the mean score (as well as how it's computed; are you doing the 0.6/0.4 split)? |
We're not currently doing the 60-40 split to be consistent across tasks. I do think we should, but it's not currently something that is set task-specifically. |
To me it makes sense to maybe have a straight average, a bio-biased average, and a batch-biased average? (5050,6040,4060) |
hmm... i wonder if that will just over-complicate things. Either way it's not generalizeable across tasks and so would require some exception in the ranking. It should definitely be discussed though. |
It still seems like the metrics are in different ranges, yet a simple average of the metrics is being used. For example, isolated label sil. is >1 for many method and seems to contribute a lot to the ranking right now. |
See #685 |
This score is displayed as "Mean score". It's the first column. |
Closing as duplicate |
https://openproblems.bio/benchmarks/batch_integration_embed/immune_batch/
It seems the silhouette is greater than 1? How is that possible if scib is rescaling to [0, 1]?
Also, is there a total score that is being computed to rank the methods? It would be great if that total score was included.
The text was updated successfully, but these errors were encountered: