You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Have a standard pipeline for evaluating sets (vs human or vs gold standard)
Describe the solution you'd like
where to store evaluation
2 types d'éval:
benchmarking (human v gold standard, vtc v human)
agreement (human v human, vtc v another automated system)
benchmarking:
where: extra/benchmarking
given that it describes the "quality" of the whole set, it makes sense to include benchmarking results within the set folder -- however, this sounds messy, so we decide to put it in extra, and signal this info in the metadata for the sets
what format:
format visuel pour l'humain qui veut avoir un aperçu général, pdf dans lequel on met les matrices de conf, préc, rec, fscore
yaml avec paramètres (par ex quels fichiers ont été comparés, versions de dataset)
csv1: matrice de conf non-normalisé (car cela donne aussi une idée de la quantité de data sur laquelle ça a été annoté)
csv2: f-score
comment on pourrait les exploiter :
get fscores for all annotators by grabbing this from all sets
get fscores per child, it would mean grabbing from all sets based on information in the yaml
Is your feature request related to a problem? Please describe.
Have a standard pipeline for evaluating sets (vs human or vs gold standard)
Describe the solution you'd like
where to store evaluation
2 types d'éval:
benchmarking (human v gold standard, vtc v human)
agreement (human v human, vtc v another automated system)
benchmarking:
where: extra/benchmarking
given that it describes the "quality" of the whole set, it makes sense to include benchmarking results within the set folder -- however, this sounds messy, so we decide to put it in extra, and signal this info in the metadata for the sets
what format:
format visuel pour l'humain qui veut avoir un aperçu général, pdf dans lequel on met les matrices de conf, préc, rec, fscore
yaml avec paramètres (par ex quels fichiers ont été comparés, versions de dataset)
csv1: matrice de conf non-normalisé (car cela donne aussi une idée de la quantité de data sur laquelle ça a été annoté)
csv2: f-score
comment on pourrait les exploiter :
get fscores for all annotators by grabbing this from all sets
get fscores per child, it would mean grabbing from all sets based on information in the yaml
!! This feature should probably come after #454
The text was updated successfully, but these errors were encountered: