-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluation treats multiple categories too leniently #91
Comments
One subtlety is that, because [F The] [H [P [C service] ] ... [D poor] [U ...] ] [F is] Should they be removed? I.e.: [F The] [H [P service ] ... [D poor] [U ...] ] [F is] Scoring |
Yes, I think normalization (including C-flattening) should occur again after moving Fs. |
Should moving all Fs be part of normalization? For structures like [S [F the] [C xyz]] it would make it more transparent that xyz is evoking a scene. |
Also: the confusion matrix code should match the F-score computation |
Evaluation is by spans, and if there is a non-empty intersection of the categories, then the span is considered correct. This is a problem because parsers can just predict many unary edges or multi-category edges and not be penalized for it: https://github.com/danielhers/ucca/blob/master/ucca/evaluation.py#L102
@omriabnd @nschneid
The text was updated successfully, but these errors were encountered: