-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Ner label re alignment #10568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ner label re alignment #10568
Conversation
| input_ids = tokens["input_ids"].cpu().numpy()[0] | ||
|
|
||
| score = np.exp(entities) / np.exp(entities).sum(-1, keepdims=True) | ||
| labels_idx = score.argmax(axis=-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we are going to set the labels according to the strategy we need the scores for all labels, specially when using “average” strategy.
| (idx, label_idx) | ||
| for idx, label_idx in enumerate(labels_idx) | ||
| if (self.model.config.id2label[label_idx] not in self.ignore_labels) and not special_tokens_mask[idx] | ||
| idx for idx in range(score.shape[0]) if not special_tokens_mask[idx] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this step we can only filter special_tokens_mask, because we don’t have the labels for other tokens
| "word": word, | ||
| "score": score[idx][label_idx].item(), | ||
| "entity": self.model.config.id2label[label_idx], | ||
| "score": score[idx], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need score for all labels
|
|
||
| entities += [entity] | ||
|
|
||
| if self.subword_label_re_alignment: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are going to set the labels according to the strategy, if subword_label_re_alignment == false we will leave the labels as they were predicted
|
|
||
| def sub_words_label(sub_words: List[dict]) -> dict: | ||
| score = np.stack([sub["score"] for sub in sub_words]) | ||
| if strategy == "default": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as @joshdevins said: "If training with padded sub-words/label for first sub-word only, e.g. Max Mustermann → Max Must ##erman ##n → B-PER I-PER X X
Use the label from the first sub-word (default)"
| task: str = "", | ||
| grouped_entities: bool = False, | ||
| subword_label_re_alignment: Union[bool, str] = False, | ||
| ignore_subwords: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the ignore_subwords flag be removed then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, we should remove ignore_subwords flag. @LysandreJik, @Narsil i leave some of the old codes just to pass the tests(😅 i'm new to tests). Can you help me about the tests? ( for example should I remove this test or somehow change it ? )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I would think the test can be repurposed for the new flag. It would also be good to assert correctness, beside execution (it looks like the current test doesn't check the resulting output?). I'm happy to contribute directly to your branch, if it helps. Let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to keep it for backwards compatibility purposes. Can the capabilities enabled by that flag be achieved with the new flag introduced in this PR?
| ignore_labels=["O"], | ||
| task: str = "", | ||
| grouped_entities: bool = False, | ||
| subword_label_re_alignment: Union[bool, str] = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if aggregate_subwords would be a more suitable name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would understand aggregate_subwords better than subword_label_re_alignment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or would aggregate_strategy be even better, as we're actually prompting for a strategy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having this accept enum parameters as value would be great, similar to what we do with PaddingStrategy:
transformers/src/transformers/file_utils.py
Lines 1657 to 1665 in 9f4e0c2
| class PaddingStrategy(ExplicitEnum): | |
| """ | |
| Possible values for the ``padding`` argument in :meth:`PreTrainedTokenizerBase.__call__`. Useful for tab-completion | |
| in an IDE. | |
| """ | |
| LONGEST = "longest" | |
| MAX_LENGTH = "max_length" | |
| DO_NOT_PAD = "do_not_pad" |
| def set_subwords_label(self, entities: List[dict], strategy: str) -> dict: | ||
| def sub_words_label(sub_words: List[dict]) -> dict: | ||
| score = np.stack([sub["score"] for sub in sub_words]) | ||
| if strategy == "default": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if strategy is set to True?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sry this is my bad. you're right. When strategy is True we should use "default" strategy. I'll fix this
|
Thank you for addressing this! I left some minor comments/questions. |
elk-cloner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@francescorubbo if you can fix the tests it would be great, let me know if you have any questions about my code
| def set_subwords_label(self, entities: List[dict], strategy: str) -> dict: | ||
| def sub_words_label(sub_words: List[dict]) -> dict: | ||
| score = np.stack([sub["score"] for sub in sub_words]) | ||
| if strategy == "default": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sry this is my bad. you're right. When strategy is True we should use "default" strategy. I'll fix this
Ensure existing behavior for `ignore_subwords` and `grouped_entities` arguments is preserved for backward compatibility.
Restore compatibility with existing NER pipeline tests
The refactor addresses bugs for corner cases uncovered when testing each scenario of label re-alignment with or without ignore_subwords.
Refactor label re-alignment in NER pipeline and add tests
|
@LysandreJik I think this is ready for review now. |
|
Hey @elk-cloner, @francescorubbo! That's an amazing work you've done here. The added tests are a wonderful addition, and will ensure the pipeline is as robust as it can be. To make reviews easier, could you please fill in the PR description or add a comment mentioning the changes? For example:
And optionally, if you have the time to:
If you don't have time to do any of that, that's perfectly fine - just let me know and I'll take care of it as soon as I have a bit of availability. Thanks again for the great work you've done here! |
|
This looks good. I'm wondering if you can add some tests to verify the expected behaviour of two other scenarios from the bug report. Specifically, the tests in the PR seem to ensure: ...but does not make assertions for mixed B/I/O labels in the same word: ...or inner entity labels surrounded by O labels: |
Subwords can be skipped indipendently of label realignment.
|
@joshdevins Thank you for suggesting to test those additional scenarios. Testing for those helped me identify some bugs in the previous implementation. I believe the new test should cover all three scenarios now. |
|
@LysandreJik I'll add the requested notes here, as I don't seem to have permissions to edit the PR description. Maybe @elk-cloner can transfer some of the info there.
label realignmentToken predictions for subwords can be realigned with 4 different strategies
New flag
|
|
Thank you, @francescorubbo, I added them to PR. |
* finish quicktour * fix import * fix print * explain config default better * Update docs/source/quicktour.rst Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]>
* fix docs for decoder_input_ids * revert the changes for bart and mbart
I did merge master (see 031f3ef). Shouldn't it address that? |
|
I think originally there was also mention of saving the |
@cceyda Yes, this was my original proposal, but I think it might be too much for one PR. I would not close the original issue (#10263) until the other items are addressed, but perhaps a new/smaller PR can address saving the strategy used at training/evaluation time to the model config file. |
* Update min versions in README and add Flax * Adapt index
…xt_pair` parameter (huggingface#11486) * Update tokenization_utils_base.py * add assertion * check batch len * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sylvain Gugger <[email protected]> * add error message Co-authored-by: Sylvain Gugger <[email protected]>
Ensure existing behavior for `ignore_subwords` and `grouped_entities` arguments is preserved for backward compatibility.
The refactor addresses bugs for corner cases uncovered when testing each scenario of label re-alignment with or without ignore_subwords.
Subwords can be skipped indipendently of label realignment.
The `aggregation_strategy` argument can be either string or an AggregationStrategy enum. If a string, we attempt to cast into the corresponding AggregationStrategy enum.
Given that the label realignment is now only applied when subwords are ignored, the default strategy does not need to reset the score for all subwords.
…r/transformers into ner_label_re_alignment
|
ugh...this ^ is why I hate rebasing on big project repos... |
|
@LysandreJik @sgugger Is there more work needed for this PR? If the rebase is an issue, I can create a new PR with only the relevant changes, but we would loose the commit history. |
|
We can't see the diff of the PR anymore after the rebase, so you should close this one and open a new one from the same branch please. (GitHub completely sucks at properly showing rebases, unless you force push after the rebase.) |
What does this PR do?
Fixes #10263
Before submitting
Pull Request section?
to it if that's the case. link
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
What capabilities have been added ?
label realignment: token predictions for subwords can be realigned with 4 different strategies
What are the expected changes from the current behavior?
Example use cases with code sample enabled by the PR
Previous use cases with code sample that see the behavior changes