Ner label re alignment #10568

elk-cloner · 2021-03-06T21:50:05Z

What does this PR do?

Fixes #10263

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case. link
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

pipelines: @LysandreJik, @Narsil, @joshdevins

What capabilities have been added ?

label realignment: token predictions for subwords can be realigned with 4 different strategies

default: reset all subword token predictions except for first token
first: the prediction for the first token in the word is assigned to all subword tokens
max: the highest confidence prediction among the subword tokens is assigned to all subword tokens
average: the average pool of the predictions for all subwords is assigned to all subword tokens
ignore subwords: enable ignoring subwords by merging tokens

What are the expected changes from the current behavior?

New flag subword_label_re_alignment enables realignment.
Already existing flag ignore_subwords actually enables merging subwords.

Example use cases with code sample enabled by the PR

ner = transformers.pipeline(
    'ner',
    model='elastic/distilbert-base-cased-finetuned-conll03-english',
    tokenizer='elastic/distilbert-base-cased-finetuned-conll03-english',
    ignore_labels=[],
    ignore_subwords=False,
    subword_label_re_alignment='average'
)
ner('Mark Musterman')
[
    {
        'word': 'Mark',
        'score': 0.999686598777771,
        'index': 1,
        'start': 0,
        'end': 4,
        'is_subword': False,
        'entity': 'B-PER'
    },
    {
        'word': 'Must',
        'score': 0.9995412826538086,
        'index': 2,
        'start': 5,
        'end': 9,
        'is_subword': False,
        'entity': 'I-PER'
    },
    {
        'word': '##erman',
        'score': 0.9996127486228943,
        'index': 3,
        'start': 9,
        'end': 14,
        'is_subword': True,
        'entity': 'I-PER'
    }
]
ner = transformers.pipeline(
    'ner',
    model='elastic/distilbert-base-cased-finetuned-conll03-english',
    tokenizer='elastic/distilbert-base-cased-finetuned-conll03-english',
    ignore_labels=[],
    ignore_subwords=True,
    subword_label_re_alignment='average'
)
ner('Mark Musterman')
[
    {
        'word': 'Mark',
        'score': 0.999686598777771,
        'index': 1,
        'start': 0,
        'end': 4,
        'is_subword': False,
        'entity': 'B-PER'
    },
    {
        'word': 'Musterman',
        'score': 0.9995412826538086,
        'index': 2,
        'start': 5,
        'end': 9,
        'is_subword': False,
        'entity': 'I-PER'
    }
]

Previous use cases with code sample that see the behavior changes

ner = transformers.pipeline(
    'ner',
    model='elastic/distilbert-base-cased-finetuned-conll03-english',
    tokenizer='elastic/distilbert-base-cased-finetuned-conll03-english',
    ignore_labels=[],
    ignore_subwords=True
)
ner('Mark Musterman')
[
    {
        'word': 'Mark',
        'score': 0.999686598777771,
        'entity': 'B-PER',
        'index': 1,
        'start': 0,
        'end': 4
    },
    {
        'word': 'Must',
        'score': 0.9995412826538086,
        'entity': 'I-PER',
        'index': 2,
        'start': 5,
        'end': 9
    },
    {
        'word': '##erman',
        'score': 0.9996127486228943,
        'entity': 'I-PER',
        'index': 3,
        'start': 9,
        'end': 14
    }
]

elk-cloner · 2021-03-06T21:54:06Z

src/transformers/pipelines/token_classification.py

                        input_ids = tokens["input_ids"].cpu().numpy()[0]

            score = np.exp(entities) / np.exp(entities).sum(-1, keepdims=True)
-            labels_idx = score.argmax(axis=-1)


Because we are going to set the labels according to the strategy we need the scores for all labels, specially when using “average” strategy.

elk-cloner · 2021-03-06T21:54:42Z

src/transformers/pipelines/token_classification.py

-                (idx, label_idx)
-                for idx, label_idx in enumerate(labels_idx)
-                if (self.model.config.id2label[label_idx] not in self.ignore_labels) and not special_tokens_mask[idx]
+                idx for idx in range(score.shape[0]) if not special_tokens_mask[idx]


At this step we can only filter special_tokens_mask, because we don’t have the labels for other tokens

elk-cloner · 2021-03-06T21:55:14Z

src/transformers/pipelines/token_classification.py

                    "word": word,
-                    "score": score[idx][label_idx].item(),
-                    "entity": self.model.config.id2label[label_idx],
+                    "score": score[idx],


we need score for all labels

elk-cloner · 2021-03-06T21:55:45Z

src/transformers/pipelines/token_classification.py

-
                entities += [entity]

+            if self.subword_label_re_alignment:


we are going to set the labels according to the strategy, if subword_label_re_alignment == false we will leave the labels as they were predicted

elk-cloner · 2021-03-06T21:57:21Z

src/transformers/pipelines/token_classification.py

+
+        def sub_words_label(sub_words: List[dict]) -> dict:
+            score = np.stack([sub["score"] for sub in sub_words])
+            if strategy == "default":


as @joshdevins said: "If training with padded sub-words/label for first sub-word only, e.g. Max Mustermann → Max Must ##erman ##n → B-PER I-PER X X
Use the label from the first sub-word (default)"

francescorubbo · 2021-03-18T03:47:50Z

src/transformers/pipelines/token_classification.py

        task: str = "",
        grouped_entities: bool = False,
+        subword_label_re_alignment: Union[bool, str] = False,
        ignore_subwords: bool = False,


Should the ignore_subwords flag be removed then?

I think so, we should remove ignore_subwords flag. @LysandreJik, @Narsil i leave some of the old codes just to pass the tests(😅 i'm new to tests). Can you help me about the tests? ( for example should I remove this test or somehow change it ? )

Yeah, I would think the test can be repurposed for the new flag. It would also be good to assert correctness, beside execution (it looks like the current test doesn't check the resulting output?). I'm happy to contribute directly to your branch, if it helps. Let me know.

It would be nice to keep it for backwards compatibility purposes. Can the capabilities enabled by that flag be achieved with the new flag introduced in this PR?

francescorubbo · 2021-03-18T03:52:54Z

src/transformers/pipelines/token_classification.py

        ignore_labels=["O"],
        task: str = "",
        grouped_entities: bool = False,
+        subword_label_re_alignment: Union[bool, str] = False,


I wonder if aggregate_subwords would be a more suitable name?

I would understand aggregate_subwords better than subword_label_re_alignment

Or would aggregate_strategy be even better, as we're actually prompting for a strategy?

Having this accept enum parameters as value would be great, similar to what we do with PaddingStrategy:

transformers/src/transformers/file_utils.py

Lines 1657 to 1665 in 9f4e0c2

class PaddingStrategy(ExplicitEnum):

"""

Possible values for the ``padding`` argument in :meth:`PreTrainedTokenizerBase.__call__`. Useful for tab-completion

in an IDE.

"""

LONGEST = "longest"

MAX_LENGTH = "max_length"

DO_NOT_PAD = "do_not_pad"

francescorubbo · 2021-03-18T03:55:31Z

src/transformers/pipelines/token_classification.py

+    def set_subwords_label(self, entities: List[dict], strategy: str) -> dict:
+        def sub_words_label(sub_words: List[dict]) -> dict:
+            score = np.stack([sub["score"] for sub in sub_words])
+            if strategy == "default":


What happens if strategy is set to True?

sry this is my bad. you're right. When strategy is True we should use "default" strategy. I'll fix this

francescorubbo · 2021-03-18T03:56:25Z

Thank you for addressing this! I left some minor comments/questions.

elk-cloner

@francescorubbo if you can fix the tests it would be great, let me know if you have any questions about my code

elk-cloner · 2021-03-18T16:04:32Z

src/transformers/pipelines/token_classification.py

+    def set_subwords_label(self, entities: List[dict], strategy: str) -> dict:
+        def sub_words_label(sub_words: List[dict]) -> dict:
+            score = np.stack([sub["score"] for sub in sub_words])
+            if strategy == "default":


sry this is my bad. you're right. When strategy is True we should use "default" strategy. I'll fix this

Ensure existing behavior for `ignore_subwords` and `grouped_entities` arguments is preserved for backward compatibility.

Restore compatibility with existing NER pipeline tests

The refactor addresses bugs for corner cases uncovered when testing each scenario of label re-alignment with or without ignore_subwords.

Refactor label re-alignment in NER pipeline and add tests

francescorubbo · 2021-03-26T02:11:39Z

@LysandreJik I think this is ready for review now.

LysandreJik · 2021-03-29T21:24:39Z

Hey @elk-cloner, @francescorubbo! That's an amazing work you've done here. The added tests are a wonderful addition, and will ensure the pipeline is as robust as it can be.

To make reviews easier, could you please fill in the PR description or add a comment mentioning the changes? For example:

What capabilities have been added
What are the expected changes from the current behavior

And optionally, if you have the time to:

Example use cases with code sample enabled by the PR
Previous use cases with code sample that see the behavior changes

If you don't have time to do any of that, that's perfectly fine - just let me know and I'll take care of it as soon as I have a bit of availability.

Thanks again for the great work you've done here!

joshdevins · 2021-03-30T09:06:23Z

This looks good. I'm wondering if you can add some tests to verify the expected behaviour of two other scenarios from the bug report.

Specifically, the tests in the PR seem to ensure:
Accenture → A ##cc ##ent ##ure → B-ORG O O O → Accenture (ORG)

...but does not make assertions for mixed B/I/O labels in the same word:
Max Mustermann → Max Must ##erman ##n → B-PER I-PER I-PER O → Max Mustermann (PER)

...or inner entity labels surrounded by O labels:
Elasticsearch → El ##astic ##sea #rch → O O I-MISC O → Elasticsearch (MISC)

Subwords can be skipped indipendently of label realignment.

francescorubbo · 2021-04-04T21:56:41Z

@joshdevins Thank you for suggesting to test those additional scenarios. Testing for those helped me identify some bugs in the previous implementation. I believe the new test should cover all three scenarios now.

francescorubbo · 2021-04-04T22:25:26Z

@LysandreJik I'll add the requested notes here, as I don't seem to have permissions to edit the PR description. Maybe @elk-cloner can transfer some of the info there.

What capabilities have been added

label realignment

Token predictions for subwords can be realigned with 4 different strategies

default: reset all subword token predictions except for first token
first: the prediction for the first token in the word is assigned to all subword tokens
max: the highest confidence prediction among the subword tokens is assigned to all subword tokens
average: the average pool of the predictions for all subwords is assigned to all subword tokens
ignore subwords: enable ignoring subwords by merging tokens

What are the expected changes from the current behavior

New flag `subword_label_re_alignment` enables realignment.

Already existing flag ignore_subwords actually enables merging subwords.

Example use cases with code sample enabled by the PR

ner = transformers.pipeline('ner',
                            model='elastic/distilbert-base-cased-finetuned-conll03-english',
                            tokenizer='elastic/distilbert-base-cased-finetuned-conll03-english',
                            ignore_labels = [],
                            ignore_subwords=False,
                            subword_label_re_alignment='average'
                           )
ner('Mark Musterman')
[{'word': 'Mark',
  'score': 0.999686598777771,
  'index': 1,
  'start': 0,
  'end': 4,
  'is_subword': False,
  'entity': 'B-PER'},
 {'word': 'Must',
  'score': 0.9995412826538086,
  'index': 2,
  'start': 5,
  'end': 9,
  'is_subword': False,
  'entity': 'I-PER'},
 {'word': '##erman',
  'score': 0.9996127486228943,
  'index': 3,
  'start': 9,
  'end': 14,
  'is_subword': True,
  'entity': 'I-PER'}]
ner = transformers.pipeline('ner',
                            model='elastic/distilbert-base-cased-finetuned-conll03-english',
                            tokenizer='elastic/distilbert-base-cased-finetuned-conll03-english',
                            ignore_labels = [],
                            ignore_subwords=True,
                            subword_label_re_alignment='average'
                           )
ner('Mark Musterman')
[{'word': 'Mark',
  'score': 0.999686598777771,
  'index': 1,
  'start': 0,
  'end': 4,
  'is_subword': False,
  'entity': 'B-PER'},
 {'word': 'Musterman',
  'score': 0.9995412826538086,
  'index': 2,
  'start': 5,
  'end': 9,
  'is_subword': False,
  'entity': 'I-PER'}]

Previous use cases with code sample that see the behavior changes

ner = transformers.pipeline('ner',
                            model='elastic/distilbert-base-cased-finetuned-conll03-english',
                            tokenizer='elastic/distilbert-base-cased-finetuned-conll03-english',
                            ignore_labels = [],
                            ignore_subwords=True
                           )
ner('Mark Musterman')
[{'word': 'Mark',
  'score': 0.999686598777771,
  'entity': 'B-PER',
  'index': 1,
  'start': 0,
  'end': 4},
 {'word': 'Must',
  'score': 0.9995412826538086,
  'entity': 'I-PER',
  'index': 2,
  'start': 5,
  'end': 9},
 {'word': '##erman',
  'score': 0.9996127486228943,
  'entity': 'I-PER',
  'index': 3,
  'start': 9,
  'end': 14}]

elk-cloner · 2021-04-05T03:57:19Z

Thank you, @francescorubbo, I added them to PR.

* finish quicktour * fix import * fix print * explain config default better * Update docs/source/quicktour.rst Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]>

* fix docs for decoder_input_ids * revert the changes for bart and mbart

francescorubbo · 2021-04-28T02:05:10Z

There was a new release of the black library which touched a lot of files, so you will need to rebase your PR on master to have the quality tests pass again.

I did merge master (see 031f3ef). Shouldn't it address that?

cceyda · 2021-04-28T05:45:41Z

I think originally there was also mention of saving the aggregation_strategy to the model config?
since it makes the most sense to use the same strategy the model was trained on, ignoring subwords or else.

joshdevins · 2021-04-28T08:00:07Z

I think originally there was also mention of saving the aggregation_strategy to the model config?
since it makes the most sense to use the same strategy the model was trained on, ignoring subwords or else.

@cceyda Yes, this was my original proposal, but I think it might be too much for one PR. I would not close the original issue (#10263) until the other items are addressed, but perhaps a new/smaller PR can address saving the strategy used at training/evaluation time to the model config file.

* Update min versions in README and add Flax * Adapt index

…xt_pair` parameter (huggingface#11486) * Update tokenization_utils_base.py * add assertion * check batch len * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sylvain Gugger <[email protected]> * add error message Co-authored-by: Sylvain Gugger <[email protected]>

Ensure existing behavior for `ignore_subwords` and `grouped_entities` arguments is preserved for backward compatibility.

The refactor addresses bugs for corner cases uncovered when testing each scenario of label re-alignment with or without ignore_subwords.

Subwords can be skipped indipendently of label realignment.

The `aggregation_strategy` argument can be either string or an AggregationStrategy enum. If a string, we attempt to cast into the corresponding AggregationStrategy enum.

Given that the label realignment is now only applied when subwords are ignored, the default strategy does not need to reset the score for all subwords.

…r/transformers into ner_label_re_alignment

francescorubbo · 2021-04-29T04:18:29Z

ugh...this ^ is why I hate rebasing on big project repos...
@sgugger from a cursory look the 215 (!) file diffs look legit, please let me know if this PR needs any more work before you can merge.

francescorubbo · 2021-05-06T22:58:37Z

@LysandreJik @sgugger Is there more work needed for this PR? If the rebase is an issue, I can create a new PR with only the relevant changes, but we would loose the commit history.

sgugger · 2021-05-06T23:28:42Z

We can't see the diff of the PR anymore after the rebase, so you should close this one and open a new one from the same branch please. (GitHub completely sucks at properly showing rebases, unless you force push after the rebase.)

elk-cloner added 2 commits March 7, 2021 01:06

add subword_label_re_alignment strategies

009409c

remove redundent import

ee7aeb0

elk-cloner commented Mar 6, 2021

View reviewed changes

pass local test

96f68e4

LysandreJik mentioned this pull request Mar 17, 2021

TokenClassificationPipeline: ignoring subwords #10763

Closed

4 tasks

francescorubbo reviewed Mar 18, 2021

View reviewed changes

elk-cloner commented Mar 19, 2021

View reviewed changes

Restore compatibility with existing NER pipeline tests

1017651

Ensure existing behavior for `ignore_subwords` and `grouped_entities` arguments is preserved for backward compatibility.

francescorubbo mentioned this pull request Mar 20, 2021

Restore compatibility with existing NER pipeline tests elk-cloner/transformers#1

Merged

elk-cloner and others added 4 commits March 21, 2021 09:39

Merge pull request #1 from francescorubbo/ner_label_re_alignment

62a526a

Restore compatibility with existing NER pipeline tests

Refactor label re-alignment in NER pipeline and add tests

9052a33

The refactor addresses bugs for corner cases uncovered when testing each scenario of label re-alignment with or without ignore_subwords.

Merge pull request #2 from francescorubbo/ner_label_re_alignment

096c433

Refactor label re-alignment in NER pipeline and add tests

Import numpy, used for arrays in test input

df641ab

elk-cloner closed this Mar 30, 2021

elk-cloner reopened this Mar 30, 2021

francescorubbo added 2 commits April 4, 2021 14:45

Bugfix: ensure entities are updated with aligned labels.

482f325

Subwords can be skipped indipendently of label realignment.

Update tests to probe more scenarios and with bugfix.

e728ce1

Hamel Husain and others added 2 commits April 27, 2021 10:04

Finish Making Quick Tour respect the model object (huggingface#11467)

7ceff67

* finish quicktour * fix import * fix print * explain config default better * Update docs/source/quicktour.rst Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]>

fix docs for decoder_input_ids (huggingface#11466)

8d43c71

* fix docs for decoder_input_ids * revert the changes for bart and mbart

sgugger and others added 19 commits April 28, 2021 09:10

Update min versions in README and add Flax (huggingface#11472)

2d27900

* Update min versions in README and add Flax * Adapt index

fix huggingface#1149 (huggingface#11493)

3f6add8

add subword_label_re_alignment strategies

9114e51

remove redundent import

7d518b8

pass local test

e108da1

Restore compatibility with existing NER pipeline tests

8879e12

Ensure existing behavior for `ignore_subwords` and `grouped_entities` arguments is preserved for backward compatibility.

Refactor label re-alignment in NER pipeline and add tests

5662477

The refactor addresses bugs for corner cases uncovered when testing each scenario of label re-alignment with or without ignore_subwords.

Import numpy, used for arrays in test input

92e6cee

Bugfix: ensure entities are updated with aligned labels.

b0074c7

Subwords can be skipped indipendently of label realignment.

Update tests to probe more scenarios and with bugfix.

af865e3

Define and use AggregationStrategy enum as argument

cdd1db2

The `aggregation_strategy` argument can be either string or an AggregationStrategy enum. If a string, we attempt to cast into the corresponding AggregationStrategy enum.

Use AggregationStrategy.FIRST as default

69da7cc

Given that the label realignment is now only applied when subwords are ignored, the default strategy does not need to reset the score for all subwords.

Updated expected test results and move to fixtures

8b64c28

Use score corresponding to chosen label.

2f3b8b0

Fill entity attributes only if they exist.

1e48070

Style fixes

c27f9eb

Merge branch 'ner_label_re_alignment' of https://github.com/elk-clone…

a251184

…r/transformers into ner_label_re_alignment

Remove duplicated definition caused by rebasing after merging

45e1919

francescorubbo mentioned this pull request May 7, 2021

[TokenClassification] Label realignment for subword aggregation #11622

Closed

5 tasks

elk-cloner closed this May 7, 2021

Narsil mentioned this pull request May 11, 2021

[TokenClassification] Label realignment for subword aggregation #11680

Merged

5 tasks

	class PaddingStrategy(ExplicitEnum):
	"""
	Possible values for the ``padding`` argument in :meth:`PreTrainedTokenizerBase.__call__`. Useful for tab-completion
	in an IDE.
	"""

	LONGEST = "longest"
	MAX_LENGTH = "max_length"
	DO_NOT_PAD = "do_not_pad"

Ner label re alignment #10568

Ner label re alignment #10568

Uh oh!

Conversation

elk-cloner commented Mar 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

What capabilities have been added ?

What are the expected changes from the current behavior?

Example use cases with code sample enabled by the PR

Previous use cases with code sample that see the behavior changes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

francescorubbo commented Mar 18, 2021

Uh oh!

elk-cloner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

francescorubbo commented Mar 26, 2021

Uh oh!

LysandreJik commented Mar 29, 2021

Uh oh!

joshdevins commented Mar 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

francescorubbo commented Apr 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

francescorubbo commented Apr 4, 2021 • edited by LysandreJik Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

label realignment

New flag subword_label_re_alignment enables realignment.

Uh oh!

elk-cloner commented Apr 5, 2021

Uh oh!

francescorubbo commented Apr 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cceyda commented Apr 28, 2021

Uh oh!

joshdevins commented Apr 28, 2021

Uh oh!

francescorubbo commented Apr 29, 2021

Uh oh!

francescorubbo commented May 6, 2021

Uh oh!

sgugger commented May 6, 2021

elk-cloner commented Mar 6, 2021 •

edited

Loading

joshdevins commented Mar 30, 2021 •

edited

Loading

francescorubbo commented Apr 4, 2021 •

edited

Loading

francescorubbo commented Apr 4, 2021 •

edited by LysandreJik

Loading

New flag `subword_label_re_alignment` enables realignment.

francescorubbo commented Apr 28, 2021 •

edited

Loading