Add Doc Test for BERT by vumichien · Pull Request #16523 · huggingface/transformers

vumichien · 2022-03-31T15:45:18Z

What does this PR do?

Add doc tests for BERT, a part of issue #16292

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@patrickvonplaten, @ydshieh
Documentation: @sgugger

HuggingFaceDocBuilderDev · 2022-03-31T16:05:11Z

The documentation is not available anymore as the PR was closed or merged.

ydshieh · 2022-04-01T09:42:23Z

Hi, @vumichien,

Thank you for this PR. I currently only have a quick look, and overall it is good 😊.

I saw that your use different checkpoints for PyTorch & TensorFlow BERT models in some cases.
I understand it very well, because some checkpoints only exist in PyTorch version.

After an internal discussion, although we think it is totally fine, we would prefer to use the same checkpoints for the PyTorch & TensorFlow models. Therefore, I will try to convert some PyTorch checkpoints you used in this PR, and upload them to Hugging Face Hub. Once this is done, we can change the checkpoints in this PR, and update the expected values (which should be just about copying the values from the PyTorch side).

I will keep you updated 😊!

vumichien · 2022-04-01T12:56:25Z

@ydshieh Thank you very much. I am looking for your update.
Besides that, could you please take a look at modeling_mobilebert.py? When I run make fixup, it asks me to run make fix-copies . If I follow, the model checkpoint, expected_output, expected_loss,.. will be copied from bert to mobilebert, which leads to unwanted results.
I think it is due to these hashtags:

transformers/src/transformers/models/mobilebert/modeling_mobilebert.py

Lines 1315 to 1318 in 9de70f2

    
           # Copied from transformers.models.bert.modeling_bert.BertForQuestionAnswering with Bert->MobileBert all-casing 
        
           class MobileBertForQuestionAnswering(MobileBertPreTrainedModel): 
        
               _keys_to_ignore_on_load_unexpected = [r"pooler"]

transformers/src/transformers/models/mobilebert/modeling_mobilebert.py

Lines 1419 to 1422 in 9de70f2

    
           # Copied from transformers.models.bert.modeling_bert.BertForMultipleChoice with Bert->MobileBert all-casing 
        
           class MobileBertForMultipleChoice(MobileBertPreTrainedModel): 
        
               def __init__(self, config): 
        
                   super().__init__(config)

transformers/src/transformers/models/mobilebert/modeling_mobilebert.py

Lines 1517 to 1520 in 9de70f2

    
           # Copied from transformers.models.bert.modeling_bert.BertForTokenClassification with Bert->MobileBert all-casing 
        
           class MobileBertForTokenClassification(MobileBertPreTrainedModel): 
        
               _keys_to_ignore_on_load_unexpected = [r"pooler"]

So what should I do to avoid this problem?

ydshieh · 2022-04-01T14:31:48Z

Copied from transformers.models.bert.modeling_bert.BertForQuestionAnswering with Bert->MobileBert all-casing

Hi, this is very tricky, and I need some discussion with the team.
This # Copied from thing is used to keep track the origin of some code blocks, and it's much better to keep it as many as possible. Thank you for pointing out this issue!

ydshieh · 2022-04-01T15:32:01Z

@vumichien

I uploaded the TensorFlow checkpoint here

For TFBertForSequenceClassification:

"ydshieh/bert-base-uncased-yelp-polarity"

For TFBertForQuestionAnswering:

"ydshieh/bert-base-cased-squad2"

Whenever you have time, could you change the correspondence places in modeling_tf_bert.py to use these checkpoint?
Ping me if you encounter any difficulty.

Thanks!

vumichien · 2022-04-02T02:15:50Z

@ydshieh could you please check again your check point ydshieh/bert-base-cased-squad2 for TFBertForQuestionAnswering?Every time I run, the output is different (maybe the weight of head is still random)

ydshieh · 2022-04-04T07:46:00Z

@ydshieh could you please check again your check point ydshieh/bert-base-cased-squad2 for TFBertForQuestionAnswering?Every time I run, the output is different (maybe the weight of head is still random)

~~OK, I will take a look today.~~

@vumichien

I checked it, and it turns out that I loaded the PyTorch (QA) checkpoint into a TF model for another task type!
Sorry about this, ~~I will upload the corrected TF checkpoint today~~, and keep you updated!

Uploaded the correct TF checkpoint 🙂 and tested it myself. The result now is always the same (and same as the PT result).

ydshieh · 2022-04-04T18:26:06Z

Hi, @vumichien ,

regarding fix-copies you mentioned in a previous comment: after some discussion, we think the best way is:

In Bert model file:
- at the beginning part of the model file, set something like
  - EXPECTED_OUTPUT_FOR_TOKEN_CLASSIFICATION = ...
  - EXPECTED_LOSS_FOR_TOKEN_CLASSIFICATION = ...
- In add_code_sample_docstrings forBertForTokenClassification, use
  - expected_output=EXPECTED_OUTPUT_FOR_TOKEN_CLASSIFICATION
  - expected_loss=EXPECTED_LOSS
In MobileBert model file:
- at the beginning part of the model file, set something like
  - EXPECTED_OUTPUT_FOR_TOKEN_CLASSIFICATION = ... (set to empty string if you don't want to work on doctest for MobileBert)
  - EXPECTED_LOSS_FOR_TOKEN_CLASSIFICATION = ... (set to empty string if you don't want to work on doctest for MobileBert)
- In add_code_sample_docstrings forMobileBertForTokenClassification, use
  - expected_output=EXPECTED_OUTPUT_FOR_TOKEN_CLASSIFICATION
  - expected_loss=EXPECTED_LOSS

And if you don't provide the values for MobileBert, just don't add it to documentation_tests.txt.

This should avoid the issue coming from copies. Let us know if you encounter any difficulty, thanks!

vumichien · 2022-04-05T00:33:43Z

Thank you very much @ydshieh. I will update the docs test following your instructions

vumichien · 2022-04-05T03:52:15Z

Hi @ydshieh

I have followed your instructions define-expected-output but the problem doesn't solve. Please help me check if I did something wrong.
In this PR, I think it's better if I do both bert and mobilebert at the same time but it still requires fix-copies.
I am still waiting for your update for The loss is always 0.0 (or nan) because the shape of input_ids is [1, 14]
with QuestionAnswering task 🙂. It leads to the problem with MobileBertForQuestionAnswering and MobileBertForQuestionAnswering as well

ydshieh · 2022-04-05T08:05:11Z

Hi @vumichien Let me check, both the fix-copies & loss 0.0 things.

ydshieh · 2022-04-05T10:30:21Z

@vumichien

Regarding fix-copies, I probably should be more thorough in the previous comment.
One reason for the current PR version not working is:

In BertForSequenceClassification: you have

checkpoint="textattack/bert-base-uncased-yelp-polarity",

however, in MobileBertForSequenceClassification, you have

checkpoint="lordtt13/emo-mobilebert",

And this difference will be detected by make fix-copies.

In general, these situations could be solved using the same method:

set _CHECKPOINT_FOR_SEQUENCE_CLASSIFICATION = ... at the beginning of the model file
set checkpoint=_CHECKPOINT_FOR_SEQUENCE_CLASSIFICATION in xxxForSequenceClassification model.

The same approach applies to other models like xxxForQuestionAnswering.

Could you try it and let me know if you still have problem regarding this, please?

In the meantime, I am going to check the loss 0.0.

ydshieh · 2022-04-05T12:20:00Z

@patrickvonplaten

I think we need to update PT_QUESTION_ANSWERING_SAMPLE.

The indices (hard-coded) below

>>> target_start_index, target_end_index = torch.tensor([14]), torch.tensor([15])

actually depends on the different tokenizers. See the code snippet below for Albert v.s. Roberta.
Let me know your thought on this, thanks!

Code Snippet

from transformers import AlbertTokenizer, AlbertForQuestionAnswering, RobertaTokenizer
import torch

albert_checkpoint = "twmkn9/albert-base-v2-squad2"
roberta_checkpoint = "deepset/roberta-base-squad2"

albert_tokenizer = AlbertTokenizer.from_pretrained(f"{albert_checkpoint}")
roberta_tokenizer = RobertaTokenizer.from_pretrained(f"{roberta_checkpoint}")

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"

albert_input_ids = albert_tokenizer(question, text, return_tensors="pt").input_ids.numpy().tolist()[0]
roberta_input_ids = roberta_tokenizer(question, text, return_tensors="pt").input_ids.numpy().tolist()[0]

albert_decoded_tokens = albert_tokenizer.convert_ids_to_tokens(albert_input_ids)
roberta_decoded_tokens = roberta_tokenizer.convert_ids_to_tokens(roberta_input_ids)

# Albert
print(f"Albert: tokens = {albert_decoded_tokens}")
print(f"Albert: num_tokens = {len(albert_decoded_tokens)}")
print(f"Albert: position of `_nice`: {albert_decoded_tokens.index('▁nice')}\n")

# Roberta
print(f"Roberta: tokens = {roberta_decoded_tokens}")
print(f"Roberta: num_tokens = {len(roberta_decoded_tokens)}")
print(f"Roberta: position of `Ġnice`: {roberta_decoded_tokens.index('Ġnice')}")

Outputs

Albert: tokens = ['[CLS]', '▁who', '▁was', '▁jim', '▁henson', '?', '[SEP]', '▁jim', '▁henson', '▁was', '▁a', '▁nice', '▁puppet', '[SEP]']
Albert: num_tokens = 14
Albert: position of `_nice`: 11

Roberta: tokens = ['<s>', 'Who', 'Ġwas', 'ĠJim', 'ĠH', 'enson', '?', '</s>', '</s>', 'Jim', 'ĠH', 'enson', 'Ġwas', 'Ġa', 'Ġnice', 'Ġpuppet', '</s>']
Roberta: num_tokens = 17
Roberta: position of `Ġnice`: 14

vumichien · 2022-04-05T15:09:01Z

@ydshieh Thank you very much for your clear explanation. Now I understand why we should do it to get over the problem with fix-copies 🙏

ydshieh · 2022-04-06T14:36:14Z

Regarding the issue mentioned here, one solution is to pass target_start_index and target_end_index to the sample defined in doc.py (we need to update the sample though to accept these).

I don't think there is a super good heuristic to determine these targets in the sample directly. Even with these 2 tokenizers, we already have '▁nice' v.s. 'Ġnice'.

Let's wait Patrick's response.

patrickvonplaten · 2022-04-07T07:57:50Z

Hey @ydshieh and @vumichien,

IMO the best thing we can and should do here is to let the user pass the label idx.

ydshieh · 2022-04-08T15:02:08Z

@vumichien You can ignore the failed test run_tests_hub. I will check the PR later

vumichien · 2022-04-08T15:04:01Z

@ydshieh Thank you very much

ydshieh

Hi, @vumichien

Very high quality PR :-)
I left a few comments, and a few places to be addressed.
I think we are very close to merge the PR 💯

I haven't checked the part regarding mobilebert, but if you find my comments on bert also apply to mobilebert whenever possible, please address them too :-)

Thanks a lot.

ydshieh · 2022-04-08T16:22:20Z

src/transformers/models/bert/modeling_bert.py

+        checkpoint=_CHECKPOINT_FOR_DOC,
+        output_type=CausalLMOutputWithCrossAttentions,
+        config_class=_CONFIG_FOR_DOC,
+    )


ydshieh · 2022-04-08T16:22:41Z

src/transformers/models/bert/modeling_bert.py

+            the model is configured as a decoder.
+        encoder_attention_mask (`torch.FloatTensor` of shape `(batch_size, sequence_length)`, *optional*):
+            Mask to avoid performing attention on the padding token indices of the encoder input. This mask is used in
+            the cross-attention if the model is configured as a decoder. Mask values selected in `[0, 1]`:


More than super! Thanks for the fix

ydshieh · 2022-04-08T16:23:49Z

src/transformers/models/bert/modeling_tf_bert.py

+        >>> input_ids = tokenizer("Hello, my dog is cute", add_special_tokens=True, return_tensors="tf")
+        >>> # Batch size 1
+
        >>> outputs = model(input_ids)
-        >>> prediction_scores, seq_relationship_scores = outputs[:2]
+        >>> prediction_logits, seq_relationship_logits = outputs[:2]


Very nice :-)

thank you :D

ydshieh · 2022-04-08T16:26:50Z

src/transformers/models/bert/modeling_tf_bert.py

+        expected_output="'P a r i s'",
+        expected_loss=0.81,


I think it's better to change _CHECKPOINT_FOR_DOC from "bert-base-cased" to "bert-base-uncased", so it matches the one defined and used in PyTorch Bert.

And the result should be the same in Bert: paris and 0.88.

I have changed the _CHECKPOINT_FOR_DOC as your suggestion

ydshieh · 2022-04-08T16:28:53Z

src/transformers/models/bert/modeling_tf_bert.py

+        checkpoint=_CHECKPOINT_FOR_TOKEN_CLASS,
        output_type=TFTokenClassifierOutput,
        config_class=_CONFIG_FOR_DOC,
+        expected_output="['O', 'I-ORG', 'I-ORG', 'I-ORG', 'O', 'O', 'O', 'O', 'O', 'I-LOC', 'O', 'I-LOC', 'I-LOC']",
+        expected_loss=0.01,


Probably it's not necessary here, but since you already defined

# TokenClassification docstring _CHECKPOINT_FOR_TOKEN_CLASS = "dbmdz/bert-large-cased-finetuned-conll03-english" _TOKEN_CLASS_EXPECTED_OUTPUT = ( "['O', 'I-ORG', 'I-ORG', 'I-ORG', 'O', 'O', 'O', 'O', 'O', 'I-LOC', 'O', 'I-LOC', " "'I-LOC'] " ) _TOKEN_CLASS_EXPECTED_LOSS = 0.01

let's just reuse those variables :-)

OMG, I forgot to use them. Thank you for pointing them out ^^!

ydshieh · 2022-04-08T16:29:16Z

src/transformers/models/bert/modeling_tf_bert.py

+        expected_output="'a nice puppet'",
+        expected_loss=7.41,


We can reuse the variables you defined at the beginning

Yeah, I totally agree

sgugger

Very nice PR indeed, thanks a lot!

ydshieh

Hi @vumichien, ready to merge: I suggested some final changes (you can commit directly - you can commit them in a batch)

Using _CHECKPOINT_FOR_SEQUENCE_CLASSIFICATION instead of _CHECKPOINT_FOR_SEQ_CLASS, etc. (just style preference things, not big deal)
p a r i s -> paris: there is a bug in TF_MASKED_LM_SAMPLE, which is fixed in #16698. (No need to rebase or merge into this PR.)

~~@sgugger Could you have a final approval when you have sometime, thanks.~~
(just saw you approved!)

Thanks a lot for this PR, @vumichien . Let's merge it once the suggestions are committed!

src/transformers/models/bert/modeling_tf_bert.py

src/transformers/models/mobilebert/modeling_tf_mobilebert.py

src/transformers/models/bert/modeling_tf_bert.py

src/transformers/models/bert/modeling_bert.py

src/transformers/models/mobilebert/modeling_mobilebert.py

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

ydshieh · 2022-04-11T13:44:51Z

All good :-) I merge it now, thank you again ❤️ @vumichien

vumichien · 2022-04-11T13:57:59Z

@ydshieh You, too. Thanks a lot for helping with this PR 🙏

* Add doctest BERT * make fixup * fix typo * change checkpoints * make fixup * define doctest output value, update doctest for mobilebert * solve fix-copies * update QA target start index and end index * change checkpoint for docs and reuse defined variable * Update src/transformers/models/bert/modeling_tf_bert.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * make fixup Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

vumichien added 3 commits April 1, 2022 00:17

Add doctest BERT

b59603a

make fixup

e60a4bf

fix typo

2ea5ebe

ydshieh self-requested a review March 31, 2022 16:18

vumichien added 2 commits April 2, 2022 13:56

change checkpoints

1193bff

make fixup

e4a426f

define doctest output value, update doctest for mobilebert

390a13f

solve fix-copies

1b5d204

ydshieh mentioned this pull request Apr 7, 2022

[Community Event] Doc Tests Sprint #16292

Closed

ydshieh mentioned this pull request Apr 7, 2022

Fix QA sample #16648

Merged

vumichien and others added 2 commits April 8, 2022 22:51

Merge branch 'huggingface:main' into doc-test-sprint

5e8566f

update QA target start index and end index

67a7e92

ydshieh reviewed Apr 8, 2022

View reviewed changes

change checkpoint for docs and reuse defined variable

3d9e661

sgugger approved these changes Apr 11, 2022

View reviewed changes

ydshieh approved these changes Apr 11, 2022

View reviewed changes

ydshieh mentioned this pull request Apr 11, 2022

Fix TF_MASKED_LM_SAMPLE #16698

Merged

vumichien and others added 4 commits April 11, 2022 21:41

Update src/transformers/models/bert/modeling_tf_bert.py

8524a1e

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

Apply suggestions from code review

84f0953

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

Apply suggestions from code review

4263019

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

make fixup

f8ea70c

ydshieh merged commit 2831826 into huggingface:main Apr 11, 2022

Conversation

vumichien commented Mar 31, 2022

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Mar 31, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh commented Apr 1, 2022

Uh oh!

vumichien commented Apr 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh commented Apr 1, 2022

Copied from transformers.models.bert.modeling_bert.BertForQuestionAnswering with Bert->MobileBert all-casing

Uh oh!

ydshieh commented Apr 1, 2022

Uh oh!

vumichien commented Apr 2, 2022

Uh oh!

ydshieh commented Apr 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh commented Apr 4, 2022

Uh oh!

vumichien commented Apr 5, 2022

Uh oh!

vumichien commented Apr 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh commented Apr 5, 2022

Uh oh!

ydshieh commented Apr 5, 2022

Uh oh!

ydshieh commented Apr 5, 2022

Code Snippet

Outputs

Uh oh!

vumichien commented Apr 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh commented Apr 6, 2022

Uh oh!

patrickvonplaten commented Apr 7, 2022

Uh oh!

ydshieh commented Apr 8, 2022

Uh oh!

vumichien commented Apr 8, 2022

Uh oh!

ydshieh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

ydshieh left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Mar 31, 2022 •

edited

Loading

vumichien commented Apr 1, 2022 •

edited

Loading

ydshieh commented Apr 4, 2022 •

edited

Loading

vumichien commented Apr 5, 2022 •

edited

Loading

vumichien commented Apr 5, 2022 •

edited

Loading

ydshieh left a comment •

edited

Loading