-
Notifications
You must be signed in to change notification settings - Fork 538
how to inference after finetune a classifier in the script of bert? #662
Comments
I was able to reproduce the problem. Also, I noticed a warning:
|
Looks like the test.tsv file for CoLA only has two columns, so this index is a typo (which should have been 1): https://github.com/dmlc/gluon-nlp/blob/master/scripts/bert/dataset.py#L301 |
@szha So how to solve it? |
@szha When I come to the test.tsv in CoLA dataset,I found the indice of sentense is 1,So I change the code in class COLADataset:
But it did not work . |
@Gpwner this is the fix that I was alluding to in my previous comment, and in my case it fixed the problem. Are you using the class that you edited and it's still not work? |
@szha yes it is still not work. Here is my preprocess_data():
|
@Gpwner thanks for raising this issue. It looks like the test set does not have any label, and the implementation of
to
and use a separate transform just for the test set:
|
@eric-haibin-lin I try your solution ,But It is still not work. finetune_classify.py:
dataset.py:
here is my output:
|
@szha @eric-haibin-lin For now ,my solution is add fake labels for test data .And change init function in CoLA code in dataset.py.And it work. |
Hi @Gpwner sorry about that. I found another bug in the Transform class that it is missing the last entry when label is not present using The following code should work end2end:
and also you need to update the idx for COLADataset as mentioned before:
This gives the following result with batch_size = 1:
I'll make a PR for this fix shortly. Thanks a lot for reporting this issue! |
@eric-haibin-lin I can see your default bert model is bert_12_768_12.I wonder whether the accuracy of bert_24_1024_16 is bertter than bert_12_768_12 ?I have try the two different model on my own dataset,but the accuracy I got are about the same. |
Good question. I have not tried specifically bert large on CoLA. Did you try multiple seed? What accuracy do you get currently? On MRPC the performance of BERT large has large variance (which is also reported in the paper) and multiple random seeds are needed to get a good result on the dev set. |
@eric-haibin-lin Thanks for the reply. I have try bert large on my dataset(my dataset is not CoLA,but it is similar to CoLA).But the accuracy on bert base and bert large are about the same. |
@Gpwner I'll include the fix in PR #682 . BTW - since you have worked on a CoLA-like dataset, would you be interested to contribute CoLA fine-tuning script command/logs to gluon-nlp, just like RTE, SST in http://gluon-nlp.mxnet.io/model_zoo/bert/index.html#bert-for-sentence-classification-on-glue-tasks ? |
@eric-haibin-lin it seems that I have no access right to edit in the http://gluon-nlp.mxnet.io/model_zoo/bert/index.html#bert-for-sentence-classification-on-glue-tasks . |
@eric-haibin-lin I just don't know why when I come to epoch 2 I get the nan loss:
|
Thanks! @Gpwner you can edit scripts/bert/index.rst, which contains the content for the website. The nan loss looks strange. Did you try a smaller learning rate? Did the nan loss always happen at epoch 2? |
@eric-haibin-lin I try several learning rates included 2e-5,3e-5,5e-5,10e-5, but I always get the nan loss.I remember the old code didn't get nan loss.Since there is no big difference between the bert large model and the bert base model.So I have delete the old code from my disk And using the official code instead. |
@eric-haibin-lin I use the mxnet-cu100==1.5.0b20190427 |
@Gpwner that version likely has the regression. The change in mxnet that caused the regression has been reverted recently. Could you try the version |
Relevant issues #690 apache/mxnet#14864 |
@Gpwner the related issues have been resolved. Let us know in case you still have some trouble with it. |
When I finished the finetune job of CLoA,I want to inference the test.tsv.
But I got zero test sample:
here is my output:
here is my command :
python finetune_classifier.py --task_name CoLA --epochs 4 --batch_size 16 --optimizer bertadam --gpu --lr 2e-5 --log_interval 500
Thanks for advance!
The text was updated successfully, but these errors were encountered: