Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The accuracy of bert_spc.py is low #27

Closed
YHTtTtao opened this issue Mar 31, 2019 · 17 comments
Closed

The accuracy of bert_spc.py is low #27

YHTtTtao opened this issue Mar 31, 2019 · 17 comments
Labels
good first issue Good for newcomers

Comments

@YHTtTtao
Copy link

I ran the code of bert_spc.py model, the accuracy rate was 65.8%, F1 was 36.5%. Why the accuracy rate was not so high? I used the restaurant data. Was it because of the data set?

@songyouwei
Copy link
Owner

Note that BERT is very sensitive to hyperparameters on small data sets.

My own experiment shows that the learning rate of 5e-5, 3e-5, 2e-5 performs excellent.
BERT-SPC on Restaurant dataset: Acc==0.8446, F1==0.7698

image

@songyouwei
Copy link
Owner

plz check the latest committed version

@YHTtTtao
Copy link
Author

Ok, thank you, I will try that again.

@floAlpha
Copy link

floAlpha commented Apr 1, 2019

666, 最终还是被大佬解决了

@songyouwei songyouwei added the good first issue Good for newcomers label Apr 1, 2019
@fhamborg
Copy link

fhamborg commented Nov 5, 2019

@songyouwei What are the parameters, with which you achieve accuracy values of over 0.80? I'm using ABSA and aen_bert, with default parameters (which are defined in the argparser in train.py) but the accuracy goes only up to 0.59, where it converges. These are the parameters I use:

> training arguments:
>>> model_name: aen_bert
>>> dataset: restaurant
>>> optimizer: <class 'torch.optim.adam.Adam'>
>>> initializer: <function xavier_uniform_ at 0x7f4f29cca7a0>
>>> learning_rate: 2e-05
>>> dropout: 0.1
>>> l2reg: 0.01
>>> num_epoch: 10
>>> batch_size: 16
>>> log_step: 5
>>> embed_dim: 300
>>> hidden_dim: 300
>>> bert_dim: 768
>>> pretrained_bert_name: bert-base-uncased
>>> max_seq_len: 80
>>> polarities_dim: 3
>>> hops: 3
>>> device: cpu
>>> seed: None
>>> valset_ratio: 0
>>> local_context_focus: cdm
>>> SRD: 3
>>> model_class: <class 'models.aen.AEN_BERT'>
>>> dataset_file: {'train': './datasets/semeval14/Restaurants_Train.xml.seg', 'test': './datasets/semeval14/Restaurants_Test_Gold.xml.seg'}
>>> inputs_cols: ['text_raw_bert_indices', 'aspect_bert_indices']

@songyouwei
Copy link
Owner

songyouwei commented Nov 5, 2019

@fhamborg Maybe it's because the batch size is small? What is the performance of this set of parameters on bert_spc ? I'll check it later.

@fhamborg
Copy link

fhamborg commented Nov 5, 2019

Hi, thanks for getting back! With the same parameters, but on bert_spc the performance I get on the restaurants set is a bit higher than for aen_bert but still not in the 80%ish range:

> val_acc: 0.6455, val_f1: 0.4145
>> test_acc: 0.6670, test_f1: 0.3756

FYI, I attached the full console output: https://gist.github.com/fhamborg/dade525af54a158982967383444fade4

@yangheng95
Copy link
Contributor

Hello, I just cloned the latest repository and checked the code after latest PR . I set the parameters consistent with you and aen_bert's accuracy on restaurant achieves 81+. Here is my training log.
training log.txt

@yangheng95
Copy link
Contributor

Hi guys, I've also tested all of the Bert-based models modified by my latest PR, and here are the logs, and it's working really well. I hope it helps.
bert_spc training log.txt
lcf_bert training log.txt

@fhamborg
Copy link

fhamborg commented Nov 6, 2019

Hey @yangheng95 , thanks for the logs! I still haven't figured what exactly the difference; but the only more or less meaningful assumption is due to the random initialization of a few components in pytorch and transformers. Would you be so kind to post your bert_spc log with the same parameters as before, but also setting the --seed 1337? This would allow me a better comparison. Thank you =)

@fhamborg
Copy link

fhamborg commented Nov 6, 2019

Also, could you post the log when running aen_glove? Thank you very much in advance!

@yangheng95
Copy link
Contributor

Hello @fhamborg , I trained the bert_spc model with 1337 as seed and the result is still very good. >> test_acc: 0.8402, test_f1: 0.7692. I think that cloning and referring to the latest code after PR may solve your problem. Due to the busy schedule, I may not have time to adapt and train AEN-GloVe, but you can run it by adding the aen-glove model to train.py just as adding the other models.
bert_spc.training.log.seed1337.txt

@fhamborg
Copy link

fhamborg commented Nov 7, 2019

Hi @yangheng95 , thanks for your reply and verification with seed 1337. I'm using the latest repo, i.e., including the PR to migrate to transformers. Though, I tried it on another machine and the results went up to roughly 70% (plus/minus) for all the approaches.

Also, I managed to train aen_glove but in contrast to the results reported in the paper, I was only able to get roughly 50% on validation or test set. Do you have any idea where the difference for glove could come come from?

@songyouwei
Copy link
Owner

songyouwei commented Nov 7, 2019

@fhamborg Thank you for reporting this issue.
I just looked into this.
There might be something wrong with the recent release of the pretrained bert from https://github.com/huggingface/transformers, named transformers.

I installed it with pip install transformers, replaced pytorch_transformers imports with transformers, and reproduced this issue.

Try reinstall and use the previous release pytorch_transformers with pip install pytorch-transformers.

@fhamborg
Copy link

fhamborg commented Nov 7, 2019

Thanks, you're right, I was using transformers instead of pytorch_transformers. I will check it out now :-)

@fhamborg
Copy link

fhamborg commented Nov 7, 2019

Awesome, on pytorch_transformers I get much higher performances than on transformers, e.g.:

> val_acc: 0.8536, val_f1: 0.7924

Thanks for the hint, @songyouwei ! Do you have any idea what might be causing this significant difference between whether pytorch_transformers or transformers is being used?

@shuzhinian
Copy link

hello @songyouwei ,how to train aen_glove,do I need to modify the code for the train?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

6 participants