Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with demo code #1

Open
vumichien opened this issue Apr 22, 2024 · 3 comments
Open

Problem with demo code #1

vumichien opened this issue Apr 22, 2024 · 3 comments

Comments

@vumichien
Copy link

Hi, thank you for your great work. I tried to replicate your demo but got a result like this image, but no suggestion correct? Can you take a look and point out the problem
`import torch
from transformers import BertJapaneseTokenizer
from bertjsc import predict_of_json
from bertjsc.lit_model import LitBertForMaskedLM

Tokenizer & Model declaration.

tokenizer = BertJapaneseTokenizer.from_pretrained("cl-tohoku/bert-base-japanese-whole-word-masking")
model = LitBertForMaskedLM("cl-tohoku/bert-base-japanese-whole-word-masking")

Load the model downloaded in Step 2.

model.load_state_dict(torch.load('models/lit-bert-for-maskedlm-230313.pth'), strict=False)

Set computing device on GPU if available, else CPU

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

Inference

result = predict_of_json(model, tokenizer, device, "日本語校正してみす。")
print(result) `

{0: {'token': '日本語', 'score': 0.999364}, 1: {'token': '校', 'score': 0.891721}, 2: {'token': '正', 'score': 0.54799}, 3: {'token': 'し', 'score': 0.99691}, 4: {'token': 'て', 'score': 0.999236}, 5: {'token': 'み', 'score': 0.998868}, 6: {'token': 'す', 'score': 0.925406}, 7: {'token': '。', 'score': 0.999986}}

@er-ri
Copy link
Owner

er-ri commented Apr 22, 2024

What is your transformers' version? The model was trained on python 3.9 and transformers 4.24.0. Other newer packages may also impacts the prediction result.

@vumichien
Copy link
Author

vumichien commented Apr 22, 2024

I used the transformers' version based on your requirements.txt file, it said transformers>=4.36.0. I tried to downgrade the version but the results is almost the same

{0: {'token': '日本語', 'score': 0.999364}, 1: {'token': '校', 'score': 0.891721}, 2: {'token': '正', 'score': 0.54799}, 3: {'token': 'し', 'score': 0.99691}, 4: {'token': 'て', 'score': 0.999236}, 5: {'token': 'み', 'score': 0.998868}, 6: {'token': 'す', 'score': 0.925406}, 7: {'token': '。', 'score': 0.999986}}

@er-ri
Copy link
Owner

er-ri commented Apr 22, 2024

Sorry, I forgot. The demo in README.md was inferred by another previously trained model. Seems the current model cannot handle the demo sentence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants