Replies: 1 comment
-
left padding can causes numerical unstableness (e.g. overflow) in the fine-tuning of several models |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
So I understand that using right padding makes evaluation and prediction incorrect, but what's wrong with having left padding throughout the whole process, I have modified the code to basically add evaluation throughout training like most other packages and I am doing evaluations per epoch and don't see any major issue. Is there something I am missing? I did try right padding throughout the whole thing and it did not work btw. The only issue I have is that the evaluation during training is that if I do an evaluation at the end with the best model loaded it won't match any of the previous ones, not sure if this is a generation param thing or something else. It will always be a few percentage points or just a little off, some times better and some times worse.
Beta Was this translation helpful? Give feedback.
All reactions