Left padding during training #5956

herandy · 2024-11-07T22:16:30Z

herandy
Nov 7, 2024

So I understand that using right padding makes evaluation and prediction incorrect, but what's wrong with having left padding throughout the whole process, I have modified the code to basically add evaluation throughout training like most other packages and I am doing evaluations per epoch and don't see any major issue. Is there something I am missing? I did try right padding throughout the whole thing and it did not work btw. The only issue I have is that the evaluation during training is that if I do an evaluation at the end with the best model loaded it won't match any of the previous ones, not sure if this is a generation param thing or something else. It will always be a few percentage points or just a little off, some times better and some times worse.

hiyouga · 2024-11-08T08:17:49Z

hiyouga
Nov 8, 2024
Maintainer

left padding can causes numerical unstableness (e.g. overflow) in the fine-tuning of several models

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Left padding during training #5956

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Left padding during training #5956

herandy Nov 7, 2024

Replies: 1 comment

hiyouga Nov 8, 2024 Maintainer

herandy
Nov 7, 2024

hiyouga
Nov 8, 2024
Maintainer