-
Notifications
You must be signed in to change notification settings - Fork 3
[1/9] padding-free #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
open_instruct/finetune.py
Outdated
| from accelerate.logging import get_logger | ||
| from accelerate.utils import InitProcessGroupKwargs, set_seed | ||
| from huggingface_hub import HfApi | ||
| from padding_free_collator import TensorDataCollatorWithFlattening |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be open_instruct.padding_free
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that makes it more robust w/r/t the cwd the script is launched from, I guess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
dangxuanhong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks and LGTM
This PR adds the ability to train with padding-free collation, thereby removing all padding from
per_device_train_batch_size>1training examples. The HF model must have proper support for padding-free training, otherwise the model outputs will be silently wrong. LLama and bamba (as of huggingface/transformers#35861) are examples of models which have padding-free training support.To use padding-free, just specify
--padding-free True