Skip to content

Conversation

@garrett361
Copy link
Owner

@garrett361 garrett361 commented Jun 27, 2025

This PR adds the ability to train with padding-free collation, thereby removing all padding from per_device_train_batch_size>1 training examples. The HF model must have proper support for padding-free training, otherwise the model outputs will be silently wrong. LLama and bamba (as of huggingface/transformers#35861) are examples of models which have padding-free training support.

To use padding-free, just specify --padding-free True

PR Title
1 >15 padding-free
2 #16 clean_checkpoints_at_end
3 #17 final_lr_ratio
4 #18 add_seed_and_date_to_run_name
5 #19 additional_model_arguments
6 #20 sync_each_batch=True grad acc
7 #21 no grad acc averaging for sum losses
8 #22 extra reporting
9 #23 local_main_process_first when building dataset

from accelerate.logging import get_logger
from accelerate.utils import InitProcessGroupKwargs, set_seed
from huggingface_hub import HfApi
from padding_free_collator import TensorDataCollatorWithFlattening
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be open_instruct.padding_free

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes it more robust w/r/t the cwd the script is launched from, I guess?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

@dangxuanhong dangxuanhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks and LGTM

@garrett361 garrett361 merged commit 872404b into main Jun 27, 2025
2 checks passed
@fabianlim fabianlim deleted the padding-free-squashing-1 branch June 27, 2025 20:35
@fabianlim fabianlim restored the padding-free-squashing-1 branch June 27, 2025 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants