Refactor KTO coordinated with DPO [a/N]: Remove encoder-decoder support#4792
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
Yes, as @albertvillanova says, I advocated for this change in #3906. It is important change, so I would be interested to hear the opinions of @lewtun, @edbeeching, and @kashif, and ensure we are aligned on this decision for both KTO and DPO. |
trl/experimental/kto/kto_trainer.py
Outdated
| pad_token_id=processing_class.pad_token_id, | ||
| label_pad_token_id=args.label_pad_token_id, | ||
| is_encoder_decoder=self.is_encoder_decoder, | ||
| is_encoder_decoder=False, |
There was a problem hiding this comment.
you could probably remove this
There was a problem hiding this comment.
Done here and in other places.
Yes I agree it is best to streamline the repo and cut features which (presumably) are not widely used. |
qgallouedec
left a comment
There was a problem hiding this comment.
a few suggestions, otherwise lgtm!
Refactor KTO coordinated with DPO [a/N]: Remove encoder-decoder support.
This cleanup significantly simplifies the KTO trainer and makes subsequent refactoring much easier.
Part of:
Coordinated with DPO refactoring, as discussed with @qgallouedec :
Key Changes
KTOConfig
is_encoder_decoderparameter and documentationmax_completion_lengthparameter (because it is specific to encoder-decoder models) and documentationKTOTrainer
Initialization:
Data Processing:
Model Forward Pass:
Reference Model Computation:
Error Handling:
Tests