Update tokenizer apply_chat_template with return_dict=True default #4448
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pass explicitly
return_dict=Truetoapply_chat_templateand get itsinput_idskey.This PR fixes the issue:
Fix #4447.
Note that
transformershas recently setreturn_dict=Trueas the default value:This PR updates the tokenization logic in the
tokenize_fnfunction oftrl/trainer/reward_trainer.pyto improve compatibility with the default output format ofapply_chat_template. Instead of assumingreturn_dict=Falseand directly returning the result, it now requests a dictionary and extracts only theinput_idsfield.Tokenization logic update:
apply_chat_templatemethod is now called withreturn_dict=True, and only theinput_idsfrom the returned dictionary are used for both thechosenandrejectedexamples.