Accept processor in get_training_chat_template#5560
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: aa61a92ea2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 6700975. Configure here.
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
albertvillanova
left a comment
There was a problem hiding this comment.
CI is red:
AttributeError: 'NoneType' object has no attribute 'chat_template'
|
@qgallouedec, should we consider |
|
currently, it is considered public. I don't have strong opinion about it. I was a bit uncertain in the first place, see #5090. |

get_training_chat_templatealready works on bothPreTrainedTokenizerandProcessorMixin. The type hint, parameter name, and docstring were misleading by saying "tokenizer" only. Call sites (e.g.GRPOTrainer,SFTTrainer) already passprocessing_class.Note
Low Risk
Low risk API tweak: adds
processing_classas the primary parameter with a deprecating alias fortokenizer, so breakage is unlikely but callers passing both args will now error and deprecation warnings may surface.Overview
get_training_chat_templatenow takes aprocessing_class(tokenizer orProcessorMixin) instead of a tokenizer-only parameter, aligning the type hints and docs with existing processor support.For backward compatibility it adds a deprecated
tokenizeralias that emits aFutureWarningand rejects passing both arguments, and updates internal checks/error messaging to reference the generic processing class/chat template rather than “tokenizer” specifically.Reviewed by Cursor Bugbot for commit bb6d08d. Bugbot is set up for automated code reviews on this repo. Configure here.