-
Notifications
You must be signed in to change notification settings - Fork 27.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate chat templates into a single file #33957
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Interesting! I think we would need to be extra careful as this can influence quite a lot of other libs, to which we need to open PRs before merging this one! FYI @LysandreJik and @Narsil |
b4cc131
to
0216160
Compare
Also cc @zucchini-nlp |
Oh, wow, this will affect a lot of other projects. Also keep in mind that the HF API exposes the chat templates through the I will release a chat template editor (Gradio space) soon that lets you easily view, modify, test and submit PRs (just to address a couple of OP bullet-points), so thanks to @Rocketknight1 for pointing me towards this PR, I will need to keep a close eye on changes. :) |
@CISC yes, I'm aware that this could be disruptive! When this PR is ready we'll discuss it internally and with the community, and maybe we might end up aborting it at that point (and I'll try not to let that delay the inverse templates PR too much either way) |
2b7c7ca
to
3488357
Compare
This is ready for review! There's one major question: Right now I'm saving We could consider always using Other than that, this should be ready for review! (cc some possibly affected people @zucchini-nlp @Narsil @xenova @LysandreJik) Failing tests in UDOP seem to be unrelated and I can't reproduce them locally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good to me, but for processors now we'll have three ways of saving and loading templates. Some people (including repos maintained by us) still have it the old way, as part of processor_config.json
. Introducing one more template file might be confusing for users, so we have to maybe add this somewhere in the docs. I mean when we finally decide to roll out saving/loading from chat_template.jinja
files by default
@zucchini-nlp - agreed, but hopefully we can deprecate the others! I hope when we're finally finished, we just have |
aef89c6
to
0c164ab
Compare
Update: This should be ready for final review now! The failing UDOP tests are not related to this PR, and I have another PR open to fix them here #34180 To be clear, we will not move the chat templates when this PR is merged. This PR just adds support for saving/loading to the new location, but we will not save to the new file locations by default. This will give the many other frameworks that load chat templates time to adapt, and give users time to update to versions that support the new locations. |
Thanks for the ping @Rocketknight1! I'll check what it implies server-side to make the parsing of the chat template available in the API. |
High level feedback: to avoid any confusion in the future, shouldn't the long-term files be called Having a longer and more explicit final name would avoid confusion in my opinion. |
I think that makes sense, yes! It'll be a bit longer but very explicit about which files affect which classes. I'll make that change if other reviewers agree as well. |
Hi @Rocketknight1 @zucchini-nlp Getting back to this topic as I'd like to understand more the reasoning behind this file structure. I have a few questions:
Without knowing all the technical details, I'd advocate for a truly single file chat template if possible. If that's the case, we would support things differently server-side, typically having a single And sorry if all of this has already been discussed elsewhere! |
100% |
Hi @Wauplin, 1. is a good question! The main reason I wanted separate templates is because I thought the side effects would be unpredictable. For example, many processors have |
I've published the initial version of Chat Template Editor, I've started work related to this and the inverse template PR, but it's currently disabled/hidden until finalized.
Found community Pixtral which was very helpful, added support for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go with a single file systeme and define a good standard for it. If the chat template stays as it is today, it does not have to be "in" the tokenizer: it's a new form of pre-processing that comes before the tokenzier and as such we don't need different file.
let's synch about what's the best format to use, the simpler the better
ef6a33c
to
a6ff538
Compare
I got waylaid by a couple of urgent tasks, but this should finally be ready! The summary is:
|
a6ff538
to
1187047
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but the PR description needs to be updated!
PR description updated, and merging! |
Thanks @Rocketknight1 ! Is there a first repo example on the Hub to check that? Asking since we'll have to support this new (and definitive!) file in the server-side parsing |
cc @Wauplin not yet, but I'll make one in |
* Initial draft * Add .jinja file loading for processors * Add processor saving of naked chat template files * make fixup * Add save-load test for tokenizers * Add save-load test for tokenizers * stash commit * Try popping the file * make fixup * Pop the arg correctly * Pop the arg correctly * Add processor test * Fix processor code * stash commit * Processor clobbers child tokenizer's chat template * Processor clobbers child tokenizer's chat template * make fixup * Split processor/tokenizer files to avoid interactions * fix test * Expand processor tests * Rename arg to "save_raw_chat_template" across all classes * Update processor warning * Move templates to single file * Move templates to single file * Improve testing for processor/tokenizer clashes * Improve testing for processor/tokenizer clashes * Extend saving test * Test file priority correctly * make fixup * Don't pop the chat template file before the slow tokenizer gets a look * Remove breakpoint * make fixup * Fix error
We have several issues with chat templates because they're stored as single lines in the JSON config files:
processor
templates inchat_template.json
andtokenizer
templates intokenizer_config.json
causing confusionThe solution:
chat_template.jinja
file in the repoProcessor
classes, so processors should always be able to save their template as a raw Jinja file. In general, we'll be gently deprecating multiple templates in future.chat_template.jinja
file is present, it overrides the JSON files. If a tokenizer is loaded with both Jinja and JSON chat templates and resaved, it should save only the Jinja file, and not have anychat_template
entry intokenizer_config.json
.For now, we continue saving in the old format by default. I'll probably keep it this way for several versions before making the new format the default, to ensure that most users are able to load the new format before it becomes common. Until then, the new format should mostly be used for testing, to make sure it's ready for deployment when we do the switch.
Extremely draft PR for now, since it'll probably break lots of things!
TODO: