Skip to content

Conversation

@Rocketknight1
Copy link
Member

This is basically PR #41626 again! Some of it got clobbered in the tokenizer refactor, but it's just as good the second time.

@Rocketknight1 Rocketknight1 marked this pull request as ready for review December 2, 2025 16:50
@github-actions
Copy link
Contributor

github-actions bot commented Dec 2, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: blenderbot, bloom, cohere, gpt2, gpt_sw3

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Rocketknight1
Copy link
Member Author

cc @LysandreJik - this was one of the V5 PRs before, do I need to do anything special with this one, or can we just merge it to main?

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, it was already approved once so lgtm 😄

Comment on lines +3268 to +3269
if not tokenize:
return_dict = False # dicts are only returned by the tokenizer anyway
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes me wonder, do we need to support a combination of tokenize=True, return_dict=False or can we deprecate/remove return_dict over time? Can't think of cases when users want a list of tokens as output

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can get rid of it over time, but I think it's fine as a backward compatibility flag for now!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, i meant after v5 + several more minor releases, and if users are fine with it

@Rocketknight1 Rocketknight1 merged commit ce53cc0 into main Dec 4, 2025
24 checks passed
@Rocketknight1 Rocketknight1 deleted the v5_chat_template_return_type branch December 4, 2025 14:44
sarathc-cerebras pushed a commit to sarathc-cerebras/transformers that referenced this pull request Dec 7, 2025
…again (huggingface#42567)

* Flip the default return type for `apply_chat_template` to match the underlying tokenizer

* Remove test_tokenization_for_chat tests, which no longer do anything useful

* Remove test_tokenization_for_chat tests, which no longer do anything useful

* Fix test_encode_message tests

* Fix test_encode_message tests

* nit fix

* Trigger tests

* Remove test_tokenization_for_chat

* make fixup

* Add a little test to make sure that doesn't happen again

* make fixup
SangbumChoi pushed a commit to SangbumChoi/transformers that referenced this pull request Jan 23, 2026
…again (huggingface#42567)

* Flip the default return type for `apply_chat_template` to match the underlying tokenizer

* Remove test_tokenization_for_chat tests, which no longer do anything useful

* Remove test_tokenization_for_chat tests, which no longer do anything useful

* Fix test_encode_message tests

* Fix test_encode_message tests

* nit fix

* Trigger tests

* Remove test_tokenization_for_chat

* make fixup

* Add a little test to make sure that doesn't happen again

* make fixup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants