Skip to content

Conversation

@albertvillanova
Copy link
Member

@albertvillanova albertvillanova commented Nov 4, 2025

Pass explicitly return_dict=True to apply_chat_template and get its input_ids key.

This PR fixes the issue:

RuntimeError: Could not infer dtype of dict

Fix #4447.

Note that transformers has recently set return_dict=True as the default value:

This PR updates the tokenization logic in the tokenize_fn function of trl/trainer/reward_trainer.py to improve compatibility with the default output format of apply_chat_template. Instead of assuming return_dict=False and directly returning the result, it now requests a dictionary and extracts only the input_ids field.

Tokenization logic update:

  • The apply_chat_template method is now called with return_dict=True, and only the input_ids from the returned dictionary are used for both the chosen and rejected examples.

@albertvillanova albertvillanova changed the title Pass explicitly return_dict=True to apply_chat_template Update tokenizer apply_chat_template with return_dict=True default Nov 4, 2025
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@qgallouedec qgallouedec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting more, cool!

@qgallouedec qgallouedec merged commit 8b0a3ce into huggingface:main Nov 4, 2025
10 checks passed
qgallouedec added a commit to Harras3/trl that referenced this pull request Nov 4, 2025
commit 7a9592b
Author: Quentin Gallouédec <[email protected]>
Date:   Tue Nov 4 14:32:04 2025 -0700

    🐍 Drop Python 3.9 (huggingface#4183)

commit 7f15a7f
Author: Harras Mansoor <[email protected]>
Date:   Wed Nov 5 02:06:31 2025 +0500

    Removed outdated warning about batch contamination (huggingface#4423)

commit 8b0a3ce
Author: Albert Villanova del Moral <[email protected]>
Date:   Tue Nov 4 21:37:39 2025 +0100

    Update tokenizer apply_chat_template with return_dict=True default (huggingface#4448)

commit d9f9e2b
Author: Pramodith Ballapuram <[email protected]>
Date:   Tue Nov 4 19:56:58 2025 +0000

    Support casting to fp32 when word embeddings are tied to lm_head (huggingface#4446)

commit 4e138ab
Author: Sergio Paniego Blanco <[email protected]>
Date:   Tue Nov 4 15:15:23 2025 +0100

    Upload notebook with T4 selected (huggingface#4449)
qgallouedec added a commit that referenced this pull request Nov 4, 2025
commit 4677cf2
Author: Harras Mansoor <[email protected]>
Date:   Wed Nov 5 04:06:13 2025 +0500

    Removed Sentiment Tuning Examples (#4424)

commit 7a9592b
Author: Quentin Gallouédec <[email protected]>
Date:   Tue Nov 4 14:32:04 2025 -0700

    🐍 Drop Python 3.9 (#4183)

commit 7f15a7f
Author: Harras Mansoor <[email protected]>
Date:   Wed Nov 5 02:06:31 2025 +0500

    Removed outdated warning about batch contamination (#4423)

commit 8b0a3ce
Author: Albert Villanova del Moral <[email protected]>
Date:   Tue Nov 4 21:37:39 2025 +0100

    Update tokenizer apply_chat_template with return_dict=True default (#4448)

commit d9f9e2b
Author: Pramodith Ballapuram <[email protected]>
Date:   Tue Nov 4 19:56:58 2025 +0000

    Support casting to fp32 when word embeddings are tied to lm_head (#4446)

commit 4e138ab
Author: Sergio Paniego Blanco <[email protected]>
Date:   Tue Nov 4 15:15:23 2025 +0100

    Upload notebook with T4 selected (#4449)

commit 43253b2
Author: Pramodith Ballapuram <[email protected]>
Date:   Mon Nov 3 21:07:31 2025 +0000

    Add On-Policy Distillation from thinking labs to paper index. (#4410)

    Co-authored-by: Quentin Gallouédec <[email protected]>

commit 6f41b18
Author: Behrooz Azarkhalili <[email protected]>
Date:   Mon Nov 3 10:57:51 2025 -0800

    fix: Remove chat template setting from non-SFT trainer scripts (#4437)

    Co-authored-by: Quentin Gallouédec <[email protected]>
    Co-authored-by: Quentin Gallouédec <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI fails with dev dependencies: RuntimeError: Could not infer dtype of dict

3 participants