Fix Mistral chat template (mistral_v7_tekken)#2710
Conversation
WalkthroughThe "mistral_v7_tekken" chat template in the codebase was updated to better handle system and message content that can be either strings or lists of blocks. The logic now checks content types and extracts text accordingly, improving robustness for various message formats without altering public interfaces. Changes
Poem
Note ⚡️ AI Code Reviews for VS Code, Cursor, WindsurfCodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms (6)
🔇 Additional comments (1)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAll modified and coverable lines are covered by tests ✅ 📢 Thoughts on this report? Let us know! |
Per https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/commit/4b8dd8aae705887db5295fcbff4aedbb92d682eb#d2h-482763
Description
The EOS token isn't set properly in the old unfixed chat template included in axolotl.
Motivation and Context
You can't train Mistral instruction models properly without this change since the EOS token is not inserted after the assistant response.
How has this been tested?
Ran on a dataset and observed the output of data preprocessing.
Summary by CodeRabbit