fix: normalize messages before chat template application#240
fix: normalize messages before chat template application#240Thump604 merged 3 commits intowaybarrios:mainfrom
Conversation
Add _normalize_messages() to server.py and call it in all request paths before apply_chat_template. Maps non-standard roles (developer -> system, per OpenAI Responses API) and merges consecutive same-role messages. Fixes agent crashes from: - OpenAI Responses API sending role="developer" (unrecognized by Qwen3.5 template) - OpenCode sending [system, system, user, user] (rejected by alternating-role templates) Applied in create_chat_completion (both MLLM and LLM paths), create_anthropic_message, and _stream_anthropic_messages.
janhilgard
left a comment
There was a problem hiding this comment.
Reviewed the diff. Clean implementation — _normalize_messages() correctly maps developer -> system and merges consecutive same-role messages with \n\n separator. Good guard on list content (multimodal payloads preserved). All 4 request paths covered (MLLM, LLM, Anthropic, Anthropic streaming). Tests are thorough — edge cases for None content, multimodal, and 3+ consecutive messages.
This fixes real crashes we see with OpenCode and agent frameworks that send [system, system, user, user] or developer role messages.
|
Status ping — this PR has been open 7 days with no review activity. I kept the scope intentionally narrow: normalize messages before |
|
Already approved on my side. CI green, branch mergeable, scope is minimal and well-contained. @waybarrios — this one is ready to go whenever you get a chance. |
Incorporates 53 upstream commits including: - O(1) state-machine reasoning parser (PR waybarrios#234) - Resumable model download (PR waybarrios#77) - Block-aware prefix cache (PR waybarrios#217) - Message normalization (PR waybarrios#240) - Full sampling params (PR waybarrios#258) - ThinkRouter for Anthropic streaming - 22 new test files - License file, docs updates Conflict resolution: preserved production features (frequency_penalty conversion, tool markup safety nets, openai_to_anthropic import) while adopting upstream improvements (Gemma4 parser rewrite, cleaner logging, _model_name in streaming chunks). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
_normalize_messages()inserver.py: maps non-standard roles (developer->system, per OpenAI Responses API) and merges consecutive same-role messagesapply_chat_template:create_chat_completionMLLM path,create_chat_completionLLM path,create_anthropic_message,_stream_anthropic_messagesdeveloperrole (Qwen3.5 template rejects unknown roles) and consecutive same-role messages (e.g. OpenCode sends[system, system, user, user])Split from #224 for easier review. The other parts of #224 are in #NEW-hybrid-batching and #NEW-scheduler.
Behavior
Only merges when both adjacent messages have string content. Messages with list content (multimodal image/video payloads) are left as-is to preserve attachments.
Test plan
role: "developer"messages does not crash[system, system, user, user]format normalizes to[system, user]