[Frontend] Merge developer instructions into tool block for Harmony prompt rendering#34951
[Frontend] Merge developer instructions into tool block for Harmony prompt rendering#34951beom115 wants to merge 4 commits intovllm-project:mainfrom
Conversation
…Harmony Signed-off-by: 전상범 <beom115@naver.com>
…r Harmony Signed-off-by: 전상범 <beom115@naver.com>
There was a problem hiding this comment.
Code Review
The pull request correctly identifies the need to merge system/developer instructions into the tool block for Harmony-based models to ensure compatibility with specific chat templates. However, the current implementation has a critical logic flaw that leads to the loss of user-provided instructions when the VLLM_GPT_OSS_HARMONY_SYSTEM_INSTRUCTIONS environment variable is enabled. Additionally, the code uses unsafe dictionary access on message objects which could lead to runtime errors if non-dictionary types are encountered.
|
|
||
| chat_messages = request.messages | ||
| merged_instructions: str | None = None | ||
| if chat_messages and chat_messages[0]["role"] in ("system", "developer"): |
There was a problem hiding this comment.
Accessing ["role"] directly is unsafe here. ChatCompletionMessageParam is a union that includes OpenAIHarmonyMessage (a class), and even for dict-based messages, it is safer to use .get("role") to avoid potential KeyError or TypeError. Given that harmony_utils.py explicitly handles non-dict inputs, this part should be equally robust to prevent crashes if objects are passed programmatically.
| if request.tools or merged_instructions: | ||
| dev_msg = get_developer_message( | ||
| tools=request.tools if should_include_tools else None # type: ignore[arg-type] | ||
| instructions=merged_instructions, | ||
| tools=request.tools if should_include_tools else None, # type: ignore[arg-type] | ||
| ) | ||
| messages.append(dev_msg) |
There was a problem hiding this comment.
This implementation causes data loss when VLLM_GPT_OSS_HARMONY_SYSTEM_INSTRUCTIONS is enabled. The get_developer_message function in harmony_utils.py ignores the instructions argument if that environment variable is set. Since get_system_message was already called (at line 1946) without these instructions, the user-provided system/developer message content will be completely discarded from the final prompt. To fix this, the extraction logic should be moved before the get_system_message call so instructions can be passed to the appropriate block based on the environment configuration.
Purpose
Current Harmony chat implementation renders the system/developer instructions and tool definitions in two separate
<|start|>developerblocks. However, models that follow the OpenAIgpt-oss-20b/120bchat template (based on the chat_template.jinja) expect a single developer block containing both# Instructionsand# Toolssections.This PR modifies
_make_request_with_harmonyto:Extract the initial system or developer message.
Merge it into the same developer block as the tool definitions under the # Instructions header.
This ensures compatibility with models that require a unified developer message block.
Test Plan
Test Result
Before (Redundant blocks):
After (Merged block - Correct for gpt-oss):
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.