[Fix] pass chat_template_kwargs to get_system_message in gpt-oss by seunggil1 · Pull Request #30873 · vllm-project/vllm

seunggil1 · 2025-12-17T14:32:01Z

Purpose

This PR addresses the issue where chat_template_kwargs (specifically model_identity) is ignored when using gpt-oss models.

Currently, gpt-oss models use a custom request handling method _make_request_with_harmony instead of the standard apply_hf_chat_template. Inside this method, the get_system_message function is called without passing chat_template_kwargs. As a result, the model_identity parameter defaults to a hardcoded value ("You are ChatGPT..."), preventing users from dynamically customizing the system prompt per request.

This fix passes chat_template_kwargs to get_system_message in vllm/entrypoints/openai/serving_chat.py, allowing users to override the default model identity and other parameters supported by the harmony library.

This fix implements the solution discussed in #23975 (comment)

Partially Fixes #23015
Fixes #23975

Test Plan

Environment:

GPU: NVIDIA H100
vLLM Version: v0.12.0 (Docker image vllm/vllm-openai:v0.12.0)

Start the vLLM server with a gpt-oss model (e.g., gpt-oss-20b):

docker run --rm --gpus all \
  -e VLLM_LOGGING_LEVEL=DEBUG \
  -v /path/to/gpt-oss-20b:/app/model \
  vllm/vllm-openai:v0.12.0 \
  --model /app/model \
  --served-model-name vllm/gpt-oss-20b \
  --enable-log-requests

Send a chat completion request with a custom model_identity in chat_template_kwargs:

curl --location 'http://localhost:8000/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
    "model": "vllm/gpt-oss-20b",
    "messages": [
        {
            "role": "user",
            "content": "Who are you?"
        }
    ],
    "chat_template_kwargs":{
        "model_identity":"I am a custom AI assistant."
    },
    "temperature": 0
}'

Test Result

Before Fix:
The model ignores the model_identity and responds with the default identity.

{
    "object": "chat.completion",
    "created": 1765981514,
    "model": "vllm/gpt-oss-20b",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "I’m ChatGPT, a large language model created by OpenAI. I’m designed to understand and generate natural language, so I can help answer questions, brainstorm ideas, explain concepts, draft text, and more. I don’t have personal experiences or feelings—just a lot of patterns learned from text. How can I assist you today?",
                "refusal": null,
                "annotations": null,
                "audio": null,
                "function_call": null,
                "tool_calls": [],
                "reasoning": "The user asks \"Who are you?\" We need to respond as ChatGPT. The user might want a brief introduction. We should mention that we are ChatGPT, a large language model trained by OpenAI, etc. Also mention that we can help with many tasks. We should keep it concise.",
                "reasoning_content": "The user asks \"Who are you?\" We need to respond as ChatGPT. The user might want a brief introduction. We should mention that we are ChatGPT, a large language model trained by OpenAI, etc. Also mention that we can help with many tasks. We should keep it concise."
            },
            "logprobs": null,
            "finish_reason": "stop",
            "stop_reason": null,
            "token_ids": null
        }
    ],
    "usage": {
        "prompt_tokens": 73,
        "total_tokens": 211,
        "completion_tokens": 138,
        "prompt_tokens_details": null
    },
}

After Fix:
The model correctly adopts the model_identity provided in chat_template_kwargs.

{
    "id": "chatcmpl-9fb8cfe7a67f083a",
    "object": "chat.completion",
    "created": 1765981838,
    "model": "vllm/gpt-oss-20b",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "I’m your custom AI assistant—an advanced language model built on OpenAI’s GPT‑4 architecture, fine‑tuned to help you with information, ideas, and tasks. I’m here to answer questions, offer suggestions, and chat about almost anything you’re curious about. Just let me know what you need!",
                "refusal": null,
                "annotations": null,
                "audio": null,
                "function_call": null,
                "tool_calls": [],
                "reasoning": "We need to respond as the custom AI assistant. The user asks \"Who are you?\" We should answer in a friendly manner, describing ourselves as a custom AI assistant. Probably mention that we are a language model trained by OpenAI, but customized. We should keep it concise.",
                "reasoning_content": "We need to respond as the custom AI assistant. The user asks \"Who are you?\" We should answer in a friendly manner, describing ourselves as a custom AI assistant. Probably mention that we are a language model trained by OpenAI, but customized. We should keep it concise."
            },
            "logprobs": null,
            "finish_reason": "stop",
            "stop_reason": null,
            "token_ids": null
        }
    ],
    "usage": {
        "prompt_tokens": 66,
        "total_tokens": 196,
        "completion_tokens": 130,
        "prompt_tokens_details": null
    },
}

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: seunggil1 <ksgg1navercom@gmail.com>

chatgpt-codex-connector · 2025-12-17T14:32:09Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

gemini-code-assist

Code Review

This pull request aims to pass chat_template_kwargs to get_system_message for gpt-oss models. However, the current implementation introduces a critical bug that will cause a TypeError at runtime. The get_system_message function does not accept arbitrary keyword arguments, and the proposed change can lead to passing unexpected or duplicate arguments. I've provided a comment with a suggested fix to make this change safe.

vllm/entrypoints/openai/serving_chat.py

cjackal · 2025-12-17T14:39:16Z

Seems like a duplicate of #30247

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: seung <38664481+seunggil1@users.noreply.github.com>

mergify · 2025-12-17T14:56:52Z

Hi @seunggil1, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

github-actions · 2025-12-17T14:57:05Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

Signed-off-by: seunggil1 <ksgg1navercom@gmail.com>

seunggil1 · 2025-12-17T15:05:11Z

Seems like a duplicate of #30247

I missed that PR—it wasn't there when I first checked, so it must be recent. Thanks for pointing it out!

Since the objective is the same, I'll take a quick look at the implementation differences. If I find that the approaches are identical or if this PR is redundant, I'll go ahead and close it.

Thanks again!

mergify · 2025-12-21T15:44:11Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @seunggil1.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

[Fix] pass chat_template_kwargs to get_system_message in gpt-oss

7f431f4

Signed-off-by: seunggil1 <ksgg1navercom@gmail.com>

seunggil1 requested review from aarnphm and chaunceyjiang as code owners December 17, 2025 14:32

mergify bot added frontend gpt-oss Related to GPT-OSS models labels Dec 17, 2025

github-project-automation bot added this to gpt-oss Issues & Enhancements Dec 17, 2025

github-project-automation bot moved this to To Triage in gpt-oss Issues & Enhancements Dec 17, 2025

gemini-code-assist bot reviewed Dec 17, 2025

View reviewed changes

vllm/entrypoints/openai/serving_chat.py Outdated Show resolved Hide resolved

fix : TypeError when passing chat_template_kwargs to get_system_message

a03caeb

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: seung <38664481+seunggil1@users.noreply.github.com>

[refactor] apply pre-commit

bdcd953

Signed-off-by: seunggil1 <ksgg1navercom@gmail.com>

Merge branch 'main' into fix/gpt-oss-model-identity

672edf3

mergify bot added the needs-rebase label Dec 21, 2025

lyuwen mentioned this pull request Dec 22, 2025

[gpt-oss] Add model_identity to system message retrieval for harmony chat template #30247

Open

5 tasks

seunggil1 closed this Jan 1, 2026

github-project-automation bot moved this from To Triage to Done in gpt-oss Issues & Enhancements Jan 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Fix] pass chat_template_kwargs to get_system_message in gpt-oss#30873

[Fix] pass chat_template_kwargs to get_system_message in gpt-oss#30873
seunggil1 wants to merge 4 commits intovllm-project:mainfrom
seunggil1:fix/gpt-oss-model-identity

seunggil1 commented Dec 17, 2025 •

edited by github-actions bot

Loading

Uh oh!

chatgpt-codex-connector bot commented Dec 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

cjackal commented Dec 17, 2025

Uh oh!

mergify bot commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

seunggil1 commented Dec 17, 2025

Uh oh!

mergify bot commented Dec 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

seunggil1 commented Dec 17, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector bot commented Dec 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

cjackal commented Dec 17, 2025

Uh oh!

mergify bot commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

seunggil1 commented Dec 17, 2025

Uh oh!

mergify bot commented Dec 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

seunggil1 commented Dec 17, 2025 •

edited by github-actions bot

Loading