Skip to content

[Fix] pass chat_template_kwargs to get_system_message in gpt-oss#30873

Closed
seunggil1 wants to merge 4 commits intovllm-project:mainfrom
seunggil1:fix/gpt-oss-model-identity
Closed

[Fix] pass chat_template_kwargs to get_system_message in gpt-oss#30873
seunggil1 wants to merge 4 commits intovllm-project:mainfrom
seunggil1:fix/gpt-oss-model-identity

Conversation

@seunggil1
Copy link
Copy Markdown

@seunggil1 seunggil1 commented Dec 17, 2025

Purpose

This PR addresses the issue where chat_template_kwargs (specifically model_identity) is ignored when using gpt-oss models.

Currently, gpt-oss models use a custom request handling method _make_request_with_harmony instead of the standard apply_hf_chat_template. Inside this method, the get_system_message function is called without passing chat_template_kwargs. As a result, the model_identity parameter defaults to a hardcoded value ("You are ChatGPT..."), preventing users from dynamically customizing the system prompt per request.

This fix passes chat_template_kwargs to get_system_message in vllm/entrypoints/openai/serving_chat.py, allowing users to override the default model identity and other parameters supported by the harmony library.

This fix implements the solution discussed in #23975 (comment)

Partially Fixes #23015
Fixes #23975

Test Plan

Environment:

  • GPU: NVIDIA H100
  • vLLM Version: v0.12.0 (Docker image vllm/vllm-openai:v0.12.0)
  1. Start the vLLM server with a gpt-oss model (e.g., gpt-oss-20b):

    docker run --rm --gpus all \
      -e VLLM_LOGGING_LEVEL=DEBUG \
      -v /path/to/gpt-oss-20b:/app/model \
      vllm/vllm-openai:v0.12.0 \
      --model /app/model \
      --served-model-name vllm/gpt-oss-20b \
      --enable-log-requests
  2. Send a chat completion request with a custom model_identity in chat_template_kwargs:

    curl --location 'http://localhost:8000/v1/chat/completions' \
    --header 'Content-Type: application/json' \
    --data '{
        "model": "vllm/gpt-oss-20b",
        "messages": [
            {
                "role": "user",
                "content": "Who are you?"
            }
        ],
        "chat_template_kwargs":{
            "model_identity":"I am a custom AI assistant."
        },
        "temperature": 0
    }'

Test Result

Before Fix:
The model ignores the model_identity and responds with the default identity.

{
    "object": "chat.completion",
    "created": 1765981514,
    "model": "vllm/gpt-oss-20b",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "I’m ChatGPT, a large language model created by OpenAI. I’m designed to understand and generate natural language, so I can help answer questions, brainstorm ideas, explain concepts, draft text, and more. I don’t have personal experiences or feelings—just a lot of patterns learned from text. How can I assist you today?",
                "refusal": null,
                "annotations": null,
                "audio": null,
                "function_call": null,
                "tool_calls": [],
                "reasoning": "The user asks \"Who are you?\" We need to respond as ChatGPT. The user might want a brief introduction. We should mention that we are ChatGPT, a large language model trained by OpenAI, etc. Also mention that we can help with many tasks. We should keep it concise.",
                "reasoning_content": "The user asks \"Who are you?\" We need to respond as ChatGPT. The user might want a brief introduction. We should mention that we are ChatGPT, a large language model trained by OpenAI, etc. Also mention that we can help with many tasks. We should keep it concise."
            },
            "logprobs": null,
            "finish_reason": "stop",
            "stop_reason": null,
            "token_ids": null
        }
    ],
    "usage": {
        "prompt_tokens": 73,
        "total_tokens": 211,
        "completion_tokens": 138,
        "prompt_tokens_details": null
    },
}

After Fix:
The model correctly adopts the model_identity provided in chat_template_kwargs.

{
    "id": "chatcmpl-9fb8cfe7a67f083a",
    "object": "chat.completion",
    "created": 1765981838,
    "model": "vllm/gpt-oss-20b",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "I’m your custom AI assistant—an advanced language model built on OpenAI’s GPT‑4 architecture, fine‑tuned to help you with information, ideas, and tasks. I’m here to answer questions, offer suggestions, and chat about almost anything you’re curious about. Just let me know what you need!",
                "refusal": null,
                "annotations": null,
                "audio": null,
                "function_call": null,
                "tool_calls": [],
                "reasoning": "We need to respond as the custom AI assistant. The user asks \"Who are you?\" We should answer in a friendly manner, describing ourselves as a custom AI assistant. Probably mention that we are a language model trained by OpenAI, but customized. We should keep it concise.",
                "reasoning_content": "We need to respond as the custom AI assistant. The user asks \"Who are you?\" We should answer in a friendly manner, describing ourselves as a custom AI assistant. Probably mention that we are a language model trained by OpenAI, but customized. We should keep it concise."
            },
            "logprobs": null,
            "finish_reason": "stop",
            "stop_reason": null,
            "token_ids": null
        }
    ],
    "usage": {
        "prompt_tokens": 66,
        "total_tokens": 196,
        "completion_tokens": 130,
        "prompt_tokens_details": null
    },
}

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: seunggil1 <ksgg1navercom@gmail.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to pass chat_template_kwargs to get_system_message for gpt-oss models. However, the current implementation introduces a critical bug that will cause a TypeError at runtime. The get_system_message function does not accept arbitrary keyword arguments, and the proposed change can lead to passing unexpected or duplicate arguments. I've provided a comment with a suggested fix to make this change safe.

@cjackal
Copy link
Copy Markdown
Contributor

cjackal commented Dec 17, 2025

Seems like a duplicate of #30247

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: seung <38664481+seunggil1@users.noreply.github.com>
@mergify
Copy link
Copy Markdown

mergify bot commented Dec 17, 2025

Hi @seunggil1, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

@github-actions
Copy link
Copy Markdown

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

Signed-off-by: seunggil1 <ksgg1navercom@gmail.com>
@seunggil1
Copy link
Copy Markdown
Author

Seems like a duplicate of #30247

I missed that PR—it wasn't there when I first checked, so it must be recent. Thanks for pointing it out!

Since the objective is the same, I'll take a quick look at the implementation differences. If I find that the approaches are identical or if this PR is redundant, I'll go ahead and close it.

Thanks again!

@mergify
Copy link
Copy Markdown

mergify bot commented Dec 21, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @seunggil1.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend gpt-oss Related to GPT-OSS models needs-rebase

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[gpt-oss]: Ability to set model_identity dynamically which is used in building the system prompt [Bug]: Is the chat template for gpt-oss hard-coded?

2 participants