fix: reject non-text content in system/developer messages by veeceey · Pull Request #33981 · vllm-project/vllm

veeceey · 2026-02-06T09:21:02Z

Summary

Per the OpenAI API specification, system and developer role messages should only accept text content type. Previously, vLLM allowed multimodal content (e.g. image_url, input_audio, video_url) in system messages without any validation, which diverges from the OpenAI API behavior.

Changes

vllm/entrypoints/chat_utils.py: Added a _validate_text_only_content() function that checks content parts for system/developer messages and raises a ValueError when non-text content types (e.g. image_url, input_audio, video_url) are found. The validation runs inside _parse_chat_message_content() before content parts are parsed, ensuring both sync and async code paths are covered. The ValueError is caught by the serving layer's existing error handling and returned as a proper error response.
tests/entrypoints/test_chat_utils.py: Added parametrized tests covering:
- Rejection of image_url, input_audio, and video_url content in both system and developer roles
- Acceptance of text content (both list-of-parts and plain string forms) for system/developer roles

Test plan

test_system_message_rejects_non_text_content -- verifies ValueError is raised for image_url, input_audio, video_url in system/developer messages
test_system_message_accepts_text_content -- verifies text content parts are accepted
test_system_message_accepts_string_content -- verifies plain string content is accepted

mergify · 2026-02-06T09:29:10Z

Hi @veeceey, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

gemini-code-assist

Code Review

This pull request correctly identifies and addresses a deviation from the OpenAI API specification by restricting system and developer messages to text-only content. The implementation adds a new validation function and comprehensive tests to ensure compliance. While the approach is sound, I've found a critical flaw in the validation logic. It doesn't account for simplified multimodal content formats (e.g., without an explicit type key), which allows the validation to be bypassed. My review includes a suggested fix to make the validation more robust by leveraging existing parsing logic.

gemini-code-assist · 2026-02-06T09:36:47Z

vllm/entrypoints/chat_utils.py

+    for part in content:
+        if isinstance(part, str):
+            continue
+        part_type = part.get("type")
+        if part_type is not None and part_type not in _TEXT_CONTENT_TYPES:
+            raise ValueError(
+                f"Content part type '{part_type}' is not supported "
+                f"in '{role}' messages. Only text content is accepted "
+                f"for '{role}' role messages."
+            )


The validation logic here is incomplete as it only checks for an explicit type key in a content part. This allows non-text content to be accepted if specified in a simplified format (e.g., {"image_url": "..."}) or when a uuid is present, which alters type inference. This bypasses the intended validation, potentially leading to unexpected behavior.

To ensure the validation is robust, it should infer the content type using the same logic as _parse_chat_message_content_mm_part. Reusing this parsing logic for type inference will make the validation more accurate and prevent this bypass.

for part in content: if isinstance(part, str): continue # We must use the same part type inference logic from # `_parse_chat_message_content_mm_part` to correctly validate # all possible input formats. This includes simplified formats where # the 'type' key is omitted, or where a 'uuid' can override the type. try: part_type, _ = _parse_chat_message_content_mm_part(part) except ValueError: # If the part is malformed, let the main parsing logic handle it. # For this validation, we can assume it's not a non-text part. continue if part_type not in _TEXT_CONTENT_TYPES: raise ValueError( f"Content part type '{part_type}' is not supported " f"in '{role}' messages. Only text content is accepted " f"for '{role}' role messages." )

Good catch! Fixed. The validation now also checks for the presence of known multimodal dict keys (image_url, audio_url, video_url, input_audio, image_pil, image_embeds, audio_embeds) in addition to the explicit type field. This means content like {"image_url": "..."} without a type field will now be correctly rejected for system/developer roles.

I added a _MULTIMODAL_CONTENT_KEYS frozenset and the inline check uses set(part.keys()) & _MULTIMODAL_CONTENT_KEYS to detect these cases. New tests have been added to cover all the no-type-key scenarios.

chaunceyjiang · 2026-02-06T09:47:56Z

vllm/entrypoints/chat_utils.py

+
+    See: https://platform.openai.com/docs/api-reference/chat/create
+    """
+    for part in content:


Could we move this into another existing for-loop to avoid introducing an extra loop?

Done! I've removed the separate _validate_text_only_content() function and its dedicated loop entirely. The validation now happens inline inside _parse_chat_message_content_part(), which is already called for each part in the existing for-loop in _parse_chat_message_content_parts(). This avoids introducing an extra iteration over the content parts.

The role parameter is now passed through so the per-part function can check text-only constraints before proceeding with multimodal parsing.

Address reviewer feedback on PR vllm-project#33981: 1. Merge the separate `_validate_text_only_content()` pre-scan loop into the existing per-part loop inside `_parse_chat_message_content_part()`, eliminating the extra iteration over content parts. 2. Detect multimodal content even when the `type` key is absent by checking for known multimodal dict keys (image_url, audio_url, video_url, input_audio, image_pil, image_embeds, audio_embeds). This closes the gap where `{"image_url": "..."}` (without a `type` field) would bypass the validation. Signed-off-by: Varun Chawla <varun_6april@hotmail.com>

Address reviewer feedback on PR vllm-project#33981: 1. Merge the separate `_validate_text_only_content()` pre-scan loop into the existing per-part loop inside `_parse_chat_message_content_part()`, eliminating the extra iteration over content parts. 2. Detect multimodal content even when the `type` key is absent by checking for known multimodal dict keys (image_url, audio_url, video_url, input_audio, image_pil, image_embeds, audio_embeds). This closes the gap where `{"image_url": "..."}` (without a `type` field) would bypass the validation. Signed-off-by: Varun Chawla <varun_6april@hotmail.com> Signed-off-by: veeceey <veeceey@users.noreply.github.com>

veeceey · 2026-02-06T11:55:34Z

Manual test results for system/developer message validation

Ran 22 manual tests against the validation logic in vllm/entrypoints/chat_utils.py (extracted and tested directly since torch/GPU deps aren't available locally). All constants and logic verified against the actual source.

Results: 22 passed, 0 failed, 22 total
============================================================
  [PASS] reject image_url in system (explicit type)
  [PASS] reject image_url in developer (explicit type)
  [PASS] reject input_audio in system (explicit type)
  [PASS] reject input_audio in developer (explicit type)
  [PASS] reject video_url in system (explicit type)
  [PASS] reject video_url in developer (explicit type)
  [PASS] reject image_url in system (no type key)
  [PASS] reject audio_url in system (no type key)
  [PASS] reject video_url in system (no type key)
  [PASS] reject input_audio in system (no type key)
  [PASS] reject image_url in developer (no type key)
  [PASS] reject audio_url in developer (no type key)
  [PASS] reject video_url in developer (no type key)
  [PASS] reject input_audio in developer (no type key)
  [PASS] accept text part in system
  [PASS] accept text part in developer
  [PASS] accept plain string in system
  [PASS] accept plain string in developer
  [PASS] allow image_url in user role
  [PASS] allow input_audio in user role
  [PASS] allow image_url in user role (no type key)
  [PASS] error message format is correct

What was tested:

Multimodal content (image_url, input_audio, video_url, audio_url) correctly rejected in both system and developer roles
Both with explicit type field and without (just the multimodal key present)
Text content (structured {"type": "text", ...} and plain string) correctly accepted for system/developer
User role still allows multimodal content (no regression)
Error message format matches expected pattern

Looks good.

veeceey · 2026-02-10T07:39:56Z

Hi @chaunceyjiang, friendly ping — I've addressed your feedback by moving the validation into the existing for-loop (no extra loop introduced) and also handling inferred multimodal types. All CI checks are passing. Would you be able to take another look? Thank you!

Per the OpenAI API spec, system and developer messages only accept text content. Add validation inside the existing per-part parsing loop to reject multimodal content (image_url, audio_url, video_url, input_audio, etc.) for these roles. Handles both explicit type fields and inferred types from dict keys, preventing bypasses via simplified multimodal formats. Fixes vllm-project#33925 Signed-off-by: Varun Chawla <varun_6april@hotmail.com>

DarkLight1337 · 2026-02-17T02:11:35Z

I thought we agreed in #34072 that we make this a warning only.

veeceey · 2026-02-18T08:34:35Z

Thanks @DarkLight1337, you're right! I'll update this PR to use a warning instead of raising an error, consistent with what we settled on in #34072. Will push the update shortly.

Instead of raising a ValueError when system/developer messages contain non-text content, issue a logger.warning and skip the part. This is consistent with the decision in vllm-project#34072. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

veeceey · 2026-02-20T05:58:49Z

Good catch — changed the validation to issue a warning instead of raising an error, consistent with the decision in #34072.

DarkLight1337 · 2026-02-20T06:01:17Z

vllm/entrypoints/chat_utils.py

+                "for '%s' role messages. Skipping this content part.",
+                label, role, role,
+            )
+            return None


We should still allow it. But we can log a warning that it is outside of OpenAI spec.

So I think this PR isn't really needed as #34072 does that already

mergify · 2026-02-20T06:03:12Z

Hi @veeceey, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Add missing blank lines after import statements and split long function arguments across multiple lines per project style guide. Signed-off-by: Varun Chawla <varun_6april@hotmail.com>

veeceey · 2026-02-20T06:19:00Z

Thanks @DarkLight1337 — you're right that #34072 covers the warning-only approach. I've just pushed a commit that fixes the pre-commit formatting issues (missing blank lines after import and splitting long args).

However, if #34072 already fully handles this, I'm happy to close this PR. Could you confirm whether #34072 covers the same validation paths (both explicit type field and inferred multimodal keys like {"image_url": "..."}), or if there's still value in keeping the more comprehensive detection from this PR? If it's fully redundant, I'll close this out.

veeceey · 2026-02-20T06:33:14Z

Thanks @DarkLight1337 for confirming. Since #34072 covers the same validation with the warning-only approach, I'll go ahead and close this PR to avoid duplication. Appreciate the guidance!

veeceey · 2026-02-20T06:34:28Z

Closing as this is superseded by #34072. Thanks @DarkLight1337 for confirming!

veeceey requested review from DarkLight1337, NickLucche, aarnphm, chaunceyjiang and robertgshaw2-redhat as code owners February 6, 2026 09:21

mergify bot added the frontend label Feb 6, 2026

veeceey force-pushed the fix/issue-33925-system-message-validation branch from 6de61df to cebb571 Compare February 6, 2026 09:24

gemini-code-assist bot reviewed Feb 6, 2026

View reviewed changes

chaunceyjiang reviewed Feb 6, 2026

View reviewed changes

veeceey force-pushed the fix/issue-33925-system-message-validation branch from 0616598 to 9d8cabc Compare February 6, 2026 10:19

veeceey force-pushed the fix/issue-33925-system-message-validation branch from 9d8cabc to d1831c1 Compare February 16, 2026 19:30

DarkLight1337 reviewed Feb 20, 2026

View reviewed changes

style: fix pre-commit formatting issues

584013b

Add missing blank lines after import statements and split long function arguments across multiple lines per project style guide. Signed-off-by: Varun Chawla <varun_6april@hotmail.com>

veeceey closed this Feb 20, 2026

Uh oh!

Conversation

veeceey commented Feb 6, 2026

Summary

Changes

Test plan

Uh oh!

mergify bot commented Feb 6, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

veeceey Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

veeceey Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

veeceey commented Feb 6, 2026

Manual test results for system/developer message validation

Uh oh!

veeceey commented Feb 10, 2026

Uh oh!

DarkLight1337 commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

veeceey commented Feb 18, 2026

Uh oh!

veeceey commented Feb 20, 2026

Uh oh!

DarkLight1337 Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Feb 20, 2026

Uh oh!

veeceey commented Feb 20, 2026

Uh oh!

veeceey commented Feb 20, 2026

Uh oh!

veeceey commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DarkLight1337 commented Feb 17, 2026 •

edited

Loading