[Bugfix][Structured Output] Fix structural_tag bitmask not applied on reasoning models by CatherineSue · Pull Request #37388 · vllm-project/vllm

CatherineSue · 2026-03-18T05:06:14Z

Purpose

Fix structural_tag constraints being silently ignored on reasoning models (e.g., gpt-oss with openai_gptoss reasoning parser).

When a reasoning model uses a structural_tag constraint (e.g., triggered_tags format), should_fill_bitmask() and should_advance() in StructuredOutputManager return False during the reasoning phase. This happens because reasoning_ended is computed once from prompt tokens (which do not contain the reasoning end sequence) and never updated during generation. As a result, the grammar bitmask is never applied, the grammar state is never advanced, triggers never fire, and the constraint is completely ignored.

structural_tag grammars handle reasoning/content boundaries internally via triggers — the grammar itself knows when to allow free text (reasoning) and when to constrain output. The external reasoning_ended gate prevents this from working.

Fix: Add an early return in both should_fill_bitmask() and should_advance() when the request uses a structural_tag constraint, bypassing the reasoning_ended gate. Other constraint types (json_schema, regex, etc.) are unaffected and keep the existing behavior.

Test Plan

Tested with gpt-oss-120b (tp=4) via vllm serve HTTP Chat Completions with a structural_tag response_format:

curl -s -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "/raid/models/openai/gpt-oss-120b",
    "messages": [{"role": "user", "content": "List exactly 2 fruits."}],
    "response_format": {
      "type": "structural_tag",
      "format": {
        "type": "triggered_tags",
        "triggers": ["<|channel|>analysis", "<|channel|>final"],
        "tags": [
          {"begin": "<|channel|>analysis<|message|>", "content": {"type": "any_text"}, "end": "<|end|>"},
          {"begin": "<|channel|>final<|constrain|>json<|message|>", "content": {"type": "json_schema", "json_schema": {"type": "object", "properties": {"items": {"type": "array", "items": {"type": "string"}, "minItems": 2, "maxItems": 2}}, "required": ["items"], "additionalProperties": false}}, "end": ""}
        ],
        "at_least_one": true,
        "stop_after_first": false
      }
    },
    "temperature": 0
  }'

Test Result

Before — structural_tag trigger never fires, Harmony markers leak into content:

{
    "message": {
        "content": "<|channel|>final<|constrain|>json<|message|>{\"items\":[\"Apple\",\"Banana\"]}",
        "reasoning": "The user asks: \"List exactly 2 fruits.\" So we need to output exactly two fruit names..."
    },
    "finish_reason": "stop",
    "usage": {"completion_tokens": 98}
}

After — grammar bitmask applied, trigger fires, clean constrained JSON:

{
    "message": {
        "content": "{\"items\":[\"Apple\",\"Banana\"]}",
        "reasoning": "The user asks: \"List exactly 2 fruits.\" So we need to output exactly two fruit names..."
    },
    "finish_reason": "stop",
    "usage": {"completion_tokens": 78}
}

Essential Elements of an Effective PR Description Checklist

The purpose of the PR: Fix structural_tag constraint silently ignored on reasoning models
The test plan: Tested with gpt-oss-120b via HTTP Chat Completions with structural_tag response_format
The test results: Before/after comparison with token count difference
(Optional) Documentation update: N/A
(Optional) Release notes: N/A

gemini-code-assist

Code Review

The pull request effectively addresses the bug where structural_tag constraints were being ignored in reasoning models. The changes correctly bypass the reasoning_ended gate for structural_tag grammars in both should_fill_bitmask and should_advance methods, ensuring that the grammar bitmask is always applied and the grammar state is advanced as intended. The inline comments clearly explain the rationale behind these changes. The provided test plan and results confirm the fix and demonstrate the intended behavior, including the reduction in completion tokens due to correct constraint application.

…okens on reasoning models When a reasoning model (e.g., gpt-oss with openai_gptoss parser) uses a structural_tag constraint, should_advance() returns False during the reasoning phase because reasoning_ended is never set to True from prompt tokens alone. This prevents the grammar from tracking generated tokens, so it never sees the trigger sequence and the constraint silently fails. structural_tag grammars (triggered_tags format) need to track all tokens from the start to maintain trigger state. For example, gpt-oss triggers on <|channel|>final which comes after <|channel|>analysis...reasoning... <|end|><|start|>assistant — the grammar must have seen all preceding tokens to know it should fire. Add an early return in should_advance() when the request uses a structural_tag constraint, so the grammar tracks tokens during reasoning without constraining them. should_fill_bitmask() is left unchanged — the grammar only constrains output after reasoning_ended is set, keeping reasoning tokens free. Signed-off-by: Chang Su <chang.s.su@oracle.com>

CatherineSue · 2026-03-18T05:57:30Z

I realized that I can use enable_in_reasoning: True. So closing this PR now.

CatherineSue requested review from aarnphm, benchislett, mgoin and russellb as code owners March 18, 2026 05:06

mergify bot added structured-output v1 bug Something isn't working labels Mar 18, 2026

github-project-automation bot added this to Structured Output Mar 18, 2026

gemini-code-assist bot reviewed Mar 18, 2026

View reviewed changes

CatherineSue force-pushed the fix/structural-tag-reasoning-bitmask branch from 47880d5 to f370be6 Compare March 18, 2026 05:41

CatherineSue force-pushed the fix/structural-tag-reasoning-bitmask branch from f370be6 to d461e4c Compare March 18, 2026 05:42

CatherineSue closed this Mar 18, 2026

github-project-automation bot moved this to Done in Structured Output Mar 18, 2026

will-deines mentioned this pull request Mar 18, 2026

[Responses API] Structured output + reasoning via structural tag embedding #35904

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix][Structured Output] Fix structural_tag bitmask not applied on reasoning models#37388

[Bugfix][Structured Output] Fix structural_tag bitmask not applied on reasoning models#37388
CatherineSue wants to merge 1 commit intovllm-project:mainfrom
CatherineSue:fix/structural-tag-reasoning-bitmask

CatherineSue commented Mar 18, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

CatherineSue commented Mar 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

CatherineSue commented Mar 18, 2026

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

CatherineSue commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

CatherineSue commented Mar 18, 2026 •

edited

Loading