[Bug]: CI flake - v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output bug Something isn't working ci/build v1

### Your current environment

<details>
<summary>The output of `python collect_env.py`</summary>

```text
Your output of `python collect_env.py` here
```

</details>


### 🐛 Describe the bug

```

[2025-04-02T06:06:31Z] =================================== FAILURES ===================================
--
  | [2025-04-02T06:06:31Z] _ test_structured_output[mistralai/Ministral-8B-Instruct-2410-guidance:disable-any-whitespace-auto] _
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z] monkeypatch = <_pytest.monkeypatch.MonkeyPatch object at 0x7f6c3b4dcfb0>
  | [2025-04-02T06:06:31Z] sample_json_schema = {'properties': {'age': {'type': 'integer'}, 'name': {'type': 'string'}, 'skills': {'items': {'type': 'string'}, 'type'...ition'], 'type': 'object'}, 'type': 'array'}}, 'required': ['name', 'age', 'skills', 'work_history'], 'type': 'object'}
  | [2025-04-02T06:06:31Z] unsupported_json_schema = {'properties': {'email': {'pattern': '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$', 'type': 'string'}, 'grade': ...[a-z]{1,10}$', 'type': 'string'}, 'type': 'array'}}, 'required': ['score', 'grade', 'email', 'tags'], 'type': 'object'}
  | [2025-04-02T06:06:31Z] sample_sql_ebnf = '\nroot ::= select_statement\nselect_statement ::= "SELECT" column "from" table "where" condition\ncolumn ::= "col_1" \| "col_2"\ntable ::= "table_1" \| "table_2"\ncondition ::= column "=" number\nnumber ::= "1" \| "2"\n'
  | [2025-04-02T06:06:31Z] sample_sql_lark = '\nstart: select_statement\nselect_statement: "SELECT" column "from" table "where" condition\ncolumn: "col_1" \| "col_2"\ntable: "table_1" \| "table_2"\ncondition: column "=" number\nnumber: "1" \| "2"\n'
  | [2025-04-02T06:06:31Z] sample_regex = '((25[0-5]\|(2[0-4]\|1\\d\|[1-9]\|)\\d)\\.){3}(25[0-5]\|(2[0-4]\|1\\d\|[1-9]\|)\\d)'
  | [2025-04-02T06:06:31Z] sample_guided_choice = ['Python', 'Java', 'JavaScript', 'C++', 'C#', 'PHP', ...]
  | [2025-04-02T06:06:31Z] guided_decoding_backend = 'guidance:disable-any-whitespace'
  | [2025-04-02T06:06:31Z] tokenizer_mode = 'auto', model_name = 'mistralai/Ministral-8B-Instruct-2410'
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]     @pytest.mark.skip_global_cleanup
  | [2025-04-02T06:06:31Z]     @pytest.mark.parametrize("model_name, guided_decoding_backend, tokenizer_mode",
  | [2025-04-02T06:06:31Z]                              PARAMS_MODELS_BACKENDS_TOKENIZER_MODE)
  | [2025-04-02T06:06:31Z]     def test_structured_output(
  | [2025-04-02T06:06:31Z]         monkeypatch: pytest.MonkeyPatch,
  | [2025-04-02T06:06:31Z]         sample_json_schema: dict[str, Any],
  | [2025-04-02T06:06:31Z]         unsupported_json_schema: dict[str, Any],
  | [2025-04-02T06:06:31Z]         sample_sql_ebnf: str,
  | [2025-04-02T06:06:31Z]         sample_sql_lark: str,
  | [2025-04-02T06:06:31Z]         sample_regex: str,
  | [2025-04-02T06:06:31Z]         sample_guided_choice: str,
  | [2025-04-02T06:06:31Z]         guided_decoding_backend: str,
  | [2025-04-02T06:06:31Z]         tokenizer_mode: str,
  | [2025-04-02T06:06:31Z]         model_name: str,
  | [2025-04-02T06:06:31Z]     ):
  | [2025-04-02T06:06:31Z]         monkeypatch.setenv("VLLM_USE_V1", "1")
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]         # Use a single LLM instance for several scenarios to
  | [2025-04-02T06:06:31Z]         # speed up the test suite.
  | [2025-04-02T06:06:31Z]         llm = LLM(model=model_name,
  | [2025-04-02T06:06:31Z]                   enforce_eager=True,
  | [2025-04-02T06:06:31Z]                   max_model_len=1024,
  | [2025-04-02T06:06:31Z]                   guided_decoding_backend=guided_decoding_backend,
  | [2025-04-02T06:06:31Z]                   tokenizer_mode=tokenizer_mode)
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]         #
  | [2025-04-02T06:06:31Z]         # Test 1: Generate JSON output based on a provided schema
  | [2025-04-02T06:06:31Z]         #
  | [2025-04-02T06:06:31Z]         sampling_params = SamplingParams(
  | [2025-04-02T06:06:31Z]             temperature=1.0,
  | [2025-04-02T06:06:31Z]             max_tokens=1000,
  | [2025-04-02T06:06:31Z]             guided_decoding=GuidedDecodingParams(json=sample_json_schema))
  | [2025-04-02T06:06:31Z]         outputs = llm.generate(prompts=[
  | [2025-04-02T06:06:31Z]             f"Give an example JSON for an employee profile "
  | [2025-04-02T06:06:31Z]             f"that fits this schema: {sample_json_schema}"
  | [2025-04-02T06:06:31Z]         ] * 2,
  | [2025-04-02T06:06:31Z]                                sampling_params=sampling_params,
  | [2025-04-02T06:06:31Z]                                use_tqdm=True)
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]         assert outputs is not None
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]         for output in outputs:
  | [2025-04-02T06:06:31Z]             assert output is not None
  | [2025-04-02T06:06:31Z]             assert isinstance(output, RequestOutput)
  | [2025-04-02T06:06:31Z]             prompt = output.prompt
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]             generated_text = output.outputs[0].text
  | [2025-04-02T06:06:31Z]             assert generated_text is not None
  | [2025-04-02T06:06:31Z]             if 'disable-any-whitespace' in guided_decoding_backend:
  | [2025-04-02T06:06:31Z]                 assert "\n" not in generated_text
  | [2025-04-02T06:06:31Z]             print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
  | [2025-04-02T06:06:31Z] >           output_json = json.loads(generated_text)
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z] v1/entrypoints/llm/test_struct_output_generate.py:100:
  | [2025-04-02T06:06:31Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
  | [2025-04-02T06:06:31Z] /usr/lib/python3.12/json/__init__.py:346: in loads
  | [2025-04-02T06:06:31Z]     return _default_decoder.decode(s)
  | [2025-04-02T06:06:31Z] /usr/lib/python3.12/json/decoder.py:338: in decode
  | [2025-04-02T06:06:31Z]     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  | [2025-04-02T06:06:31Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z] self = <json.decoder.JSONDecoder object at 0x7f6c45eab710>
  | [2025-04-02T06:06:31Z] s = '{"name":"{1}","age":125,"skills":["^{are an elective course}","^{remote support}","\| التهاب Dass{or}{internalutt}{aur...on reasons, very simplified and doesn\'t fully undergo proper schema check according to specific rules yet appropriate'
  | [2025-04-02T06:06:31Z] idx = 0
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]     def raw_decode(self, s, idx=0):
  | [2025-04-02T06:06:31Z]         """Decode a JSON document from ``s`` (a ``str`` beginning with
  | [2025-04-02T06:06:31Z]         a JSON document) and return a 2-tuple of the Python
  | [2025-04-02T06:06:31Z]         representation and the index in ``s`` where the document ended.
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]         This can be used to decode a JSON document from a string that may
  | [2025-04-02T06:06:31Z]         have extraneous data at the end.
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]         """
  | [2025-04-02T06:06:31Z]         try:
  | [2025-04-02T06:06:31Z] >           obj, end = self.scan_once(s, idx)
  | [2025-04-02T06:06:31Z] E           json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 3079 (char 3078)
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z] /usr/lib/python3.12/json/decoder.py:354: JSONDecodeError


```
See https://buildkite.com/vllm/ci/builds/16776#0195f4d4-1d2a-42c1-98b2-9b05d879956c
### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: CI flake - v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output bug Something isn't working ci/build v1 #15944

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: CI flake - v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output bug Something isn't working ci/build v1 #15944

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions