Skip to content

[Bug]: CI flake - v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output bug Something isn't working ci/build v1 #15944

@chaunceyjiang

Description

@chaunceyjiang

Your current environment

The output of `python collect_env.py`
Your output of `python collect_env.py` here

🐛 Describe the bug


[2025-04-02T06:06:31Z] =================================== FAILURES ===================================
--
  | [2025-04-02T06:06:31Z] _ test_structured_output[mistralai/Ministral-8B-Instruct-2410-guidance:disable-any-whitespace-auto] _
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z] monkeypatch = <_pytest.monkeypatch.MonkeyPatch object at 0x7f6c3b4dcfb0>
  | [2025-04-02T06:06:31Z] sample_json_schema = {'properties': {'age': {'type': 'integer'}, 'name': {'type': 'string'}, 'skills': {'items': {'type': 'string'}, 'type'...ition'], 'type': 'object'}, 'type': 'array'}}, 'required': ['name', 'age', 'skills', 'work_history'], 'type': 'object'}
  | [2025-04-02T06:06:31Z] unsupported_json_schema = {'properties': {'email': {'pattern': '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$', 'type': 'string'}, 'grade': ...[a-z]{1,10}$', 'type': 'string'}, 'type': 'array'}}, 'required': ['score', 'grade', 'email', 'tags'], 'type': 'object'}
  | [2025-04-02T06:06:31Z] sample_sql_ebnf = '\nroot ::= select_statement\nselect_statement ::= "SELECT" column "from" table "where" condition\ncolumn ::= "col_1" \| "col_2"\ntable ::= "table_1" \| "table_2"\ncondition ::= column "=" number\nnumber ::= "1" \| "2"\n'
  | [2025-04-02T06:06:31Z] sample_sql_lark = '\nstart: select_statement\nselect_statement: "SELECT" column "from" table "where" condition\ncolumn: "col_1" \| "col_2"\ntable: "table_1" \| "table_2"\ncondition: column "=" number\nnumber: "1" \| "2"\n'
  | [2025-04-02T06:06:31Z] sample_regex = '((25[0-5]\|(2[0-4]\|1\\d\|[1-9]\|)\\d)\\.){3}(25[0-5]\|(2[0-4]\|1\\d\|[1-9]\|)\\d)'
  | [2025-04-02T06:06:31Z] sample_guided_choice = ['Python', 'Java', 'JavaScript', 'C++', 'C#', 'PHP', ...]
  | [2025-04-02T06:06:31Z] guided_decoding_backend = 'guidance:disable-any-whitespace'
  | [2025-04-02T06:06:31Z] tokenizer_mode = 'auto', model_name = 'mistralai/Ministral-8B-Instruct-2410'
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]     @pytest.mark.skip_global_cleanup
  | [2025-04-02T06:06:31Z]     @pytest.mark.parametrize("model_name, guided_decoding_backend, tokenizer_mode",
  | [2025-04-02T06:06:31Z]                              PARAMS_MODELS_BACKENDS_TOKENIZER_MODE)
  | [2025-04-02T06:06:31Z]     def test_structured_output(
  | [2025-04-02T06:06:31Z]         monkeypatch: pytest.MonkeyPatch,
  | [2025-04-02T06:06:31Z]         sample_json_schema: dict[str, Any],
  | [2025-04-02T06:06:31Z]         unsupported_json_schema: dict[str, Any],
  | [2025-04-02T06:06:31Z]         sample_sql_ebnf: str,
  | [2025-04-02T06:06:31Z]         sample_sql_lark: str,
  | [2025-04-02T06:06:31Z]         sample_regex: str,
  | [2025-04-02T06:06:31Z]         sample_guided_choice: str,
  | [2025-04-02T06:06:31Z]         guided_decoding_backend: str,
  | [2025-04-02T06:06:31Z]         tokenizer_mode: str,
  | [2025-04-02T06:06:31Z]         model_name: str,
  | [2025-04-02T06:06:31Z]     ):
  | [2025-04-02T06:06:31Z]         monkeypatch.setenv("VLLM_USE_V1", "1")
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]         # Use a single LLM instance for several scenarios to
  | [2025-04-02T06:06:31Z]         # speed up the test suite.
  | [2025-04-02T06:06:31Z]         llm = LLM(model=model_name,
  | [2025-04-02T06:06:31Z]                   enforce_eager=True,
  | [2025-04-02T06:06:31Z]                   max_model_len=1024,
  | [2025-04-02T06:06:31Z]                   guided_decoding_backend=guided_decoding_backend,
  | [2025-04-02T06:06:31Z]                   tokenizer_mode=tokenizer_mode)
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]         #
  | [2025-04-02T06:06:31Z]         # Test 1: Generate JSON output based on a provided schema
  | [2025-04-02T06:06:31Z]         #
  | [2025-04-02T06:06:31Z]         sampling_params = SamplingParams(
  | [2025-04-02T06:06:31Z]             temperature=1.0,
  | [2025-04-02T06:06:31Z]             max_tokens=1000,
  | [2025-04-02T06:06:31Z]             guided_decoding=GuidedDecodingParams(json=sample_json_schema))
  | [2025-04-02T06:06:31Z]         outputs = llm.generate(prompts=[
  | [2025-04-02T06:06:31Z]             f"Give an example JSON for an employee profile "
  | [2025-04-02T06:06:31Z]             f"that fits this schema: {sample_json_schema}"
  | [2025-04-02T06:06:31Z]         ] * 2,
  | [2025-04-02T06:06:31Z]                                sampling_params=sampling_params,
  | [2025-04-02T06:06:31Z]                                use_tqdm=True)
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]         assert outputs is not None
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]         for output in outputs:
  | [2025-04-02T06:06:31Z]             assert output is not None
  | [2025-04-02T06:06:31Z]             assert isinstance(output, RequestOutput)
  | [2025-04-02T06:06:31Z]             prompt = output.prompt
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]             generated_text = output.outputs[0].text
  | [2025-04-02T06:06:31Z]             assert generated_text is not None
  | [2025-04-02T06:06:31Z]             if 'disable-any-whitespace' in guided_decoding_backend:
  | [2025-04-02T06:06:31Z]                 assert "\n" not in generated_text
  | [2025-04-02T06:06:31Z]             print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
  | [2025-04-02T06:06:31Z] >           output_json = json.loads(generated_text)
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z] v1/entrypoints/llm/test_struct_output_generate.py:100:
  | [2025-04-02T06:06:31Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
  | [2025-04-02T06:06:31Z] /usr/lib/python3.12/json/__init__.py:346: in loads
  | [2025-04-02T06:06:31Z]     return _default_decoder.decode(s)
  | [2025-04-02T06:06:31Z] /usr/lib/python3.12/json/decoder.py:338: in decode
  | [2025-04-02T06:06:31Z]     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  | [2025-04-02T06:06:31Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z] self = <json.decoder.JSONDecoder object at 0x7f6c45eab710>
  | [2025-04-02T06:06:31Z] s = '{"name":"{1}","age":125,"skills":["^{are an elective course}","^{remote support}","\| التهاب Dass{or}{internalutt}{aur...on reasons, very simplified and doesn\'t fully undergo proper schema check according to specific rules yet appropriate'
  | [2025-04-02T06:06:31Z] idx = 0
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]     def raw_decode(self, s, idx=0):
  | [2025-04-02T06:06:31Z]         """Decode a JSON document from ``s`` (a ``str`` beginning with
  | [2025-04-02T06:06:31Z]         a JSON document) and return a 2-tuple of the Python
  | [2025-04-02T06:06:31Z]         representation and the index in ``s`` where the document ended.
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]         This can be used to decode a JSON document from a string that may
  | [2025-04-02T06:06:31Z]         have extraneous data at the end.
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z]         """
  | [2025-04-02T06:06:31Z]         try:
  | [2025-04-02T06:06:31Z] >           obj, end = self.scan_once(s, idx)
  | [2025-04-02T06:06:31Z] E           json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 3079 (char 3078)
  | [2025-04-02T06:06:31Z]
  | [2025-04-02T06:06:31Z] /usr/lib/python3.12/json/decoder.py:354: JSONDecodeError


See https://buildkite.com/vllm/ci/builds/16776#0195f4d4-1d2a-42c1-98b2-9b05d879956c

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingci/buildstaleOver 90 days of inactivity

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions