[Feature] New structural tag support#10691
Conversation
Summary of ChangesHello @DarkSharpness, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the structural tag support within the system. It introduces a new, more flexible structural tag format while carefully preserving compatibility with the existing legacy format. This is achieved through strategic renaming, the addition of a format detection utility, and updated dispatch logic in the grammar backends. The changes also include improved type safety in the OpenAI protocol and comprehensive new tests to validate the new functionality. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces support for a new structural tag format, ensuring backward compatibility with the old format. The changes are well-structured, including updates to data protocols, backend logic, and the addition of new tests. My review focuses on improving the robustness of the validation logic for structural tags and fixing a potential unhandled exception. I've also pointed out a minor issue in the new test file's documentation for better maintainability.
|
Note: this PR needs a newer release of xgrammar |
There was a problem hiding this comment.
Awesome. Thank you so much for your support.
Could you also add some introductory text to https://docs.sglang.ai/advanced_features/structured_outputs.html#id4? I believe #10497 has updated the doc on examples, but would be great to also talk about what this is and link to the structural tag docs at https://xgrammar.mlc.ai/docs/tutorials/structural_tag.html
|
New version of xgrammar is released at https://github.com/mlc-ai/xgrammar/releases/tag/v0.1.25 |
CatherineSue
left a comment
There was a problem hiding this comment.
Can we remove the legacy structural tag for all? Or do we need to keep it for llguidace_backend?
I'm not familar with |
|
@CatherineSue Thanks for the comment! I think the old structural tag is considered legacy because its functionality is fully covered by the new structural tag. I think we can keep it for a while for compatibility and consider removing it later. |
|
It seems the Execute Notebook is failing. Can you take a look? @DarkSharpness |
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds explicit support and examples for structural_tag formats: introduces legacy vs. latest detection, updates backends to dispatch accordingly, refines OpenAI protocol types and sampling parameters, adjusts function-call parsing to legacy format, and adds tests validating structural_tag behavior via the xgrammar backend. Documentation notebook includes extensive usage examples. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client as Client
participant OpenAIAPI as OpenAI-compatible Server
participant Protocol as Protocol Layer<br/>(ToolCallConstraint)
participant Parser as FunctionCall Parser
participant Backend as Constrained Backend<br/>(xgrammar/llguidance)
participant Compiler as Grammar Compiler
Client->>OpenAIAPI: chat.completions (response_format=structural_tag)
OpenAIAPI->>Protocol: Normalize & type-check request
Protocol-->>OpenAIAPI: ToolCallConstraint (structural_tag or json_schema)
OpenAIAPI->>Parser: get_structure_constraint(tool_choice)
Parser-->>OpenAIAPI: ("structural_tag", LegacyStructuralTagResponseFormat)
OpenAIAPI->>Backend: dispatch_structural_tag(structural_tag)
alt Legacy format (structures+triggers)
Backend->>Compiler: compile_structural_tag(tags, triggers)
else New key string format
Backend->>Compiler: compile_structural_tag(key_string)
end
Compiler-->>Backend: Compiled grammar or INVALID
Backend-->>OpenAIAPI: Constraint object or INVALID
OpenAIAPI->>OpenAIAPI: Generate with constraints
OpenAIAPI-->>Client: Completion (tool-call constrained)
sequenceDiagram
autonumber
participant Client
participant llguidance as llguidance_backend
participant Util as utils.is_legacy_structural_tag
Client->>llguidance: dispatch_structural_tag(structural_tag)
llguidance->>Util: is_legacy_structural_tag?
alt Non-legacy
llguidance-->>Client: INVALID_GRAMMAR_OBJ (assertion path)
else Legacy
llguidance-->>Client: Construct StructTag objects (legacy flow)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
Co-authored-by: Aaron Yee <110548922+aaronyeeio@users.noreply.github.com>
17de942 to
ba71eef
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
docs/advanced_features/structured_outputs.ipynb (1)
252-256: Syntax error: split string literal breaks the tool schemaThe "description" for state is split across two lines, leaving a stray string literal and causing a SyntaxError when executing the notebook. Merge into a single line.
- "description": "the two-letter abbreviation for the state that the city is" - " in, e.g. 'CA' which would mean 'California'", + "description": "the two-letter abbreviation for the state that the city is in, e.g. 'CA' which would mean 'California'",python/sglang/srt/constrained/llguidance_backend.py (1)
164-176: Avoid assert for input validation; validate triggers handling
- Do not use assert for request validation; it can be disabled. Raise ValueError instead.
- LLGuidance path uses only triggers[0]; either validate a single trigger or support multiple explicitly.
- structural_tag = json.loads(key_string) - assert is_legacy_structural_tag(structural_tag) + structural_tag = json.loads(key_string) + if not is_legacy_structural_tag(structural_tag): + raise ValueError( + "llguidance backend supports only legacy structural_tag (structures+triggers). " + "Use XGrammar backend for the new 'format' variant." + ) + if not structural_tag.get("triggers") or len(structural_tag["triggers"]) != 1: + raise ValueError( + "llguidance backend currently requires exactly one trigger." + ) tags = [ StructTag( begin=structure["begin"], grammar=structure["schema"], end=structure["end"], - trigger=structural_tag["triggers"][0], # TODO? + trigger=structural_tag["triggers"][0], ) for structure in structural_tag["structures"] ] g = StructTag.to_grammar(tags) return self._from_serialized(g) - except Exception as e: + except (ValueError, Exception) as e: logging.error(f"Hit invalid structural_tag: {key_string=}, {e=}") return INVALID_GRAMMAR_OBJ
♻️ Duplicate comments (2)
python/sglang/srt/constrained/xgrammar_backend.py (1)
245-261: Handle ValueError, and guard optional schemaCatching ValueError keeps this aligned with stricter validation. Also default schema to {} if missing.
- structural_tag = json.loads(key_string) + structural_tag = json.loads(key_string) if is_legacy_structural_tag(structural_tag): tags = [ StructuralTagItem( begin=structure["begin"], - schema=json.dumps(structure["schema"]), + schema=json.dumps(structure.get("schema") or {}), end=structure["end"], ) for structure in structural_tag["structures"] ] ctx = self.grammar_compiler.compile_structural_tag( tags, structural_tag["triggers"] ) else: ctx = self.grammar_compiler.compile_structural_tag(key_string) - except (RuntimeError, json.decoder.JSONDecodeError) as e: + except (RuntimeError, json.decoder.JSONDecodeError, ValueError) as e: logging.error(f"Hit invalid structural_tag: {key_string=}, {e=}") return INVALID_GRAMMAR_OBJpython/sglang/srt/constrained/utils.py (1)
7-12: Replace asserts with explicit validation (robust under -O)Use explicit checks and raise ValueError for malformed structural_tag objects. This avoids disabled-assert pitfalls.
- if obj.get("structures", None) is not None: - assert obj.get("triggers", None) is not None - return True - else: - assert obj.get("format", None) is not None - return False + if obj.get("structures") is not None: + if obj.get("triggers") is None: + raise ValueError( + "Legacy structural_tag with 'structures' must also include 'triggers'." + ) + return True + if obj.get("format") is None: + raise ValueError("New structural_tag must include a 'format' key.") + return False
🧹 Nitpick comments (5)
docs/advanced_features/structured_outputs.ipynb (1)
641-690: Add timeout to requests.postExamples should include a reasonable timeout to avoid hanging in docs.
- response = requests.post(f"http://localhost:{port}/generate", json=payload) + response = requests.post(f"http://localhost:{port}/generate", json=payload, timeout=30)python/sglang/srt/function_call/function_call_parser.py (1)
154-160: Deterministic trigger orderingSets are unordered; convert to a sorted list to avoid nondeterministic payloads and flaky tests.
- return LegacyStructuralTagResponseFormat( + return LegacyStructuralTagResponseFormat( type="structural_tag", structures=tool_structures, - triggers=list(tool_trigger_set), + triggers=sorted(tool_trigger_set), )python/sglang/srt/entrypoints/openai/protocol.py (2)
639-642: Robust serialization for StructuralTag (dict/dataclass/Pydantic)Don’t assume .model_dump; support dict or dataclass objects too.
- sampling_params["structural_tag"] = convert_json_schema_to_str( - self.response_format.model_dump(by_alias=True) - ) + obj = self.response_format + payload = ( + obj.model_dump(by_alias=True) if hasattr(obj, "model_dump") + else (obj if isinstance(obj, dict) else getattr(obj, "__dict__", obj)) + ) + sampling_params["structural_tag"] = convert_json_schema_to_str(payload)
654-662: Apply same robust handling to tool_call_constraintMirror the above when injecting structural_tag from tool_call_constraint.
- if constraint_type == "structural_tag": - sampling_params[constraint_type] = convert_json_schema_to_str( - constraint_value.model_dump(by_alias=True) - ) + if constraint_type == "structural_tag": + obj = constraint_value + payload = ( + obj.model_dump(by_alias=True) if hasattr(obj, "model_dump") + else (obj if isinstance(obj, dict) else getattr(obj, "__dict__", obj)) + ) + sampling_params[constraint_type] = convert_json_schema_to_str(payload)test/srt/openai_server/features/test_structural_tag.py (1)
82-122: Consider validating additional schema constraints.The test validates the types of returned fields but doesn't verify:
- The pattern constraint
^[\\w]+$on thenamefield- That no additional properties exist (as specified by
additionalProperties: False)- That required fields are actually present
While the current type checks are sufficient for basic validation, adding these assertions would provide more comprehensive test coverage for the structural_tag feature.
Example enhancement:
js_obj = json.loads(text) # Validate required fields self.assertIn("name", js_obj) self.assertIn("population", js_obj) # Validate types self.assertIsInstance(js_obj["name"], str) self.assertIsInstance(js_obj["population"], int) # Validate pattern constraint import re self.assertRegex(js_obj["name"], r"^[\w]+$") # Validate no additional properties self.assertEqual(set(js_obj.keys()), {"name", "population"})
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
docs/advanced_features/structured_outputs.ipynb(3 hunks)python/sglang/srt/constrained/llguidance_backend.py(2 hunks)python/sglang/srt/constrained/utils.py(1 hunks)python/sglang/srt/constrained/xgrammar_backend.py(2 hunks)python/sglang/srt/entrypoints/openai/protocol.py(6 hunks)python/sglang/srt/function_call/function_call_parser.py(4 hunks)test/srt/openai_server/features/test_structural_tag.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
test/srt/openai_server/features/test_structural_tag.py (2)
python/sglang/srt/utils/common.py (1)
kill_process_tree(967-1003)python/sglang/test/test_utils.py (2)
CustomTestCase(1611-1619)popen_launch_server(501-642)
python/sglang/srt/constrained/xgrammar_backend.py (1)
python/sglang/srt/constrained/utils.py (1)
is_legacy_structural_tag(4-12)
python/sglang/srt/constrained/llguidance_backend.py (1)
python/sglang/srt/constrained/utils.py (1)
is_legacy_structural_tag(4-12)
python/sglang/srt/function_call/function_call_parser.py (1)
python/sglang/srt/entrypoints/openai/protocol.py (4)
LegacyStructuralTagResponseFormat(133-136)StructuresResponseFormat(126-129)Tool(419-423)ToolChoice(432-436)
🪛 Ruff (0.13.3)
docs/advanced_features/structured_outputs.ipynb
355-355: Probable use of requests call without timeout
(S113)
388-388: Probable use of requests call without timeout
(S113)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)
- GitHub Check: performance-test-1-gpu-part-3
- GitHub Check: accuracy-test-1-gpu
- GitHub Check: unit-test-frontend
- GitHub Check: unit-test-backend-2-gpu (1)
- GitHub Check: performance-test-1-gpu-part-1
- GitHub Check: performance-test-1-gpu-part-2
- GitHub Check: unit-test-backend-2-gpu (0)
- GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 11)
- GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 4)
- GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 9)
- GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 1)
- GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 2)
- GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 8)
- GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 10)
- GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 6)
- GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 3)
- GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 0)
- GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 5)
- GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 7)
- GitHub Check: mla-test-1-gpu-amd (linux-mi325-gpu-1)
- GitHub Check: bench-test-2-gpu-amd (linux-mi325-gpu-2)
- GitHub Check: unit-test-backend-2-gpu-amd (linux-mi325-gpu-2)
- GitHub Check: unit-test-sgl-kernel-amd (linux-mi325-gpu-1)
- GitHub Check: performance-test-1-gpu-part-2-amd (linux-mi325-gpu-1)
- GitHub Check: performance-test-1-gpu-part-1-amd (linux-mi325-gpu-1)
- GitHub Check: accuracy-test-2-gpu-amd (linux-mi325-gpu-2)
- GitHub Check: accuracy-test-1-gpu-amd (linux-mi325-gpu-1)
- GitHub Check: run-all-notebooks
- GitHub Check: vllm-dependency-test
- GitHub Check: build-test (all)
- GitHub Check: lint
🔇 Additional comments (9)
docs/advanced_features/structured_outputs.ipynb (2)
352-394: Latest structural_tag example looks correctUses begin/end and triggered_tags per XGrammar v0.1.25.
Ensure your runtime has xgrammar >= 0.1.25 (begin field). Based on learnings
922-971: LGTM: SRT latest structural_tag exampleAPI shape matches triggered_tags format and mirrors the OpenAI path.
python/sglang/srt/function_call/function_call_parser.py (1)
126-160: LGTM: Legacy structural_tag constructionType hints and return path align with LegacyStructuralTagResponseFormat and ToolCallConstraint.
python/sglang/srt/entrypoints/openai/protocol.py (2)
596-597: to_sampling_params signature change LGTMThe ToolCallConstraint type integration is consistent across callers.
1157-1158: Updated MessageProcessingResult type LGTMtool_call_constraint now typed precisely.
test/srt/openai_server/features/test_structural_tag.py (4)
5-18: LGTM!All imports are appropriate and necessary for the test implementation.
21-37: LGTM!The setup helper correctly configures and launches the test server with the specified grammar backend.
40-51: LGTM!The test class setup follows proper unittest patterns with appropriate resource management. The use of
Anytype for the process attribute is acceptable given the subprocess.Popen return type complexity.
53-80: LGTM!The test effectively validates const_string constraint enforcement. Using an intentionally incorrect answer is a good design choice to verify that the structural_tag constraint is respected regardless of semantic correctness.
|
Will approve after all CI passes |
|
@DarkSharpness @hnyls2002 The PR is merged even fails the lint check https://github.com/sgl-project/sglang/actions/runs/18649166067/job/53162947593 :( |
|
@airMeng That is because the lint rules on main have been updated over the last two days. Thanks. |
|
Thank you so much @DarkSharpness @CatherineSue @hnyls2002! |
Motivation
structural tag
Modifications
We support the new structural tag while maintaining backward compatibility with the legacy one.
StructuralTagResponseFormattoLegacyStructuralTagResponseFormat.StructuralTag.Accuracy Tests
Benchmarking and Profiling
Checklist
Summary by CodeRabbit
New Features
Documentation
Tests