[Spec][Ngram] 5/N: Store and advance anchor match state across decode steps by kpham-sgl · Pull Request #21243 · sgl-project/sglang

kpham-sgl · 2026-03-24T00:39:25Z

Motivation

Part of Ngram refactoring series #21052
Following #21225

Previously, match() is O(D^2) at every decode steps where D is the max_trie_depth. One observation is, since we always append to the sequence during decode, we can store a list of MatchState (which corresponds to an anchor) from previous decode step and advance them in O(1) for each anchor

Modifications

Keep per-request NGRAM anchor state across decode steps instead of rebuilding every suffix match from trie root each time.
Add MatchState plus versioned NodeRef so cached anchors can be advanced safely across decode steps and invalidated correctly after eviction or reset.
Make Trie::match() stateful: infer the appended suffix from the current tail and total_len, advance cached anchors when valid, and rebuild when state is stale.
Preserve existing BFS / PROB draft construction behavior while swapping anchor collection to the new stateful matcher.
Move per-request match-state ownership into Ngram, keyed by req_id, with explicit cleanup on request finish/reset.
Simplify the public matching API to batchMatch(req_ids, tokens, total_lens) / batch_get(req_ids, batch_tokens, total_lens) by removing explicit appended-token plumbing.
Simplify NGRAMWorker integration so it only passes the trimmed tail and full request length, without mutating Req or keeping extra Python-side match state.
Add regression coverage for incremental-vs-stateless equivalence, leaf-anchor expansion, and stale-state rebuild after eviction.

Accuracy Tests

Passed python3 -m pytest -q test/registered/spec/utils/test_ngram_corpus.py and python3 -m pytest -q test/registered/spec/test_ngram_speculative_decoding.py

Benchmarking and Profiling

[TODO]

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review Process

Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
After green CI and required approvals, ask Merge Oncalls to merge.

gemini-code-assist · 2026-03-24T00:39:30Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

…-decode-steps Resolve merge conflicts adapting the stateful ngram MatchState feature to the upstream's TVM FFI binding (replacing pybind11): - trie.h/trie.cpp: keep HEAD's MatchState, NodeRef, and incremental anchor advancement logic (auto-resolved) - ngram.h/ngram.cpp: add stateless batchMatch(tokens) overload for FFI, switch stateful overload to int64_t keys (FFI-compatible) - ngram_corpus_ffi.cpp: add batch_match_stateful and erase_match_state FFI methods - jit_kernel/ngram_corpus.py: add match_stateful and erase_states to the FFI wrapper - srt/speculative/cpp_ngram/ngram_corpus.py: use get_ngram_corpus_cls() with a req_id-to-state_id mapping layer for the stateful path - Remove old pybind11 ngram_corpus_binding.cpp (superseded by FFI) - test_ngram_corpus.py: resolve to use _batch_get helper, remove stale _raw_batch_match Made-with: Cursor

hnyls2002 · 2026-04-06T04:40:28Z

/rerun-test test_ngram_corpus test_ngram_speculative_decoding

github-actions · 2026-04-06T04:40:51Z

❌ test_ngram_corpus: No test file found matching test_ngram_corpus under test/registered/.

❌ test_ngram_speculative_decoding: No test file found matching test_ngram_speculative_decoding under test/registered/.

hnyls2002 · 2026-04-06T04:41:35Z

/rerun-test test/registered/spec/utils/test_ngram_corpus.py test/registered/spec/test_ngram_speculative_decoding.py

github-actions · 2026-04-06T04:42:13Z

✅ 1-gpu-5090: View workflow run

cd test/ && python3 registered/spec/utils/test_ngram_corpus.py

✅ 1-gpu-h100: View workflow run

cd test/ && python3 registered/spec/test_ngram_speculative_decoding.py

hnyls2002 · 2026-04-06T04:51:21Z

/tag-and-rerun-ci

Adapt SAM external corpus feature to the new ngram architecture: - Move suffix_automaton.{h,cpp} to jit_kernel/csrc/ngram_corpus/ - Use TVM FFI instead of pybind11 for SAM loading methods - Change match_state_ key from std::string to int64_t state_ids - Add SAM budget splitting to stateful batchMatch overload - Wire external corpus params through FFI constructor - Keep both stateless and stateful batchMatch overloads

… steps (sgl-project#21243)

… steps (#21243)

kpham-sgl added 6 commits March 23, 2026 19:59

remove min max match window

4926b37

lint

d30835f

misc

36ed24c

increment anchor after every decode steps instead of rematching

0d1d60f

lint

1951002

nit

0e2d349

kpham-sgl requested review from Ying1123, hnyls2002 and merrymercy as code owners March 24, 2026 00:39

github-actions bot added documentation Improvements or additions to documentation lora speculative-decoding labels Mar 24, 2026

This was referenced Mar 24, 2026

[Roadmap] Further Ngram Speculative Decoding Support #21052

Open

[Spec][Ngram] 6/N: Load an external corpus and construct a Suffix Automaton #21425

Merged

kpham-sgl requested review from BBuf, DarkSharpness, HydraQYH, celve and yuan-luo as code owners April 4, 2026 00:37

github-actions bot added the jit-kernel label Apr 4, 2026

tiny fix lint

67140c5

github-actions bot added the run-ci label Apr 6, 2026

hnyls2002 merged commit b2008bf into sgl-project:main Apr 6, 2026
101 of 156 checks passed

hnyls2002 mentioned this pull request Apr 6, 2026

[Spec][Ngram] Followup fixes for MatchState incremental advance #22180

Merged

JustinTong0323 pushed a commit to JustinTong0323/sglang that referenced this pull request Apr 7, 2026

[Spec][Ngram] 5/N: Store and advance anchor match state across decode…

574f6d7

… steps (sgl-project#21243)

Fridge003 pushed a commit that referenced this pull request Apr 7, 2026

[Spec][Ngram] 5/N: Store and advance anchor match state across decode…

7aae012

… steps (#21243)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Spec][Ngram] 5/N: Store and advance anchor match state across decode steps#21243

[Spec][Ngram] 5/N: Store and advance anchor match state across decode steps#21243
hnyls2002 merged 8 commits intosgl-project:mainfrom
kpham-sgl:kp/maintain-per-anchor-matching-state-across-decode-steps

kpham-sgl commented Mar 24, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Mar 24, 2026

Uh oh!

hnyls2002 commented Apr 6, 2026

Uh oh!

github-actions bot commented Apr 6, 2026

Uh oh!

hnyls2002 commented Apr 6, 2026

Uh oh!

github-actions bot commented Apr 6, 2026

Uh oh!

hnyls2002 commented Apr 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kpham-sgl commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

Uh oh!

gemini-code-assist bot commented Mar 24, 2026

Uh oh!

hnyls2002 commented Apr 6, 2026

Uh oh!

github-actions bot commented Apr 6, 2026

Uh oh!

hnyls2002 commented Apr 6, 2026

Uh oh!

github-actions bot commented Apr 6, 2026

Uh oh!

hnyls2002 commented Apr 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kpham-sgl commented Mar 24, 2026 •

edited

Loading