[HiCache] Add synchronization for context parallelism by vladnosiv · Pull Request #20460 · sgl-project/sglang

vladnosiv · 2026-03-12T14:55:35Z

HiCache previously synchronized state only within tp_group, which is no longer sufficient after the CP split.
This could cause different CP ranks to make different decisions about prefetch completion/revoke, write-through ack handling, and host-cache updates.

This change passes attn_cp / attn_tp groups into HiCache and switches the relevant sync points to CP-aware reductions/barriers, including storage-prefetch synchronization.

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

gemini-code-assist · 2026-03-12T14:55:40Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

libratiger · 2026-04-02T08:45:20Z

@vladnosiv does this feature improve the performance?

vladnosiv · 2026-04-02T08:56:30Z

@vladnosiv does this feature improve the performance?

Hi ! No, according to my observations, without this change, CP + HiCache (L2) may fail after some time (up to 30-40 minutes according to my observations)
So it's for reliability.

libratiger · 2026-04-02T09:21:12Z

@vladnosiv does this feature improve the performance?

Hi ! No, according to my observations, without this change, CP + HiCache (L2) may fail after some time (up to 30-40 minutes according to my observations) So it's for reliability.

Thanks, is there any script or command to reproduce this situation.

vladnosiv · 2026-04-02T09:36:35Z

Thanks, is there any script or command to reproduce this situation.

I think that such a launch + some cache-heavy traffic like the mooncake dataset in bench_serving should reproduce the crash:

python3 -m sglang.launch_server \
      --model-path deepseek-ai/DeepSeek-V3.2 \
      --trust-remote-code \
      --tp-size 8 \
      --attn-cp-size 8 \
      --enable-nsa-prefill-context-parallel \
      --chat-template examples/chat_template/tool_chat_template_deepseekv32.jinja \
      --mem-fraction-static 0.8 \
      --enable-hierarchical-cache \
      --hicache-ratio 2.0 &> sglang.out

In my case, the repro setup was with P/D + CP on Prefills + HiCache and real prod traffic. After 30-40 minutes, prefills stopped responding to generation requests in any way without an obvious fail.
I can confirm that with this commit, I don't see any problems for a weeks.

I also saw that @whybeyoung worked on CP + PP + P/D + HiCache, maybe he has more information.

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

vladnosiv · 2026-04-16T13:53:23Z

@hzh0425 done

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

vladnosiv · 2026-04-17T09:20:05Z

fixed the flag --moe-dp-size 2 that was lost during the moving of the test

hzh0425 · 2026-04-18T17:38:19Z

/rerun-test test_qwen35_hicache.py test_hicache_storage_mooncake_backend.py test_hicache_storage_file_backend.py

github-actions · 2026-04-18T17:38:55Z

✅ 4-gpu-h100 (1 test): View workflow run

cd test/ && python3 registered/4-gpu-models/test_qwen35_hicache.py

✅ 2-gpu-h100 (2 tests): View workflow run

cd test/ && python3 registered/hicache/test_hicache_storage_mooncake_backend.py
cd test/ && python3 registered/hicache/test_hicache_storage_file_backend.py

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

vladnosiv · 2026-04-22T15:03:19Z

/rerun-failed-ci

vladnosiv · 2026-04-22T15:13:11Z

stage-b test crashed a second time due to hf error 429

ShangmingCai · 2026-04-23T03:16:18Z

/rerun-failed-ci

ShangmingCai · 2026-04-23T03:17:19Z

/rerun-test test_hicache_storage_mooncake_backend.py test_hicache_storage_file_backend.py

hzh0425 · 2026-04-23T06:05:01Z

HiCache's CI has been temporarily removed due to some incompatibilities with the CUDA 13 environment.
We might need to wait until the CI is restored before proceeding further.

ShangmingCai · 2026-04-27T05:25:52Z

Please address conflict

# Conflicts: # python/sglang/srt/mem_cache/hi_mamba_radix_cache.py

vladnosiv · 2026-04-27T10:40:11Z

vladnosiv · 2026-04-27T16:50:47Z

CI is done

) Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

support cp in hicache

614b781

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

vladnosiv requested review from Ying1123, hanming-lu, hnyls2002, hzh0425, ispobock, merrymercy, xiezhq-hermann and yizhang2077 as code owners March 12, 2026 14:55

ShangmingCai assigned xiezhq-hermann Mar 13, 2026

vladnosiv added 2 commits March 17, 2026 13:53

Merge branch 'main' into hicache-and-cp

d4849ed

Merge branch 'main' into hicache-and-cp

ce27bde

whybeyoung added a commit to whybeyoung/sglang that referenced this pull request Mar 20, 2026

support cp in hicache (cherry-pick from PR sgl-project#20460)

81d6931

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

vladnosiv mentioned this pull request Mar 20, 2026

[HiCache] Add CP support for HiCache #20977

Merged

5 tasks

Merge branch 'main' into hicache-and-cp

4ba927a

voipmonitor added a commit to voipmonitor/sglang that referenced this pull request Apr 1, 2026

PR sgl-project#20460: HiCache synchronization for context parallelism

f741ea7

Merge branch 'main' into hicache-and-cp

1640288

hzh0425 mentioned this pull request Apr 2, 2026

[Roadmap]: SGLang Distributed KVCache System For Agentic Workload #21846

Open

25 tasks

Merge branch 'main' into hicache-and-cp

81ded49

xiezhq-hermann added the high priority label Apr 7, 2026

xiezhq-hermann assigned hzh0425 and whybeyoung Apr 7, 2026

vladnosiv added 2 commits April 7, 2026 18:21

Merge branch 'main' into hicache-and-cp

794c777

Merge branch 'main' into hicache-and-cp

4bd0278

voipmonitor pushed a commit to voipmonitor/sglang that referenced this pull request Apr 12, 2026

PR sgl-project#20460

30bf86b

replace tests

2792423

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

hzh0425 added the run-ci label Apr 17, 2026

hzh0425 and others added 2 commits April 17, 2026 10:47

Merge branch 'main' into hicache-and-cp

0a5747d

fix missed moe-dp-size 2 in cp test

fa6499c

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

xiezhq-hermann approved these changes Apr 17, 2026

View reviewed changes

Merge branch 'main' into hicache-and-cp

b273628

hzh0425 approved these changes Apr 18, 2026

View reviewed changes

vladnosiv added 4 commits April 20, 2026 15:53

Merge remote-tracking branch 'origin/main' into hicache-and-cp

d2d05e6

Merge branch 'main' into hicache-and-cp

5a2a4ff

Merge branch 'main' into hicache-and-cp

643b083

fix bug in conflict resolve

c6f6cb3

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

ShangmingCai approved these changes Apr 23, 2026

View reviewed changes

huangtingwei9988 approved these changes Apr 23, 2026

View reviewed changes

sgl-project deleted a comment from github-actions Bot Apr 23, 2026

Merge branch 'main' into hicache-and-cp

49a3cb5

# Conflicts: # python/sglang/srt/mem_cache/hi_mamba_radix_cache.py

ShangmingCai merged commit 28ee08c into sgl-project:main Apr 27, 2026
211 of 232 checks passed

vguduruTT pushed a commit to vguduruTT/sglang that referenced this pull request May 2, 2026

[HiCache] Add synchronization for context parallelism (sgl-project#20460

d8e184f

) Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

icepoint666 mentioned this pull request May 9, 2026

[Bugfix][HiCache]: Use GroupCoordinator.all_gather_object for creating custom group #24744

Open

5 tasks

Conversation

vladnosiv commented Mar 12, 2026

Uh oh!

gemini-code-assist Bot commented Mar 12, 2026

Uh oh!

libratiger commented Apr 2, 2026

Uh oh!

vladnosiv commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

libratiger commented Apr 2, 2026

Uh oh!

vladnosiv commented Apr 2, 2026

Uh oh!

vladnosiv commented Apr 16, 2026

Uh oh!

vladnosiv commented Apr 17, 2026

Uh oh!

hzh0425 commented Apr 18, 2026

Uh oh!

github-actions Bot commented Apr 18, 2026

Uh oh!

vladnosiv commented Apr 22, 2026

Uh oh!

vladnosiv commented Apr 22, 2026

Uh oh!

ShangmingCai commented Apr 23, 2026

Uh oh!

ShangmingCai commented Apr 23, 2026

Uh oh!

hzh0425 commented Apr 23, 2026

Uh oh!

ShangmingCai commented Apr 27, 2026

Uh oh!

vladnosiv commented Apr 27, 2026

Uh oh!

vladnosiv commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

vladnosiv commented Apr 2, 2026 •

edited

Loading