Skip to content

Commit 26b3b60

Browse files
authored
docs: change sglang hicache example to use hicache-ratio (#2582)
1 parent bc290e7 commit 26b3b60

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

components/backends/sglang/docs/sgl-hicache-example.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,15 +15,15 @@ python -m dynamo.sglang \
1515
--host 0.0.0.0 --port 8000 \
1616
--page-size 64 \
1717
--enable-hierarchical-cache \
18-
--hicache-size 30 \
18+
--hicache-ratio 2 \
1919
--hicache-write-policy write_through \
2020
--hicache-storage-backend nixl \
2121
--log-level debug \
2222
--skip-tokenizer-init
2323
```
2424

2525
- **--enable-hierarchical-cache**: Enables hierarchical KV cache/offload
26-
- **--hicache-size**: HiCache capacity in GB of pinned host memory (upper bound of offloaded KV to CPU)
26+
- **--hicache-ratio**: The ratio of the size of host KV cache memory pool to the size of device pool. Lower this number if your machine has less CPU memory.
2727
- **--hicache-write-policy**: Write policy (e.g., `write_through` for synchronous host writes)
2828
- **--hicache-storage-backend**: Host storage backend for HiCache (e.g., `nixl`). NIXL selects the concrete store automatically; see [PR #8488](https://github.com/sgl-project/sglang/pull/8488)
2929

0 commit comments

Comments
 (0)