-
-
Notifications
You must be signed in to change notification settings - Fork 16.6k
[Nixl][PD] Lease renewal TTL KV blocks on P #38027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
7b131e4
5e76e7b
4575240
fb06246
0d699ae
487a4f3
fbb85bf
c2861cf
3d7e34b
00285f8
14899ed
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -126,9 +126,18 @@ python tests/v1/kv_connector/nixl_integration/toy_proxy_server.py \ | |
| - Set when prefiller and decoder are on different machines | ||
| - Connection info is passed via KVTransferParams from prefiller to decoder for handshake | ||
|
|
||
| - `VLLM_NIXL_ABORT_REQUEST_TIMEOUT`: Timeout (in seconds) for automatically releasing the prefiller’s KV cache for a particular request. (Optional) | ||
| - Default: 480 | ||
| - If a request is aborted and the decoder has not yet read the KV-cache blocks through the nixl channel, the prefill instance will release its KV-cache blocks after this timeout to avoid holding them indefinitely. | ||
| - `VLLM_NIXL_KV_LEASE_DURATION`: Initial lease duration (in seconds) for KV blocks on the prefiller. (Optional) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See #25700 - let's model this as config properly. Probably The "lease_extension" and "heartbeat_interval" configs are particularly niche - we would probably be fine with a hard-coded |
||
| - Default: 30 | ||
| - After a prefill completes, the prefiller holds KV blocks for this duration. The decoder sends periodic heartbeats to extend the lease. | ||
| - If no heartbeat is received within the lease duration, blocks are released. | ||
|
|
||
| - `VLLM_NIXL_KV_LEASE_EXTENSION`: Lease extension (in seconds) granted per heartbeat. (Optional) | ||
| - Default: 20 | ||
| - Each heartbeat from the decoder extends the lease by this amount. | ||
|
NickLucche marked this conversation as resolved.
|
||
|
|
||
| - `VLLM_NIXL_KV_HEARTBEAT_INTERVAL`: Interval (in seconds) at which the decoder sends heartbeats. (Optional) | ||
| - Default: 5 | ||
| - Should be less than `VLLM_NIXL_KV_LEASE_EXTENSION` to ensure timely renewal. | ||
|
|
||
| ## Multi-Instance Setup | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just thinking about ... "is it ok to just drop this, what about compatibility?" ... which reminds me ...
We need P and D to upgrade to this in lockstep, so that means we need to update
NIXL_CONNECTOR_VERSION