Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions docs/features/nixl_connector_usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,9 +126,18 @@ python tests/v1/kv_connector/nixl_integration/toy_proxy_server.py \
- Set when prefiller and decoder are on different machines
- Connection info is passed via KVTransferParams from prefiller to decoder for handshake

- `VLLM_NIXL_ABORT_REQUEST_TIMEOUT`: Timeout (in seconds) for automatically releasing the prefiller’s KV cache for a particular request. (Optional)
- Default: 480
- If a request is aborted and the decoder has not yet read the KV-cache blocks through the nixl channel, the prefill instance will release its KV-cache blocks after this timeout to avoid holding them indefinitely.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking about ... "is it ok to just drop this, what about compatibility?" ... which reminds me ...

We need P and D to upgrade to this in lockstep, so that means we need to update NIXL_CONNECTOR_VERSION

- `VLLM_NIXL_KV_LEASE_DURATION`: Initial lease duration (in seconds) for KV blocks on the prefiller. (Optional)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #25700 - let's model this as config properly. Probably kv_connector_extra_config is the place

The "lease_extension" and "heartbeat_interval" configs are particularly niche - we would probably be fine with a hard-coded lease_extension = lease_duration * 2 /3 and heartbeat_interval = lease_duration / 6. And if anyone really needs to tweak these, we can add that later 👍

- Default: 30
- After a prefill completes, the prefiller holds KV blocks for this duration. The decoder sends periodic heartbeats to extend the lease.
- If no heartbeat is received within the lease duration, blocks are released.

- `VLLM_NIXL_KV_LEASE_EXTENSION`: Lease extension (in seconds) granted per heartbeat. (Optional)
- Default: 20
- Each heartbeat from the decoder extends the lease by this amount.
Comment thread
NickLucche marked this conversation as resolved.

- `VLLM_NIXL_KV_HEARTBEAT_INTERVAL`: Interval (in seconds) at which the decoder sends heartbeats. (Optional)
- Default: 5
- Should be less than `VLLM_NIXL_KV_LEASE_EXTENSION` to ensure timely renewal.

## Multi-Instance Setup

Expand Down
Loading
Loading