🐛 Fix tiered-prefix-cache CrashLoopBackOff: num_cpu_blocks → cpu_bytes_to_use by clubanderson · Pull Request #768 · llm-d/llm-d

clubanderson · 2026-02-14T01:53:21Z

Summary

Fixes the nightly E2E tiered-prefix-cache/cpu workflow CrashLoopBackOff on OpenShift
vLLM v0.14.1 (PR vllm-project/vllm#24498, merged Jan 12 2026) replaced num_cpu_blocks with cpu_bytes_to_use in the OffloadingConnector's CPUOffloadingSpec
The guide manifest and nightly workflow still used the old num_cpu_blocks parameter, causing: Exception: cpu_bytes_to_use must be specified in kv_connector_extra_config

Changes

Guide manifest (offloading-connector/kustomization.yaml): num_cpu_blocks: 41000 → cpu_bytes_to_use: 107374182400 (100GB)
Nightly slim patch (nightly-e2e-tiered-prefix-cache.yaml): num_cpu_blocks: 4000 → cpu_bytes_to_use: 10737418240 (10GB)
Benchmark docs (README.md): Updated parameter reference

Test plan

Nightly E2E tiered-prefix-cache workflow passes on OpenShift (no more CrashLoopBackOff)
Manual kubectl apply -k guides/tiered-prefix-cache/cpu/manifests/vllm/offloading-connector deploys successfully

…s_to_use vLLM v0.14.1 (PR #24498, merged Jan 12 2026) replaced the `num_cpu_blocks` config key with `cpu_bytes_to_use` in the OffloadingConnector's CPUOffloadingSpec. This causes model server pods to crash immediately with: Exception: cpu_bytes_to_use must be specified in kv_connector_extra_config Update all references: - Guide manifest: 41000 blocks (~100GB) → 107374182400 bytes (100GB) - Nightly slim patch: 4000 blocks (~10GB) → 10737418240 bytes (10GB) - README benchmark section: updated parameter name Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Andrew Anderson <andy@clubanderson.com>

clubanderson · 2026-02-14T01:56:06Z

@Gregory-Pereira ptal!

…cpu_bytes_to_use (#768)" This reverts commit 7e80a18.

…cpu_bytes_to_use (#768)" (#769) This reverts commit 7e80a18.

clubanderson · 2026-02-14T01:58:02Z

this was reverted - will open new pr

clubanderson merged commit 7e80a18 into main Feb 14, 2026
23 of 24 checks passed

clubanderson added a commit that referenced this pull request Feb 14, 2026

Revert "🐛 Fix tiered-prefix-cache CrashLoopBackOff: num_cpu_blocks → …

3a09306

…cpu_bytes_to_use (#768)" This reverts commit 7e80a18.

clubanderson mentioned this pull request Feb 14, 2026

⚠️ Revert: Fix tiered-prefix-cache CrashLoopBackOff (#768) #769

Merged

clubanderson added a commit that referenced this pull request Feb 14, 2026

Revert "🐛 Fix tiered-prefix-cache CrashLoopBackOff: num_cpu_blocks → …

647a589

…cpu_bytes_to_use (#768)" (#769) This reverts commit 7e80a18.

clubanderson mentioned this pull request Feb 26, 2026

Governance, CI, and nightly E2E hygiene tracker (Feb 12–26) #853

Open

24 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 Fix tiered-prefix-cache CrashLoopBackOff: num_cpu_blocks → cpu_bytes_to_use#768

🐛 Fix tiered-prefix-cache CrashLoopBackOff: num_cpu_blocks → cpu_bytes_to_use#768
clubanderson merged 1 commit intomainfrom
fix/tpc-cpu-bytes-to-use

clubanderson commented Feb 14, 2026

Uh oh!

Uh oh!

clubanderson commented Feb 14, 2026

Uh oh!

clubanderson commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

clubanderson commented Feb 14, 2026

Summary

Changes

Test plan

Uh oh!

Uh oh!

clubanderson commented Feb 14, 2026

Uh oh!

clubanderson commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant