From 17643dd988a8a1f539fa4f12595cbadf8540565e Mon Sep 17 00:00:00 2001 From: Yuanheng Zhao Date: Mon, 2 Mar 2026 23:38:19 +0800 Subject: [PATCH] fix doc link Signed-off-by: Yuanheng Zhao --- docs/configuration/README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/configuration/README.md b/docs/configuration/README.md index 02440b95dce..e2073a96ae9 100644 --- a/docs/configuration/README.md +++ b/docs/configuration/README.md @@ -2,7 +2,7 @@ This section lists the most common options for running vLLM-Omni. -For options within a vLLM Engine. Please refer to [vLLM Configuration](https://docs.vllm.ai/en/v0.14.0/configuration/index.html) +For options within a vLLM Engine. Please refer to [vLLM Configuration](https://docs.vllm.ai/en/v0.16.0/configuration/index.html) Currently, the main options are maintained by stage configs for each model. @@ -19,3 +19,4 @@ For introduction, please check [Introduction for stage config](./stage_configs.m - **[TeaCache Configuration](../user_guide/diffusion/teacache.md)** - Enable TeaCache adaptive caching for DiT models to achieve 1.5x-2.0x speedup with minimal quality loss - **[Cache-DiT Configuration](../user_guide/diffusion/cache_dit_acceleration.md)** - Enable Cache-DiT as cache acceleration backends for DiT models - **[Parallelism Configuration](../user_guide/diffusion/parallelism_acceleration.md)** - Enable parallelism (e.g., sequence parallelism) for for DiT models +- **[CPU Offloading](../user_guide/diffusion/cpu_offload_diffusion.md)** - Enable CPU offloading (model-level and layerwise) for for DiT models