[feature] : add cache-dit for stable-audio-open-1.0 by akshatvishu · Pull Request #1341 · vllm-project/vllm-omni

akshatvishu · 2026-02-11T21:03:02Z

Part of #1217

Purpose

Add cache-dit support for stable audio open 1.0

Test Plan

    omni = Omni(
        model=MODEL_PATH,
        dtype="float16",
        num_workers=1,
        cache_backend=cache_backend,
        cache_config=cache_config
    )
}

sampling_params = OmniDiffusionSamplingParams(
    num_inference_steps=100,
    guidance_scale=7.0,
    seed=42,
    extra_args={"audio_end_in_s": 10.0}
)

outputs = omni.generate(
    {"prompt": "The sound of a hammer hitting a wooden surface", "negative_prompt": "Low quality, noisy"},
    sampling_params
)

full comprehensive testing can be found in this kaggle_notebook

Test Result

Device: cuda
GPU: NVIDIA Tesla T4
Prompt : The sound of a hammer hitting a wooden surface
num_inference_steps=100
guidance_scale=7.0,
max_audio_length = 10 seconds

Baseline:

Configuration	Time	Speed Up(vs baseline)	file (mp3)
Baseline( OMNI)	25.91 s	-	baseline.mp3
Baseline (HF Diffuser )	27.78s	-	baseline_hf.mp3

Config1:

    "configA_balanced": {
        "Fn_compute_blocks": 2,
        "Bn_compute_blocks": 0,
        "max_warmup_steps": 4,
        "residual_diff_threshold": 0.22,
        "max_continuous_cached_steps": 3,
        "enable_taylorseer": True,
        "taylorseer_order": 1,
    },

Configuration	Time	Speed Up(vs baseline)	file (mp3)
OMNI	22.08s	1.17x	configA_balanced.mp3
HF Diffuser + CacheDit	24.69s	1.13x	configA_balanced_hf.mp3

Config2:

    "configB_aggressive": {
        "Fn_compute_blocks": 1,
        "Bn_compute_blocks": 0,
        "max_warmup_steps": 3,
        "residual_diff_threshold": 0.30,
        "max_continuous_cached_steps": 5,
        "enable_taylorseer": True,
        "taylorseer_order": 1,
    },

Configuration	Time	Speed Up(vs baseline)	file (mp3)
OMNI	20.15 s	1.29x	configB_aggressive.mp3
HF Diffuser + CacheDit	24.05s	1.16x	configB_aggressive_hf.mp3

Config3:

    "configC_ultra": {
        "Fn_compute_blocks": 1,
        "Bn_compute_blocks": 0,
        "max_warmup_steps": 2,
        "residual_diff_threshold": 0.35,
        "max_continuous_cached_steps": 6,
        "enable_taylorseer": True,
        "taylorseer_order": 2,
    }
}

Configuration	Time	Speed Up(vs baseline)	file (mp3)
OMNI	19.16 s	1.35x	configC_ultra.mp3
HF Diffuser + CacheDit	20.90s	1.33x	configC_ultra_hf.mp3

Files are in .mp3 format as github doesn't support .wav in comments.

Note :

Stable Audio Open 1.0 exhibits a high natural step-to-step drift (median residual ≈0.34) as seen in cache-dit.summary() when running the same config as vllm-omni in hf diffuser+cache-dit setup. To achieve significant speedups on T4 hardware, it is necessary to use a residual_diff_threshold near or above this drift value as using conservative residual_diff_threshold like 0.12 resulted in 1.00x speedup (or even slowdowns) because the cache missed on nearly every step, leaving only the management overhead without any compute savings.
The vllm-omni orchestrator performs a 1-step dummy warmup run during server initialization, If a user provides an SCM (Step Computation Masking) policy, the engine crashes with the following error:

AssertionError: Only total_steps=4 or 6 is supported for predefined masks while total_steps < 8. Got total_steps=1.

Thus, I am wondering if we should a guard condition like below or it's an acceptable behavior.

def refresh_cache_context(pipeline: Any, num_inference_steps: int, verbose: bool = True) -> None:
    """
    Refresh cache context. 
    Guards against 1-step dummy warmup causing SCM mask generation errors.
    """
    # Disable SCM policy for the 1-step dummy warmup to prevent AssertionError
    effective_mask_policy = cache_config.scm_steps_mask_policy if num_inference_steps > 1 else None

Also added the missing _repeated_blocks = ["StableAudioDiTBlock"] to StableAudioDiTModel to enable regional compilation and backend patching.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cf50517d5d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

hsliuustc0106 · 2026-02-12T07:11:01Z

fix DCO please

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

…g warmup" This reverts commit e4c5a1f. Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

akshatvishu · 2026-02-12T16:02:30Z

@hsliuustc0106 Sorry! I've updated it !

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

…p/cache-dit Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

…tion Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

lishunyang12

Good addition -- the backend code looks correct after the Pattern_3 + cache_config fixes, but the docs table has a column count mismatch that will render broken.

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

hsliuustc0106 · 2026-02-24T07:07:42Z

@vllm-omni-reviewer

Gaohan123 · 2026-03-17T15:12:05Z

@akshatvishu Please resolve reviews and conficts. Thanks!

linyueqian · 2026-03-24T05:02:36Z

@akshatvishu any updates?

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

akshatvishu · 2026-03-24T13:11:09Z

@linyueqian Ready to go from my side ! Happy to conduct any more test if needed!

akshatvishu · 2026-03-24T13:56:11Z

mkdocs ci were failing due to :

ERROR - mkdocstrings: Couldn't load inventory https://psutil.readthedocs.io/en/stable/objects.inv through handler 'python': HTTP Error 404: Not Found

Since main already has the fix, pulled the latest changes.

hsliuustc0106 · 2026-04-09T07:19:05Z

resolve conflicts please

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

akshatvishu · 2026-04-09T09:05:14Z

@hsliuustc0106 done!

SamitHuang

LGTM

akshatvishu · 2026-04-11T12:41:08Z

thanks for all the reviews @linyueqian !

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

akshatvishu requested a review from hsliuustc0106 as a code owner February 11, 2026 21:03

akshatvishu mentioned this pull request Feb 11, 2026

[RFC]: Continuous Diffusion Model Acceleration Support #1217

Open

1 task

chatgpt-codex-connector Bot reviewed Feb 11, 2026

View reviewed changes

Comment thread vllm_omni/diffusion/cache/cache_dit_backend.py

SamitHuang added the ready label to trigger buildkite CI label Feb 12, 2026

hsliuustc0106 removed the ready label to trigger buildkite CI label Feb 12, 2026

akshatvishu added 4 commits February 12, 2026 19:26

feat(diffusion): add cache-dit support for Stable Audio Open 1.0

c5dbef6

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

fix(cache-dit): add step guard to prevent sao scm crash during warmup

5212ac6

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

Revert "fix(cache-dit): add step guard to prevent sao scm crash durin…

adadbc1

…g warmup" This reverts commit e4c5a1f. Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

docs: update Cache-DiT entry

4569fd7

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

akshatvishu force-pushed the cache-dit-sao branch from cf50517 to 4569fd7 Compare February 12, 2026 13:56

akshatvishu added 3 commits February 12, 2026 21:45

correctly initialize Stable Audio cache context

6390191

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

Fix StableAudio CacheDiT forward_pattern to Pattern_3 to match vipsho…

2667215

…p/cache-dit Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

feat: add _repeated_blocks to Stable Audio DiT for Cache-DiT accelera…

3dfc0b2

…tion Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

lishunyang12 reviewed Feb 21, 2026

View reviewed changes

Comment thread docs/user_guide/diffusion_acceleration.md Outdated

Comment thread docs/user_guide/diffusion_acceleration.md Outdated

Comment thread vllm_omni/diffusion/cache/cache_dit_backend.py Outdated

Comment thread vllm_omni/diffusion/cache/cache_dit_backend.py

akshatvishu added 3 commits February 23, 2026 18:44

fix: scm 1 step bug

91c217c

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

fix2

f631e71

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

fix: prevent SCM assertion during vLLM 1-step profiling run

c658144

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

Gaohan123 added this to the v0.18.0 milestone Mar 17, 2026

akshatvishu added 2 commits March 24, 2026 18:20

Merge branch 'main' into cache-dit-sao

e4d3c98

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

fix stable audio cache-dit scm refresh config mutation

66e25b1

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

Merge branch 'main' into cache-dit-sao

b565f9d

linyueqian added the ready label to trigger buildkite CI label Mar 24, 2026

zhangj1an mentioned this pull request Mar 31, 2026

[Test] Add Stable Audio offline e2e TeaCache Test #2377

Merged

5 tasks

akshatvishu added 2 commits April 9, 2026 14:26

Merge remote-tracking branch 'origin/main' into cache-dit-sao

0d01fc2

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

Merge remote-tracking branch 'upstream/main' into cache-dit-sao

58e011f

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

SamitHuang approved these changes Apr 11, 2026

View reviewed changes

SamitHuang merged commit c9e8411 into vllm-project:main Apr 11, 2026
7 of 8 checks passed

daixinning pushed a commit to daixinning/vllm-omni that referenced this pull request Apr 13, 2026

[feature] : add cache-dit for stable-audio-open-1.0 (vllm-project#1341)

b93df54

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

lengrongfu pushed a commit to lengrongfu/vllm-omni that referenced this pull request May 1, 2026

[feature] : add cache-dit for stable-audio-open-1.0 (vllm-project#1341)

54229cc

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

linyueqian mentioned this pull request May 4, 2026

[Bug] [CI failure]: CUDA illegal memory access during dummy run for stabilityai/stable-audio-open-1.0 #3334

Closed

1 task

clodaghwalsh17 pushed a commit to clodaghwalsh17/nm-vllm-omni-ent that referenced this pull request May 12, 2026

[feature] : add cache-dit for stable-audio-open-1.0 (vllm-project#1341)

f4f853a

Signed-off-by: akshatvishu <akshatnayak197@gmail.com>

Conversation

akshatvishu commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Baseline:

Config1:

Config2:

Config3:

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

hsliuustc0106 commented Feb 12, 2026

Uh oh!

akshatvishu commented Feb 12, 2026

Uh oh!

lishunyang12 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hsliuustc0106 commented Feb 24, 2026

Uh oh!

Gaohan123 commented Mar 17, 2026

Uh oh!

linyueqian commented Mar 24, 2026

Uh oh!

akshatvishu commented Mar 24, 2026

Uh oh!

akshatvishu commented Mar 24, 2026

Uh oh!

hsliuustc0106 commented Apr 9, 2026

Uh oh!

akshatvishu commented Apr 9, 2026

Uh oh!

SamitHuang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

akshatvishu commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

akshatvishu commented Feb 11, 2026 •

edited

Loading