Skip to content

[Bugfix] Set separate CFG flag in Helios for CacheDiT#3756

Merged
hsliuustc0106 merged 1 commit into
vllm-project:mainfrom
alex-jw-brooks:helios_fix
May 22, 2026
Merged

[Bugfix] Set separate CFG flag in Helios for CacheDiT#3756
hsliuustc0106 merged 1 commit into
vllm-project:mainfrom
alex-jw-brooks:helios_fix

Conversation

@alex-jw-brooks
Copy link
Copy Markdown
Contributor

Purpose

Related: #2527

It looks like Helios should have has_separate_cfg=True for Cache-DiT, since we always handle +/- embeds separately, both for standard cfg & cfg zero* (here). You can also see this is the case in the Cache DiT adapter for Helios, here.

Currently, the output with cache DiT on using some models, e.g., helios base, will come out fuzzy:

For example, ground truth:

seed_131_gt.mp4

Example with Cache DiT on:

helios_cache_bug.mp4

Result after fix:

post_fix.mp4

Test Plan

Added Helios to explicit checks for has CFG=True (which were previously added for the same issue on Longcat/ltx2 #2860)

Signed-off-by: Alex Brooks <albrooks@redhat.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@hsliuustc0106
Copy link
Copy Markdown
Collaborator

could we make this model real-time streaming output?

@hsliuustc0106
Copy link
Copy Markdown
Collaborator

I am trying to optimizing this model

@alex-jw-brooks
Copy link
Copy Markdown
Contributor Author

@hsliuustc0106 Yes, I think Helios should be able to support real-time, at least if it's one request at a time, since it was originally supposed to get up to ~19 FPS on H100!

This PR is just a correctness fix though since the outputs get destroyed by not separating CFG correctly

Copy link
Copy Markdown
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - well-justified fix with visual evidence and test coverage.

@hsliuustc0106
Copy link
Copy Markdown
Collaborator

@hsliuustc0106 Yes, I think Helios should be able to support real-time, at least if it's one request at a time, since it was originally supposed to get up to ~19 FPS on H100!

This PR is just a correctness fix though since the outputs get destroyed by not separating CFG correctly

please check #3737

@hsliuustc0106 hsliuustc0106 added the ready label to trigger buildkite CI label May 22, 2026
@hsliuustc0106 hsliuustc0106 merged commit 23e2433 into vllm-project:main May 22, 2026
7 of 8 checks passed
zengchuang-hw pushed a commit to zengchuang-hw/vllm-omni that referenced this pull request Jun 1, 2026
)

Signed-off-by: Alex Brooks <albrooks@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants