[Feature]: Add CFG param to online serving by gDINESH13 · Pull Request #824 · vllm-project/vllm-omni

gDINESH13 · 2026-01-17T07:17:17Z

closes #777

Purpose

Make cfg_parallel_size available in Offline Inference in Diffusion Models

Test Plan

My device doesn't have enough Hardware to conduct model inferencing tests. But I have verified if
parameter plumbing are working as expected by executing, the script below

# test_cfg_parallel_local.py
from vllm_omni.entrypoints.openai.protocol.images import ImageGenerationRequest
from vllm_omni.diffusion.data import DiffusionParallelConfig

print("Test 1: API Schema")
req = ImageGenerationRequest(prompt="test", cfg_parallel_size=2)
assert req.cfg_parallel_size == 2
print("PASSED")

print("\nTest 2: Configuration Flow")
config = DiffusionParallelConfig(ulysses_degree=1, ring_degree=1, cfg_parallel_size=2)
assert config.cfg_parallel_size == 2
assert config.world_size == 2
print("PASSED")

print("\nTest 3: Backward Compatibility")
req_default = ImageGenerationRequest(prompt="test")
assert req_default.cfg_parallel_size is None
print("PASSED")

Test Result

Result of test script execution.

Test 1: API Schema
PASSED

Test 2: Configuration Flow
PASSED

Test 3: Backward Compatibility
PASSED

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: Dinesh G <G.Dinesh@ibm.com> Signed-off-by: gDINESH13 <dinesh13g@gmail.com>

hsliuustc0106 · 2026-01-17T07:20:11Z

please update the text-to-image.py under examples
please add it into api tests if necessary

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c9ddc72391

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-01-17T07:21:51Z

+            cfg_parallel_size = kwargs.get("cfg_parallel_size") or 1
            if sequence_parallel_size is None:
                sequence_parallel_size = ulysses_degree * ring_degree
-            num_devices = sequence_parallel_size * tensor_parallel_size
+            num_devices = sequence_parallel_size * tensor_parallel_size * cfg_parallel_size


Account for CFG parallel GPUs in stage device locking

The new cfg_parallel_size factor increases the diffusion stage’s device list (num_devices now multiplies by cfg_parallel_size), but the stage worker’s lock calculation still only uses TP/PP/DP/SP (vllm_omni/entrypoints/omni_stage.py lines 470–499). When cfg_parallel_size > 1 and multiple stages/processes initialize concurrently, the extra CFG GPUs won’t be locked, so another stage can initialize on them at the same time, defeating the “lock ALL devices” guarantee and risking memory-calculation/OOM races. Consider including cfg_parallel_size in num_devices_per_stage (or otherwise locking all CUDA_VISIBLE_DEVICES) to keep the lock coverage consistent with the new device list.

Useful? React with 👍 / 👎.

gDINESH13 · 2026-01-17T08:15:37Z

Hello @hsliuustc0106 good day, Thank you for your comments.
I have some questions,

text-to-image.py: The file already has cfg_parallel_size parameter (lines 84-90, 147, 169).
- Should I improve the help text documentation?
- Should I remove the choices=[1, 2] restriction?
- Is there something else you'd like updated?
API tests: Should I add:
- Integration tests for the /v1/images/generations endpoint with cfg_parallel_size? like the ones already in text_image_server.py
The suggestion by bot, I missed it In my early commit, but I feel its right to do it. Can you share your comment about it?

hsliuustc0106 · 2026-01-17T08:39:20Z

Hello @hsliuustc0106 good day, Thank you for your comments. I have some questions,

text-to-image.py: The file already has cfg_parallel_size parameter (lines 84-90, 147, 169).

Should I improve the help text documentation?

Should I remove the choices=[1, 2] restriction?

Is there something else you'd like updated?

API tests: Should I add:

Integration tests for the /v1/images/generations endpoint with cfg_parallel_size? like the ones already in text_image_server.py

The suggestion by bot, I missed it In my early commit, but I feel its right to do it. Can you share your comment about it?

for online serving, I think we have not supported it before in the examples
for test, using test_image_server.py is ok

Signed-off-by: gDINESH13 <dinesh13g@gmail.com>

gDINESH13 · 2026-01-17T11:42:19Z

Hey @hsliuustc0106 I'd updated the examples in online serving. Please take a look when you get a chance. Thanks..

hsliuustc0106 · 2026-01-17T11:37:57Z

-            type=int,
-            default=1,
-            help="Number of GPUs for CFG parallel computation"
+            "--cfg-parallel-size", type=int, default=1, help="Number of GPUs for CFG parallel computation"


keep it as before

Do you want me to remove this? that would make this param unavailable to be configured right? while starting the server.

no, i mean keep line break as before

it fails pre-commit formatting check if I keep line break

hsliuustc0106 · 2026-01-17T11:38:57Z

        le=20.0,
        description="True CFG scale (model-specific parameter, may be ignored if not supported)",
    )
-    cfg_parallel_size: int | None = Field(


keep it as before

removed this

hsliuustc0106 · 2026-01-17T11:44:56Z

 bash run_server.sh
 ```

+### Start with CFG Parallelism


@wtomin can CFG be applied to all models now? I think we need to wait for the refactoring, right? if so, maybe we need to keep the examples as it is before the refactoring finished

hsliuustc0106 · 2026-01-17T11:46:12Z

@gDINESH13 I realized that the cfg parallel is not applicable to all models at this stage, many we should keep the examples as before at this stage

gDINESH13 · 2026-01-17T12:15:27Z

@gDINESH13 I realized that the cfg parallel is not applicable to all models at this stage, many we should keep the examples as before at this stage

Just to confirm the scope—since we’re keeping the existing example as-is, please Let me know if there's anything else you want included or excluded.

hsliuustc0106 · 2026-01-17T12:50:25Z

@gDINESH13 I realized that the cfg parallel is not applicable to all models at this stage, many we should keep the examples as before at this stage

Just to confirm the scope—since we’re keeping the existing example as-is, please Let me know if there's anything else you want included or excluded.

yes

Signed-off-by: gDINESH13 <dinesh13g@gmail.com>

gDINESH13 · 2026-01-17T14:15:15Z

@hsliuustc0106 I have removed changes I made in example files. I hope now we are on same page.

david6666666 · 2026-01-19T03:55:27Z

lgtm

david6666666 · 2026-01-19T04:26:17Z

Thanks for your contribution.

Signed-off-by: Dinesh G <G.Dinesh@ibm.com> Signed-off-by: gDINESH13 <dinesh13g@gmail.com>

Add CFG param to online serving

c9ddc72

Signed-off-by: Dinesh G <G.Dinesh@ibm.com> Signed-off-by: gDINESH13 <dinesh13g@gmail.com>

gDINESH13 requested a review from hsliuustc0106 as a code owner January 17, 2026 07:17

chatgpt-codex-connector Bot reviewed Jan 17, 2026

View reviewed changes

Implemented review comments and pre-commit formatting

d9ba570

Signed-off-by: gDINESH13 <dinesh13g@gmail.com>

gDINESH13 force-pushed the 777_add_cfg_param branch from 5785c42 to d9ba570 Compare January 17, 2026 11:39

hsliuustc0106 reviewed Jan 17, 2026

View reviewed changes

Removed changes done in examples, as per comments

da4b024

Signed-off-by: gDINESH13 <dinesh13g@gmail.com>

gDINESH13 requested a review from hsliuustc0106 January 17, 2026 14:15

david6666666 added the ready label to trigger buildkite CI label Jan 19, 2026

david6666666 approved these changes Jan 19, 2026

View reviewed changes

david6666666 enabled auto-merge (squash) January 19, 2026 03:56

david6666666 merged commit 5e7035e into vllm-project:main Jan 19, 2026
6 of 7 checks passed

with1015 pushed a commit to with1015/vllm-omni that referenced this pull request Jan 20, 2026

[Feature]: Add CFG param to online serving (vllm-project#824)

e5e941e

Signed-off-by: Dinesh G <G.Dinesh@ibm.com> Signed-off-by: gDINESH13 <dinesh13g@gmail.com>

Conversation

gDINESH13 commented Jan 17, 2026

Purpose

Test Plan

Test Result

Uh oh!

hsliuustc0106 commented Jan 17, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

gDINESH13 commented Jan 17, 2026

Uh oh!

hsliuustc0106 commented Jan 17, 2026

Uh oh!

gDINESH13 commented Jan 17, 2026

Uh oh!

hsliuustc0106 Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

gDINESH13 Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

gDINESH13 Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

gDINESH13 Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 commented Jan 17, 2026

Uh oh!

gDINESH13 commented Jan 17, 2026

Uh oh!

hsliuustc0106 commented Jan 17, 2026

Uh oh!

gDINESH13 commented Jan 17, 2026

Uh oh!

david6666666 commented Jan 19, 2026

Uh oh!

Uh oh!

david6666666 commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gDINESH13 Jan 17, 2026 •

edited

Loading

gDINESH13 Jan 17, 2026 •

edited

Loading