[BugFix]: Fix multi-stage cfg bug#2801
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
@natureofnature PTAL |
|
Blocker scan:
OVERALL: 3 BLOCKERS FOUND VERDICT: REQUEST_CHANGES Issues:
Also, the load_format: dummy addition to test configs is unrelated to the bugfix — split into separate PR if it's for CI resource optimization. |
|
Blocker scan:
OVERALL: NO BLOCKERS VERDICT: REQUEST_CHANGES The fix is reasonable (empty string for default negative prompt, early return if neg_prompt is empty), but a few issues:
Minor: load_format: dummy change in CI configs is unrelated to the bugfix. Split into a separate PR if it's intentional. |
@hsliuustc0106 Can you help approve it |
Signed-off-by: princepride <wangzhipeng628@gmail.com>
ba46a24 to
f1b230e
Compare
|
@hsliuustc0106 conflict resolved |
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
lishunyang12
left a comment
There was a problem hiding this comment.
Review: [BugFix]: Fix multi-stage cfg bug
Verdict: Approve
The core logic change is correct and well-reasoned. For text2img, the original BAGEL model uses an empty KV cache (0 tokens) as the text-unconditional CFG branch. The previous code was sending <|im_start|><|im_end|> through the LLM stage to produce a KV cache for this branch, which was incorrect in the multi-stage setup. The fix correctly uses the default empty NaiveCache instead, preserving CFG guidance with the right unconditional baseline.
What I verified
-
_get_negative_promptreturning"": This correctly causesexpand_cfg_prompts/expand_cfg_prompts_thinkto skip creating a companion request for text2img when no user-specified negative prompt exists. The empty KV cache (NaiveCache) inpipeline_bagel.pyserves as the text-unconditional branch, matching reference BAGEL behavior. -
cfg_img_kvfallback change (if cfg_img_kv is Noneinstead ofif cfg_img_kv is None and cfg_text_kv is not None): This is correct. For text2img, cfg_img should always fall back to the gen KV cache (injected_kv) regardless of whether cfg_text_kv was received. The old guard incorrectly skipped this assignment when cfg_text_kv was also None. -
Removal of
cfg_parallel_contractfallback: The old code disabled CFG entirely (scales=1.0) when no companion KV caches were received. This was wrong for text2img because text2img legitimately has no cfg_text companion -- its unconditional branch is an empty cache. The removal is correct. -
img2img path is unaffected: The early return
if not neg_prompt: return []is correctly placed only inside the"image" in modalitiesbranch, not the"img2img"branch. For img2img, even an empty negative prompt is meaningful because the companion still carries the image data. -
Test updates: Reference pixel values changed, which is expected since the CFG is now actually being applied (previously it was silently disabled, producing degraded results).
Minor nit (non-blocking)
The docstring of _get_negative_prompt still reads:
"An empty string is treated the same as absent (falls through to the Bagel default token pair), because an empty negative prompt is not meaningful for CFG guidance."
This is now stale -- the function returns "" rather than a token pair. Consider updating the docstring to reflect the new behavior, e.g. "Returns an empty string when no negative prompt is configured, which signals the caller to skip creating a CFG text companion."
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Fix indentation error in pipeline_bagel.py Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
| pass | ||
|
|
||
| if cfg_img_kv is None: | ||
| cfg_img_kv = injected_kv |
There was a problem hiding this comment.
With the else, cfg_img_kv = injected_kv is dead — cfg_img_kv isn't read after this block. Was the intent to also populate cfg_img_context with injected_kv when it's None? The single-stage path populates cfg_img_context with the positive prompt KV via forward_cache_update_text, and for text2img multi-stage injected_kv is the equivalent. If leaving cfg_img_context empty here is intentional, drop the assignment.
| @@ -300,4 +304,4 @@ def _get_negative_prompt( | |||
| if neg: | |||
There was a problem hiding this comment.
Docstring above (line 293-296) is stale — still says "falls through to the Bagel default token pair" but we now return "". Please update.
Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
In bagel, cfg_text use [] rather then use "<|im_start|><|im_end|>" as the default negative prompt.
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)