[Model] Add edit preprocessor for HunyuanImage3#1644
Conversation
Wire HunyuanImage3 into the /images/edit path with full conditional-image preprocessing and request plumbing. - add get_hunyuan_image_3_pre_process_func and register it for HunyuanImage3ForCausalMM - normalize edit inputs from PIL/ndarray/tensor, resize/crop for VAE, and build VAE+ViT JointImageInfo payloads - serialize/deserialize conditional image info so async RPC transport remains compatible - propagate batch_cond_image_info through forward -> prepare_model_inputs - make vae_encode accept 3D/4D image tensors by normalizing to (B, C, T, H, W) - declare HunyuanImage3Pipeline.support_image_input = True - implement LightProjector.forward to unblock vision aligner calls during edit generation - extend module discovery/layerwise hints for Hunyuan model offload path - add regression tests for preprocess payload construction and LightProjector callability Co-authored-by: Codex <codex@openai.com> Signed-off-by: Jeff Cook <jeff@jeffcook.io>
Prevent stage IPC serialization failures on image edit requests by coercing numpy scalar metadata (e.g. np.int64) into plain Python scalars before attaching conditional image payloads to prompts. - coerce ImageInfo payload scalar fields via helper - normalize target/base/ratio values to int during preprocess - handle numpy scalar values in payload decode helper - extend preprocess regression test to cover numpy int64 metadata Co-authored-by: Codex <codex@openai.com> Signed-off-by: Jeff Cook <jeff@jeffcook.io>
Signed-off-by: Jeff Cook <jeff@jeffcook.io>
620eb35 to
d1aeb33
Compare
PR #1644 Review: Add edit preprocessor for HunyuanImage3📊 Overall Assessment: 8.5/10This is a well-structured PR that adds image editing support to the HunyuanImage3 model. The implementation is comprehensive, handles edge cases properly, and includes good test coverage. ✅ Strengths1. Complete End-to-End Implementation
2. Robust Image Input Handlingdef _to_pil_image(image: Any) -> PILImage.Image:
# Handles PIL, str, ndarray, tensor
# Normalizes dtype, channels, dimensions✅ Good: Comprehensive conversion logic handles multiple input formats gracefully. 3. IPC Serialization Fixdef _to_python_scalar(value: Any) -> Any:
if isinstance(value, np.generic):
return value.item()
return value✅ Excellent: Critical fix for numpy scalar serialization in multi-process environments. This prevents IPC failures that would be hard to debug. 4. Proper Batch Validationif any(has_cond_image) and not all(has_cond_image):
raise ValueError(
"When batching Hunyuan image editing requests, "
"every prompt must include input image(s)."
)✅ Good: Clear error message for inconsistent batch inputs. 5. Test Coverage
🔍 Code ReviewMain Components1. Image Preprocessing (
|
- Add image editing capability for HunyuanImage3 model - Document conditional image preprocessing pipeline - Note IPC serialization fix for numpy scalars Source: vllm-project/vllm-omni#1644
lishunyang12
left a comment
There was a problem hiding this comment.
LGTM — solid edit preprocessor with good test coverage.
20b9f89 to
0135a4f
Compare
|
look good, thanks Jeff |
|
@Semmer2 @usberkeley @nussejzz please also take a look😊 |
|
Give at least one offline inference script and one online serving script. please also include the generated images, VRAM, e2e latency in your PR body. You may need to update the documents, |
|
Please resolve this conflicts. |
| assert request.sampling_params.height == 16 | ||
|
|
||
|
|
||
| def test_hunyuan_image3_light_projector_is_callable(): |
Signed-off-by: Samit <285365963@qq.com>
|
Can you create a doc |
|
Please check PR #3107 as it is duplicated |
|
closed since #3107 merged |
Purpose
Allows HunyuanImage-3.0-Instruct to be used to edit images.
Test Plan
Some tests have been added, and have tried manually with the ComfyUI extension. All seems to be working well.
See the new
tests/diffusion/test_hunyuan_image3_edit_preprocess.py.Test Result
All seems to be working well.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.