-
Notifications
You must be signed in to change notification settings - Fork 9
[WIP] Prototype/refactor dynamic node params #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dynamic-custom-node-followup
Are you sure you want to change the base?
Conversation
| logger.setLevel(logging.DEBUG) | ||
|
|
||
|
|
||
| @dataclass(frozen=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved ModularMellonNodeRegistry from diffusers to mellon and so this registry gets absorbed into that https://github.com/huggingface/diffusers/blob/main/src/diffusers/modular_pipelines/mellon_node_utils.py#L703
| } | ||
|
|
||
|
|
||
| QwenImageEdit_NODE_TYPES_PARAMS_MAP = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should config params for each model_type we support here, this one is for QwenImage-Edit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each key represents a node type in the Mellon UI. Setting a value to None (like controlnet) indicates that pipeline doesn't support that node type.
QwenImageEdit_NODE_TYPES_PARAMS_MAP = {
"controlnet": None, # No ControlNet support for Qwen Edit
"denoise": {...},
"vae_encoder": {...},
"text_encoder": {...},
"decoder": {...},
}| "latents", | ||
| "doc", | ||
| ], | ||
| "block_names": ["denoise"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "block_names": ["denoise"] field maps to the actual pipeline block in the modular pipeline. For Qwen Edit, this corresponds to the "denoise" block in EDIT_BLOCKS:
EDIT_BLOCKS = InsertableDict([
("text_encoder", QwenImageEditVLEncoderStep()),
("vae_encoder", QwenImageEditVaeEncoderStep()),
("input", QwenImageEditInputStep()),
("prepare_latents", QwenImagePrepareLatentsStep()),
("set_timesteps", QwenImageSetTimestepsStep()),
("prepare_rope_inputs", QwenImageEditRoPEInputsStep()),
("denoise", QwenImageEditDenoiseStep()), # ← Maps here
("decode", QwenImageDecodeStep()),
])|
|
||
| QwenImageEdit_NODE_TYPES_PARAMS_MAP = { | ||
| "controlnet": None, | ||
| "denoise": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for each node_type, you have to config inputs/model_inputs/outputs/block_name
inputs/model_inputs/outputs will be used to compose our final params. I structured this way to mirror the modular pipeline structure (we have inputs/components/outputs) -> this way in the future we can support direct conversion between mellon nodes defination <-> modular pipeline blocks definition
| "controlnet": None, | ||
| "denoise": { | ||
| "inputs": [ | ||
| "embeddings", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this map is used together with a common schema we maintain, currently hosted here https://github.com/huggingface/diffusers/blob/main/src/diffusers/modular_pipelines/mellon_node_utils.py#L29
if a parameter is defined as a string like "embedding" here, we will feetch the UI configuration from the common schema
"embeddings": {
"label": "Text Embeddings",
"display": "input",
"type": "embeddings",
}This is automatically resolved at runtime without needing explicit configuration in the node map.
| "seed", | ||
| "num_inference_steps", | ||
| "guidance_scale", | ||
| MellonParam(name="image_latents", label="Image Latents", type="latents", display="input"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can also customize a parameter using MellonParam, e.g. here we override the definition for image_latents
the common one is this
"image_latents": {
"label": "Image Latents",
"type": "latents",
"display": "input",
"onChange": {False: ["height", "width"], True: ["strength"]}, # ← Shows strength when connected
}
but since qwen-image-edit does not need strength, we can don;t need the onChange part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also you may notice we did not list height/width here for qwen-edit since we don't want user to set custom size for qwen-edit,
but these two parameters are listed for qwen-image
| "target": "skip_image_size", | ||
| "target": "model_type", | ||
| # "data": SIGNAL_DATA, # YiYi Notes: not working | ||
| "data": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not able to use a variable for data, here, I created a function to dynamically create this map but will run into an issue if I put the function here, we need to not have to hard-code this
get_model_type_signal_data()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is one of the petitions I made about the signals, we should be able to do it soon.
| width=None, | ||
| height=None, | ||
| skip_image_size=False, | ||
| **kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should refactor the execute method too, to take kwarggs, so that we will be parameters dynamically defined in param, similar to how we implemented DynamicCOntrolnet, but that will be next step and i will help
| "default": "", | ||
| "hidden": True # Hidden field to receive signal data | ||
| }, | ||
| "unet": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Denoise node must have a unet input to receive signals, the rest should all be dynamic
along with this PR huggingface/diffusers#12560
This PR is a POC for completely dynamically generating mellon node
paramsbased on selected pipeline type. currently they are hardcoded to support multiple pipelines, it will be unmaintinable when we expand support for more pipelinesWe already have a dynamic controlnet, I refactored Denoise node in this PR, similar pattern could also be applied to other node types
The code is only a reference implementation, feel free to do what's best to reach the goal.
I made comments next to the code, but here is a summary on how it works