Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About Annotator Resolution #924

Closed
lllyasviel opened this issue Apr 20, 2023 · 11 comments
Closed

About Annotator Resolution #924

lllyasviel opened this issue Apr 20, 2023 · 11 comments

Comments

@lllyasviel
Copy link
Collaborator

lllyasviel commented Apr 20, 2023

How to set the Annotator Resolution is always a difficult thing, and users are likely to get frustrating results if their Annotator Resolution is not very correct.

For example, to diffuse 1024 ×1024 images, the Annotator Resolution should be 1024 rather than the default 512 (excepting using depth as Annotator).

However again, because multiple resizing methods are available, the Annotator Resolution also depends on Crop and Resize v.s. Resize and Fill, and the correct Annotator resolution becomes really difficult to think about:

For example, if the A1111 resolution is 640 × 512, the input control image is 512 × 768, and we use “Crop and Resize” then the control image will be first resized by ControlNet to (512×640/512) × (768×640/512) = 640 × 960, and then it will be crpped to 640 × 512.

In this case, if we want the annotator (say canny) to be pixel-perfect, we need to use the short side of 640 × 960, say 640 (not the short side of 512×640 which is 512, !!), and then compute in our human mind that this number should have a closet neighbor to 64 factor, say 64×round(640/64) = 64×10=640. Lucky, it is still 640.

In this way, the final correct Annotator Resolution is 640. What the heck. Who is able do such computation in their mind? I am also confused from time to time.

I think we should have a solution to this, but I think it is a bad idea to force a correct value because we also want to allow users to control the resolution as they want.

Perhaps a better idea is to add some hints but I am not sure where to add such hints. And sometime users may be bored by too many crowded UI. (but if we can implement it in gradio I can have a try)

Anyone has ideas?

@lllyasviel
Copy link
Collaborator Author

what about we add a toggle say "pixel-perfect" next to the "guess mode" so that if it is selected, we automatically compute the annotator resolution

@OedoSoldier
Copy link

The annotator's resolution need to match the height of the resulting image, otherwise, it may result in displacement. I've seen this when using multi-cnet w/ different annotator resolutions.

Also, can we include an option to "copy img to all cnet" feature? It can become quite tiresome when working with multi-cnet, especially in text-to-text scenarios where you have to duplicate the reference image across all cnet tabs.

@lllyasviel
Copy link
Collaborator Author

lllyasviel commented Apr 20, 2023

it is not the height, it is the short side of resizing after first pass in control map resizing before finding the x64 nearest neighbor.
in the case that the image has longer height and using Crop and Resize it is likely to be height. but it is not always

This is just super difficult to understand and people including me cannot set correct values.
I am working on a branch pixel-perfect.
Unity has pixel-perfect mode. Blender has pixel-perfect mode. UE and some others all have pixel-perfect mode.

ControlNet also needs a pixel-perfect mode.
I think I can finish it today hopefully.

https://github.com/Mikubill/sd-webui-controlnet/tree/pixel-perfect

see also commit #926

@OedoSoldier
Copy link

Good news! Thank you

@Ratinod
Copy link

Ratinod commented Apr 20, 2023

@lllyasviel
I am glad that you do not remove the old feature but add a new one. This is a good approach to the problem.
For example, in automatic1111 the "Extra Networks -> [ ] Apply Lora to outputs rather than inputs when possible (experimental)" option has completely disappeared. And the option in "Compatibility" was not even added.

@lllyasviel
Copy link
Collaborator Author

supported

@ljleb
Copy link
Collaborator

ljleb commented Apr 20, 2023

Also, can we include an option to "copy img to all cnet" feature? It can become quite tiresome when working with multi-cnet, especially in text-to-text scenarios where you have to duplicate the reference image across all cnet tabs.

Is this undocumented? You can leave the cnet input image empty and it will fallback to the img2img init image.

@OedoSoldier
Copy link

Also, can we include an option to "copy img to all cnet" feature? It can become quite tiresome when working with multi-cnet, especially in text-to-text scenarios where you have to duplicate the reference image across all cnet tabs.

Is this undocumented? You can leave the cnet input image empty and it will fallback to the img2img init image.

Ha! Never noticed it!

@OedoSoldier
Copy link

Also, can we include an option to "copy img to all cnet" feature? It can become quite tiresome when working with multi-cnet, especially in text-to-text scenarios where you have to duplicate the reference image across all cnet tabs.

Is this undocumented? You can leave the cnet input image empty and it will fallback to the img2img init image.

Well, I tested it with CNet0: Lineart anime; CNet1: ZOE; CNet2: Softedge pidinet with only CNet0 having one input image and it seems not working. If I disable CNet1 and CNet2 then it starts working again.

@ljleb
Copy link
Collaborator

ljleb commented Apr 20, 2023

Sorry if I was not clear, this only works with img2img when you have an init image.

@OedoSoldier
Copy link

Sorry if I was not clear, this only works with img2img when you have an init image.

Okay... I think for text2img we need a similar function, e.g. use the input for CNet-0 as default if no other image was inputted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants