`Image` component requests #466

pngwn · 2022-01-19T15:57:34Z

Is your feature request related to a problem? Please describe.

We are getting a lot of feature requests for the different interactive Image variants, historically we had numerous different tools to handle the different kinds of Image editing functionality, this has improved somewhat but the Image is less feature rich than it used to be. Sadly over time the new (kinda) Image component has become difficult to maintain and extend and need substantial refactoring to realise its full potential.

We are also finding the current signature of pre and post-process limiting, the challenge here is that the sensible thing would be to issue a breaking change to make a clean break with the past but we don't want to introduce too much churn for users.

This issue will collate all feedback we have received so far (assuming I can find it all) and act as a single place to discuss features and design of a new unified image component.

overview

Broadly speaking the Image component (as an interactive input) has two key parts: the source of the image and the editing capabilities. The rewrite that will stem from this issue will preserve the different inputs (and maybe there are more people would like to see) but unify the editing tools into a single, simple (hopefully) GUI. The high level thinking is that the Gradio developer will be able to toggle and constrain the various features one by one if necessary (with defaults and templates making this even simpler for users who do not need such granular control).

References to "Gradio API" refer to controlling the feature via the Gradio Python API when creating the app.
References to "GUI" refer to end users controlling the feature in the browser when interacting with the tool.

inputs/source

none - blank background on which to sketch (other uses?)
- Gradio API - Can be constrained to a specific background colour (removing the config from the GUI), or a set of valid colours, or left open.
- GUI - Can be changed by users if left open of limited to a set of users. No config if a single background colour constraint is set in Gradio API.
webcam - Webcam image snapshot can be taken to be used as a background image.
upload - Background image can be uploaded from the users computer (either by browsing or drag+drop)

One thing that has crossed my mind is allowing multiple inputs, this would allow Gradio app authors to be very flexible with what kinds of Image sources are set which will work well for some more general models. Could be controllable via the GUI (defaulting to blank background but with buttons/ toggles to enable different source modes).

Note: source=canvas would be deprecated as everything can be a canvas in the new world.

Are there other possible inputs we should consider here?

editor tools

Currently the Gradio Image component is a simple raster/bitmap graphics editor but there is no reason we cannot support certain vector features. I would be wary of attempting any kind of comprehensive vector tools (specifically things like modifying paths + curves, creating new shapes from the intersection/union of multiple shapes, etc.) but we could support some simple shape tools with transforms (translate/rotate/resize). It would probably make sense to start with rasters only because combining vectors and raster images introduces some complexities that we have no choice but to push onto the user (such as needing to rasterise vectors and flatten layers in order for image filters to work as expected).

We have had lots of requests so here goes:

transforms
- crop
- rotate
- resize
- perspective (???)
filters/effects/colour correction
- noise
- blur
- brightness/Contrast
- hue/Saturation
- curves (???)
- are there other filters/effects that would be useful? in some ways these kinds of thing are relatively straightforward although will only work on raster images (or flattened/rasterised vectors)
other
- text tools
sketching
- brush size
- brush colour - Gradio API: open, locked to a specific colour, locked to a set of colours
- brush texture (???)
- lazy brush - currently a feature, makes it easier to draw smooth paths with a mouse, can be frustrating. remove or make toggleable.
- path smooth - on/off - paths are always lazy (although not technically smoothed) with the current sketch tool, if the lazy brush is removed or made configurable we could add path smoothing
- masking - gradio API: open, locked to a specific colour, locked to a set of colours.
- eraser - design unknown
- fill region (bucket tool)
- shapes (???)

general

Some more general things can need to be handled better. The main thing I can think of here is the size of the canvas. It is a little better today than it was yesterday but still not ideal.

fullscreen mode

We have never really had this, the previous 'full screen' wasn't really full screen but we should add this.

canvas size

I'm not 100% sure what is the best way to approach the canvas size. I definitely thing we need to respect any options passed into the Gradio API, so app authors can set the most appropriate canvas size (and ratio) for their model but I'm not sure about other cases.

Currently we size the cavas based on either the 'source' or if there isn't one, the screensize. So if a users uploads a 500x500 image, then that will be the size of the canvas (scaled in the browser to account for device pixel ratio) but this might not be ideal as very large images could slow down the predictions. We could accept a max width/ height and never go above that to ensure we aren't sending huge images back to the server to be processed unwittingly.

Would love to get people's thoughts on this one.

performance

Performance iif the current component is ok but there are some performance issues which are a result of a number of things. They can be addressed in a rewrite as we will almost certainly need to switch to webgl to do implement some of these features in a performant manner (while maintain good UX). Calling them out here for posterity, not a great deal to discuss.

pre and post-process

These signatures need to change, they aren't work right now and things aren't going to get any better. This is a pretty significant breaking change because the Image component is our most used component. We will have to discuss how we manage this.

I think the image component should switch to always returns a dictionary with a series of keys. We have had numerous requests about returning certain layers separately and others together, so we can discuss the specifics in this thread but something like:

{
  "image": "background_image.whatever",
  "mask": "mask_image.whatever",
  "sketch": "...",
  ...
}

There are questions around the exact shape of this, what if we have multiple masks? Should that be a list/array on the mask key or should every layer have its own key? Should the return be a list of dicts instead, containing meta information about that layer? How does an app author figure out what each layer is for (take the example of three masks again)? Should we also return a composite image in addition to the separate layers?

Would be good to get people's thoughts on this one as well.

Issues

Features

Feature requests but should be custom components?

Bugs

The text was updated successfully, but these errors were encountered:

omerXfaruq · 2022-04-12T17:44:32Z

this issue looks very old, what's its status?

pngwn · 2022-04-12T19:25:18Z

Still important just not as important as other things.

charlesfrye · 2022-06-07T17:59:49Z

Thanks for your hard work on an awesome tool! I just wanted to chime in on why this is important to me, as a user.

The image editor was one of my favorite Gradio 2.x features. It allowed me to "play" with my computer vision models in the same way that NLP folks have been able to play with theirs. I used it very fruitfully to probe and understand the failure modes of an OCR model. This makes it a killer feature to combine with flagging as part of an "exploratory model analysis" workflow, where Gradio can shine as a central component.

Without the full editor, I have much less reason to prefer Gradio for this over other libraries for rapid model-centric app development, like Streamlit. I'd also like to register that it was very confusing to see the documentation for the editor choice in the inputs.Image class's tool kwarg totally unchanged, still referring to a "full-screen editor", even though the feature was intentionally removed.

Cheers, and thanks for making a really useful library!

abidlabs · 2022-06-07T18:08:54Z

Thanks @charlesfrye for the very useful feedback! We are definitely planning on bringing it back, but most likely using our own implementation so that we have more control over it. Would you be able to tell us which parts of the editor were most useful for you? Blurring / cropping / coloring / etc.?

hysts · 2022-06-07T22:15:35Z

I also found the image editing function very useful to modify input images to check the robustness of models, and I'm glad to hear there's a plan to bring it back.

In my case, rotating, flipping, blurring, cropping, changing aspect ratios, and adding noise were useful for checking the performance of object detection models, image classification models, etc. Drawing tools were also useful for partially or completely occluding objects in images.

Some of the features that were missing in the previous image editor and that I wanted are discussed in the following issues.
#1020 #1410
A little while ago, I was thinking of making an app for MatteFormer, but image matting task requires three-colored mask to specify foreground, background, and ambiguous areas, and it was impossible to create such masks even with gradio v2 image editor, so I decided not to. Also, I recently made an app for Text2Human, and it would be better if we could edit label images directly. It's possible with the GUI app in the original repo, but it's not with image editor with gradio.

charlesfrye · 2022-06-08T01:19:16Z

@abidlabs Happy to help! The most useful transformations were adding noise, blurring, adding text, and erasing/drawing.

Adding noise and blurring are nice generic robustness tests, but they are relatively easy to do in a library like torchvision. Erasing, drawing, and adding text, on the other hand, are much harder to automate and so aren't as readily available in existing modeling libraries.

For "gradio as a tool for exploring models", I think it is generally the case that those more interactive editing tools would be highest value-add.

Rotations and flips were less useful, but that may be specific to the use case I spent the most time with -- the OCR model expected text to be mostly oriented correctly.

pngwn · 2022-09-20T10:39:30Z

I have updated the parent issue to try to capture the various requests we have had and start a conversation about how we design this. Please take a look and provide any feedback, it would be very much appreciated!

abidlabs · 2022-09-20T14:37:45Z

This looks great @pngwn and definitely captures the vast majority of user feedback that I've heard. A few thoughts:

One thing that has crossed my mind is allowing multiple inputs, this would allow Gradio app authors to be very flexible with what kinds of Image sources are set which will work well for some more general models. Could be controllable via the GUI (defaulting to blank background but with buttons/ toggles to enable different source modes).

This is something we've heard a lot. In the Python API, if users pass in a list for the source parameter, it would be nice if the GUI allowed them to toggle between these options.

editor tools

LGTM. One additional request we've heard is the ability to type text onto an image. This is useful for OCR-type models. See @charlesfrye's comments in the thread above, for example.

pre and post-process

It seems that some users strongly prefer dealing with a single image, while others require separate layers for the image, mask, and sketch, I think we should actually provide this as an option that can be controlled via the Python API. The Image component could take in a parameter (something like collapse_layers), which if set to True, would return a single image to the backend function. If False, it would return separate a dictionary with separate images for the keys image, mask, and sketch.

I didn't follow what you meant about the "example of three masks"

pngwn · 2022-09-20T17:11:26Z

@abidlabs regarding the three masks, it is an example pulled from this comment:

A little while ago, I was thinking of making an app for MatteFormer, but image matting task requires three-colored mask to specify foreground, background, and ambiguous areas, and it was impossible to create such masks even with gradio v2 image editor, so I decided not to.

collapse_layers kwarg, sounds like a good idea.

I'll add text to the feature list.

GalaxyTimeMachine · 2022-09-30T09:18:21Z

I came here to add a request for a mask eraser. It's sometimes a pain to have to reverse and delete the whole mask when you only want to be able to erase a small part of it.

johko · 2022-10-04T20:43:52Z

Hi, I would love to see the possibility to have an example for a masked image input in a space.
Even just being able to put in an empty mask would already help in my opinion, as mostly the really important thing for examples is to have an image to start from without having to upload anything.

Starhkz · 2022-10-24T17:23:58Z

I recently started using gradio, and it's been really helpful. My major challenge is reducing the brush size. Fortunately the issue was mentioned earlier.
Are there any updates on any of these?

sketching

brush size

brush colour - Gradio API: open, locked to a specific colour, locked to a set of colours

brush texture (???)

pngwn · 2023-01-08T12:21:17Z

#2903

Alchete · 2023-01-27T19:56:04Z

@pngwn I'm strongly in favor of improving this component as, whether intended or not, it's become the de-facto interface for Stable Diffusion and folks are currently zooming their browser windows to see what they're masking. BTW, I'd also recommend looking at InvokeAI's implementation of its unified canvas feature.

Even just adding shortcuts and a functioning zoom to the existing Image component would go a long way toward filling the gap short term. Since I come from the Desktop UI world, can you or someone explain if the Image component currently supports "focus" and "keyboard" listeners? And if not, which package one might use to support those features that would be acceptable on the Gradio-side? I'd be willing to tinker with this on my own. Many thanks.

anapnoe · 2023-03-28T10:45:26Z

I don't know if here is the right place to ask
why the image editing component uses 5 canvases instead of one ?
it seems very inefficient maybe someone can explain to me why we need to allocate 4x times the memory
which is not free for large canvases
this component is very useful it should be optimized first before anything else is added on top
[x] remove unnecessary canvases
[x] proper undo redo option attribute field to constrain memory footprint (history)

then any tool adds a cherry on the pi 😎

pngwn · 2023-03-28T13:25:35Z

@anapnoe there are reasons but they aren't particularly good ones. This will be addressed in the rewrite. The performance of the current sketch tool is quite poor currently, especially with large images.

missionfloyd · 2023-06-24T07:32:28Z

How about a None tool option? Sometimes we just need to upload images.

cceyda · 2023-06-26T08:29:04Z

On windows using chrome I can drag&drop an image from another tab/window into gradio. But on mac this doesn't work.
I really like doing this for quick testing of things, like I can search for cat pics on one tab and drag drop the images to see if my animal classifier works.

abidlabs assigned pngwn Jan 19, 2022

abidlabs added the enhancement New feature or request label Jan 29, 2022

pngwn mentioned this issue May 17, 2022

Image editor did not show edit window #1306

Closed

1 task

pngwn mentioned this issue Jun 7, 2022

[ImageEditor] - Support resizing images without cropping #1451

Closed

1 task

omerXfaruq added this to the 3.x milestone Jun 7, 2022

abidlabs assigned pngwn and unassigned pngwn Jul 11, 2022

abidlabs mentioned this issue Jul 25, 2022

Support of *.tiff files #32

Closed

pngwn mentioned this issue Sep 20, 2022

Sketching + Inpainting Capabilities to Gradio #2144

Merged

pngwn mentioned this issue Sep 27, 2022

[ImageEditor] - Gradio support for ordered points as an output to sketch. #1922

Closed

1 task

jmp909 mentioned this issue Oct 5, 2022

Feature Request: Erase brush for mask AUTOMATIC1111/stable-diffusion-webui#1745

Closed

pngwn added the 🖼️ image Image component label Feb 22, 2023

pngwn changed the title ~~refreshed image input component~~ Image Feb 22, 2023

pngwn pinned this issue Feb 22, 2023

abidlabs removed this from the 3.x milestone Mar 17, 2023

abidlabs changed the title ~~Image~~ Image component requests Mar 29, 2023

pngwn mentioned this issue Mar 30, 2023

Fix sketch tool gr.Image not filling up the entire component size #3649

Merged

7 tasks

abidlabs added this to the Component Cleanup milestone Jul 9, 2023

hannahblair assigned pngwn and unassigned pngwn Jul 31, 2023

pngwn mentioned this issue Aug 1, 2023

Image proposal #5055

Closed

1 task

abidlabs modified the milestones: Component Cleanup, 4.0 Aug 10, 2023

This comment was marked as abuse.

Sign in to view

abidlabs unpinned this issue Sep 29, 2023

abidlabs modified the milestones: 4.0, 4.0-image Oct 23, 2023

pngwn mentioned this issue Nov 15, 2023

Image editor #6169

Merged

pngwn closed this as completed in #6169 Nov 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`Image` component requests #466

`Image` component requests #466

pngwn commented Jan 19, 2022 •

edited

Loading

omerXfaruq commented Apr 12, 2022

pngwn commented Apr 12, 2022 •

edited

Loading

charlesfrye commented Jun 7, 2022

abidlabs commented Jun 7, 2022

hysts commented Jun 7, 2022

charlesfrye commented Jun 8, 2022

pngwn commented Sep 20, 2022

abidlabs commented Sep 20, 2022

pngwn commented Sep 20, 2022 •

edited

Loading

GalaxyTimeMachine commented Sep 30, 2022

johko commented Oct 4, 2022

Starhkz commented Oct 24, 2022

pngwn commented Jan 8, 2023

Alchete commented Jan 27, 2023

anapnoe commented Mar 28, 2023

pngwn commented Mar 28, 2023

missionfloyd commented Jun 24, 2023

cceyda commented Jun 26, 2023

This comment was marked as abuse.

Image component requests #466

Image component requests #466

Comments

pngwn commented Jan 19, 2022 • edited Loading

Issues

omerXfaruq commented Apr 12, 2022

pngwn commented Apr 12, 2022 • edited Loading

charlesfrye commented Jun 7, 2022

abidlabs commented Jun 7, 2022

hysts commented Jun 7, 2022

charlesfrye commented Jun 8, 2022

pngwn commented Sep 20, 2022

abidlabs commented Sep 20, 2022

pngwn commented Sep 20, 2022 • edited Loading

GalaxyTimeMachine commented Sep 30, 2022

johko commented Oct 4, 2022

Starhkz commented Oct 24, 2022

pngwn commented Jan 8, 2023

Alchete commented Jan 27, 2023

anapnoe commented Mar 28, 2023

pngwn commented Mar 28, 2023

missionfloyd commented Jun 24, 2023

cceyda commented Jun 26, 2023

This comment was marked as abuse.

`Image` component requests #466

`Image` component requests #466

pngwn commented Jan 19, 2022 •

edited

Loading

pngwn commented Apr 12, 2022 •

edited

Loading

pngwn commented Sep 20, 2022 •

edited

Loading