-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alpha-transparency support #8
Comments
Really interesting question! My gut says your workaround of pre/post-processing is going to be the easiest solution here, as it would mean, like you say, tackling it upstream, specifically re-training a model from scratch to support 4 channels instead of 3 (similar logic to why you need different models for different scale sizes). If you do want to go down that path: all of the models are trained using this python implementation of ESRGAN (this repo's Javascript code is model-agnostic, and built to support other implementations and algorithms but I haven't converted any other implementations yet). This particular Python implementation seems to explicitly only support three channels (here's a related issue I found). However, there's no theoretical reason (I think!) that transparency couldn't be supported, as it's just another channel. One option would be to look for an alternative python implementation that supports alpha transparency and convert it to TFJS. I left some notes on how to go about picking an implementation here if you're interested in that route, but the TLDR is look for Tensorflow implementations (ideally without custom layers). Alternatively, you could modify You'd also need a dataset that had images with alpha transparency. A google search led me to some datasets that seem more aimed at training a matting model, but you may be able to re-purpose their datasets for this use case. Alternatively, I believe you could take a bunch of images (for instance, start with Flickr faces dataset), matte out the backgrounds, and make them transparent. You might even have good luck by randomly turning parts of images transparent, or setting certain colors to transparent - this would be a really interesting research avenue to explore! The original Python implementation (as well as the models in this repo) were trained on the DIV2K dataset, which has 800 images (plus I think 200 for validation / test) so I'd shoot for something in that ballpark. The Python repo has good information on how to train. Once you've got a trained model you'd need to make some changes to this repo, specifically on the input and output tensors. As of now, it assumes three channels. It'd be neat to have a model-per config that describes channels (I could also see single-channel black+white as being useful), for which I would welcome a PR, or help you with one! I'm also not 100% sure if the 4-channel tensor -> canvas image would need some massaging as well, but if so that'd be a straight forward fix. In general, the Javascript changes should be fairly easy to make, the model-training-from-scratch (and dataset collection) I'd imagine to be significantly more work. Hope that helps. It's a very interesting problem! |
Thank you for the helpful detailed reply, @thekevinscott!
I appreciate the validation on approach! As my use-case is upscaling a class of images that often has 4 channels (Memoji, stickers) I've gone ahead and implemented a pre-processing step to address this issue for now. Images are written to
This is very helpful. Thank you! https://thekevinscott.com/super-resolution-with-js/#hearing-it-through-the-grapevine is a great starting point for me to dig into deeper.
I greatly appreciate the pointer to
That's a great tip. If the matting model sets appear like they aren't sufficiently close, it shouldn't be a huge effort to create a new dataset with Google's image search and the transparency option.
I'll be sure to report back here if I get as far as a trained model that appears to work sufficiently. Once again, thank you for the pointers! |
Great work on Upscaler, @thekevinscott! I've recently been using it for a small side-project. When processed through the library, I observed that PNG images with an alpha channel / which are transparent appear to get a solid black color background. This appears to happen with all models.
I wanted to ask if there was a small fix possible for this upstream in Upscale. My alternative workaround is likely to involve pre or post-processing (e.g allow the user to customize the solid background color) but preserving their input would of course be ideal if possible :)
Below is an example of a transparent PNG that demonstrates this behavior with the demo:
Demo images
Input:
![grass](https://user-images.githubusercontent.com/110953/104148331-b3ab4080-5386-11eb-96ad-ab9490b28039.png)
Upscaler output:
![download (2)](https://user-images.githubusercontent.com/110953/104148335-b86ff480-5386-11eb-812b-1f2abc4b4fcc.png)
The text was updated successfully, but these errors were encountered: