Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new community pipeline for 'Adaptive Mask Inpainting', introduced in [ECCV2024] ComA #9228

Conversation

jellyheadandrew
Copy link
Contributor

@jellyheadandrew jellyheadandrew commented Aug 20, 2024

Add new community pipeline for 'Adaptive Mask Inpainting' introduced in [ECCV2024] Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models.

What does this PR do?

This PR implements the 'Adaptive Mask Inpainting' algorithm introduced in [ECCV2024] Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models.

The code is borrowed from the author's repository (https://github.com/snuvclab/coma), changed slightly to represent the demo usage.

In a nutshell, this pipeline provides a way to insert human inside the scene image without altering the background, by inpainting with adapting mask.

Fixes # (issue)

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

… in [ECCV2024] Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Sep 19, 2024
@asomoza
Copy link
Member

asomoza commented Sep 19, 2024

Hi @jellyheadandrew, this seems really interesting, can you please post some example generations of this?

@jellyheadandrew
Copy link
Contributor Author

jellyheadandrew commented Sep 19, 2024

@asomoza Yeah sure! Completely forgot updating the changes.

tmp_cache.3.mp4

image (2)

Copy link
Member

@asomoza asomoza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, if you're still interested in adding it to diffusers I left a couple of comments.


if __name__ == "__main__":
"""
Download Necessary Files
Copy link
Member

@asomoza asomoza Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all this example part should go inside the community pipelines README. This file can't go in the root of the project and in general, we don't use a separate example file for each community pipeline

Comment on lines 94 to 98
beta_start=0.00085,
beta_end=0.012,
beta_schedule="scaled_linear",
clip_sample=False,
set_alpha_to_one=False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can simplify this by just changing the scheduler, no need to add the same config manually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm kinda new to contributing to huggingface. What do you mean by changing the scheduler? Do you mean changing the config?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, no problem, normally we can just change the scheduler with any other scheduler just using it like this:

from diffusers import DDIMScheduler
....
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)

which is cleaner and simpler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and if it has a different config (which doesn't seem the case here) you can just add the changed param as an additional kwarg

Comment on lines 125 to 132
adaptive_mask_model = PointRendPredictor(
pointrend_thres=0.2,
device="cuda" if torch.cuda.is_available() else "cpu",
use_visualizer=use_visualizer,
config_pth="pointrend_rcnn_R_50_FPN_3x_coco.yaml",
weights_pth="model_final_edd263.pkl",
)
pipeline.register_adaptive_mask_model(adaptive_mask_model)
Copy link
Member

@asomoza asomoza Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can probably remove this from here and add it as code inside the pipeline to make it easier for the user. But it's your decision, if you do, you should probably download the files inside the pipeline too then.

Comment on lines 137 to 152
adaptive_mask_settings = EasyDict(
dict(
dilate_scheduler=MaskDilateScheduler(
max_dilate_num=20,
num_inference_steps=num_steps,
schedule=[20] * step_num + [10] * step_num + [5] * step_num + [4] * step_num + [3] * step_num + [2] * step_num + [1] * step_num + [0] * final_step_num
),
dilate_kernel=np.ones((3, 3), dtype=np.uint8),
provoke_scheduler=ProvokeScheduler(
num_inference_steps=num_steps,
schedule=list(range(2, 10 + 1, 2)) + list(range(12, 40 + 1, 2)) + [45],
is_zero_indexing=False,
),
)
)
pipeline.register_adaptive_mask_settings(adaptive_mask_settings)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as before

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough feedback! I've applied them, and will PR again.

(Please check minor questions above)

@asomoza asomoza removed the stale Issues that haven't received updates label Sep 19, 2024
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@jellyheadandrew
Copy link
Contributor Author

Hi @asomoza , I have updated the files based on your feedbacks. Waiting for more feedbacks, or a merge!

@asomoza
Copy link
Member

asomoza commented Oct 23, 2024

Hi @jellyheadandrew sorry it took me so long to get back to you. Let's finish this so we can merge it.

First can you please merge it to the main branch, I can see some couple of outdated imports that will probably fail.

Also can you run make style and make quality.

@jellyheadandrew
Copy link
Contributor Author

@asomoza Hi, I've updated my branch! Could you check

@asomoza
Copy link
Member

asomoza commented Oct 25, 2024

@jellyheadandrew thanks, you merged main but we still need the make style and make quality like it's described in the contributing guide. Without that the tests will fail, also you have some unused imports that will be fixed.

@jellyheadandrew
Copy link
Contributor Author

Thanks @asomoza , I've finished make style and make quality.

@yiyixuxu yiyixuxu merged commit e2b3c24 into huggingface:main Nov 6, 2024
8 checks passed
@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Nov 6, 2024

@jellyheadandrew merged! thanks for your contribution!

sayakpaul pushed a commit that referenced this pull request Dec 23, 2024
… in [ECCV2024] ComA (#9228)

* Add new community pipeline for 'Adaptive Mask Inpainting', introduced in [ECCV2024] Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants