Skip to content

Conversation

@RezaYazdaniAminabadi
Copy link
Contributor

What does this PR do?

This PR changes the allocation of a tensor at the modeling_clip.py to happen at device side, to be able to use CUDA-Graph at DeepSpeed-Inference, which can help improve the performance for the Stable-Diffusion model inference. Here is the PR that includes the optimization for improving the SD performance.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@patrickvonplaten, @stas00

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 11, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@stas00 stas00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thank you Reza


as the failed CI indicated you needed to run make fix-copies to update another model that copies the fwd of this model. I have run it and pushed the change. It's all good now.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @RezaYazdaniAminabadi!

@LysandreJik LysandreJik merged commit f6fa0f0 into huggingface:main Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants