-
Notifications
You must be signed in to change notification settings - Fork 31.6k
[Falcon] Set use_cache=False before creating presents which relies on use_cache
#26328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
presents=None when use_cache is set to False for activation ckptpresents=None when use_cache is set to False for activation ckpt
ArthurZucker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good catch!
| "`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`..." | ||
| ) | ||
| use_cache = False | ||
| presents = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be done outside the loop and similar to this:
if self.gradient_checkpointing and self.training:
if use_cache:
logger.warning_once(
"`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`..."
)
use_cache = False
present_key_values = () if use_cache else None
all_self_attentions = () if output_attentions else None
all_hidden_states = () if output_hidden_states else NoneThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated, thanks!
presents=None when use_cache is set to False for activation ckptuse_cache=False before creating presents which relies on use_cache
younesbelkada
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch ! Can you run the styling checks?
make fixupThen we can merge I think
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
|
Hi @yundai424 thanks a lot for iterating, in order to move forward with the PR could you merge your branch with main branch? |
|
Hi @younesbelkada are you referring to merging to HF main? 🤔 |
|
Hi @yundai424 git fetch upstream
git merge upstream/main
git push |
|
oh cool i see what you mean.. merged, thanks! @younesbelkada |
younesbelkada
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clean to me, thanks!
LysandreJik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks @yundai424!
What does this PR do?
Fixes #26327
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.