-
Notifications
You must be signed in to change notification settings - Fork 31.6k
Cache: don't show warning in forward passes when past_key_values is None
#33541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| past_key_values = DynamicCache.from_legacy_cache(past_key_values) | ||
| logger.warning_once( | ||
| "We detected that you are passing `past_key_values` as a tuple of tuples. This is deprecated and " | ||
| "will be removed in v4.47. Please use an appropriate `Cache` class " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bumped the deprecation to v4.47, we some key models like T5 are still missing
| next_cache = next_decoder_cache if use_cache else None | ||
| if return_legacy_cache: | ||
| next_cache = next_cache.to_legacy_cache() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy/paste from llama
(on some models, this pattern was slightly different)
| "Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/internal/generation_utils#transformers.Cache)" | ||
| ) | ||
| return_legacy_cache = False | ||
| if use_cache and not isinstance(past_key_values, Cache): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: not self.training was removed.
If we are training and we pass past_key_values as tuple of tuples, we definitely want to see the warning -- the code will break in the near future
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
zucchini-nlp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing, this is way much better than checking for self.training
LysandreJik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Joao!
| logger.warning_once( | ||
| "We detected that you are passing `past_key_values` as a tuple of tuples. This is deprecated and " | ||
| "will be removed in v4.47. Please use an appropriate `Cache` class " | ||
| "(https://huggingface.co/docs/transformers/internal/generation_utils#transformers.Cache)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit not really related to the PR but to the link which was already here before)
Linking to the Cache class is cool but you have to scroll down a bit to see an example. Would it be possible to link to a migration doc/example showcasing how a previously written code with past key values as a tuple of tuples can be adapted to be sent to the model?
The more copy-pastable the example, the less friction there will be here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LysandreJik good point!
I've added a tiny section to our cache docs about the legacy cache and how to convert it to/from the new format, with an example (cc @zucchini-nlp). This warning now points to that section in the docs.
(will merge after confirming the docs with the doc builder)
EDIT: for some reason, the doc builder is not updating its contents, despite the doc job being successful 🤔 I'm going to merge and double-check the merged results
EDIT2: it worked :) https://huggingface.co/docs/transformers/main/en/kv_cache#legacy-cache-format
What does this PR do?
Because of the transition from tuple of tuples to
Cacheinstances, we were throwing a warning when convertingpast_key_valuesto the new cache format in the forward passes.One of those situations was when
use_cache=Trueandpast_key_values is None... but there is nothing to convert there. In fact, most of the times, the user didn't even specify the argument (see test script below). Moreover, after the transition is complete, we want to keep the defaultpast_key_values=Noneargument.As such, this PR removes the warning when
past_key_values=None.Fixes #33489
Test script:
Before:
Now: no warning :)