Skip to content

Conversation

@zucchini-nlp
Copy link
Member

What does this PR do?

Fixes #34033 and enables tests for VLMs. Prev tests were all skipped because we had a hard check for CausalLM Mapping

# generating the first new token or not, and we only want to use the embeddings for the first new token)
if not self.config.is_encoder_decoder and model_input_name == "inputs_embeds":
model_kwargs["use_cache"] = True
generation_config.use_cache = True
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the subsequent step of preparing cache checks generation config, and we miss the step of adding a Cache class. Fails for IDEFICS only because it doesn't prepare cache in "forward" if "use_cache", but rather assume that there's already a correct cache provided

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change updates generation_config, which means we would have two variables containing the source of truth for this flag (generation_config.use_cache and model_kwargs["use_cache"]). More than one source of truth usually leads to bugs 🐛 We also know that we need model_kwargs["use_cache"] for the forward pass.

To avoid multiple sources of truth, I'd suggest one of the following:

  1. let's use model_kwargs["use_cache"] in all places after this if/else, instead of also using generation_config.use_cache.
  2. (I have a preference for this one) Let's use generation_config.use_cache everywhere, removing model_kwargs["use_cache"] from most places in the generate function. Before calling the decoding methods, let's add model_kwargs["use_cache"] = generation_config.use_cache, since we need this for the forward pass and generation_config barely gets used in the decoding methods

(I know this issue predates your change 🤗 But since we're touching it, let's do the most sustainable change)

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's always the same models causing problems 👁️ @leot13 👁️

@zucchini-nlp
Copy link
Member Author

zucchini-nlp commented Oct 10, 2024

I am trying to make sure that IDEFICS models start using library standards, and hopefully after we enable generation tests it will be easier to catch those bugs when adding the model 🤗

Copy link
Contributor

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except for that nit in the generate body :)

Thank you for fixing 💪

# generating the first new token or not, and we only want to use the embeddings for the first new token)
if not self.config.is_encoder_decoder and model_input_name == "inputs_embeds":
model_kwargs["use_cache"] = True
generation_config.use_cache = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change updates generation_config, which means we would have two variables containing the source of truth for this flag (generation_config.use_cache and model_kwargs["use_cache"]). More than one source of truth usually leads to bugs 🐛 We also know that we need model_kwargs["use_cache"] for the forward pass.

To avoid multiple sources of truth, I'd suggest one of the following:

  1. let's use model_kwargs["use_cache"] in all places after this if/else, instead of also using generation_config.use_cache.
  2. (I have a preference for this one) Let's use generation_config.use_cache everywhere, removing model_kwargs["use_cache"] from most places in the generate function. Before calling the decoding methods, let's add model_kwargs["use_cache"] = generation_config.use_cache, since we need this for the forward pass and generation_config barely gets used in the decoding methods

(I know this issue predates your change 🤗 But since we're touching it, let's do the most sustainable change)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@zucchini-nlp zucchini-nlp merged commit d087165 into huggingface:main Oct 16, 2024
@zucchini-nlp zucchini-nlp mentioned this pull request Oct 28, 2024
BernardZach pushed a commit to BernardZach/transformers that referenced this pull request Dec 5, 2024
* support embeds

* use cache from config

* style...

* fix tests after rebase
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

IDEFICS can't use inputs_embeds in generate function

4 participants