-
Notifications
You must be signed in to change notification settings - Fork 31.6k
fix past_key_values in GPTNeoXForCausalLM.prepare_inputs_for_generation #20621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will let @gante comment here as he's the specialist for generate. Make sure to run make style on your branch to fix the code quality issue.
|
After doing some more testing, I noticed another issue that might or might not be a bug. Currently, it's not possible to use anything else than Leads to Is that expected behavior? I can fix it by creating multiple prompts (see below) per input, but it seems unintuitive, and I don't see anything about it in the docs. Perhaps the docs should simply mention that. |
|
Hey @ValeKnappich 👋 Thank you for the addition, I really think we should do this for all models for a better interface. In fact, the argument should be As for |
…on (huggingface#20621) * fix past_key_values in GPTNeoXForCausalLM.prepare_inputs_for_generation * fix formatting
…on (huggingface#20621) * fix past_key_values in GPTNeoXForCausalLM.prepare_inputs_for_generation * fix formatting
|
Hi, has this issue been resolved? I tried running the code snippet above: and it returned with Is this a different error? |
|
@ardywibowo the script I paste below works. But keep in mind that it is probably not doing what you expect: when To understand why, you would have to dive into this blog post and into our import torch
from transformers import GPTNeoXForCausalLM, AutoTokenizer
# Load model
s = "NinedayWang/PolyCoder-160M"
model = GPTNeoXForCausalLM.from_pretrained(s)
tokenizer = AutoTokenizer.from_pretrained(s, pad_token="<|PAD|>")
# Create random prompt
N_TOKENS = 100
BATCH_SIZE=1
pkv = torch.rand(
(
BATCH_SIZE, # batch size
N_TOKENS, # number of tokens
2 * model.config.num_hidden_layers,
model.config.num_attention_heads,
model.config.hidden_size // model.config.num_attention_heads
)
).permute([2, 0, 3, 1, 4]).split(2)
# Tokenize
enc = tokenizer("Hello world", return_tensors="pt")
enc["attention_mask"] = torch.ones((1, N_TOKENS+1))
# Generate
print(
tokenizer.decode(
model.generate(
**enc,
past_key_values=pkv,
max_new_tokens=100,
pad_token_id=tokenizer.pad_token_id,
do_sample=True,
)[0],
skip_special_tokens=True
)
) |
What does this PR do?
@gante @sgugger
Fixes
past_key_valuesinGPTNeoXForCausalLM.prepare_inputs_for_generation. Passingpast_key_valuestomodel.generatehad no effect whatsoever, since the argument was swallowed. Described in Issue #20347 (note that the validation bug was fixed in PR #20353, but the argument was still not passed along to the forward method)The attached commit fixes the issue on my end, i.e. I now get different results when passing
past_key_valuestogenerate, as opposed to before.