-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Fix FalconMambaIntegrationTests
#38566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| ("cuda", 7): [ | ||
| ' I will be talking about the “Theory of Relativity” by Albert Einstein.\nThe', | ||
| ' I will be talking about the importance of the internet in our lives.\nThe internet is a global', | ||
| ], | ||
| ("cuda", 8): [ | ||
| ' I am going to talk about the “Theory of Relativity” by Albert Einstein.\n', | ||
| ' I will be talking about the importance of the internet in our lives.\nThe internet is a global' | ||
| ], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gante I guess it's normal that, with inputs_embeds, we don't have the prompt included, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imo, that doesn't seem correct to me. It would be weird to expect different behaviour here since we generate from the same "prompt". That might be a regression somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not expert (that is why I ping @gante ), but I kind think it's normal. For the prompt part, we only pass embedding not the token ids. And for those part, we can't recover the token ids from the embedding. That is why it only gives the part that are generated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to know if the original commit worked on this test. Or at least if on input embeds the same issues persisted (no prompt).
Both behaviours are justified imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's failing when the test is written 😢
(That's why I always it's important to use run-slow on PR )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, my bad, move too fast to get the wrong results.
It's on
I will check the commit the day before it to see what happened
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirmed that this test is failing when the test is added on 2025/06/19 - I checkout to the commit, run the test.
(around that time, there are several CIs triggered manually on different commits, so hard to check on slack channels)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sry to be so picky but is it also having failures on (at least) no prefix being returned 👀 If yes, I think this is fine to merge then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, same failing reason. Not picky, it's fine. I am also happy to wait joao's response. No urgent to merge at all :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gentilly ping @gante
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
vasqu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would wait for Joao on the input embeds side because that seems really weird to me. Otherwise, mostly nits.
| def tearDown(self): | ||
| cleanup(torch_device, gc_collect=True) | ||
|
|
||
| # On T4, get `NotImplementedError: Cannot copy out of meta tensor; no data!` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be too little memory or some cpu offloading issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I guess too. But if I remove device_map="auto" and add .to(torch_device)
model = AutoModelForCausalLM.from_pretrained(self.model_id, torch_dtype=torch.float16).to(torch_device)
the generation could run without OOM.
So it's a bit strange to me why auto would cause problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe someone to cc here? It's not a deal breaker here but might be interesting to investigate if someone wants to / has the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will open an issue and ping
| ("cuda", 7): [ | ||
| ' I will be talking about the “Theory of Relativity” by Albert Einstein.\nThe', | ||
| ' I will be talking about the importance of the internet in our lives.\nThe internet is a global', | ||
| ], | ||
| ("cuda", 8): [ | ||
| ' I am going to talk about the “Theory of Relativity” by Albert Einstein.\n', | ||
| ' I will be talking about the importance of the internet in our lives.\nThe internet is a global' | ||
| ], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imo, that doesn't seem correct to me. It would be weird to expect different behaviour here since we generate from the same "prompt". That might be a regression somewhere.
|
Although it is not urgent, but let's merge. If @gante has any comment that would lead to any further change(s), we could do it in a follow-up PR. Keeping failing tests have some risk: if other merged commits cause troubles, it won't be detected and make further fixes more difficult to handle. (I have been in these cases several times ... trust me the pain) |
What does this PR do?
All pass on A10.
One fails on T4
(# On T4, get
NotImplementedError: Cannot copy out of meta tensor; no data!)I decided to just skip it on T4