[fix] Use last_hidden_state key from get_image_features for llama4#43882
[fix] Use last_hidden_state key from get_image_features for llama4#43882vasqu merged 1 commit intohuggingface:mainfrom
fix] Use last_hidden_state key from get_image_features for llama4#43882Conversation
|
[For maintainers] Suggested jobs to run (before merge) run-slow: llama4 |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
zucchini-nlp
left a comment
There was a problem hiding this comment.
Btw, do we not have llama4 fast tests running to check this?
| return BaseModelOutput( | ||
| return BaseModelOutputWithPooling( | ||
| last_hidden_state=hidden_state, | ||
| hidden_states=hidden_states, |
There was a problem hiding this comment.
i think the last hidden state is the one after layernorm_post, and pooler state is after adapter. Though it'll be a breaking change...
Fine with leaving it as is, thanks
There was a problem hiding this comment.
You're right :/ Damn, I wish I spotted that before v5, as it's indeed breaking if we improve it. I did make roughly this change for a few other architectures pre-v5.
No, I was surprised as well. We only have slow tests. |
|
super weird tbh. The model arch is pretty straightforward so there should be no reason to skip the |
vasqu
left a comment
There was a problem hiding this comment.
Thanks, yea we don't have fast tests for llama4 (which I also wondered about but I think it was done in a rush so it happens)
|
Yep we had an insane rush it was the first BIG BIG model 😢 |
|
I see, would be great to add it but I don;;t want to block this PR. Will add in my todo notes, maybe one day 🙃 |
## Summary <!--- This is a required section; please describe the main purpose of this proposed code change. ---> Fix Convergence Tests for Transformers v5, including updating some mismatched variables, increasing the tolerance for some tests, ...etc. The only one error remained is expected and should be fixed in [Transformers#43882](huggingface/transformers#43882). <!--- ## Details This is an optional section; is there anything specific that reviewers should be aware of? ---> ## Related Issues & PRs - #978 ## Testing Done <!--- This is a required section; please describe how this change was tested. ---> <!-- Replace BLANK with your device type. For example, A100-80G-PCIe Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> - Hardware Type: H100 - [ ] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [x] run `make test-convergence` to ensure convergence on v4 & v5 --------- Co-authored-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
…ma4 (huggingface#43882) Use last_hidden_state key from get_image_features for llama4
What does this PR do?
Resolves #42564 (comment)
#42564 updated
get_image_featuresfor Llama4, but it erroneously started usingpooler_outputinstead of the previouslast_hidden_state. Additionally, the Llama4VisionModel output has been updated toBaseModelOutputWithPoolingto match many vision models.Reproducer
On
main:On this PR:
(Gibberish obviously as this is a tiny-random model, but it works now!)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@zucchini-nlp @vasqu
Thank you to @Mecoli1219 for reporting this.