Fix tie_word_embedding issue for llava_onevision model#43617
Closed
kaixuanliu wants to merge 1 commit intohuggingface:mainfrom
Closed
Fix tie_word_embedding issue for llava_onevision model#43617kaixuanliu wants to merge 1 commit intohuggingface:mainfrom
kaixuanliu wants to merge 1 commit intohuggingface:mainfrom
Conversation
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Contributor
|
[For maintainers] Suggested jobs to run (before merge) run-slow: llava_next_video, llava_onevision |
Member
zucchini-nlp
left a comment
There was a problem hiding this comment.
Hey @kaixuanliu , thanks a lot for catching this. I am fixing all config default value issues in #43592 instead of fixing per model in separate PRs
I'll add llava models in the PR to keep it in one place
Contributor
Author
|
Great! Will close this PR once #43592 is merged. |
Member
|
PR merged and will be in the next patch release, prob on Monday |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
@zucchini-nlp pls help review, thx! We have to add back the changes in #42523. As for llava_onevision model, in its checkpoint config file, the model's
tie_word_embeddingsis Flase, and model's text_config'stie_word_embeddingsis True: L171, which causes when loading the model, thelm_head.weightof the model will get missing, hence cause totally wrong output. You can run this unit test to reproduce:RUN_SLOW=1 pytest -rA tests/models/llava_onevision/test_modeling_llava_onevision.py::LlavaOnevisionForConditionalGenerationIntegrationTest::test_small_model_integration_test_multi_image_nested