Skip to content

Fix tie_word_embedding issue for llava_onevision model#43617

Closed
kaixuanliu wants to merge 1 commit intohuggingface:mainfrom
kaixuanliu:llava_onevision_fix
Closed

Fix tie_word_embedding issue for llava_onevision model#43617
kaixuanliu wants to merge 1 commit intohuggingface:mainfrom
kaixuanliu:llava_onevision_fix

Conversation

@kaixuanliu
Copy link
Contributor

@zucchini-nlp pls help review, thx! We have to add back the changes in #42523. As for llava_onevision model, in its checkpoint config file, the model's tie_word_embeddings is Flase, and model's text_config's tie_word_embeddings is True: L171, which causes when loading the model, the lm_head.weight of the model will get missing, hence cause totally wrong output. You can run this unit test to reproduce:
RUN_SLOW=1 pytest -rA tests/models/llava_onevision/test_modeling_llava_onevision.py::LlavaOnevisionForConditionalGenerationIntegrationTest::test_small_model_integration_test_multi_image_nested

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: llava_next_video, llava_onevision

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @kaixuanliu , thanks a lot for catching this. I am fixing all config default value issues in #43592 instead of fixing per model in separate PRs

I'll add llava models in the PR to keep it in one place

@kaixuanliu
Copy link
Contributor Author

Great! Will close this PR once #43592 is merged.

@zucchini-nlp
Copy link
Member

PR merged and will be in the next patch release, prob on Monday

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants