Fix CLIPVisionModel hidden_states issue in Llava convergence test for transformers v5#1061
Conversation
Remove importlib.reload(modeling_clip) from revert_liger_kernel_to_llava() as it breaks CLIPVisionModel's output_hidden_states functionality in transformers v5. Liger kernel does not patch modeling_clip when model=None, so reloading it is unnecessary and causes hidden_states to return None.
There was a problem hiding this comment.
LGTM. btw, do we need to reload llama? I think liger kernel doesn't patch modeling_llama here.
@Tcc0403 I've tested it already. Feel free to approve and merge it.
|
Yes, we need to reload modeling_llama. In the Llava convergence test, |
Yes as @yukiu00 explained above, there's a llama patch inside llava patch. Thanks for the fix, but do you know the root cause of CLIPVisionModel.forward() returning
|
|
I just checked transformers changes on Somehow update: pivot PR huggingface/transformers#41546 |
|
Thanks for the deep dive into the root cause! I've just opened an issue on the Here is the link: huggingface/transformers#43761 I referenced this PR and the context you provided. Hopefully, they can clarify if this was intended or a bug. |
|
@Tcc0403 FYI: I tested the transformers fix branch ( |
|
Can you check whether huggingface/transformers#43765 resolves the issue? |
|
@Tcc0403 Confirmed. transformers main (5.2.0.dev0, includes #43765) resolves the issue. |
|
Thanks! |
Fixes #1011
Summary
TypeError: 'NoneType' object is not subscriptableerror in Llava convergence test caused byimage_outputs.hidden_statesreturningNoneimportlib.reload(modeling_clip)fromrevert_liger_kernel_to_llava()functionProblem
In transformers v5, calling
importlib.reload(modeling_clip)breaksCLIPVisionModel'soutput_hidden_statesfunctionality. When the Llava convergence test runs:revert_liger_kernel_to_llava()which reloadsmodeling_clipCLIPVisionModel.forward()now returnshidden_states=Noneeven whenoutput_hidden_states=Trueis passedThis causes the error at:
Solution
Remove
importlib.reload(modeling_clip)fromrevert_liger_kernel_to_llava(). This is safe because:modeling_clipwhenmodel=None(which is the case in convergence tests)modeling_llavaandmodeling_llamaneed to be reloaded to revert Liger patchesTest plan
python -m pytest test/convergence/bf16/test_mini_models_multimodal.py -k llavapasseshidden_statesis no longerNoneafter the fix