Avoid overwrite existing local implementation when loading remote custom model#38474
Avoid overwrite existing local implementation when loading remote custom model#38474Isotr0py merged 2 commits intohuggingface:mainfrom
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
Hi @Isotr0py, it's a little weird but this is expected behaviour! When the user sets The solution is for users to remove |
In fact, we found this issue during investigation when tried removing A more comprehensive reproduce code with pure import torch
from transformers import AutoModel
# initialize a library model
model = AutoModel.from_pretrained('Qwen/Qwen2-0.5B-Instruct')
print(model.__class__)
# <class 'transformers.models.qwen2.modeling_qwen2.Qwen2Model'>
# initialize a remote model with `trust_remote_code=True`
model = AutoModel.from_pretrained('Alibaba-NLP/gte-Qwen2-1.5B-instruct', trust_remote_code=True)
print(model.__class__)
# <class 'transformers_modules.Alibaba-NLP.gte-Qwen2-1.5B-instruct.a9af15a6372d7d6b25e9fb07c2ccb9e1fe645644.modeling_qwen.Qwen2Model'>
# re-initialize a library model without `trust_remote_code=True`
model = AutoModel.from_pretrained('Qwen/Qwen2-0.5B-Instruct')
print(model.__class__)
# <class 'transformers_modules.Alibaba-NLP.gte-Qwen2-1.5B-instruct.a9af15a6372d7d6b25e9fb07c2ccb9e1fe645644.modeling_qwen.Qwen2Model'>The library Qwen2 model can't be loaded properly anymore (even if without |
BTW, given the reproduction outputs with this PR, it seems that remote code sharing config name with a library model can still be loaded with this PR, though it's quite weird for me too... |
ArthurZucker
left a comment
There was a problem hiding this comment.
cc @Rocketknight1 It makes sense to try and protect no? Or prevent someone from updating a model on the hub that would overwrite one of our code, but the use case makes complete sense!
|
Yes, this is a good point, and I misunderstood part of the issue. You're absolutely correct that we should avoid overwriting the local implementation when loading a remote code model with the same class. I'll review this again and see if we can achieve that without breaking other parts of remote code loading |
af89886 to
f7c8a7e
Compare
Rocketknight1
left a comment
There was a problem hiding this comment.
Reviewed again and this change looks good, now that I understand what it's for! I made one suggestion to update the comment to be very clear about what that block is for.
…ote model Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
34abfb1 to
1ba9933
Compare
…tom model (huggingface#38474) * avoid overwrite existing local implementation when loading custom remote model Signed-off-by: Isotr0py <2037008807@qq.com> * update comments Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>
What does this PR do?
Fixes vllm-project/vllm#18720 (comment)
Alibaba-NLP/gte-Qwen2-1.5B-instructwithtrust_remote_code=True, the local Qwen2 implementation in HF is overwritten by its custom implementation, initializing any original Qwen2 models again will use the custom module, which causes unexpected results even if settingtrust_remote_code=False.Transformers.Reproduce code:
Output without this PR
Output with this PR
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.