-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC]: Model Deprecation Policy #9669
Comments
Let's give a buffer of four weeks as some PRs have taken considerably longer than that. Meanwhile, we can simply disable any tests related to that model so it won't impact our CI. |
|
I can add another bar around the reported usage of the model. I pulled the data for |
I generally agree that it makes sense to have a deprecation policy in place, so we don't need to ad-hoc decide what to do whenever something like this xverse issue pops up 4 weeks seems reasonable to me |
I think overall this policy makes sense to me, but we might need to think about the definition of "broken" models. It seems that we're going to rely on |
I would define it as: it does not work with the latest vllm ( which usually uses the latest transformers / pytorch ) out of the box. |
according to the feedback, the decision is: If we find a model is broken (cannot run directly with the latest |
Motivation.
Usually, we accept model contribution from model vendors, as long as they can verify the model output is correct.
When a new model is added into vLLM, vLLM maintainers will need to maintain the code, update it when necessary.
However, we find that sometimes the model vendor might not be responsive, and the model can get obsolete and even be broken for new
transformers
versions.As stated in https://docs.vllm.ai/en/latest/models/supported_models.html#model-support-policy , some models are community-driven, and vLLM maintainers do not actively keep them up-to-date.
Here, I want to go one step further: if a model is broken (cannot run directly with the latest
transformers
version), and we cannot hear from the model vendor for a period of time, then we will remove the model from vLLM.An example: the
xverse
model added by #3610 . the huggingface repo https://huggingface.co/xverse/XVERSE-7B-Chat/tree/main does not have any update in one year, and the tokenizer is broken in recenttransformers
, leading to an error similar to huggingface/transformers#31789 . In fact, when I addtorch.compile
support for this model in #9641 , I find that I have to use the tokenizer frommeta-llama/Llama-2-7b-chat-hf
in order to run the model.Proposed Change.
If we find a model is broken (cannot run directly with the latest
transformers
version), and we cannot hear from the model vendor for a period of time, then we will remove the model from vLLM.Please comment and vote, the period for deprecation:
Feedback Period.
1 week ( 10/24 - 10/31, both inclusive)
CC List.
No response
Any Other Things.
No response
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: