Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC]: Model Deprecation Policy #9669

Closed
1 task done
youkaichao opened this issue Oct 24, 2024 · 7 comments
Closed
1 task done

[RFC]: Model Deprecation Policy #9669

youkaichao opened this issue Oct 24, 2024 · 7 comments
Labels

Comments

@youkaichao
Copy link
Member

youkaichao commented Oct 24, 2024

Motivation.

Usually, we accept model contribution from model vendors, as long as they can verify the model output is correct.

When a new model is added into vLLM, vLLM maintainers will need to maintain the code, update it when necessary.

However, we find that sometimes the model vendor might not be responsive, and the model can get obsolete and even be broken for new transformers versions.

As stated in https://docs.vllm.ai/en/latest/models/supported_models.html#model-support-policy , some models are community-driven, and vLLM maintainers do not actively keep them up-to-date.

Here, I want to go one step further: if a model is broken (cannot run directly with the latest transformers version), and we cannot hear from the model vendor for a period of time, then we will remove the model from vLLM.

An example: the xverse model added by #3610 . the huggingface repo https://huggingface.co/xverse/XVERSE-7B-Chat/tree/main does not have any update in one year, and the tokenizer is broken in recent transformers, leading to an error similar to huggingface/transformers#31789 . In fact, when I add torch.compile support for this model in #9641 , I find that I have to use the tokenizer from meta-llama/Llama-2-7b-chat-hf in order to run the model.

Proposed Change.

If we find a model is broken (cannot run directly with the latest transformers version), and we cannot hear from the model vendor for a period of time, then we will remove the model from vLLM.

Please comment and vote, the period for deprecation:

  1. one week
  2. two weeks
  3. four weeks

Feedback Period.

1 week ( 10/24 - 10/31, both inclusive)

CC List.

No response

Any Other Things.

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@youkaichao youkaichao added the RFC label Oct 24, 2024
@DarkLight1337
Copy link
Member

Let's give a buffer of four weeks as some PRs have taken considerably longer than that. Meanwhile, we can simply disable any tests related to that model so it won't impact our CI.

@robertgshaw2-neuralmagic
Copy link
Collaborator

  • I think we should give a point release with a deprecation warning as well so we can solicit community feedback prior
  • We can also suggest users leverage the Plugins if they want to use the model

@simon-mo
Copy link
Collaborator

I can add another bar around the reported usage of the model. I pulled the data for XverseForCausalLM and it seems to be actively used by the same group of users but they are using version 0.4.0

@tlrmchlsmth
Copy link
Collaborator

I generally agree that it makes sense to have a deprecation policy in place, so we don't need to ad-hoc decide what to do whenever something like this xverse issue pops up

4 weeks seems reasonable to me

@ywang96
Copy link
Member

ywang96 commented Oct 24, 2024

I think overall this policy makes sense to me, but we might need to think about the definition of "broken" models.

It seems that we're going to rely on transformers to determine such, which I have no issue with, but this is something we should explicitly define and state in our documentation.

@youkaichao
Copy link
Member Author

definition of "broken" models

I would define it as: it does not work with the latest vllm ( which usually uses the latest transformers / pytorch ) out of the box.

@youkaichao
Copy link
Member Author

according to the feedback, the decision is:

If we find a model is broken (cannot run directly with the latest transformers version and vllm version), and we cannot hear from the model vendor for 4 weeks, then we will remove the model from vLLM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants