Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow loading of .safetensors through GPTQ-for-LLaMa #529

Merged
merged 1 commit into from
Mar 25, 2023

Conversation

EyeDeck
Copy link
Contributor

@EyeDeck EyeDeck commented Mar 24, 2023

Quantized models were hardcoded to only load .pt, but qwopqwop200's repo already works with .safetensors if we just pass in the other file extension.

With this PR, it looks for models in the order:

  • models\model-#bit.safetensors
  • models\subfolder\model-#bit.safetensors
  • models\model-#bit.pt
  • models\subfolder\model-#bit.pt

Seems to work fine, but I only tested on one janky model I ran through the quantizer myself.

@Ph0rk0z
Copy link
Contributor

Ph0rk0z commented Mar 24, 2023

Yes, except we all have to re-quantize 100gb of models for this.

@EyeDeck
Copy link
Contributor Author

EyeDeck commented Mar 24, 2023

Why would that be necessary? I mean, it's going to need to be done to support newer versions of GPTQ-for-LLaMa at some point, but that's not related to this PR.

@oobabooga
Copy link
Owner

Thanks, this is handy! More updates on GPTQ will come after #530.

@oobabooga oobabooga merged commit 3da633a into oobabooga:main Mar 25, 2023
Ph0rk0z pushed a commit to Ph0rk0z/text-generation-webui-testing that referenced this pull request Apr 17, 2023
Allow loading of .safetensors through GPTQ-for-LLaMa
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants