Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatch in vocab_size between .bin files and .safetensors files #43

Open
noahboegli opened this issue May 26, 2024 · 0 comments
Open

Comments

@noahboegli
Copy link

Hey !

I'm sorry if this is not an issue and it's just me not understanding the problem, I'm not an expert, rather a novice, in this field.

I'm trying to deploy the project according to your deployment guide.
However, since I don't have access to enough memory for the -70B version of the model, I want to use the --load-8bit parameter to enable model compression. (I shall specify that I run the model using the CPU, with the --device cpu flag)

When I use this, I get the following error:

ValueError: Trying to set a tensor of shape torch.Size([32000, 8192]) in "weight" (which has shape torch.Size([32017, 8192])), this look incorrect

If I look in the HF's upload log, I see that there were two main upload of the model:

  • The first one with the .bin files, including the vocab_size value set to 32000
  • The seconde one with the .safetensors files, including the vocab_size value set to 32017

My understanding is that to enable model compression, the .bin files are needed, which do not match to the model configuration anymore.

This is supported by a manual edit of the config.json file to set vocab_size back to 32000, which allows the model to load properly using --load-8bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant