You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm sorry if this is not an issue and it's just me not understanding the problem, I'm not an expert, rather a novice, in this field.
I'm trying to deploy the project according to your deployment guide.
However, since I don't have access to enough memory for the -70B version of the model, I want to use the --load-8bit parameter to enable model compression. (I shall specify that I run the model using the CPU, with the --device cpu flag)
When I use this, I get the following error:
ValueError: Trying to set a tensor of shape torch.Size([32000, 8192]) in "weight" (which has shape torch.Size([32017, 8192])), this look incorrect
If I look in the HF's upload log, I see that there were two main upload of the model:
The first one with the .bin files, including the vocab_size value set to 32000
The seconde one with the .safetensors files, including the vocab_size value set to 32017
My understanding is that to enable model compression, the .bin files are needed, which do not match to the model configuration anymore.
This is supported by a manual edit of the config.json file to set vocab_size back to 32000, which allows the model to load properly using --load-8bit.
The text was updated successfully, but these errors were encountered:
Hey !
I'm sorry if this is not an issue and it's just me not understanding the problem, I'm not an expert, rather a novice, in this field.
I'm trying to deploy the project according to your deployment guide.
However, since I don't have access to enough memory for the -70B version of the model, I want to use the
--load-8bit
parameter to enable model compression. (I shall specify that I run the model using the CPU, with the--device cpu
flag)When I use this, I get the following error:
If I look in the HF's upload log, I see that there were two main upload of the model:
vocab_size
value set to 32000vocab_size
value set to 32017My understanding is that to enable model compression, the
.bin
files are needed, which do not match to the model configuration anymore.This is supported by a manual edit of the config.json file to set
vocab_size
back to 32000, which allows the model to load properly using--load-8bit
.The text was updated successfully, but these errors were encountered: