Skip to content

Commit

Permalink
[Bugfix] Fix incorrect vocal embedding shards for GGUF model in tenso…
Browse files Browse the repository at this point in the history
…r parallelism (vllm-project#7954)
  • Loading branch information
Isotr0py authored and Jeffwan committed Sep 19, 2024
1 parent 823660c commit 5e2ccd2
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion vllm/model_executor/layers/vocab_parallel_embedding.py
Original file line number Diff line number Diff line change
Expand Up @@ -351,7 +351,10 @@ def weight_loader(self, param: Parameter, loaded_weight: torch.Tensor):
param.weight_type = loaded_weight.item()
return
elif isinstance(param, UninitializedParameter):
param.materialize(loaded_weight.shape, dtype=loaded_weight.dtype)
shape = list(loaded_weight.shape)
if output_dim is not None:
shape[output_dim] = shape[output_dim] // self.tp_size
param.materialize(tuple(shape), dtype=loaded_weight.dtype)

# If parameter does not have output dim, then it should
# be copied onto all gpus (e.g. g_idx for act_order gptq).
Expand Down

0 comments on commit 5e2ccd2

Please sign in to comment.