Skip to content

Commit e106c4e

Browse files
venkywonkadominicshanshan
authored andcommitted
[https://nvbugs/5463720][fix] tp-split the inferred mlp_hidden_size for nemotron-nas (NVIDIA#7231)
Signed-off-by: Venky Ganesh <[email protected]> Signed-off-by: Wangshanshan <[email protected]>
1 parent 88c9eda commit e106c4e

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

tensorrt_llm/_torch/model_config.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -494,7 +494,8 @@ def get_bindings_model_config(self,
494494
architectures = self.pretrained_config.architectures
495495
if len(architectures
496496
) == 1 and architectures[0] == "DeciLMForCausalLM":
497-
mlp_hidden_size = self._infer_nemotron_ffn_mult()
497+
mlp_hidden_size = self._infer_nemotron_ffn_mult(
498+
) // self.mapping.tp_size
498499
else:
499500
raise ValueError(
500501
f"Inferring mlp hidden size for model architecture: {architectures} isn't supported yet"

0 commit comments

Comments
 (0)