Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions vllm/model_executor/layers/layernorm.py
Original file line number Diff line number Diff line change
Expand Up @@ -510,6 +510,7 @@ def __init__(
norm_before_gate: bool = False,
device: torch.device | None = None,
dtype: torch.dtype | None = None,
activation: str = "swish",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The activation parameter should have a type annotation to improve code readability and maintainability. This also helps with static analysis and prevents potential type-related errors.

        dtype: torch.dtype | None = None,
        activation: str = "swish",

):
"""Initialize RMSNormGated.

Expand All @@ -524,10 +525,12 @@ def __init__(
If False and z is provided: out = norm(x * silu(z))
device: Device to create parameters on
dtype: Data type for parameters
activation: Activation function name for gating
"""
Comment on lines +528 to 529
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Adding a description for the activation parameter in the docstring improves code documentation and makes it easier for users to understand the purpose of this parameter.

            device: Device to create parameters on
            dtype: Data type for parameters
            activation: Activation function name for gating
        """

factory_kwargs = {"device": device, "dtype": dtype}
super().__init__()
self.eps = eps
self.activation = activation
Comment on lines 532 to +533
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Initializing self.activation with the provided activation value ensures that the attribute is properly set during object creation. This fixes the AttributeError reported in the issue.

        super().__init__()
        self.eps = eps
        self.activation = activation

self.weight = nn.Parameter(torch.empty(hidden_size, **factory_kwargs))
self.register_parameter("bias", None)
self.group_size = group_size
Expand Down