Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docker/Dockerfile.rocm_base
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ ARG PYTORCH_REPO="https://github.com/pytorch/pytorch.git"
ARG PYTORCH_VISION_REPO="https://github.com/pytorch/vision.git"
ARG FA_BRANCH="1a7f4dfa"
ARG FA_REPO="https://github.com/Dao-AILab/flash-attention.git"
ARG AITER_BRANCH="5a77249"
ARG AITER_BRANCH="c1debd8"
ARG AITER_REPO="https://github.com/ROCm/aiter.git"

FROM ${BASE_IMAGE} AS base
Expand Down
3 changes: 1 addition & 2 deletions vllm/model_executor/layers/fused_moe/layer.py
Original file line number Diff line number Diff line change
Expand Up @@ -376,10 +376,9 @@ def process_weights_after_loading(self, layer: torch.nn.Module) -> None:
shuffle_weights)

if self.rocm_aiter_moe_enabled:
# use 2stage ck moe layout
shuffled_w13, shuffled_w2 = shuffle_weights(layer.w13_weight.data,
layer.w2_weight.data,
layout=(32, 32))
layout=(16, 16))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be made into something static that clarifies what it is (e.g.
AITER_XXX=16). This constant in the middle of the function is just confusing

Copy link
Contributor Author

@vllmellm vllmellm May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertgshaw2-redhat updated the documentation for shuffle_weights as below. which explains and clarifies the arguments of the function.

https://github.com/EmbeddedLLM/vllm/blob/85a7151d180de411d9593f7831a5f1d8c437685f/vllm/model_executor/layers/fused_moe/rocm_aiter_fused_moe.py#L351-L372

Since currently the best optimal layout size for all kernels are the same as (16, 16) we kept that as default and in future updates if the layout meant to change for any kernel then we can introduce the constant variables based on individual kernels in rocm_aiter_fused_moe.py and use them where needed.


layer.w13_weight.data = shuffled_w13
layer.w2_weight.data = shuffled_w2
Expand Down
Loading