Skip to content

Cherry-picks to enable Llama4 Maverick#882

Merged
wpyszka merged 4 commits into
vllm-project:releases/v0.14.1from
rsmyrek:llama4_maverick_enablement
Jan 28, 2026
Merged

Cherry-picks to enable Llama4 Maverick#882
wpyszka merged 4 commits into
vllm-project:releases/v0.14.1from
rsmyrek:llama4_maverick_enablement

Conversation

rsmyrek and others added 4 commits January 27, 2026 01:18
Following reasoning stated in PR:
vllm-project#616

Signed-off-by: Radoslaw Smyrek <radoslawx.smyrek@intel.com>
…ect#837)

Signed-off-by: linoy buchnik <lbuchnik@habana.ai>
Signed-off-by: Iryna Boiko <iboiko@habana.ai>
Co-authored-by: Iryna Boiko <iboiko@habana.ai>
…llm-project#855)

Llama4 for `max_model_len > 32k` enable temperature adjustment
https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/llama4.py#L719.
Enabled adjustment causes tensor `q` shape modification from 2D to 3D:
https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/llama4.py#L307.
This tensor is passing to `UnqnatizedFusedMoEMetod -> forward`:
https://github.com/vllm-project/vllm-gaudi/blob/main/vllm_gaudi/ops/hpu_fused_moe.py#L163
causing invalid reshaping - we trying to return a 3D `output.view` based
on 2D output tensor.

Found that following PR introduced the bug: vllm-project#680 and vllm-project#684

Cherry-picked from `releases/v0.13.0`

---------

Signed-off-by: Artur Fierka <artur.fierka@intel.com>
Signed-off-by: Radoslaw Smyrek <radoslawx.smyrek@intel.com>
@github-actions
Copy link
Copy Markdown

🚧 CI Blocked

The main CI workflow was not started for the following reason:

This is a Draft PR. Please mark it as 'Ready for Review' to trigger the CI.

@rsmyrek rsmyrek marked this pull request as ready for review January 26, 2026 23:54
@github-actions
Copy link
Copy Markdown

✅ CI Passed

All checks passed successfully against the following vllm commit:
d7de043d55d1dd629554467e23874097e1c48993

Copy link
Copy Markdown
Collaborator

@afierka-intel afierka-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Collaborator

@wpyszka wpyszka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.14.1 approved

@wpyszka wpyszka merged commit 9ab497d into vllm-project:releases/v0.14.1 Jan 28, 2026
64 of 65 checks passed
wpyszka pushed a commit that referenced this pull request Jan 29, 2026
Reverts part of: #882

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
slokesha pushed a commit to libinta/vllm-gaudi that referenced this pull request Jan 29, 2026
1. vllm-project#805
2. vllm-project#837
3. vllm-project#855
4. vllm-project#862

---------

Signed-off-by: Radoslaw Smyrek <radoslawx.smyrek@intel.com>
Signed-off-by: linoy buchnik <lbuchnik@habana.ai>
Signed-off-by: Iryna Boiko <iboiko@habana.ai>
Signed-off-by: Artur Fierka <artur.fierka@intel.com>
Co-authored-by: Linoy Buchnik <linoybu@gmail.com>
Co-authored-by: Iryna Boiko <iboiko@habana.ai>
Co-authored-by: Artur Fierka <artur.fierka@intel.com>
Signed-off-by: slokesha <slokeshappa@habana.ai>
slokesha pushed a commit to libinta/vllm-gaudi that referenced this pull request Jan 29, 2026
Reverts part of: vllm-project#882

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
Signed-off-by: slokesha <slokeshappa@habana.ai>
@rsmyrek rsmyrek deleted the llama4_maverick_enablement branch February 13, 2026 04:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants