GPT OSS Integration Code by hlahkar · Pull Request #771 · vllm-project/vllm-gaudi

hlahkar · 2026-01-02T01:42:23Z

No description provided.

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

Copilot

Pull request overview

This PR integrates support for the GPT OSS model type, including additions for handling model-specific routing logic, bias support in MoE layers, and attention sink mechanisms for improved inference.

Adds GPT OSS-specific expert routing and softmax handling in the MoE forward pass
Implements bias support throughout the MoE pipeline
Introduces attention sink functionality across attention backends and operations

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
vllm_gaudi/v1/worker/hpu_model_runner.py	Increases sliding window block size calculation by 1
vllm_gaudi/ops/hpu_fused_moe.py	Adds GPT OSS model type detection, bias handling in MoE operations, and model-specific expert routing
vllm_gaudi/extension/utils.py	Adds sinks parameter support to forward pass
vllm_gaudi/extension/ops.py	Implements sink attention mechanism in pipelined and naive attention functions, adds bias support to MoE operations
vllm_gaudi/attention/backends/hpu_attn.py	Adds sinks parameter and dtype consistency checks in attention implementation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

github-actions · 2026-01-19T08:03:49Z

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

wpyszka · 2026-01-19T09:31:39Z

/run-gaudi-tests

Signed-off-by: Himangshu Lahkar <49579433+hlahkar@users.noreply.github.com>

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

hlahkar · 2026-01-22T12:09:50Z

/run-gaudi-tests

sys-hab-pt-service · 2026-01-22T12:10:20Z

Only codeowners and testowners can request to run Gaudi tests. Contact list: kzawora-intel, xuechendi, adobrzyn, mgawarkiewicz-intel, afierka-intel, michalkuligowski, iboiko-habana, kamil-kaczor, ksmusz, PatrykWo, kamil-kaczor, kfojcik-intel, ksmusz, wuxun-zhang, xuechendi, attafosu, ulivne, Kacper-Pietkun, iboiko-habana, jkaniecki, jbyczkow, wpyszka

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

github-actions · 2026-01-23T10:45:38Z

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

wpyszka · 2026-01-23T14:33:56Z

/run-gaudi-tests

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

…llm-project#855) Llama4 for `max_model_len > 32k` enable temperature adjustment https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/llama4.py#L719. Enabled adjustment causes tensor `q` shape modification from 2D to 3D: https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/llama4.py#L307. This tensor is passing to `UnqnatizedFusedMoEMetod -> forward`: https://github.com/vllm-project/vllm-gaudi/blob/main/vllm_gaudi/ops/hpu_fused_moe.py#L163 causing invalid reshaping - we trying to return a 3D `output.view` based on 2D output tensor. Found that following PR introduced the bug: vllm-project#680 and vllm-project#684 Cherry-picked from `releases/v0.13.0` --------- Signed-off-by: Artur Fierka <artur.fierka@intel.com>

…vllm-project#852) Signed-off-by: Dudi Lester <dlester@habana.ai> Co-authored-by: Kamil Kaczor <kamil.kaczor@intel.com>

Reverts vllm-project#780 --------- Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai> Co-authored-by: Chendi.Xue <chendi.xue@intel.com>

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

github-actions · 2026-01-27T06:11:06Z

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

wpyszka · 2026-01-27T08:50:57Z

/run-gaudi-tests

hlahkar · 2026-02-05T05:48:16Z

PR is taken care through #887

GPT OSS Code

7d63e51

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

Copilot AI review requested due to automatic review settings January 2, 2026 01:42

hlahkar requested review from adobrzyn, afierka-intel, iboiko-habana, kamil-kaczor, ksmusz, kzawora-intel, mgawarkiewicz-intel, michalkuligowski and xuechendi as code owners January 2, 2026 01:42

Copilot AI reviewed Jan 2, 2026

View reviewed changes

Comment thread vllm_gaudi/extension/ops.py

hlahkar mentioned this pull request Jan 2, 2026

Initial Commit GPT-OSS #485

Closed

Himangshu Lahkar added 3 commits January 2, 2026 04:36

Update MOE

0b49b97

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

Update Pipelined PA

9cdf3f3

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

Format MOE

7c8e4eb

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

github-actions Bot mentioned this pull request Jan 2, 2026

🚦 Team Review Dashboard #701

Open

Himangshu Lahkar and others added 5 commits January 2, 2026 06:59

Update FSDPA

cb9bd94

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

Merge branch 'main' into gpt_oss_latest

14408fe

Merge branch 'main' into gpt_oss_latest

c609cd5

Merge branch 'main' into gpt_oss_latest

e3c8b52

Set Model type to None if config is None

4b2f0ff

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

Merge branch 'main' into gpt_oss_latest

f494059

wpyszka and others added 3 commits January 19, 2026 13:29

Merge branch 'main' into gpt_oss_latest

9a37437

Merge branch 'main' into gpt_oss_latest

87edb67

Signed-off-by: Himangshu Lahkar <49579433+hlahkar@users.noreply.github.com>

qkv dtype cast, needed for GPT-OSS

69f4178

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

hlahkar and others added 3 commits January 23, 2026 06:07

Merge branch 'main' into gpt_oss_latest

f2c3d2d

Put a check for hf_config

bc26332

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

Make bias the last arg

a63ceb2

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

hlahkar and others added 3 commits January 23, 2026 16:15

Merge branch 'main' into gpt_oss_latest

e3373e1

Merge branch 'main' into gpt_oss_latest

22a55d1

bias ordering made proper

4b6fb28

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

Himangshu Lahkar and others added 5 commits January 27, 2026 08:07

check for bias not None

0f46ae2

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

Fix HPU model runner profile_run to work with dynamic kv-cache scales (…

ba3e55a

…vllm-project#852) Signed-off-by: Dudi Lester <dlester@habana.ai> Co-authored-by: Kamil Kaczor <kamil.kaczor@intel.com>

Revert "skip HPU graphs for long prefills" (vllm-project#850)

8a68c7d

Reverts vllm-project#780 --------- Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai> Co-authored-by: Chendi.Xue <chendi.xue@intel.com>

fix pre-commit error

f3a4560

Signed-off-by: Himangshu Lahkar <hlahkar@habana.ai>

hlahkar force-pushed the gpt_oss_latest branch from 8416f2f to f3a4560 Compare January 27, 2026 06:10

Merge branch 'main' into gpt_oss_latest

5634b9b

hlahkar closed this Feb 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPT OSS Integration Code#771

GPT OSS Integration Code#771
hlahkar wants to merge 25 commits intovllm-project:mainfrom
hlahkar:gpt_oss_latest

hlahkar commented Jan 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

github-actions Bot commented Jan 19, 2026

Uh oh!

wpyszka commented Jan 19, 2026

Uh oh!

hlahkar commented Jan 22, 2026

Uh oh!

sys-hab-pt-service commented Jan 22, 2026

Uh oh!

github-actions Bot commented Jan 23, 2026

Uh oh!

wpyszka commented Jan 23, 2026

Uh oh!

github-actions Bot commented Jan 27, 2026

Uh oh!

wpyszka commented Jan 27, 2026

Uh oh!

hlahkar commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

hlahkar commented Jan 2, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

github-actions Bot commented Jan 19, 2026

🚧 CI Blocked

Uh oh!

wpyszka commented Jan 19, 2026

Uh oh!

hlahkar commented Jan 22, 2026

Uh oh!

sys-hab-pt-service commented Jan 22, 2026

Uh oh!

github-actions Bot commented Jan 23, 2026

🚧 CI Blocked

Uh oh!

wpyszka commented Jan 23, 2026

Uh oh!

github-actions Bot commented Jan 27, 2026

🚧 CI Blocked

Uh oh!

wpyszka commented Jan 27, 2026

Uh oh!

hlahkar commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants