[CI/Build][Intel] Add new performance benchmarks for Intel Gaudi 3 by simonreginis · Pull Request #31025 · vllm-project/vllm

simonreginis · 2025-12-19T11:18:39Z

Purpose

Add new performance benchmarks for Intel Gaudi 3 Accelerator which include latency, throughput, and serving test suites for DeepSeek-R1, Llama-4-Maverick-17B-128E-Instruct-FP8, Qwen-3-8B models with HPU-specific optimizations

Test Plan

Models tested: DeepSeek-R1, Llama-4-Maverick-17B-128E-Instruct-FP8, Qwen-3-8B
Scenarios: throughput, latency and serving

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request adds new performance benchmarks for Intel Gaudi 3, specifically for DeepSeek-R1, Llama-4-Maverick-17B-128E-Instruct-FP8, and Qwen-3-8B models. The changes look good overall, but I've found a few critical issues in the benchmark configuration files that could cause failures or incorrect results. These include missing quantization settings for FP8 models, incorrect model names for client tokenizers, and typos in model identifiers. Please see the detailed comments for suggestions on how to fix them.

github-actions · 2025-12-19T11:56:44Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

- DeepSeek-R1 - Llama-4-Maverick-17B-128E-Instruct-FP8 - Qwen-3-8B Signed-off-by: Szymon Reginis <sreginis@habana.ai>

simonreginis · 2026-02-05T12:48:25Z

This is a continuation of @jakub-sochacki's PR #26919

Related PR in pytortch-integration-testing is merged
pytorch/pytorch-integration-testing#121

@xuechendi @jikunshang Please review.

PatrykWo · 2026-02-17T08:50:06Z

@huydhn 👆

jikunshang

LGTM! trigger HPU CI to check status.

…llm-project#31025) Signed-off-by: Szymon Reginis <sreginis@habana.ai> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>

mergify Bot added ci/build performance Performance-related issues labels Dec 19, 2025

gemini-code-assist Bot reviewed Dec 19, 2025

View reviewed changes

simonreginis force-pushed the sreginis/new_benchmarks_dec branch from cc5e80a to d4e41d9 Compare December 19, 2025 11:32

simonreginis force-pushed the sreginis/new_benchmarks_dec branch 3 times, most recently from 7f3ad25 to 9513d3b Compare December 23, 2025 11:11

simonreginis force-pushed the sreginis/new_benchmarks_dec branch from 9513d3b to f737407 Compare February 4, 2026 12:12

[CI/Build][Intel] Add new performance benchmarks for Intel Gaudi 3

92c454c

- DeepSeek-R1 - Llama-4-Maverick-17B-128E-Instruct-FP8 - Qwen-3-8B Signed-off-by: Szymon Reginis <sreginis@habana.ai>

simonreginis force-pushed the sreginis/new_benchmarks_dec branch from f737407 to 92c454c Compare February 4, 2026 12:56

simonreginis marked this pull request as ready for review February 4, 2026 12:58

Merge branch 'main' into sreginis/new_benchmarks_dec

0543b71

jikunshang approved these changes Feb 24, 2026

View reviewed changes

jikunshang added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 24, 2026

simonreginis added 2 commits February 25, 2026 13:30

Merge branch 'main' into sreginis/new_benchmarks_dec

3c52fd8

Merge branch 'main' into sreginis/new_benchmarks_dec

794bf36

jikunshang merged commit 4beebfd into vllm-project:main Mar 3, 2026
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI/Build][Intel] Add new performance benchmarks for Intel Gaudi 3#31025

[CI/Build][Intel] Add new performance benchmarks for Intel Gaudi 3#31025
jikunshang merged 4 commits intovllm-project:mainfrom
simonreginis:sreginis/new_benchmarks_dec

simonreginis commented Dec 19, 2025 •

edited by github-actions Bot

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Dec 19, 2025

Uh oh!

simonreginis commented Feb 5, 2026

Uh oh!

PatrykWo commented Feb 17, 2026

Uh oh!

jikunshang left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

simonreginis commented Dec 19, 2025 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Dec 19, 2025

Uh oh!

simonreginis commented Feb 5, 2026

Uh oh!

PatrykWo commented Feb 17, 2026

Uh oh!

jikunshang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

simonreginis commented Dec 19, 2025 •

edited by github-actions Bot

Loading