[Platform] Add current_platform.num_compute_units interface by jikunshang · Pull Request #35042 · vllm-project/vllm

jikunshang · 2026-02-22T02:09:19Z

Purpose

there are some torch.cuda.get_device_properties().multi_processor_count across vllm code base. we can unify it into current_platform.num_compute_units interface to make it clean and extensible for non-cuda hardware like xpu and npu.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

dosubot · 2026-02-22T02:09:27Z

Related Documentation

Checked 0 published document(s) in 1 knowledge base(s). No updates required.

^{How did I do? Any feedback?}

gemini-code-assist

Code Review

The pull request introduces a new get_num_sm interface to abstract the retrieval of SM (Streaming Multiprocessor) counts across different platforms (CUDA, ROCm, XPU). This change unifies the way SM counts are obtained, making the codebase cleaner and more extensible for non-CUDA hardware. The existing torch.cuda.get_device_properties().multi_processor_count calls are replaced with the new platform-agnostic interface. This is a good step towards improving platform independence and maintainability.

njhill · 2026-02-22T02:11:44Z

Thanks @jikunshang! LGTM but wonder if we should name it something more generic since SM is a CUDA term AFAIK

Maybe get_multi_processor_count?

mergify · 2026-02-22T02:14:12Z

Hi @jikunshang, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

jikunshang · 2026-02-22T02:36:46Z

@njhill seems Compute Unit is unified term across GPU vendor https://chatgpt.com/share/699a69bd-1fd0-8002-aeb8-e12bf149533e
I prefer to rename to get_num_compute_units, thoughts?

mergify · 2026-02-22T03:21:21Z

Hi @jikunshang, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

njhill

I prefer to rename to get_num_compute_units, thoughts?

How about just compute_unit_count, since there is already device_count?

Or if not then I think just num_compute_units would be better

njhill · 2026-02-22T05:25:43Z

benchmarks/kernels/benchmark_marlin.py

        if allspark_supported:
            properties = torch.cuda.get_device_properties(b.device.index)
-            sm_count = properties.multi_processor_count
+            sm_count = current_platform.get_num_compute_units(b.device.index)


Probably doesn't make sense to change in places like this which are cuda-specific anyway

I think it's ok to always use this current_platfrom API here. we should avoid/reduce using torch.cuda APIs since we are proposing this RFC(#30679). (though non-cuda platform will never fall into current code path)
my understanding here is it will use property to check cuda capability later. I feel we can also refactor this part into something like current_platform.supported_arch

I just think in places like this we are already using torch.cuda.get_device_properties (line above) so it's better to keep the code consistent (like you say, it can always be refactored for portability in future if/when appropriate).

Similarly if it's in cuda or rocm or xpu-specific files/code then I don't think there's a need to use the current_platform version, and actually maybe better not to since it implies that code could be cross-platform which could be misleading.

got it. reverted:)

jikunshang · 2026-02-22T06:39:45Z

I prefer to rename to get_num_compute_units, thoughts?

How about just compute_unit_count, since there is already device_count?

Or if not then I think just num_compute_units would be better

I feel count is for something countable and you can control use it or not.. while we always want to use all compute units. so let's use num_compute_units :)

njhill · 2026-02-23T01:54:27Z

Also remove platform_utils.get_cu_count() method now?

jikunshang · 2026-02-23T02:06:53Z

Also remove platform_utils.get_cu_count() method now?

nice catch, done!

robertgshaw2-redhat · 2026-02-25T22:24:26Z

did this PR break Distributed Tests (2 GPus)?

https://buildkite.com/vllm/ci/builds/53087#019c941c-f14a-4166-b576-2101c7621154

torch.cuda.current_device() returns an int directly, not a device object with an .index attribute. This was introduced in vllm-project#35042. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>

jikunshang · 2026-02-26T00:03:50Z

did this PR break Distributed Tests (2 GPus)?

https://buildkite.com/vllm/ci/builds/53087#019c941c-f14a-4166-b576-2101c7621154

oh yes. sorry for that and thanks @LucasWilkinson for fixing

…ject#35042) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>

…ject#35042) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Signed-off-by: Kunshang Ji <jikunshang95@gmail.com> Signed-off-by: xjx <493337577@qq.com>

…ject#35042) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>

…ject#35042) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Signed-off-by: Kunshang Ji <jikunshang95@gmail.com> Signed-off-by: Andrii Skliar <askliar@nvidia.com>

…ject#35042) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>

…ect#35042 Signed-off-by: Andreas Karatzas <akaratza@amd.com>

…37764) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

…ect#35042 (vllm-project#37764) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

…ect#35042 (vllm-project#37764) Signed-off-by: Andreas Karatzas <akaratza@amd.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>

…ect#35042 (vllm-project#37764) Signed-off-by: Andreas Karatzas <akaratza@amd.com> Signed-off-by: Nithin Chalapathi <nithin.ch10@gmail.com>

…ect#35042 (vllm-project#37764) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

jikunshang requested review from 22quinn, WoosukKwon, houseroad, mgoin, njhill, pavanimajety, robertgshaw2-redhat, tjtanaa, tlrmchlsmth and yewentao256 as code owners February 22, 2026 02:09

mergify bot added performance Performance-related issues nvidia rocm Related to AMD ROCm v1 labels Feb 22, 2026

github-project-automation bot added this to AMD and NVIDIA Feb 22, 2026

github-project-automation bot moved this to Todo in AMD Feb 22, 2026

gemini-code-assist bot reviewed Feb 22, 2026

View reviewed changes

jikunshang mentioned this pull request Feb 22, 2026

remove cuda check in top_k_top_p_triton kernel #35011

Merged

5 tasks

jikunshang changed the title ~~[Platform]Add current_platform.get_num_sm interface~~ [Platform]Add current_platform.get_num_compute_units interface Feb 22, 2026

njhill reviewed Feb 22, 2026

View reviewed changes

jikunshang mentioned this pull request Feb 22, 2026

[CI Failure]: Intel GPU Test : examples/offline_inference/basic/generate.py #34941

Closed

3 tasks

github-project-automation bot moved this to Ready in NVIDIA Feb 25, 2026

njhill changed the title ~~[Platform]Add current_platform.num_compute_units interface~~ [Platform] Add current_platform.num_compute_units interface Feb 25, 2026

njhill enabled auto-merge (squash) February 25, 2026 01:49

vllm-bot merged commit 8ad54a9 into vllm-project:main Feb 25, 2026
71 of 74 checks passed

github-project-automation bot moved this from Todo to Done in AMD Feb 25, 2026

github-project-automation bot moved this from Ready to Done in NVIDIA Feb 25, 2026

jikunshang mentioned this pull request Feb 25, 2026

[XPU] Initial support for GDN attention on Qwen3-next/Qwen3.5 #33657

Open

5 tasks

laudney mentioned this pull request Feb 25, 2026

[ROCm] Enable wvSplitK skinny GEMM kernel for RDNA4/gfx1x decode #34709

Merged

4 tasks

LucasWilkinson mentioned this pull request Feb 25, 2026

[Bugfix] Fix AttributeError in SMControlContextManager #35338

Merged

3 tasks

haanjack pushed a commit to haanjack/vllm that referenced this pull request Feb 26, 2026

[Platform] Add current_platform.num_compute_units interface (vllm-pro…

2374499

…ject#35042) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>

tom-zju pushed a commit to tom-zju/vllm that referenced this pull request Feb 26, 2026

[Platform] Add current_platform.num_compute_units interface (vllm-pro…

5f3db2d

…ject#35042) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>

llsj14 pushed a commit to llsj14/vllm that referenced this pull request Mar 1, 2026

[Platform] Add current_platform.num_compute_units interface (vllm-pro…

700b83c

…ject#35042) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>

tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Mar 4, 2026

[Platform] Add current_platform.num_compute_units interface (vllm-pro…

ac73ce7

…ject#35042) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>

Copilot AI pushed a commit to machov/vllm that referenced this pull request Mar 10, 2026

[Platform] Add current_platform.num_compute_units interface (vllm-pro…

d773ed8

…ject#35042) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>

AndreasKaratzas added a commit to ROCm/vllm that referenced this pull request Mar 21, 2026

[ROCm][CI] get_cu_count was renamed to num_compute_units in vllm-proj…

640d664

…ect#35042 Signed-off-by: Andreas Karatzas <akaratza@amd.com>

AndreasKaratzas mentioned this pull request Mar 21, 2026

[ROCm][CI] get_cu_count was renamed to num_compute_units in #35042 #37764

Merged

DarkLight1337 pushed a commit that referenced this pull request Mar 22, 2026

[ROCm][CI] get_cu_count was renamed to num_compute_units in #35042 (#…

6ecba84

…37764) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

yzong-rh pushed a commit to yzong-rh/vllm that referenced this pull request Mar 23, 2026

[ROCm][CI] get_cu_count was renamed to num_compute_units in vllm-proj…

8754c00

…ect#35042 (vllm-project#37764) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

RhizoNymph pushed a commit to RhizoNymph/vllm that referenced this pull request Mar 26, 2026

[ROCm][CI] get_cu_count was renamed to num_compute_units in vllm-proj…

132172b

…ect#35042 (vllm-project#37764) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

HenryTangDev pushed a commit to HenryTangMain/vllm that referenced this pull request Mar 27, 2026

[ROCm][CI] get_cu_count was renamed to num_compute_units in vllm-proj…

0451e33

…ect#35042 (vllm-project#37764) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

SouthWest7 pushed a commit to SouthWest7/vllm that referenced this pull request Mar 27, 2026

[ROCm][CI] get_cu_count was renamed to num_compute_units in vllm-proj…

50a0393

…ect#35042 (vllm-project#37764) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

khairulkabir1661 pushed a commit to khairulkabir1661/vllm that referenced this pull request Mar 27, 2026

[ROCm][CI] get_cu_count was renamed to num_compute_units in vllm-proj…

eea035a

…ect#35042 (vllm-project#37764) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

JiantaoXu pushed a commit to JiantaoXu/vllm that referenced this pull request Mar 28, 2026

[ROCm][CI] get_cu_count was renamed to num_compute_units in vllm-proj…

d48ff27

…ect#35042 (vllm-project#37764) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

Uh oh!

Conversation

jikunshang commented Feb 22, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

dosubot bot commented Feb 22, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

njhill commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergify bot commented Feb 22, 2026

Uh oh!

jikunshang commented Feb 22, 2026

Uh oh!

mergify bot commented Feb 22, 2026

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

njhill Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

jikunshang Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

njhill Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

jikunshang Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

jikunshang commented Feb 22, 2026

Uh oh!

njhill commented Feb 23, 2026

Uh oh!

jikunshang commented Feb 23, 2026

Uh oh!

Uh oh!

robertgshaw2-redhat commented Feb 25, 2026

Uh oh!

jikunshang commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jikunshang commented Feb 22, 2026 •

edited by github-actions bot

Loading

njhill commented Feb 22, 2026 •

edited

Loading