Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion vllm_ascend/ops/triton/gdn_chunk_meta.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@
import torch
from vllm.triton_utils import tl, triton

from vllm_ascend.utils import is_310p

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Suggested PR Title:

[Ops][BugFix] Fix Ascend310P3 Triton compilation error in GDN chunk meta

Suggested PR Summary:

### What this PR does / why we need it?

This PR bypasses Triton kernel compilation for the `_build_final_chunk_indices` operation on Ascend 310P3 hardware. Since `bishengir-compile` does not currently support targeting Ascend 310P3, attempting to compile Triton kernels results in a CI error: `Cannot find option named 'Ascend310P3'`. A PyTorch-based fallback implementation is used instead.

Fixes #7756

### Does this PR introduce _any_ user-facing change?

No. This is a backend-specific fix for Ascend 310P3 hardware compatibility.

### How was this patch tested?

Verified that the logic correctly falls back to the PyTorch implementation when `is_310p()` is true, avoiding the Triton compilation error in CI.

This PR does not follow the Repository Style Guide for PR Title and Summary format. Please update the PR title and description accordingly.

References
  1. The PR title and summary must follow a specific format including [Branch][Module][Action] prefixes and a structured summary body. (link)



def _cdiv(x: int, y: int) -> int:
triton_cdiv = getattr(triton, "cdiv", None)
Expand Down Expand Up @@ -156,7 +158,9 @@ def _build_final_chunk_indices(
out_final_chunk_indices: torch.Tensor,
) -> None:
num_seqs = chunk_counts.shape[0]
if hasattr(_build_final_chunk_indices_kernel, "__getitem__"):
# 310P does not support Triton kernel compilation (bishengir-compile
# cannot target Ascend310P), so always use the PyTorch fallback path.
if not is_310p() and hasattr(_build_final_chunk_indices_kernel, "__getitem__"):
block_size = 256
grid = (_cdiv(num_seqs, block_size),)
_build_final_chunk_indices_kernel[grid](
Expand Down
Loading