Skip to content

[fix] [whisper] ensure inputs are moved to the correct device before processing.#22293

Merged
yhyang201 merged 3 commits intosgl-project:mainfrom
AgainstEntropy:fix/whisper-device
Apr 8, 2026
Merged

[fix] [whisper] ensure inputs are moved to the correct device before processing.#22293
yhyang201 merged 3 commits intosgl-project:mainfrom
AgainstEntropy:fix/whisper-device

Conversation

@AgainstEntropy
Copy link
Copy Markdown
Collaborator

Motivation

Whisper was not covered by the Lazy device transfer introduced by #22038 , thus caused a bug that the input features may not be transferred to the correct device.
Can be spotted by the manual test here: https://github.com/sgl-project/sglang/blob/main/test/manual/test_whisper_cuda_graph.py

python test/manual/test_whisper_cuda_graph.py
  • main branch

    ...
    inputs_embeds = torch.nn.functional.gelu(self.conv1(input_features))
    ...
    return F.conv1d(
    ...
    RuntimeError: Expected all tensors to be on the same device, but got weight is on cuda:0, different from other tensors on cpu (when checking argument in method wrapper_CUDA___slow_conv2d_forward)
  • this PR
    everything good

Modifications

Accuracy Tests

Speed Tests and Profiling

Checklist

Review and Merge Process

  1. Ping Merge Oncalls to start the process. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
  4. After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@yhyang201
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@github-actions github-actions bot added the run-ci label Apr 8, 2026
position_ids: torch.Tensor,
forward_batch: ForwardBatch,
):
device = self.conv1.weight.device
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: device=next(self()).device is more robust

JustinTong0323 added a commit to JustinTong0323/sglang that referenced this pull request Apr 8, 2026
Cherry-picked from sgl-project#22293. Ensures input features and position IDs are
moved to the correct device before encoder processing.
JustinTong0323 added a commit to JustinTong0323/sglang that referenced this pull request Apr 8, 2026
Cherry-picked from sgl-project#22293. Ensures input features and position IDs are
moved to the correct device before encoder processing.
@yhyang201
Copy link
Copy Markdown
Collaborator

all cuda ci passed.

@yhyang201 yhyang201 merged commit ae8da14 into sgl-project:main Apr 8, 2026
441 of 527 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants