Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion python/sglang/srt/layers/sampler.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ def forward(
)
else:
batch_next_token_ids = top_k_top_p_sampling_from_probs(
probs,
probs.contiguous(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Calling .contiguous() here is a good safeguard to ensure the probs tensor meets the memory layout requirements of the top_k_top_p_sampling_from_probs kernel, which is likely a custom C++/CUDA kernel from sgl_kernel that expects contiguous inputs. This prevents potential runtime errors or incorrect behavior if probs becomes non-contiguous due to upstream operations.

To improve code clarity and maintainability, consider adding a brief inline comment explaining why this call is necessary. This helps future developers understand the requirement, especially since .contiguous() can incur a performance overhead if a copy is made.

For example:

# Ensure probs is contiguous, as required by the underlying sgl_kernel
# for the 'flashinfer' backend to prevent potential errors.
probs.contiguous(),

sampling_info.top_ks,
sampling_info.top_ps,
filter_apply_order="joint",
Expand Down
Loading