Fix apply_bitmask logit for both CPU and triton versions when shape and stride doesn't match by Jialin · Pull Request #390 · mlc-ai/xgrammar

Jialin · 2025-08-09T00:23:45Z

Motivation

In vLLM, in order to address Issue #19493, PR #19565 served as a workaround to mitigate the issue by switching to use torch.compile version instead of triton version.

Root cause

With some investigation, we found that the error would happen when logits.stride()[0] != logits.shape[-1].

Changes

Fix CPU and Triton-version apply_grammar_bitmask for this scenario
- use stride[0] instead of shape[-1] when access rows
Add a unit test to reproduce the errors

Tests

We've verified

New unit tests would failed on trunk
New unit tests passed with the change (and no new failure, however, there're a lot of failing tests in trunk, 47 failed in tests/python/test_token_bitmask_operations.py)

Followup

We will follow up with a benchmark comparison to see if we should bring back triton-version or use the new cuda version on vLLM side.

Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>

Ubospica

Thank you so much for your contribution!

strides are not considered previously. It's good that this PR fixes this issue.

In the future we will further use nanobind's ArrayList to simplify the logic of passing torch tensor to C++. See https://github.com/mlc-ai/xgrammar/blob/main/cpp/nanobind/nanobind.cc#L45. This can avoid the complex logic of passing shape, stride, and dtype, etc. separately to C++.

Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>

Jialin · 2025-08-11T17:20:43Z

@Ubospica The precommit failure is addressed. And also we tried to fix cuda implementation in #394

Ubospica · 2025-08-11T22:57:23Z

@Jialin Thank you so much for your contribution

Jialin added 3 commits August 8, 2025 14:45

Add unit test when shape[1] != stride[0] for apply_token_bitmask

2c8a55e

Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>

Try to fix cpu impl

85eceb1

Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>

Remove debug loggings

1498a47

Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>

Ubospica approved these changes Aug 10, 2025

View reviewed changes

Fix Ruff check

1c8d915

Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>

Ubospica merged commit c3da9db into mlc-ai:main Aug 11, 2025
38 checks passed

Jialin deleted the bitmask_mismatch branch August 13, 2025 04:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix apply_bitmask logit for both CPU and triton versions when shape and stride doesn't match#390

Fix apply_bitmask logit for both CPU and triton versions when shape and stride doesn't match#390
Ubospica merged 4 commits intomlc-ai:mainfrom
Jialin:bitmask_mismatch

Jialin commented Aug 9, 2025 •

edited

Loading

Uh oh!

Ubospica left a comment

Uh oh!

Jialin commented Aug 11, 2025

Uh oh!

Uh oh!

Ubospica commented Aug 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Jialin commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Root cause

Changes

Tests

Followup

Uh oh!

Ubospica left a comment

Choose a reason for hiding this comment

Uh oh!

Jialin commented Aug 11, 2025

Uh oh!

Uh oh!

Ubospica commented Aug 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Jialin commented Aug 9, 2025 •

edited

Loading