[FlexAttention] allow custom mask mod by liangel-02 · Pull Request #37692 · vllm-project/vllm

liangel-02 · 2026-03-20T15:06:57Z

updating FlexAttention impl to accept custom mask mod from users

gemini-code-assist

Code Review

This pull request introduces a new block_sparsity_hint parameter to the FlexAttentionMetadata class and modifies the attention mechanism to allow for custom mask modifications. The changes aim to provide more flexibility in defining attention patterns, including support for custom sparsity hints. The code has been reviewed and a critical issue has been identified.

gemini-code-assist · 2026-03-20T15:11:17Z

vllm/v1/attention/backends/flex_attention.py

        # (causal mask for decoder or bidirectional mask for encoder)
-        if self.causal:
+        has_custom_mask = self.logical_mask_mod is not causal_mask_mod
+        if self.causal or has_custom_mask:


The condition self.causal or has_custom_mask will always evaluate to True if has_custom_mask is True. This means that the code will always use self.get_causal_mask_mod() when a custom mask is present, regardless of the value of self.causal. This might not be the intended behavior, as the user might want to use a bidirectional mask with a custom modification. This could lead to unexpected or incorrect attention patterns.

To fix this, the logic should ensure that self.causal is only considered when a custom mask is not present. If a custom mask is present, it should override the causal mask behavior.

Suggested change

if self.causal or has_custom_mask:

if has_custom_mask:

mask_mod = self.logical_mask_mod

elif self.causal:

mask_mod = self.get_causal_mask_mod()

else:

mask_mod = self.get_bidirectional_mask_mod()

LucasWilkinson

@drisspg do you think you can help review this?

drisspg · 2026-03-23T23:45:48Z

vllm/v1/attention/backends/flex_attention.py

+        causal_sliding_window = self.sliding_window and self.causal
+        custom_hint = self.block_sparsity_hint is not None
+
+        if causal_sliding_window or custom_hint:


nit: looking at this again do we even need causal to be true? I take it like always is but if we have a lookback window the same logical truncation applies :think:

drisspg · 2026-03-23T23:50:46Z

vllm/v1/attention/backends/flex_attention.py

        self.mask_mod = self.get_mask_mod()
        self.transformed_score_mod = self.get_transformed_score_mod()

-        if self.direct_build and self.causal:


confirming; intentional right

yeah inst of getting built in the post init i moved it to the forward to avoid needing to rebuild if its different per layer for custom mask mods

drisspg

A few things;

Describe the sparisty hint in more detail (its shape attributes, etc) Maybe a make a named_tuple
Add a small test showing how to it is used, it seems like both at the per layer and per model

Can you confrim where direct build gets set these days? Is it expected that it will always work for custom mask mods

drisspg · 2026-03-24T02:44:29Z

tests/kernels/test_flex_attention.py

+    device = torch.device("cuda")
+
+    vllm_config = create_vllm_config(
+        model_name="meta-llama/Meta-Llama-3-8B",


probs a smaller one for ci, well ig uess its never ran so choose whateve rmakes a small config

+1, it's not clear to me what this is for

i need vllm_config for FlexAttentionMetadataBuilder but the size of the model doesn't affect the actual test since its never loaded, but i changed it to a smaller config

drisspg

Looks good

zou3519

LGTM minus the testing nit

Signed-off-by: Angel Li <liangel@meta.com>

Signed-off-by: Angel Li <liangel@meta.com> Signed-off-by: Michel Belleau <michel.belleau@malaiwah.com>

Signed-off-by: Angel Li <liangel@meta.com>

Signed-off-by: Angel Li <liangel@meta.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>

Signed-off-by: Angel Li <liangel@meta.com> Signed-off-by: Nithin Chalapathi <nithin.ch10@gmail.com>

Signed-off-by: Angel Li <liangel@meta.com>

Signed-off-by: Angel Li <liangel@meta.com> Signed-off-by: Vinay Damodaran <vrdn@hey.com>

liangel-02 requested review from LucasWilkinson and MatthewBonanni as code owners March 20, 2026 15:06

mergify bot added the v1 label Mar 20, 2026

gemini-code-assist bot reviewed Mar 20, 2026

View reviewed changes

liangel-02 force-pushed the flex branch from 190b9fb to 86815a0 Compare March 20, 2026 15:27

LucasWilkinson reviewed Mar 20, 2026

View reviewed changes

drisspg reviewed Mar 23, 2026

View reviewed changes

liangel-02 force-pushed the flex branch 2 times, most recently from c489993 to 907d60d Compare March 24, 2026 01:57

liangel-02 requested review from WoosukKwon, mgoin, tlrmchlsmth and yewentao256 as code owners March 24, 2026 01:57

liangel-02 force-pushed the flex branch 4 times, most recently from 0ae34bf to 49cc770 Compare March 24, 2026 02:10

drisspg reviewed Mar 24, 2026

View reviewed changes

drisspg approved these changes Mar 24, 2026

View reviewed changes

zou3519 reviewed Mar 24, 2026

View reviewed changes

liangel-02 force-pushed the flex branch from 49cc770 to 54e5e79 Compare March 24, 2026 14:52

liangel-02 requested a review from zou3519 March 24, 2026 14:52

zou3519 approved these changes Mar 24, 2026

View reviewed changes

zou3519 added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 24, 2026

[FlexAttention] allow custom mask mod

92f0171

Signed-off-by: Angel Li <liangel@meta.com>

liangel-02 force-pushed the flex branch from 54e5e79 to 92f0171 Compare March 24, 2026 17:00

zou3519 merged commit 8c47fdf into vllm-project:main Mar 24, 2026
57 checks passed

RhizoNymph pushed a commit to RhizoNymph/vllm that referenced this pull request Mar 26, 2026

[FlexAttention] allow custom mask mod (vllm-project#37692)

03ba515

Signed-off-by: Angel Li <liangel@meta.com>

HenryTangDev pushed a commit to HenryTangMain/vllm that referenced this pull request Mar 27, 2026

[FlexAttention] allow custom mask mod (vllm-project#37692)

7689f99

Signed-off-by: Angel Li <liangel@meta.com>

malaiwah pushed a commit to malaiwah/vllm that referenced this pull request Mar 27, 2026

[FlexAttention] allow custom mask mod (vllm-project#37692)

90246c7

Signed-off-by: Angel Li <liangel@meta.com> Signed-off-by: Michel Belleau <michel.belleau@malaiwah.com>

khairulkabir1661 pushed a commit to khairulkabir1661/vllm that referenced this pull request Mar 27, 2026

[FlexAttention] allow custom mask mod (vllm-project#37692)

3e3f7ee

Signed-off-by: Angel Li <liangel@meta.com>

Monishver11 pushed a commit to Monishver11/vllm that referenced this pull request Mar 27, 2026

[FlexAttention] allow custom mask mod (vllm-project#37692)

0e55b5c

Signed-off-by: Angel Li <liangel@meta.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>

nithinvc pushed a commit to nithinvc/vllm that referenced this pull request Mar 27, 2026

[FlexAttention] allow custom mask mod (vllm-project#37692)

2a04955

Signed-off-by: Angel Li <liangel@meta.com> Signed-off-by: Nithin Chalapathi <nithin.ch10@gmail.com>

JiantaoXu pushed a commit to JiantaoXu/vllm that referenced this pull request Mar 28, 2026

[FlexAttention] allow custom mask mod (vllm-project#37692)

3b41d3b

Signed-off-by: Angel Li <liangel@meta.com>

vrdn-23 pushed a commit to vrdn-23/vllm that referenced this pull request Mar 30, 2026

[FlexAttention] allow custom mask mod (vllm-project#37692)

594c531

Signed-off-by: Angel Li <liangel@meta.com> Signed-off-by: Vinay Damodaran <vrdn@hey.com>

-        if self.causal or has_custom_mask:
+        if has_custom_mask:
+            mask_mod = self.logical_mask_mod
+        elif self.causal:
+            mask_mod = self.get_causal_mask_mod()
+        else:
+            mask_mod = self.get_bidirectional_mask_mod()

Uh oh!

Conversation

liangel-02 commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

LucasWilkinson left a comment

Choose a reason for hiding this comment

Uh oh!

drisspg Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

drisspg Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

liangel-02 Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

drisspg left a comment

Choose a reason for hiding this comment

Uh oh!

drisspg Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zou3519 Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

liangel-02 Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

drisspg left a comment

Choose a reason for hiding this comment

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

liangel-02 commented Mar 20, 2026 •

edited

Loading

drisspg Mar 24, 2026 •

edited

Loading