Skip to content

[LV] Add initial support for partial alias masking#177599

Draft
MacDue wants to merge 3 commits intollvm:mainfrom
MacDue:whilewr_lv_v2
Draft

[LV] Add initial support for partial alias masking#177599
MacDue wants to merge 3 commits intollvm:mainfrom
MacDue:whilewr_lv_v2

Conversation

@MacDue
Copy link
Copy Markdown
Member

@MacDue MacDue commented Jan 23, 2026

This patch adds initial support for partial alias masking, which allows entering the vector loop even when there is aliasing within a single vector iteration. It does this by clamping the VF to the safe distance between pointers. This allows the runtime VF to be anywhere from 2 to the "static" VF.

Conceptually, this transform looks like:

  // `c` and `b` may alias.
  for (int i = 0; i < n; i++) {
    c[i] = a[i] + b[i];
  }

->

  svbool_t alias_mask = loop.dependence.war.mask(b, c);
  int num_active = num_active_lanes(mask);
  if (num_active >= 2) {
    for (int i = 0; i < n; i += num_active) {
      // ... vector loop masked with `alias_mask`
    }
  }
  // ... scalar tail

Alias masking can be used both with/without tail folding, however has the current patch as a few limitations:

  • Currently, the mask and transform is only valid for IC = 1
    • Some recipes may not handle the "ClampedVF" correctly at IC > 1
    • Note: On AArch64, we also only have native alias mask instructions for IC = 1
  • Reverse iteration is not supported
    • The mask reversal logic is not correct for the alias mask (or clamped ALM)
  • This style of vectorization is not enabled by default/costed
    • It can be enabled with -force-partial-aliasing-vectorization
    • When enabled, alias masking is used instead of the standard diff checks (when legal to do so)

This PR supersedes #100579 (closes #100579).

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms vectorizers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants