Skip to content

[TRITON] unified attention improvements#1259

Merged
cagrikymk merged 1 commit into355_wip_tritonfrom
cagri/unif_attn_improvements
Oct 24, 2025
Merged

[TRITON] unified attention improvements#1259
cagrikymk merged 1 commit into355_wip_tritonfrom
cagri/unif_attn_improvements

Conversation

@cagrikymk
Copy link
Contributor

@cagrikymk cagrikymk commented Oct 24, 2025

This PR:

  • Reorganizes the unified attention config. related code,
  • Introduces KV cache tiling (following vllm)
  • Fixes config. issues on MI300
  • Change tl.exp to tl.exp2 with the corresponding scaling.

@cagrikymk cagrikymk changed the title [TRITON] unified attn. reorg., fixes, exp2 update [TRITON] unified attention improvements Oct 24, 2025
@cagrikymk cagrikymk merged commit 4cc7998 into 355_wip_triton Oct 24, 2025
5 checks passed
@cagrikymk cagrikymk deleted the cagri/unif_attn_improvements branch October 24, 2025 16:10
@sunway513
Copy link
Collaborator

@cagrikymk can you also submit PRs to AITER main branch?

@cagrikymk
Copy link
Contributor Author

@sunway513 We are working on upstreaming Triton-related changes via the 355_wip_triton branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants