Skip to content

Commit ce0bbf2

Browse files
committed
[PYTORCHDGQ-6865] Added RoPE support in Chunk prefill
1. This version will compute RoPE on GMEM data
1 parent 9da9cbd commit ce0bbf2

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

examples/06_bmg_flash_attention/06_bmg_chunk_prefill.cpp

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -38,15 +38,15 @@
3838
See https://arxiv.org/pdf/2307.08691 for details of Flash Attention V2 algorithm
3939
4040
To run this example:
41-
$ ./examples/sycl/06_bmg_flash_attention_cachedKV/06_bmg_prefill_attention_cachedKV --seq_len_qo=512
42-
--seq_len_kv=512 --seq_len_kv_cache=512 --head_size_vo=128 --head_size_qk=128
41+
$ ./examples/06_bmg_flash_attention/06_bmg_chunk_prefill_hdim64 --seq_len_qo=512
42+
--seq_len_kv=512 --seq_len_kv_cache=512 --head_size_vo=64 --head_size_qk=64
4343
4444
Causal masking of the first matrix multiplication is supported (`--is_causal`)
4545
4646
To build & run this example (from your build dir):
4747
48-
$ ninja 06_bmg_prefill_attention_cachedKV
49-
$ ./examples/sycl/06_bmg_flash_attention_cachedKV/06_bmg_prefill_attention_cachedKV
48+
$ ninja 06_bmg_chunk_prefill_hdim64
49+
$ ./examples/06_bmg_flash_attention/06_bmg_chunk_prefill_hdim64
5050
5151
Call with `--help` for information about available options
5252
*/

0 commit comments

Comments
 (0)