Skip to content

Commit de91c5c

Browse files
authored
[Bugfix] rocm shared memory issue on MI250 (#16901)
* [Bugfix] rocm shared memory issue on MI250
1 parent da56c89 commit de91c5c

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

python/tvm/dlight/gpu/gemv.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -469,7 +469,10 @@ def apply(
469469
TS, TR = 2, 64
470470
elif target.kind.name == "rocm":
471471
VEC_C = 4
472-
LOAD_V_SHARED = True
472+
# TODO: set LOAD_V_SHARED = False for now
473+
# rocm might have some issues when load/store of shared do not belong to same data type
474+
# and only works for certain vector lens, our commonly useful vector lens are in 4
475+
LOAD_V_SHARED = False
473476
LOAD_V_VEC = 8
474477
UNROLL = 256
475478
if isinstance(len_S, int):

0 commit comments

Comments
 (0)