[FRONTEND] fix matmul int8 overflow issue#2297
Conversation
Previously on matmul, if inputs are int8, output was also int8. This commit fixes the overflow problem with int32 output.
|
Not sure this is how it should be handled. Generally, you would expect that a binary operation on two operands of the same type would also produce the result of that same type (or if it's a combination of two types, the lower precision type should be promoted to the higher one, and produce the result of the higher type). This would make the output type confusing. Maybe a better option would be to handle the type of |
|
Thanks for quick and detailed review.
|
|
So the idea with This does limit the number of options we could support, because it could be quite possible that you want to accumulate in, e.g., What are the user-level pitfalls you had in mind? Re computation & memory pressure, if you're thinking about the additional cast from a type to itself would introduce some overhead, I'm quite sure that Triton can optimize that away. (Also, just wanted to clarify that I'm not an official reviewer, nor affiliated with the maintainers of this repository. I'm just following it closely and noticed that this might cause unexpected semantics.) |
|
For user-level pitfalls, I thought wrong in that manual override requires specific datatype anyways. For computation & memory overhead, I'm seeing quite a lot of overhead for int <-> fp casting with Tradeoff between proposed code vs. ( Thank you for your opinion! |
|
I'm gonna merge that. I don't feel super strongly either way, since triton.ops.matmul isn't really part of the language and just used for testing |
|
|
Previously on matmul, if inputs are int8, output was also int8. This commit fixes the overflow problem with int32 output. triton-lang#2296
Previously on matmul, if inputs are int8, output was also int8. This commit fixes the overflow problem with int32 output. triton-lang#2296
Previously on matmul, if inputs are int8, output was also int8.
This commit fixes the overflow problem with int32 output.
#2296