Testing triton when C matrix has some initial values. Possible? #170

navdeepkk · 2021-07-31T10:07:09Z

Hi, Is it possible to benchmark triton when C matrix is not assumed to be all zeros, i.e., it is loaded from global memory rather than initializing an accumulator tile with zeros?

Thanks!

The text was updated successfully, but these errors were encountered:

navdeepkk · 2021-08-03T12:10:59Z

@ptillet any help would be appreciated. Thanks!

ptillet · 2021-08-03T17:18:32Z

@navdeepkk Sorry for the delay. I was double-checking something. So right now it seems like initialize matmul accumulator with loaded values is buggy. But what you can do instead is to add an accumulation after the matmul loop.

acc = 0
for k in range(K, 0, -BLOCK_K):
  ...
D = ... # construct pointer to the other tensor
acc += tl.load(D)

I will work on fixing bugs so that one can do

D = ... # construct pointer to the other tensor
acc = tl.load(D)
for k in range(K, 0, -BLOCK_K):
  ...

instead

navdeepkk · 2021-08-06T09:10:20Z

Thanks! I'll try this.

ptillet · 2021-08-06T09:14:24Z

Actually I just checked and this doesn't seem to work anymore now ... But I'll be working on it over the next few days.

navdeepkk · 2021-08-06T09:17:04Z

Oh Okay, I have some matmul kernels generated, which load C from global memory. I wanted to compare against Triton. I'll get on that when this is fixed. Thanks!

ptillet · 2021-08-30T19:10:30Z

Hey. This should work now on top of master :) Just add for example c = c + tl.load(c_ptrs, mask=c_mask) there: https://github.com/openai/triton/blob/master/python/tutorials/03-matrix-multiplication.py#L253. You can also initialize the accumulator here https://github.com/openai/triton/blob/master/python/tutorials/03-matrix-multiplication.py#L228, but it should be less efficient, and for now it'll only work if you explicitly cast it to float32 after loading. Let me know if you have any more issue

…_options_and_report [FRONTEND] Fix triton-translate default options

This was referenced Aug 22, 2021

Dot Product Computes Wrong Values #217

Closed

Broadcast causes buggy behaviour #236

Closed

ptillet closed this as completed Aug 30, 2021

B1tway pushed a commit to B1tway/triton that referenced this issue Apr 3, 2023

Merge pull request triton-lang#170 from binarman/triton-translate/fix…

30762f0

…_options_and_report [FRONTEND] Fix triton-translate default options

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing triton when C matrix has some initial values. Possible? #170

Testing triton when C matrix has some initial values. Possible? #170

navdeepkk commented Jul 31, 2021

navdeepkk commented Aug 3, 2021

ptillet commented Aug 3, 2021

navdeepkk commented Aug 6, 2021

ptillet commented Aug 6, 2021

navdeepkk commented Aug 6, 2021

ptillet commented Aug 30, 2021

Testing triton when C matrix has some initial values. Possible? #170

Testing triton when C matrix has some initial values. Possible? #170

Comments

navdeepkk commented Jul 31, 2021

navdeepkk commented Aug 3, 2021

ptillet commented Aug 3, 2021

navdeepkk commented Aug 6, 2021

ptillet commented Aug 6, 2021

navdeepkk commented Aug 6, 2021

ptillet commented Aug 30, 2021