Implement SAMPLED_TEXTURE_ARRAY_NON_UNIFORM_INDEXING#369
Implement SAMPLED_TEXTURE_ARRAY_NON_UNIFORM_INDEXING#369bors[bot] merged 1 commit intogfx-rs:masterfrom
Conversation
0d62a31 to
1635e8d
Compare
| #extension GL_EXT_nonuniform_qualifier : require | ||
|
|
||
| layout(location = 0) in vec2 v_TexCoord; | ||
| layout(location = 1) flat in int v_Index; // dynamically non-uniform |
There was a problem hiding this comment.
This is actually uniform across the wave, since it's flat-interpolated. Is it supposed to be non-uniform in a difference sense?
There was a problem hiding this comment.
If you consider how gpus run fragment shaders, it may not be even coherent in the wave. They take 2x2 blocks of pixels, put N of them together into a wave. These blocks may not be from the same primative, so this may be different across the wave. This 2x2 action is also why gpus are bad at rendering small polygons, they always get a 2x2 section, even if they are only 1x1. (edit: src also this comment which is really helpful)
Maybe confusion between uniform (same over the draw call) and coherent (same over the wave)?
There was a problem hiding this comment.
My understanding is that 2x2 blocks are always from the same primitive. If some of these pixels lie outside, they become "ghost" threads, but regardless - flat is there precisely to inform the compiler that this is constant across the wave.
The link you provided is very familiar to me, although I'll make sure to rehash that part in my mind.
There was a problem hiding this comment.
The pixels from within the 2x2 blocks are always from the same primitive, being masked out and such as needed, but when 8 of those are combined into a wave, each of those 8 blocks could be from a different primitive. If that wasn't allowed, having a 1x1 triangle would waste an entire wave of processing power, not just a 2x2 block.
kvark
left a comment
There was a problem hiding this comment.
I guess we are still missing the logic in the example to select the proper shader
6086f94 to
f046e26
Compare
f046e26 to
021419b
Compare
|
Alright, I think this is finally ready for review on this one |
This implements gfx-rs/wgpu#715 in wgpu-rs. I haven't changed the example, as I want to actually think up a better example to use and didn't want that to block this. It will change in the future however.