Fix IR layout of 3-element vectors in cbuffers for -fvk-use-dx-layout#7282
Conversation
Fixes shader-slang#6921 D3D cbuffers have slightly different packing rules that allow packing vectors into a 16-byte slot at element alignments, except when a field would cross a 16-byte boundary. In that case, we need to realign the field to the next 16-byte boundary. In particular, this impacts vec3s, which are not a power of two in size and thus require slightly different alignment logic, compared to std430 and std140. (Example: a float and float3 should fit together in that order in a single slot.) Also adds a test case.
This update introduces functions to determine if a struct or constant buffer requires scalar layout based on offset alignment rules. The GLSL source emitter now checks for scalar layout requirements when emitting parameter groups, ensuring proper alignment for various field types. Additionally, a new test case has been added to validate the changes in layout handling for constant buffers.
This update modifies the emitStructDeclarationsBlock function to include a new parameter, forceScalarOffsets, allowing for more precise control over struct field layout. The changes ensure that scalar offsets can be enforced when necessary, improving the handling of struct field attributes across various emitters (C-like, GLSL, HLSL, and WGSL). Additionally, adjustments were made to related function signatures to accommodate this new parameter, enhancing consistency and flexibility in struct emission.
|
I think the SPIRV target is probably doing what it's supposed to at this point. The GLSL target is the part that I'm still working my way through. In particular, would be good to get feedback on how best to handle what Other issue is whether to just go ahead and promote everything that leaves a gap (vs std140/std430 rules) to either use |
source/slang/slang-ir-layout.cpp
Outdated
| (int)(element.size * count), | ||
| (int)(element.size * countForAlignment)); | ||
| } | ||
| virtual void adjustAlignmentForStructOffset(IRSizeAndAlignment& element, IRIntegerValue offset) |
There was a problem hiding this comment.
is offset the offset of the element in the struct?
The logic is not very easy to understand, can you make more comment to explain?
From my understand, what you are trying to do is that if the element is not cross the 16 byte boundary, do nothing.
And if it's cross the boundary, aligned with 16. So it means that there will be padding after the last element? If this is the case, then the offset for this element is changed, right?
There was a problem hiding this comment.
I feel like this logic should be in the existing method adjustOffsetForNextAggregateMember.
There was a problem hiding this comment.
The problem with putting it in adjustOffsetForNextAggregateMember is that we need to know the size of the next member in order to know whether or not to adjust the offset, and we won't know that until we iterate to the next field.
For the case of struct { float a; float3 b; }, at the point where we call adjustOffsetForNextAggregateMember we only know the offset and alignment of a, so we don't know the size of b yet and thus don't know whether or not the offset should be adjusted. For example, if b is a float4, it needs to be adjusted, but if it's a float3 it doesn't.
Hm, maybe you're suggesting I move adjustOffsetForNextAggregateMember up to where I've put adjustAlignmentForStructOffset and have it take the current element size as an additional argument. Let me give that a try.
Split test case in two, so we can check that the struct only uses layout(scalar) when necessary. Strips out additional alignment tests that aren't pertinent to 3-element vector offset calculation.
Consolidate constant buffer logic with the existing logic for adjusting offset for aggregate members.
When calculating offsets in the IR, take into account packOffset statements from HLSL. Necessary for SPIRV target to generate correct offset for any undecorated fields following the one with packOffset.
Revert changes to force GLSL to use scalar output. (Was incomplete)
Tests were originally named in a way that reflects the GLSL target. Changing names to reflect what the specific cases are instead of how a particular target interprets them.
Add a test for the case where a packoffset decoration leaves a gap. When calculating offsets in the IR, we need to take this into account or else we can end up assigning multiple items to the same offset.
This reverts commit 8077a94.
This change would need a matching one in the AST/front-end in order to work properly.
|
You need to take care of the Falcor test failure, but address the comment first, I think they might be related. |
|
I think the falcor failure might be spurious? Updating the branch to trigger another validation run. |
|
Just to add some commentary breadcrumbs, the patch fixes the SPIR-V target, but the GLSL target will still be broken for this case. GLSL uses a base alignment of 4N for a 3 component vector and the final value becomes padding, so it's not possible to directly translate this packing from D3D to GLSL. Fixing GLSL target for the general case would likely involve using a 4-element vector and then some reinterpret casts to extract the elements. Vulkan has laxer rules on offsets and padding when the Regarding the |
| tests/bugs/byte-address-buffer-interlocked-add-f32.slang (vk) | ||
| tests/ir/loop-unroll-0.slang.1 (vk) | ||
| tests/hlsl-intrinsic/texture/float-atomics.slang (vk) | ||
| tests/hlsl/cbuffer-float3-offsets-aligned.slang.2 (vk) |
There was a problem hiding this comment.
Why do we add a test that is expected to fail? Any plan to fix this?
There was a problem hiding this comment.
These succeed on Vulkan using the SPIRV target, but do not succeed on Vulkan using the GLSL target. Kai told me to add the tests here to disable error reporting for Vulkan with GLSL target. It sounds like that's not correct; will fix.
| @@ -0,0 +1,115 @@ | |||
| //TEST:SIMPLE(filecheck=SPIRV): -target spirv -profile cs_6_2 -entry computeMain -line-directive-mode none -fvk-use-dx-layout | |||
| //TEST(compute):COMPARE_COMPUTE_EX(filecheck-buffer=BUFFER):-slang -compute -dx12 -use-dxil -profile cs_6_2 -Xslang... -Xdxc -fvk-use-dx-layout -Xdxc -enable-16bit-types -X. -output-using-type | |||
| //TEST(compute):COMPARE_COMPUTE_EX(filecheck-buffer=BUFFER):-slang -compute -vk -profile cs_6_2 -Xslang... -fvk-use-dx-layout -X. -output-using-type | |||
There was a problem hiding this comment.
If this test is failing under -emit-spirv-via-glsl path, add a -emit-spirv-directly option here to prevent the glsl test failure.
There was a problem hiding this comment.
Thanks, will do.
…shader-slang#7282) * Better handling for 16-byte boundary of d3d cbuffer Fixes shader-slang#6921 D3D cbuffers have slightly different packing rules that allow packing vectors into a 16-byte slot at element alignments, except when a field would cross a 16-byte boundary. In that case, we need to realign the field to the next 16-byte boundary. In particular, this impacts vec3s, which are not a power of two in size and thus require slightly different alignment logic, compared to std430 and std140. (Example: a float and float3 should fit together in that order in a single slot.) Adds test cases. Adds documentation page for GLSL target
…shader-slang#7282) * Better handling for 16-byte boundary of d3d cbuffer Fixes shader-slang#6921 D3D cbuffers have slightly different packing rules that allow packing vectors into a 16-byte slot at element alignments, except when a field would cross a 16-byte boundary. In that case, we need to realign the field to the next 16-byte boundary. In particular, this impacts vec3s, which are not a power of two in size and thus require slightly different alignment logic, compared to std430 and std140. (Example: a float and float3 should fit together in that order in a single slot.) Adds test cases. Adds documentation page for GLSL target
Results of these tests had been marked ignored, because they failed on VK with the GLSL backend. This change removes them from the expected-failure.txt file and adds the correct command line option to avoid using the GLSL target. Addresses concern raised on #7282
Adhere better to the packing rules used by DXC with -fvk-use-dx-layout option.
The packing rules for D3D cbuffers are slightly different from how std140 and std430 handle packing, in particular around how they handle the alignment of vec3 types in structs. As an example, "struct { float a; float3 b; };" will get packed into 16 bytes in HLSL, while the std140 and std430 packing rules would align the float3 variable to an offset of 16. The front-end/AST already handled this correctly, but the IR layout code did not.
Also adds a documentation page for the GLSL target, which has some shortcomings relative to the SPIR-V target.
Fixes #6921.