-
Notifications
You must be signed in to change notification settings - Fork 808
Description
ResourceDescriptorHeaps are an incredibly flexible way of accessing descriptors, but they imply that all descriptors are the same size, which is untrue more or less universally across vendors.
As a simple example, an unformatted buffer is generally going to look like an address and a size, whereas an image has significantly more data associated with it.
Vulkan Working Group members (including AMD) have looked at preliminary data suggesting that having all descriptors in a homogenous array results in notable slowdown (high single to low double digits) for some apps compared to using separate parallel arrays, verified by experiments with VK_EXT_mutable_descriptor_type.
Some mitigation strategies exist for drivers, but generally speaking we'd like to see developers given tools to manage this better themselves as we move to increasingly "bindless" ways of managing descriptors (and thus less scope for driver intervention).
A key idea we've identified is that if we could separate buffers (and acceleration structures) out from the main resource heap, and provide these in a separate array, it would allow developers the option to pull these from a more tightly packed array with limited shader changes, and we believe this can recover most or possibly all of the performance compared to just using a single flat array.
Ideally, we'd like to see this addressed in a way that works for both Vulkan and DirectX, without requiring any API changes.
One potential path we see is to use buffer addresses for this purpose.
In both Vulkan and DX12 it's possible to obtain a GPU address for a buffer resource, and in Vulkan it's possible to further use this in the shader in lieu of a buffer descriptor, though it's exposed in HLSL in the vk namespace as a simple load from address, and doesn't fit very neatly with the rest of HLSL.
We also want to avoid doing something where we add pointers to HLSL, as this is both an enormous task and may be undesirable as HLSL is a largely "robust" language which adding significant pointer support could compromise.
An option we'd like to entertain is providing a way to load a buffer from another buffer in the same way as one would load a buffer from the ResourceDescriptorHeap, roughly as follows:
// Root buffer acting as a heap
StructuredBuffer<uint64_t> BufferHeap;
[numthreads(32,1,1)]
void CSMain()
{
// Resource Heap syntax
RWStructuredBuffer<float4> myBuffer = ResourceDescriptorHeap[0];
// Rough proposed new syntax
RWStructuredBuffer<float4> myBuffer2 = BufferHeap[0];
}There are questions to ask about what limitations might be reasonable on this (e.g. can that uint64_t value be anywhere?), and/or whether there should be a more explicit cast than is provided from ResourceDescriptorHeap or some other syntax.
It might also be reasonable to have uint64_t pairs (address and size) rather than just an address, to provide more robustness guarantees, possibly optionally.
This would give developers access to buffer addresses, in a way that is hopefully in keeping with existing HLSL syntax.
Additionally, it avoids introducing pointers to the language, but still gives developers a way to describe fairly complex recursive data structures if they so wish, or pull data in ways that were previously not possible in HLSL.
There's some danger of providing developers with too much rope here, but we believe this should be close to the minimum viable proposal for adding this functionality.
So some questions:
- Is this of interest to ISVs?
- Is this of interest to Microsoft?
- Are there any constraints which should be considered with this?
We could work to add this to the vk namespace, but if this is interesting to those targeting DX12 it might be a feature that makes sense in the next Shader Model so it can work for any target API.