You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Objective
- Make the meshlet fill cluster buffers pass slightly faster
- Address #15920 for meshlets
- Added PreviousGlobalTransform as a required meshlet component to avoid
extra archetype moves, slightly alleviating
#14681 for meshlets
- Enforce that MeshletPlugin::cluster_buffer_slots is not greater than
2^25 (glitches will occur otherwise). Technically this field controls
post-lod/culling cluster count, and the issue is on pre-lod/culling
cluster count, but it's still valid now, and in the future this will be
more true.
Needs to be merged after #15846
and #15886
## Solution
- Old pass dispatched a thread per cluster, and did a binary search over
the instances to find which instance the cluster belongs to, and what
meshlet index within the instance it is.
- New pass dispatches a workgroup per instance, and has the workgroup
loop over all meshlets in the instance in order to write out the cluster
data.
- Use a push constant instead of arrayLength to fix the linked bug
- Remap 1d->2d dispatch for software raster only if actually needed to
save on spawning excess workgroups
## Testing
- Did you test these changes? If so, how?
- Ran the meshlet example, and an example with 1041 instances of 32217
meshlets per instance. Profiled the second scene with nsight, went from
0.55ms -> 0.40ms. Small savings. We're pretty much VRAM bandwidth bound
at this point.
- How can other people (reviewers) test your changes? Is there anything
specific they need to know?
- Run the meshlet example
## Changelog (non-meshlets)
- PreviousGlobalTransform now implements the Default trait
@group(0) @binding(0) var<storage, read> meshlet_cluster_meshlet_ids:array<u32>; // Per cluster
64
66
@group(0) @binding(1) var<storage, read> meshlet_bounding_spheres:array<MeshletBoundingSpheres>; // Per meshlet
65
67
@group(0) @binding(2) var<storage, read> meshlet_simplification_errors:array<u32>; // Per meshlet
66
68
@group(0) @binding(3) var<storage, read> meshlet_cluster_instance_ids:array<u32>; // Per cluster
67
69
@group(0) @binding(4) var<storage, read> meshlet_instance_uniforms:array<Mesh>; // Per entity instance
68
70
@group(0) @binding(5) var<storage, read> meshlet_view_instance_visibility:array<u32>; // 1 bit per entity instance, packed as a bitmask
69
71
@group(0) @binding(6) var<storage, read_write> meshlet_second_pass_candidates:array<atomic<u32>>; // 1 bit per cluster , packed as a bitmask
70
-
@group(0) @binding(7) var<storage, read_write> meshlet_software_raster_indirect_args:DispatchIndirectArgs; // Single object shared between all workgroups/clusters/triangles
71
-
@group(0) @binding(8) var<storage, read_write> meshlet_hardware_raster_indirect_args:DrawIndirectArgs; // Single object shared between all workgroups/clusters/triangles
72
-
@group(0) @binding(9) var<storage, read_write> meshlet_raster_clusters:array<u32>; // Single object shared between all workgroups/clusters/triangles
72
+
@group(0) @binding(7) var<storage, read_write> meshlet_software_raster_indirect_args:DispatchIndirectArgs; // Single object shared between all workgroups
73
+
@group(0) @binding(8) var<storage, read_write> meshlet_hardware_raster_indirect_args:DrawIndirectArgs; // Single object shared between all workgroups
74
+
@group(0) @binding(9) var<storage, read_write> meshlet_raster_clusters:array<u32>; // Single object shared between all workgroups
73
75
@group(0) @binding(10) vardepth_pyramid:texture_2d<f32>; // From the end of the last frame for the first culling pass, and from the first raster pass for the second culling pass
0 commit comments