perf(SSGI): eliminate radiance copy via direct UAV write#2127
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughTemporary radiance texture now supports direct UAV writes for mip 0; the disocclusion compute writes mip 0 into the temp UAV, and the prefilter reads mip 0 from the temp SRV while writing the mip-chain back into the main radiance texture. Changes
Sequence Diagram(s)sequenceDiagram
participant Disocc as RadianceDisoccCompute
participant TempTex as texRadianceTemp (UAV / SRV)
participant Radiance as texRadiance (UAV chain)
participant Prefilter as PrefilterCompute
Disocc->>TempTex: write mip0 (UAV)
Note right of TempTex: mip0 stored in temp UAV (created in SetupResources)
Prefilter->>TempTex: read mip0 (SRV)
Prefilter->>Radiance: write/read other mips (per-mip UAVs)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
No actionable suggestions for changed features. |
45b5391 to
5b344a7
Compare
|
@coderabbitai please carefully review the code changes (discounting any comments) and determine if it should result in performance gains and explain why. Please identify all changes and potential regressions with the approach. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 3573 🏁 Script executed: Length of output: 127 🏁 Script executed: Length of output: 196 🏁 Script executed: Length of output: 761 🏁 Script executed: Length of output: 132 🏁 Script executed: Length of output: 11495
Performance AnalysisYes — this PR should produce a real, measurable per-frame GPU saving.The old pipeline (3 steps): The new pipeline (2 steps): Why the gains are real:
For a full-res All Code Changes
|
| Before | After | |
|---|---|---|
texRadianceTemp bind flags |
D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_RENDER_TARGET |
D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_UNORDERED_ACCESS |
texRadianceTemp UAV |
Not created | New: DXGI_FORMAT_R11G11B10_FLOAT, Texture2D, MipSlice=0 |
DrawSSGI — radianceDisocc pass
- uavs.at(0) = texRadiance->uav.get();
+ uavs.at(0) = texRadianceTemp->uav.get();radianceDisocc now writes outRadianceDisocc at u0 directly into texRadianceTemp.
DrawSSGI — prefilterRadiance pass
- context->CopySubresourceRegion(
- texRadianceTemp->resource.get(), 0, 0, 0, 0,
- texRadiance->resource.get(), 0, nullptr);The entire copy is removed. The srvs.at(0) = texRadianceTemp->srv.get() binding and all uavRadiance[0..4] bindings are unchanged.
Correctness / Hazard Analysis
Why the original copy existed: prefilterRadiance writes mip 0 of texRadiance (uavRadiance[0]) but also needs mip 0 as its SRV input — a read/write conflict on the same resource. texRadianceTemp was the intermediate buffer to break that hazard.
This hazard is fully preserved in the new approach: prefilterRadiance still reads texRadianceTemp (SRV) and writes to texRadiance (UAV) — they are different resources.
UAV→SRV barrier: The resetViews() call before the prefilterRadiance bindings clears slot u0, which D3D11 uses to enforce the UAV flush before CSSetShaderResources binds the same resource as an SRV. This pattern was already present before this PR.
Format UAV store support: DXGI_FORMAT_R11G11B10_FLOAT is valid for typed UAV stores (RWTexture2D<float3> writes) in D3D11 Feature Level 11.0. This was already proven working for texRadiance->uav in the same dispatch.
Potential Regressions
-
texRadiance->uavis now orphaned. The default/full-resource UAV ontexRadiancewas used in the oldradianceDisoccpass but is no longer bound anywhere. It remains allocated in memory but is harmless — just a minor resource leak worth noting. -
D3D11_BIND_RENDER_TARGETremoved fromtexRadianceTemp. If any code path outside this diff attempts to usetexRadianceTempas a render target,CreateRenderTargetViewwould fail or silently produce an invalid view. No such usage is visible in the shader or diff, so this appears safe. -
Content of
texRadiancemip 0 between passes. Before this PR,texRadiancemip 0 held the raw disocclusion output fromradianceDisoccuntilprefilterRadianceoverwrote it with the filtered version. After this PR,texRadiancemip 0 holds stale data from the previous frame untilprefilterRadianceruns. If any intermediate pass readstexRadiancemip 0 beforeprefilterRadiancecompletes — which is not visible in this diff — it would receive one-frame-stale radiance. Based on the visible pipeline ordering this does not appear to be the case, but it is worth confirming no pass samplestexRadiancemip 0 between these two dispatches. -
D3D11 debug layer warnings. If
resetViews()does not null out all UAV slots before the prefilterRadiance bindings (e.g., if it only resets SRV slots), the D3D11 debug layer may warn abouttexRadianceTempbeing simultaneously bound as UAV (from the prior dispatch's residual state) and SRV. This was a pre-existing concern, not introduced here.
Overall verdict: The optimization is sound and the performance gain is genuine. The data flow and read/write hazard avoidance are maintained correctly. The main item to double-check is regression point 3 — confirming no intermediate pass reads texRadiance mip 0 in the window between radianceDisocc completing and prefilterRadiance completing.
🧠 Learnings used
Learnt from: vrnord
Repo: doodlum/skyrim-community-shaders PR: 2003
File: features/Upscaling/Shaders/Upscaling/DepthUpscalePS.hlsl:55-61
Timestamp: 2026-03-24T04:30:32.809Z
Learning: In `features/Upscaling/Shaders/Upscaling/DepthUpscalePS.hlsl` (doodlum/skyrim-community-shaders PR `#2003`), the shader is dispatched per-eye using per-eye dimensions (DstWidth/DstHeight). The UV (`input.TexCoord`) is derived from `dispatchID / DstDim`, which is already eye-local — not full SBS/packed stereo coordinates. The per-eye UV clamping logic (isRight, halfScale) operates entirely in eye-local space, and there is no stereo seam issue. Do not flag SBS seam concerns for this shader.
Learnt from: alandtse
Repo: doodlum/skyrim-community-shaders PR: 2084
File: src/ShaderCache.cpp:1646-1649
Timestamp: 2026-04-10T07:34:27.849Z
Learning: In `src/ShaderCache.cpp` (doodlum/skyrim-community-shaders PR `#2084`), `GetImagespaceShaderDescriptor` intentionally returns `false` for VR when a shader is not reverse-engineered. Returning `false` causes `ResolveImageSpaceDescriptor` to return `false`, which makes all callers (GetVertexShader, GetPixelShader, GetComputeShader, CompileShader) skip custom shader handling and defer to the vanilla shader. This is the correct pattern for un-RE'd imagespace shaders on VR — do not flag it as "skipping all custom VR handling" or suggest gating on HDR feature state.
Learnt from: vrnord
Repo: doodlum/skyrim-community-shaders PR: 1982
File: package/Shaders/Lighting.hlsl:3089-3095
Timestamp: 2026-03-16T13:13:11.490Z
Learning: In `package/Shaders/Lighting.hlsl` (doodlum/skyrim-community-shaders), the `TREE_ANIM` alpha test path intentionally uses a hardcoded `0.1` floor check (`if (alpha < 0.1) discard;`) and `AlphaTestRefRS` rather than `SharedData::VRAlphaTestThreshold`. `VRAlphaTestThreshold` is wired into `DistantTree.hlsl` and `RunGrass.hlsl` only. The threshold wiring for `Lighting.hlsl`'s `TREE_ANIM` path is deferred to a separate VR foliage PR.
Learnt from: Dlizzio
Repo: doodlum/skyrim-community-shaders PR: 1950
File: package/Shaders/Lighting.hlsl:2008-2012
Timestamp: 2026-03-07T00:53:21.267Z
Learning: In doodlum/skyrim-community-shaders PR `#1950`, the maintainer (Dlizzio) confirmed that per-axis normal reorientation for triplanar-projected normals/detail in package/Shaders/Lighting.hlsl is not required for their use case; the current approach intentionally mixes samples without axis-specific swizzle/sign correction.
Learnt from: Dlizzio
Repo: doodlum/skyrim-community-shaders PR: 2000
File: src/WeatherEditor/EditorWindow.cpp:1271-1274
Timestamp: 2026-03-22T18:40:49.633Z
Learning: Repo: doodlum/skyrim-community-shaders — Maintainer preference (PR `#2000` on 2026-03-22): In src/WeatherEditor/EditorWindow.cpp within EditorWindow::RenderUI, keep `resetLayout = false;` at the end of the frame (after PaletteWindow::Draw). The reset is intentionally one-shot for windows rendered that frame; closed windows are allowed to reopen with their previous geometry. Do not move the clear later or broadcast resets to closed windows.
Learnt from: jiayev
Repo: doodlum/skyrim-community-shaders PR: 1947
File: package/Shaders/DeferredCompositeCS.hlsl:234-243
Timestamp: 2026-03-14T08:35:42.651Z
Learning: In `package/Shaders/DeferredCompositeCS.hlsl` (doodlum/skyrim-community-shaders), the fallback non-IBL SKYLIGHTING specular path intentionally double-weights with `skylightingSpecular`: `dalcScaled` is computed as `IrradianceToGamma(IrradianceToLinear(directionalAmbientColorSpecular) * skylightingSpecular)`, and the final `finalIrradiance` is then `lerp(specularIrradiance, specularIrradianceReflections, skylightingSpecular)`. This is original engine logic preserved from before PR `#1947` and should not be flagged as a double-attenuation bug.
Learnt from: jiayev
Repo: doodlum/skyrim-community-shaders PR: 0
File: :0-0
Timestamp: 2025-08-03T18:37:19.690Z
Learning: ISReflectionsRayTracing.hlsl and ISWorldMap.hlsl in the skyrim-community-shaders repository are image-space post-processing shaders that perform color sampling and blending operations that need proper linear color space handling for the linear lighting system. ISReflectionsRayTracing handles screen-space reflections and should use conditional Color::IrradianceToLinear/Gamma conversions similar to ISCompositeLensFlareVolumetricLighting.hlsl. ISWorldMap performs 7x7 color accumulation that should be done in linear space similar to the pattern used in ISSAOComposite.hlsl.
Learnt from: vrnord
Repo: doodlum/skyrim-community-shaders PR: 2002
File: package/Shaders/VRStereoOptimizations/StencilCS.hlsl:97-143
Timestamp: 2026-03-24T05:34:21.730Z
Learning: In `package/Shaders/VRStereoOptimizations/StencilCS.hlsl` (doodlum/skyrim-community-shaders PR `#2002`), `EdgeWidth` has been fully removed as a setting. The edge detection search width is hardcoded to `kInnerWidth = 2` with `maxWidth = kInnerWidth`. Do not flag EdgeWidth as a dead/unused parameter — it no longer exists in `VRStereoOptParams` or `cbuffers.hlsli` in this PR.
Learnt from: vrnord
Repo: doodlum/skyrim-community-shaders PR: 1982
File: src/Features/VR/StereoBlend.cpp:98-109
Timestamp: 2026-03-16T13:13:31.383Z
Learning: In `src/Features/VR/StereoBlend.cpp` (doodlum/skyrim-community-shaders), debug modes 4/5 ("Overwrite", "Overwrite Eye1") in the `StereoBlendDebugMode` selector are NOT separate shader variants. They use the same `stereoBlendOverwriteCS` shader with different cbuffer parameters. When `vrStereoOptActive` is true, `stereoBlendOverwriteCS` is always selected unconditionally and `StereoBlendDebugMode` values 4/5 do not map to any distinct code path.
Learnt from: jiayev
Repo: doodlum/skyrim-community-shaders PR: 1947
File: features/Dynamic Cubemaps/Shaders/DynamicCubemaps/DynamicCubemaps.hlsli:107-115
Timestamp: 2026-03-14T08:35:02.326Z
Learning: In `features/Dynamic Cubemaps/Shaders/DynamicCubemaps/DynamicCubemaps.hlsli`, the pattern of scaling `directionalAmbientColorSpecular` by `skylightingSpecular` (via `dalcScaled`) before the final `lerp(specularIrradiance, specularIrradianceReflections, skylightingSpecular)` is intentional pre-existing behavior preserved from the original implementation. Do not flag this as double-attenuation; the combined attenuation is by design.
radianceDisocc now writes mip 0 directly to texRadianceTemp, which prefilterRadiance reads as SRV while writing the mip chain back to texRadiance. Removes the per-frame CopySubresourceRegion that was previously used to resolve the read/write hazard on texRadiance. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
5b344a7 to
fe572cd
Compare
|
✅ A pre-release build is available for this PR: |
…haders#2127) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
radianceDisoccpreviously wrote mip 0 totexRadiance, then aCopySubresourceRegioncopied it totexRadianceTempbeforeprefilterRadiancecould safely read it (avoiding the read/write hazard on the same texture). This PR eliminates that intermediate copy by havingradianceDisoccwrite directly totexRadianceTempas a UAV.texRadianceTempgains a UAV view (D3D11_BIND_UNORDERED_ACCESSreplacesD3D11_BIND_RENDER_TARGET)radianceDisoccbindstexRadianceTemp->uavat slot 0 instead oftexRadiance->uavprefilterRadiancecontinues to readtexRadianceTempas SRV and write the mip chain back totexRadiance— unchangedCopySubresourceRegioncall is removed entirelyNo behaviour change, no quality impact. Removes one full-texture GPU copy per frame.
Test plan
prefilterRadiancereceives correct mip 0 input after the change🤖 Generated with Claude Code
Summary by CodeRabbit