Skip to content

revert: "perf: optimise DeferredComposite"#2169

Closed
alandtse wants to merge 1 commit into
devfrom
revert-2150-ps-opt
Closed

revert: "perf: optimise DeferredComposite"#2169
alandtse wants to merge 1 commit into
devfrom
revert-2150-ps-opt

Conversation

@alandtse
Copy link
Copy Markdown
Collaborator

@alandtse alandtse commented Apr 21, 2026

Reverts #2150

closes #2168

Summary by CodeRabbit

  • Optimizations
    • Improved rendering performance through graphics pipeline enhancements
    • Enhanced VR stereo rendering optimization for better visual quality
    • Refined surface normal encoding for improved rendering fidelity
    • Optimized resource handling across screen-space rendering features

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 21, 2026

📝 Walkthrough

Walkthrough

The pull request refactors deferred compositing from a graphics pipeline (vertex/pixel shaders) to a compute shader-based approach, introduces octahedral normal encoding/decoding, remaps IBL texture register bindings, and updates render target references across dependent features. The compute shader replaces full-screen triangle rendering with thread-group dispatch, uses UAVs for output, and adds VR stereo mode texture binding.

Changes

Cohort / File(s) Summary
Deferred Composite Pipeline Refactor
package/Shaders/DeferredCompositeCS.hlsl, package/Shaders/DeferredCompositeVS.hlsl, src/Deferred.cpp, src/Deferred.h
Converted deferred compositing from graphics pipeline (PS/VS with render target binding) to compute shader with numthreads(8,8,1) entry point. Replaced SRV model with UAV outputs (MainRW, NormalTAAMaskSpecularMaskRW, MotionVectorsRW) and added VR stereo mode texture binding at t16. Removed vertex shader entirely and transitioned from PSSetShaderResources to CSSetShaderResources with 16-slot SRV array. Removed D3DStateBackup and related graphics state (blend, depth-stencil, rasterizer) management. Updated shader cache to track compute shaders instead of PS/VS.
Normal Encoding Algorithm Update
package/Shaders/Common/GBuffer.hlsli, package/Shaders/Tests/TestGBuffer.hlsl
Replaced standard normal encoding with octahedral encoding using half precision. Added OctWrap(half2) helper, new EncodeNormal(half3)->half2 with octahedral reprojection and wrapping, and DecodeNormal(half2)->half3 with reconstruction. Introduced EncodeNormalVanilla(half3)->half2 preserving prior math. Updated test suite to validate octahedral encoding roundtrip with half precision, added TestOctWrap and TestVanillaNormalEncoding.
IBL Texture Register Remapping
features/IBL/Shaders/IBL/IBL.hlsli
Shifted deferred-path texture register bindings: EnvIBLTexture moved from t15 to t14, SkyIBLTexture moved from t16 to t15 under IBL_DEFERRED conditional.
Render Target Reference Updates
src/Features/ScreenSpaceGI.cpp, src/Features/SubsurfaceScattering.cpp
Updated render target access from globals::deferred->normalRoughnessRT to NORMALROUGHNESS macro in SSGI and SSS compute dispatches for normal/roughness SRV binding.
Utility and Documentation
src/Utils/D3DStateBackup.h, src/Features/VRStereoOptimizations.h
Removed entire D3DStateBackup header (state capture/restore utility). Updated VRStereoOptimizations::GetModeTextureSRV() documentation to reference compute shader consumer instead of pixel shader.

Sequence Diagram(s)

sequenceDiagram
    participant AppThread as Application Thread
    participant CS as Compute Shader<br/>(DeferredCompositeCS)
    participant SRV as SRV Resources<br/>(16 slots)
    participant UAV as UAV Outputs<br/>(3 buffers)
    participant CB as Constant Buffer<br/>(Per-frame)

    AppThread->>AppThread: Compute dispatchCount<br/>from texture dimensions
    AppThread->>CS: Bind compute shader
    AppThread->>CB: Set per-frame & VR CB
    AppThread->>SRV: Bind 16 SRV slots<br/>(Depth, Normals, etc.)
    AppThread->>SRV: Optionally bind VR<br/>mode texture at t16
    AppThread->>UAV: Bind MainRW (u0)<br/>NormalTAAMask (u1)<br/>MotionVectors (u2)
    
    AppThread->>CS: Dispatch compute<br/>numthreads(8,8,1)<br/>with dispatchCount
    
    activate CS
    CS->>SRV: Read dispatchID.xy<br/>from SRV resources
    CS->>CS: Check stereo eye<br/>early-out if needed
    CS->>CS: Sample depth, albedo,<br/>normals, SSGI, etc.
    CS->>CS: Compute deferred<br/>lighting & color
    CS->>UAV: Write color to MainRW
    CS->>UAV: Write normal-encoded &<br/>mask to NormalTAAMask
    CS->>UAV: Write motion vectors<br/>to MotionVectorsRW
    deactivate CS
    
    AppThread->>AppThread: Unbind mode texture SRV
    AppThread->>AppThread: Clear SRV/UAV bindings
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • davo0411
  • doodlum

Poem

🐰 Compute threads now dance where shaders once ran,
Octahedrons twist normals in a cunning plan,
From vertex and pixel to dispatch we leapt,
UAVs replace targets where data is swept,
A rabbit's delight—such refactoring feat!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately reflects the main change: reverting a previous performance optimization commit (PR #2150) for DeferredComposite.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch revert-2150-ps-opt

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/Deferred.cpp (1)

135-155: Remove the unused descriptor block.

These descriptors are populated but never used to create or assign a resource, so the block is dead code and makes SetupResources() look incomplete.

♻️ Proposed cleanup
-	{
-		D3D11_TEXTURE2D_DESC texDesc;
-		auto mainTex = renderer->GetRuntimeData().renderTargets[RE::RENDER_TARGETS::kMAIN];
-		mainTex.texture->GetDesc(&texDesc);
-
-		texDesc.Format = DXGI_FORMAT_R11G11B10_FLOAT;
-		texDesc.BindFlags = D3D11_BIND_RENDER_TARGET | D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_UNORDERED_ACCESS;
-
-		D3D11_SHADER_RESOURCE_VIEW_DESC srvDesc = {
-			.Format = texDesc.Format,
-			.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D,
-			.Texture2D = {
-				.MostDetailedMip = 0,
-				.MipLevels = 1 }
-		};
-		D3D11_UNORDERED_ACCESS_VIEW_DESC uavDesc = {
-			.Format = texDesc.Format,
-			.ViewDimension = D3D11_UAV_DIMENSION_TEXTURE2D,
-			.Texture2D = { .MipSlice = 0 }
-		};
-	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Deferred.cpp` around lines 135 - 155, Remove the unused descriptor block
in SetupResources: the temporary variables mainTex, texDesc, srvDesc and uavDesc
are populated but never used to create views or assign resources, so delete that
entire block (the braces containing the D3D11_TEXTURE2D_DESC modification and
the D3D11_SHADER_RESOURCE_VIEW_DESC / D3D11_UNORDERED_ACCESS_VIEW_DESC
initializations). If you intend to actually create SRV/UAVs instead of removing
it, replace the block with proper calls that use texDesc/ srvDesc/ uavDesc
(e.g., CreateShaderResourceView/CreateUnorderedAccessView) and assign the
resulting views to the renderer/runtime structures referenced by
renderer->GetRuntimeData().renderTargets[RE::RENDER_TARGETS::kMAIN].
package/Shaders/Tests/TestGBuffer.hlsl (1)

34-110: Tighten the new tests so they catch direction regressions.

Line 57 only proves decoded is unit length; it does not prove it still points near original. The OctWrap checks are also broad enough for many incorrect formulas to pass.

🧪 Proposed test-strengthening sketch
 		// 2. Decoded normal is normalized (unit length)
 		half length = sqrt(decoded.x * decoded.x + decoded.y * decoded.y + decoded.z * decoded.z);
 		ASSERT(IsTrue, abs(length - 1.0h) < 0.02h);  // Relaxed tolerance for half precision
+
+		// 3. Decoded normal still points in the original direction
+		ASSERT(IsTrue, dot(original, decoded) > 0.98h);
 	}
 }
@@
-	half2 v1 = half2(0.5h, 0.5h);
+	half2 v1 = half2(0.8h, 0.3h);
 	half2 wrapped1 = GBuffer::OctWrap(v1);
-	ASSERT(IsTrue, wrapped1.x >= 0.0h && wrapped1.x <= 1.0h);
-	ASSERT(IsTrue, wrapped1.y >= 0.0h && wrapped1.y <= 1.0h);
+	ASSERT(IsTrue, abs(wrapped1.x - 0.7h) < 0.01h);
+	ASSERT(IsTrue, abs(wrapped1.y - 0.2h) < 0.01h);
@@
-	half2 v2 = half2(-0.3h, 0.7h);
+	half2 v2 = half2(-0.8h, 0.3h);
 	half2 wrapped2 = GBuffer::OctWrap(v2);
-	ASSERT(IsTrue, wrapped2.x >= -1.0h && wrapped2.x <= 1.0h);
-	ASSERT(IsTrue, wrapped2.y >= -1.0h && wrapped2.y <= 1.0h);
+	ASSERT(IsTrue, abs(wrapped2.x + 0.7h) < 0.01h);
+	ASSERT(IsTrue, abs(wrapped2.y - 0.2h) < 0.01h);
@@
-	half2 v3 = half2(0.2h, -0.8h);
+	half2 v3 = half2(0.8h, -0.3h);
 	half2 wrapped3 = GBuffer::OctWrap(v3);
-	ASSERT(IsTrue, wrapped3.x >= -1.0h && wrapped3.x <= 1.0h);
-	ASSERT(IsTrue, wrapped3.y >= -1.0h && wrapped3.y <= 1.0h);
+	ASSERT(IsTrue, abs(wrapped3.x - 0.7h) < 0.01h);
+	ASSERT(IsTrue, abs(wrapped3.y + 0.2h) < 0.01h);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@package/Shaders/Tests/TestGBuffer.hlsl` around lines 34 - 110, The tests
currently only assert normalization/range but not directional fidelity; in
TestNormalEncodingAngledNormals after DecodeNormal(encoded) add a directional
check comparing decoded to original using dot(original, decoded) and assert it
exceeds a relaxed threshold (e.g., > 0.95) so regressions that flip or rotate
the normal are caught (refer to TestNormalEncodingAngledNormals,
GBuffer::EncodeNormal, GBuffer::DecodeNormal). For TestOctWrap tighten checks in
TestOctWrap: assert outputs are within [-1,1] (already present) but also assert
sign/hemisphere preservation for inputs (e.g., if input.x >= 0 then wrapped.x >=
0, same for y) and that magnitude/clamping behavior is consistent (e.g.,
abs(wrapped) <= 1.0) to rule out incorrect wrapping semantics (refer to
TestOctWrap and GBuffer::OctWrap). For TestVanillaNormalEncoding add a
directional sanity check for EncodeNormalVanilla outputs by reconstructing or
comparing to expected encoded values for canonical axes (upNormal and
axis-aligned normals) with tighter thresholds (e.g., abs(encoded - expected) <
0.1) referencing TestVanillaNormalEncoding and GBuffer::EncodeNormalVanilla.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@package/Shaders/DeferredCompositeCS.hlsl`:
- Around line 333-334: The normal is being encoded with
GBuffer::EncodeNormalVanilla when downstream decoding uses the octahedral
GBuffer::DecodeNormal; update the write to use the matching octahedral encoder
(e.g. GBuffer::EncodeNormal) for NormalTAAMaskSpecularMaskRW[dispatchID.xy] so
the stored normal pairs correctly with GBuffer::DecodeNormal and downstream
consumers get correct normals.

---

Nitpick comments:
In `@package/Shaders/Tests/TestGBuffer.hlsl`:
- Around line 34-110: The tests currently only assert normalization/range but
not directional fidelity; in TestNormalEncodingAngledNormals after
DecodeNormal(encoded) add a directional check comparing decoded to original
using dot(original, decoded) and assert it exceeds a relaxed threshold (e.g., >
0.95) so regressions that flip or rotate the normal are caught (refer to
TestNormalEncodingAngledNormals, GBuffer::EncodeNormal, GBuffer::DecodeNormal).
For TestOctWrap tighten checks in TestOctWrap: assert outputs are within [-1,1]
(already present) but also assert sign/hemisphere preservation for inputs (e.g.,
if input.x >= 0 then wrapped.x >= 0, same for y) and that magnitude/clamping
behavior is consistent (e.g., abs(wrapped) <= 1.0) to rule out incorrect
wrapping semantics (refer to TestOctWrap and GBuffer::OctWrap). For
TestVanillaNormalEncoding add a directional sanity check for EncodeNormalVanilla
outputs by reconstructing or comparing to expected encoded values for canonical
axes (upNormal and axis-aligned normals) with tighter thresholds (e.g.,
abs(encoded - expected) < 0.1) referencing TestVanillaNormalEncoding and
GBuffer::EncodeNormalVanilla.

In `@src/Deferred.cpp`:
- Around line 135-155: Remove the unused descriptor block in SetupResources: the
temporary variables mainTex, texDesc, srvDesc and uavDesc are populated but
never used to create views or assign resources, so delete that entire block (the
braces containing the D3D11_TEXTURE2D_DESC modification and the
D3D11_SHADER_RESOURCE_VIEW_DESC / D3D11_UNORDERED_ACCESS_VIEW_DESC
initializations). If you intend to actually create SRV/UAVs instead of removing
it, replace the block with proper calls that use texDesc/ srvDesc/ uavDesc
(e.g., CreateShaderResourceView/CreateUnorderedAccessView) and assign the
resulting views to the renderer/runtime structures referenced by
renderer->GetRuntimeData().renderTargets[RE::RENDER_TARGETS::kMAIN].
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 98e9ffe8-d168-4eae-8c85-c9e14364dc6a

📥 Commits

Reviewing files that changed from the base of the PR and between 359e9bf and d49ef4e.

📒 Files selected for processing (11)
  • features/IBL/Shaders/IBL/IBL.hlsli
  • package/Shaders/Common/GBuffer.hlsli
  • package/Shaders/DeferredCompositeCS.hlsl
  • package/Shaders/DeferredCompositeVS.hlsl
  • package/Shaders/Tests/TestGBuffer.hlsl
  • src/Deferred.cpp
  • src/Deferred.h
  • src/Features/ScreenSpaceGI.cpp
  • src/Features/SubsurfaceScattering.cpp
  • src/Features/VRStereoOptimizations.h
  • src/Utils/D3DStateBackup.h
💤 Files with no reviewable changes (2)
  • package/Shaders/DeferredCompositeVS.hlsl
  • src/Utils/D3DStateBackup.h

Comment on lines +333 to +334
MainRW[dispatchID.xy] = float4(color, 1.0);
NormalTAAMaskSpecularMaskRW[dispatchID.xy] = float4(GBuffer::EncodeNormalVanilla(normalVS), 0.0, 0.0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use the matching normal encoder for this output.

Line 113 decodes with octahedral GBuffer::DecodeNormal, but Line 334 writes with GBuffer::EncodeNormalVanilla, which is not the inverse encoding. This will produce incorrectly decoded normals for consumers expecting the updated GBuffer format.

🐛 Proposed fix
 	MainRW[dispatchID.xy] = float4(color, 1.0);
-	NormalTAAMaskSpecularMaskRW[dispatchID.xy] = float4(GBuffer::EncodeNormalVanilla(normalVS), 0.0, 0.0);
+	NormalTAAMaskSpecularMaskRW[dispatchID.xy] = float4(GBuffer::EncodeNormal(normalVS), 0.0, 0.0);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
MainRW[dispatchID.xy] = float4(color, 1.0);
NormalTAAMaskSpecularMaskRW[dispatchID.xy] = float4(GBuffer::EncodeNormalVanilla(normalVS), 0.0, 0.0);
MainRW[dispatchID.xy] = float4(color, 1.0);
NormalTAAMaskSpecularMaskRW[dispatchID.xy] = float4(GBuffer::EncodeNormal(normalVS), 0.0, 0.0);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@package/Shaders/DeferredCompositeCS.hlsl` around lines 333 - 334, The normal
is being encoded with GBuffer::EncodeNormalVanilla when downstream decoding uses
the octahedral GBuffer::DecodeNormal; update the write to use the matching
octahedral encoder (e.g. GBuffer::EncodeNormal) for
NormalTAAMaskSpecularMaskRW[dispatchID.xy] so the stored normal pairs correctly
with GBuffer::DecodeNormal and downstream consumers get correct normals.

@doodlum doodlum closed this Apr 21, 2026
@alandtse alandtse reopened this Apr 29, 2026
@github-actions
Copy link
Copy Markdown

Actionable Suggestions

  • Subsurface Scattering (Alan Tse): Needs version bump to 3-0-2

@alandtse alandtse closed this Apr 29, 2026
@github-actions
Copy link
Copy Markdown

✅ A pre-release build is available for this PR:
Download

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Certain VR textures shift with hmd before settling in correct location

2 participants