Skip to content

revert: "perf: optimise DeferredComposite (#2150)"#2232

Merged
alandtse merged 1 commit into
community-shaders:devfrom
alandtse:fix/revert-2150-fresh
Apr 30, 2026
Merged

revert: "perf: optimise DeferredComposite (#2150)"#2232
alandtse merged 1 commit into
community-shaders:devfrom
alandtse:fix/revert-2150-fresh

Conversation

@alandtse
Copy link
Copy Markdown
Collaborator

@alandtse alandtse commented Apr 29, 2026

This reverts commit 7f64e55.

closes #2223

Summary by CodeRabbit

  • Performance & Optimization

    • Refactored deferred rendering pipeline to use compute-based processing for improved efficiency.
    • Optimized normal encoding with enhanced precision handling.
  • Refactor

    • Updated shader architecture to streamline rendering pass implementation and reduce state management overhead.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 29, 2026

📝 Walkthrough

Walkthrough

This PR converts the deferred composite pipeline from a graphics-based approach (vertex/pixel shaders) to a compute shader architecture, refactors normal encoding from sqrt-based to octahedral schemes, updates dependent systems to reference the new normal-roughness render target, and removes obsolete state management utilities.

Changes

Cohort / File(s) Summary
Normal Encoding Refactor
package/Shaders/Common/GBuffer.hlsli, package/Shaders/Tests/TestGBuffer.hlsl
Replaces sqrt-based normal encoding with octahedral encoding via new OctWrap and updated EncodeNormal/DecodeNormal; adds EncodeNormalVanilla as alternate path; converts types from float to half; updates test suite with new test functions and half-precision tolerance adjustments.
Deferred Composite Pipeline Conversion
package/Shaders/DeferredCompositeCS.hlsl, package/Shaders/DeferredCompositeVS.hlsl, src/Deferred.cpp, src/Deferred.h
Converts deferred composite from graphics pipeline (VS/PS, render targets, fixed-function states) to compute shader with [numthreads(8,8,1)] dispatch; reorganizes resource bindings to SRVs and UAVs; integrates stereo handling via StereoOptModeTexture; eliminates vertex shader entirely; updates shader caching and rendering calls.
Normal Texture Binding Updates
src/Features/ScreenSpaceGI.cpp, src/Features/SubsurfaceScattering.cpp
Redirects normal/roughness texture SRV bindings from globals::deferred->normalRoughnessRT to rts[NORMALROUGHNESS] render-target index across SSGI and SSS compute passes.
Documentation & Cleanup
src/Features/VRStereoOptimizations.h, src/Utils/D3DStateBackup.h
Updates Doxygen comment to reference compute shader instead of pixel shader; removes D3DStateBackup state-snapshot utility struct no longer needed by compute pipeline.

Sequence Diagram(s)

sequenceDiagram
    participant Deferred as Deferred::Render
    participant Dispatch as GPU Dispatch
    participant ComputeShader as DeferredCompositeCS
    participant SRVs as Input SRVs<br/>(Albedo, Normal, Depth, etc.)
    participant UAVs as Output UAVs<br/>(Main, Normals, Motion)

    Deferred->>Deferred: Bind SRVs (Albedo, NormalRoughness, Masks, Depth, etc.)
    Deferred->>Deferred: Bind StereoOptModeTexture (optional)
    Deferred->>Deferred: Bind UAVs (Main, NormalMask, MotionVectors)
    Deferred->>Dispatch: Dispatch(screen width/8, screen height/8, 1)
    Dispatch->>ComputeShader: Launch threads (dispatchID)
    ComputeShader->>SRVs: Sample per-pixel data (albedo, normal, depth)
    alt Depth == 1.0
        ComputeShader->>ComputeShader: Generate sky motion vectors
    else Standard
        ComputeShader->>ComputeShader: Compute standard lighting
    end
    ComputeShader->>ComputeShader: Decode octahedral normal
    ComputeShader->>ComputeShader: Sample SSGI, reflections, sky
    ComputeShader->>UAVs: Write Main (final color)
    ComputeShader->>UAVs: Write NormalTAAMaskSpecular
    ComputeShader->>UAVs: Write MotionVectors
    Deferred->>Deferred: Unbind compute resources
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Sssr #2156 — Modifies DeferredCompositeCS.hlsl shader interface and SSRT/specular bindings alongside this compute-conversion refactor.
  • perf: optimise DeferredComposite #2150 — Directly conflicts/modifies the same normal encoding functions (Encode/Decode) in GBuffer.hlsli with an alternative analytic mapping.
  • perf: optimise ssgi normal #2189 — Alters ScreenSpaceGI normal texture handling in tandem with normal-roughness render-target changes.

Suggested reviewers

  • davo0411

Poem

🐰 From pixel shades we hop away,
To compute threads that light the day,
Octahedral normals encode so neat,
Dispatch by dispatch, our render's complete!
No vertices now—just UAVs write,
A shader reborn in parallel light! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes a reversion of commit 7f64e55 from PR #2150, matching the PR's stated objective to revert that specific optimization.
Linked Issues check ✅ Passed The PR reverts changes from #2150 to fix broken VR reflections (#2223). The code changes undo the compute-shader-based deferred compositing and restore prior implementations, directly addressing the reflection breakage.
Out of Scope Changes check ✅ Passed All changes are directly related to reverting the deferred composite optimization. Modifications to shader files, C++ implementation, and header definitions all undo the compute-shader refactoring from #2150.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

Actionable Suggestions

  • Subsurface Scattering (Alan Tse): Needs version bump to 3-0-2

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@package/Shaders/Tests/TestGBuffer.hlsl`:
- Around line 37-59: The tests currently only assert encoded range and decoded
unit length for the sample normals (testNormals) but miss verifying direction
fidelity; update the loop that uses GBuffer::EncodeNormal and
GBuffer::DecodeNormal to also compute the dot product between original and
decoded (e.g., dot(original, decoded)) and assert it exceeds a relaxed threshold
(choose ~0.90–0.97 depending on half precision tolerance) to catch
mirrored/reflected results; add this cosine/dot assertion alongside the existing
length check for variables original and decoded.

In `@src/Deferred.cpp`:
- Around line 356-373: The composite shader reads DepthTexture unguarded in
DeferredCompositeCS but srvs[16] leaves slot t4 null on builds without
dynamicCubemaps, causing bad depth reads; fix by binding the depth SRV into
srvs[4] unconditionally (use Util::GetCurrentSceneDepthSRV(true) for t4) instead
of only when dynamicCubemaps.loaded || REL::Module::IsVR(), and ensure the srvs
array still gracefully uses nullptr for optional textures (reflectance,
envTexture, skylighting, etc.) so resource lifetimes and DX11 binding remain
safe when features are absent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5b1b00ef-be67-4da7-9759-36c316185e2a

📥 Commits

Reviewing files that changed from the base of the PR and between 651426e and dc41444.

📒 Files selected for processing (10)
  • package/Shaders/Common/GBuffer.hlsli
  • package/Shaders/DeferredCompositeCS.hlsl
  • package/Shaders/DeferredCompositeVS.hlsl
  • package/Shaders/Tests/TestGBuffer.hlsl
  • src/Deferred.cpp
  • src/Deferred.h
  • src/Features/ScreenSpaceGI.cpp
  • src/Features/SubsurfaceScattering.cpp
  • src/Features/VRStereoOptimizations.h
  • src/Utils/D3DStateBackup.h
💤 Files with no reviewable changes (2)
  • package/Shaders/DeferredCompositeVS.hlsl
  • src/Utils/D3DStateBackup.h

Comment on lines +37 to 59
// Test behavioral properties of octahedral encoding (not exact numerical accuracy)
// Half precision + quantization means we check: valid output, normalized, reasonable direction
half3 testNormals[4] = {
normalize(half3(1.0h, 1.0h, 1.0h)),
normalize(half3(-1.0h, 1.0h, 1.0h)),
normalize(half3(1.0h, -1.0h, 1.0h)),
normalize(half3(1.0h, 1.0h, -1.0h))
};

for (int i = 0; i < 4; i++) {
float3 original = testNormals[i];
float2 encoded = GBuffer::EncodeNormal(original);
float3 decoded = GBuffer::DecodeNormal(encoded);
half3 original = testNormals[i];
half2 encoded = GBuffer::EncodeNormal(original);
half3 decoded = GBuffer::DecodeNormal(encoded);

// Check behavioral properties (relaxed for half precision quantization):
// 1. Encoded values are in valid range [0, 1]
ASSERT(IsTrue, encoded.x >= 0.0h && encoded.x <= 1.0h);
ASSERT(IsTrue, encoded.y >= 0.0h && encoded.y <= 1.0h);

float length = sqrt(decoded.x * decoded.x + decoded.y * decoded.y + decoded.z * decoded.z);
ASSERT(IsTrue, abs(length - 1.0) < 0.05);
// 2. Decoded normal is normalized (unit length)
half length = sqrt(decoded.x * decoded.x + decoded.y * decoded.y + decoded.z * decoded.z);
ASSERT(IsTrue, abs(length - 1.0h) < 0.02h); // Relaxed tolerance for half precision
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Keep a direction check for the diagonal roundtrip.

This now only proves “valid range + unit length.” A broken octahedral fold/sign path can still return a normalized but mirrored vector, so these angled cases would pass while reflections drift. Add a cosine/dot threshold against original.

🧪 Suggested assertion
 	for (int i = 0; i < 4; i++) {
 		half3 original = testNormals[i];
 		half2 encoded = GBuffer::EncodeNormal(original);
 		half3 decoded = GBuffer::DecodeNormal(encoded);
 
 		// Check behavioral properties (relaxed for half precision quantization):
 		// 1. Encoded values are in valid range [0, 1]
 		ASSERT(IsTrue, encoded.x >= 0.0h && encoded.x <= 1.0h);
 		ASSERT(IsTrue, encoded.y >= 0.0h && encoded.y <= 1.0h);
+
+		// 1b. Decoded direction should still match the source normal closely.
+		ASSERT(IsTrue, dot(decoded, original) > 0.98h);
 
 		// 2. Decoded normal is normalized (unit length)
 		half length = sqrt(decoded.x * decoded.x + decoded.y * decoded.y + decoded.z * decoded.z);
 		ASSERT(IsTrue, abs(length - 1.0h) < 0.02h);  // Relaxed tolerance for half precision
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Test behavioral properties of octahedral encoding (not exact numerical accuracy)
// Half precision + quantization means we check: valid output, normalized, reasonable direction
half3 testNormals[4] = {
normalize(half3(1.0h, 1.0h, 1.0h)),
normalize(half3(-1.0h, 1.0h, 1.0h)),
normalize(half3(1.0h, -1.0h, 1.0h)),
normalize(half3(1.0h, 1.0h, -1.0h))
};
for (int i = 0; i < 4; i++) {
float3 original = testNormals[i];
float2 encoded = GBuffer::EncodeNormal(original);
float3 decoded = GBuffer::DecodeNormal(encoded);
half3 original = testNormals[i];
half2 encoded = GBuffer::EncodeNormal(original);
half3 decoded = GBuffer::DecodeNormal(encoded);
// Check behavioral properties (relaxed for half precision quantization):
// 1. Encoded values are in valid range [0, 1]
ASSERT(IsTrue, encoded.x >= 0.0h && encoded.x <= 1.0h);
ASSERT(IsTrue, encoded.y >= 0.0h && encoded.y <= 1.0h);
float length = sqrt(decoded.x * decoded.x + decoded.y * decoded.y + decoded.z * decoded.z);
ASSERT(IsTrue, abs(length - 1.0) < 0.05);
// 2. Decoded normal is normalized (unit length)
half length = sqrt(decoded.x * decoded.x + decoded.y * decoded.y + decoded.z * decoded.z);
ASSERT(IsTrue, abs(length - 1.0h) < 0.02h); // Relaxed tolerance for half precision
}
for (int i = 0; i < 4; i++) {
half3 original = testNormals[i];
half2 encoded = GBuffer::EncodeNormal(original);
half3 decoded = GBuffer::DecodeNormal(encoded);
// Check behavioral properties (relaxed for half precision quantization):
// 1. Encoded values are in valid range [0, 1]
ASSERT(IsTrue, encoded.x >= 0.0h && encoded.x <= 1.0h);
ASSERT(IsTrue, encoded.y >= 0.0h && encoded.y <= 1.0h);
// 1b. Decoded direction should still match the source normal closely.
ASSERT(IsTrue, dot(decoded, original) > 0.98h);
// 2. Decoded normal is normalized (unit length)
half length = sqrt(decoded.x * decoded.x + decoded.y * decoded.y + decoded.z * decoded.z);
ASSERT(IsTrue, abs(length - 1.0h) < 0.02h); // Relaxed tolerance for half precision
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@package/Shaders/Tests/TestGBuffer.hlsl` around lines 37 - 59, The tests
currently only assert encoded range and decoded unit length for the sample
normals (testNormals) but miss verifying direction fidelity; update the loop
that uses GBuffer::EncodeNormal and GBuffer::DecodeNormal to also compute the
dot product between original and decoded (e.g., dot(original, decoded)) and
assert it exceeds a relaxed threshold (choose ~0.90–0.97 depending on half
precision tolerance) to catch mirrored/reflected results; add this cosine/dot
assertion alongside the existing length check for variables original and
decoded.

Comment thread src/Deferred.cpp
@github-actions
Copy link
Copy Markdown

✅ A pre-release build is available for this PR:
Download

@alandtse
Copy link
Copy Markdown
Collaborator Author

the feature audit is spurious. Need to teach it about reverts.

@alandtse alandtse merged commit 1e3b6fa into community-shaders:dev Apr 30, 2026
14 of 15 checks passed
IgorAlanAlbuquerque pushed a commit to IgorAlanAlbuquerque/skyrim-community-shaders that referenced this pull request May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Rubber banding reflections

2 participants