perf(vr): SSS resolution-independent sample count#2219
Conversation
📝 WalkthroughWalkthroughDisable precision offset for VR builds; move DynamicRes scaling into UV construction and clamp/remap VR depth reads; floor depth-thickness; change VR sample-count path to return Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application (CPU)
participant VR as VR Runtime / Stereo
participant Shader as ScreenSpaceShadow Shader
participant Depth as Depth Buffer
App->>VR: Query VR build flag / stereo params
App->>App: GetScaledSampleCount() — if VR return SampleCount*64
App->>Shader: Dispatch raymarch with parameters (UsePrecisionOffset=false for VR)
Shader->>Depth: Sample depth using constructed UVs (DynamicRes applied)
Depth-->>Shader: Depth value (clamped/remapped to FarDepthValue when out-of-eye or zero)
Shader->>Shader: Compute depth_thickness_scale = max(diff, 1e-4)
Shader-->>App: WriteScreenSpaceShadow results
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
No actionable suggestions for changed features. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/Features/ScreenSpaceShadows.cpp (1)
28-30:⚠️ Potential issue | 🟡 MinorUpdate the stale slider tooltip — it no longer "adapts to render resolution".
After this PR,
GetScaledSampleCount()returns a flatSampleCount * 64independent of resolution. The tooltip on line 30 still claims "Adapts to render resolution," which contradicts the new behavior and will mislead users tuning shadow reach in VR/4K.📝 Proposed tooltip update
ImGui::SliderInt("Sample Count Multiplier", (int*)&bendSettings.SampleCount, 1, 4); if (auto _tt = Util::HoverTooltipWrapper()) - ImGui::Text("Multiplier for shadow ray sample count. Higher values increase shadow reach at the cost of performance. Adapts to render resolution."); + ImGui::Text("Multiplier for shadow ray sample count. Higher values increase shadow reach (in pixels) at the cost of performance. Reach is FOV-driven, not resolution-driven.");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Features/ScreenSpaceShadows.cpp` around lines 28 - 30, The tooltip text next to ImGui::SliderInt("Sample Count Multiplier", (int*)&bendSettings.SampleCount, 1, 4) is stale — update the string inside the Util::HoverTooltipWrapper() ImGui::Text call to remove "Adapts to render resolution" and accurately describe the current behavior: that GetScaledSampleCount() now returns a flat SampleCount * 64 independent of resolution, so higher values increase shadow reach (and cost performance) regardless of render resolution; locate the tooltip in ScreenSpaceShadows.cpp near bendSettings.SampleCount and replace the sentence accordingly.
🧹 Nitpick comments (1)
features/Screen-Space Shadows/Shaders/ScreenSpaceShadows/bend_sss_gpu.hlsli (1)
262-271: Consider usinginParameters.FarDepthValueinstead of literal1.0for both clamps.The comments on lines 262-264 and 268-269 reference
FarDepthValue, but the code hardcodes1.0. This works only because the caller (RaymarchCS.hlsl:40) setsparameters.FarDepthValue = 1. If anyone ever flips to a reversed-Z-style configuration, these clamps would silently invert (mapping HMD mask → near plane, casting strong false shadows). A symbolic reference keeps the shader self-consistent with its documented contract.♻️ Proposed change
- depths.x = coord_out_of_eye ? 1.0 : inParameters.DepthTexture.SampleLevel(inParameters.PointBorderSampler, coord, 0); - depths.y = coord_offset_out_of_eye ? 1.0 : inParameters.DepthTexture.SampleLevel(inParameters.PointBorderSampler, coord_with_offset, 0); + depths.x = coord_out_of_eye ? inParameters.FarDepthValue : inParameters.DepthTexture.SampleLevel(inParameters.PointBorderSampler, coord, 0); + depths.y = coord_offset_out_of_eye ? inParameters.FarDepthValue : inParameters.DepthTexture.SampleLevel(inParameters.PointBorderSampler, coord_with_offset, 0); // HMD mask: depth==0 is outside the visible lens area. Remap to FarDepthValue so // mask pixels do not cast false shadows. - depths.x = lerp(depths.x, 1.0, (float)(depths.x == 0)); - depths.y = lerp(depths.y, 1.0, (float)(depths.y == 0)); + depths.x = (depths.x == 0) ? inParameters.FarDepthValue : depths.x; + depths.y = (depths.y == 0) ? inParameters.FarDepthValue : depths.y;The ternary form is also slightly clearer than
lerp(x, 1.0, (float)(x == 0))and compiles to the same select on modern HLSL backends.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@features/Screen-Space` Shadows/Shaders/ScreenSpaceShadows/bend_sss_gpu.hlsli around lines 262 - 271, Replace the hardcoded 1.0 literals with the shader parameter inParameters.FarDepthValue for both the cross-eye clamping and the HMD-mask remap so the shader follows its declared contract; specifically update the lines assigning depths.x and depths.y (the ternary selects that sample or FarDepthValue) and replace the lerp-based HMD mask remap with an explicit conditional select that uses inParameters.FarDepthValue when depths.x or depths.y == 0; this keeps ScreenSpaceShadows/bend_sss_gpu.hlsli consistent with the RaymarchCS parameter FarDepthValue and safe for reversed-Z configurations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/Features/ScreenSpaceShadows.cpp`:
- Around line 28-30: The tooltip text next to ImGui::SliderInt("Sample Count
Multiplier", (int*)&bendSettings.SampleCount, 1, 4) is stale — update the string
inside the Util::HoverTooltipWrapper() ImGui::Text call to remove "Adapts to
render resolution" and accurately describe the current behavior: that
GetScaledSampleCount() now returns a flat SampleCount * 64 independent of
resolution, so higher values increase shadow reach (and cost performance)
regardless of render resolution; locate the tooltip in ScreenSpaceShadows.cpp
near bendSettings.SampleCount and replace the sentence accordingly.
---
Nitpick comments:
In `@features/Screen-Space` Shadows/Shaders/ScreenSpaceShadows/bend_sss_gpu.hlsli:
- Around line 262-271: Replace the hardcoded 1.0 literals with the shader
parameter inParameters.FarDepthValue for both the cross-eye clamping and the
HMD-mask remap so the shader follows its declared contract; specifically update
the lines assigning depths.x and depths.y (the ternary selects that sample or
FarDepthValue) and replace the lerp-based HMD mask remap with an explicit
conditional select that uses inParameters.FarDepthValue when depths.x or
depths.y == 0; this keeps ScreenSpaceShadows/bend_sss_gpu.hlsli consistent with
the RaymarchCS parameter FarDepthValue and safe for reversed-Z configurations.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 49534c2d-d588-41b0-a3a5-8f1a9f0ba98d
📒 Files selected for processing (3)
features/Screen-Space Shadows/Shaders/ScreenSpaceShadows/RaymarchCS.hlslfeatures/Screen-Space Shadows/Shaders/ScreenSpaceShadows/bend_sss_gpu.hlslisrc/Features/ScreenSpaceShadows.cpp
80c17e6 to
b579dee
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/Features/ScreenSpaceShadows.cpp (1)
82-87: Optional: add astd::max(..., 64u)floor for symmetry with the non-VR path.The non-VR path (Line 97) caps with
std::max(scaledSampleCount, 64u)to keepSAMPLE_COUNT≥WAVE_SIZE(required byREAD_COUNT (SAMPLE_COUNT / WAVE_SIZE + 2)inbend_sss_gpu.hlsliLine 185). The VR path skips this guard — fine while the slider is clamped to[1,4], but a malformed JSON withSampleCount = 0would compile a shader withSAMPLE_COUNT = 0. The non-VR path is hardened against this, the VR path isn't.♻️ Proposed fix
if (globals::game::isVR) { // In VR, SAMPLE_COUNT is a pixel-space ray length that is FOV-driven, not resolution-driven. // Resolution-scaling produced 2-8x excess samples at VR resolutions with no quality benefit. // WAVE_SIZE (64) alignment is required for correct Bend READ_COUNT computation. - return bendSettings.SampleCount * 64; + return std::max(bendSettings.SampleCount * 64u, 64u); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Features/ScreenSpaceShadows.cpp` around lines 82 - 87, The VR branch returns SAMPLE_COUNT derived from bendSettings.SampleCount without a floor, so a malformed value (e.g. 0) can produce SAMPLE_COUNT < WAVE_SIZE and break READ_COUNT math; update the VR return to mirror the non‑VR guard by returning std::max(bendSettings.SampleCount * 64u, 64u) (or equivalent) so SAMPLE_COUNT is always at least WAVE_SIZE; refer to globals::game::isVR, bendSettings.SampleCount, SAMPLE_COUNT, WAVE_SIZE and the READ_COUNT rule in bend_sss_gpu.hlsli when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/Features/ScreenSpaceShadows.cpp`:
- Around line 82-87: The VR branch returns SAMPLE_COUNT derived from
bendSettings.SampleCount without a floor, so a malformed value (e.g. 0) can
produce SAMPLE_COUNT < WAVE_SIZE and break READ_COUNT math; update the VR return
to mirror the non‑VR guard by returning std::max(bendSettings.SampleCount * 64u,
64u) (or equivalent) so SAMPLE_COUNT is always at least WAVE_SIZE; refer to
globals::game::isVR, bendSettings.SampleCount, SAMPLE_COUNT, WAVE_SIZE and the
READ_COUNT rule in bend_sss_gpu.hlsli when making the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0f983263-7f1d-415a-b8bd-571627b1ab91
📒 Files selected for processing (3)
features/Screen-Space Shadows/Shaders/ScreenSpaceShadows/RaymarchCS.hlslfeatures/Screen-Space Shadows/Shaders/ScreenSpaceShadows/bend_sss_gpu.hlslisrc/Features/ScreenSpaceShadows.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
- features/Screen-Space Shadows/Shaders/ScreenSpaceShadows/RaymarchCS.hlsl
b579dee to
9c553fa
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/Features/ScreenSpaceShadows.cpp (1)
28-30:⚠️ Potential issue | 🟡 MinorUpdate slider tooltip — “Adapts to render resolution” is no longer accurate in VR.
With the new
GetScaledSampleCount(), VR returns a fixedSampleCount * 64and intentionally does not scale with resolution (the rationale is in the VR branch comment). The current tooltip ("Adapts to render resolution.") will mislead VR users into thinking higher render resolutions automatically extend shadow reach. Consider clarifying that resolution-adaptive scaling only applies to flatscreen, or rewording so it is accurate in both paths.📝 Suggested wording
- ImGui::SliderInt("Sample Count Multiplier", (int*)&bendSettings.SampleCount, 1, 4); - if (auto _tt = Util::HoverTooltipWrapper()) - ImGui::Text("Multiplier for shadow ray sample count. Higher values increase shadow reach at the cost of performance. Adapts to render resolution."); + ImGui::SliderInt("Sample Count Multiplier", (int*)&bendSettings.SampleCount, 1, 4); + if (auto _tt = Util::HoverTooltipWrapper()) + ImGui::Text("Multiplier for shadow ray sample count. Higher values increase shadow reach at the cost of performance. In flatscreen, the base count also scales with render resolution; in VR, sample count is FOV-driven and resolution-independent.");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Features/ScreenSpaceShadows.cpp` around lines 28 - 30, The tooltip for the "Sample Count Multiplier" slider is misleading for VR because GetScaledSampleCount() uses a fixed VR path (SampleCount * 64) that does not adapt to render resolution; update the ImGui::Text string shown after Util::HoverTooltipWrapper() to clarify that resolution-adaptive scaling applies only to flatscreen rendering or explicitly state that in VR the multiplier maps to a fixed SampleCount * 64 so users aren’t led to expect resolution-driven changes; reference the GetScaledSampleCount() function and the bendSettings.SampleCount symbol when making the text change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/Features/ScreenSpaceShadows.cpp`:
- Around line 28-30: The tooltip for the "Sample Count Multiplier" slider is
misleading for VR because GetScaledSampleCount() uses a fixed VR path
(SampleCount * 64) that does not adapt to render resolution; update the
ImGui::Text string shown after Util::HoverTooltipWrapper() to clarify that
resolution-adaptive scaling applies only to flatscreen rendering or explicitly
state that in VR the multiplier maps to a fixed SampleCount * 64 so users aren’t
led to expect resolution-driven changes; reference the GetScaledSampleCount()
function and the bendSettings.SampleCount symbol when making the text change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 48c023b4-88cc-434c-af18-bfa015e7ea7a
📒 Files selected for processing (3)
features/Screen-Space Shadows/Shaders/ScreenSpaceShadows/RaymarchCS.hlslfeatures/Screen-Space Shadows/Shaders/ScreenSpaceShadows/bend_sss_gpu.hlslisrc/Features/ScreenSpaceShadows.cpp
✅ Files skipped from review due to trivial changes (1)
- features/Screen-Space Shadows/Shaders/ScreenSpaceShadows/bend_sss_gpu.hlsli
… fixes Tracy data confirmed −514 µs/frame in ScreenSpaceShadows::Prepass at VR resolution (captures: dev.tracy vs no_reproj.tracy). All three changes are reproj-independent and benefit every VR user. ## GetScaledSampleCount rewrite (ScreenSpaceShadows.cpp) Bend's SAMPLE_COUNT controls pixel-space ray length, not per-pixel quality. Shadow reach in pixels is FOV-driven: a shadow spanning N pixels at 1080p spans the same N pixels at 3000p for the same FOV and world geometry. The resolution-scaled formula produced 2-8x more samples at VR resolutions with no quality benefit. New formula: bendSettings.SampleCount * 64 (flat, WAVE_SIZE-aligned for correct Bend READ_COUNT). User-facing SampleCount slider still scales reach linearly. ## UsePrecisionOffset = false (RaymarchCS.hlsl) The precision offset applies a depth bias that causes subtle shadow shifting on camera motion. Disabled. ## bend_sss_gpu.hlsli correctness fixes Four independent fixes: - DynamicRes ordering: DynamicRes and the VR x-halve are now applied AFTER offset_xy is incorporated into the UV, so the bilinear-neighbour sample is exactly 1 texel away. Applying them before offset addition caused the neighbour offset to span ~3 px, breaking edge detection. This is the universally-correct ordering and is not VR-gated. - Cross-eye seam clamp (VR): depth reads that cross the SBS center are clamped to FarDepthValue (1.0) so rays near the seam see no occluder. Shadow weakens by ~1 pixel at the boundary but stays temporally stable. - HMD mask handling (VR): depth==0 (outside the visible lens area) remaps to FarDepthValue so mask pixels do not cast false shadows. - depth_thickness_scale clamp: max(..., 1e-4) prevents divide-by-zero when the sampled depth equals FarDepthValue. - VR write-pixel guard removed: both eye dispatches now write to the seam overlap, preventing a visible gap at the SBS boundary. ## Validation To validate: compare ScreenSpaceShadows::Prepass Tracy zone between upstream/dev and this branch with reproj OFF. Expected: ~500 µs drop at VR resolutions. Visually confirm shadow reach and stability at distant objects, shadow boundaries, and hands/weapons in first-person. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
9c553fa to
7941372
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
features/Screen-Space Shadows/Shaders/ScreenSpaceShadows/bend_sss_gpu.hlsli (1)
265-274: PreferinParameters.FarDepthValueover hardcoded1.0.The struct already parameterizes the far-plane depth via
inParameters.FarDepthValue, and existing code (e.g.,depth_thickness_scale[i] = max(abs(inParameters.FarDepthValue - depths.x), …)) consults it. The new cross-eye clamp and HMD-mask remap hardcode1.0, which is correct for the current Skyrim forward-Z convention but silently breaks ifFarDepthValueis ever reconfigured (and the in-file doc comment on line 108 still says "typically 0" from the upstream Bend defaults). Using the parameter keeps the convention assumption in one place.♻️ Proposed fix
- // Clamp cross-eye depth reads to FarDepthValue (1.0) so rays near the SBS center - // seam see no occluder at the boundary. Shadow weakens by ~1 pixel at the seam but - // stays temporally stable across camera movement. - depths.x = coord_out_of_eye ? 1.0 : inParameters.DepthTexture.SampleLevel(inParameters.PointBorderSampler, coord, 0); - depths.y = coord_offset_out_of_eye ? 1.0 : inParameters.DepthTexture.SampleLevel(inParameters.PointBorderSampler, coord_with_offset, 0); + // Clamp cross-eye depth reads to FarDepthValue so rays near the SBS center + // seam see no occluder at the boundary. Shadow weakens by ~1 pixel at the seam but + // stays temporally stable across camera movement. + depths.x = coord_out_of_eye ? inParameters.FarDepthValue : inParameters.DepthTexture.SampleLevel(inParameters.PointBorderSampler, coord, 0); + depths.y = coord_offset_out_of_eye ? inParameters.FarDepthValue : inParameters.DepthTexture.SampleLevel(inParameters.PointBorderSampler, coord_with_offset, 0); - // HMD mask: depth==0 is outside the visible lens area. Remap to FarDepthValue so - // mask pixels do not cast false shadows. - depths.x = lerp(depths.x, 1.0, (float)(depths.x == 0)); // Stencil area - depths.y = lerp(depths.y, 1.0, (float)(depths.y == 0)); // Stencil area + // HMD mask: depth==0 is outside the visible lens area. Remap to FarDepthValue so + // mask pixels do not cast false shadows. + depths.x = (depths.x == 0) ? inParameters.FarDepthValue : depths.x; // Stencil area + depths.y = (depths.y == 0) ? inParameters.FarDepthValue : depths.y; // Stencil area🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@features/Screen-Space` Shadows/Shaders/ScreenSpaceShadows/bend_sss_gpu.hlsli around lines 265 - 274, The cross-eye clamp and HMD-mask remap use a hardcoded 1.0 which should instead use the configured far-plane value; update the assignments that set depths.x and depths.y after sampling (the lines that currently use "1.0" for out-of-eye clamp and for remapping depth==0) to use inParameters.FarDepthValue so both the coord_out_of_eye/clamp and the lerp(depths.* == 0) branches reference the same FarDepthValue; keep the existing SampleLevel() calls, PointBorderSampler usage, and coord/coord_with_offset logic unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@features/Screen-Space` Shadows/Shaders/ScreenSpaceShadows/bend_sss_gpu.hlsli:
- Around line 265-274: The cross-eye clamp and HMD-mask remap use a hardcoded
1.0 which should instead use the configured far-plane value; update the
assignments that set depths.x and depths.y after sampling (the lines that
currently use "1.0" for out-of-eye clamp and for remapping depth==0) to use
inParameters.FarDepthValue so both the coord_out_of_eye/clamp and the
lerp(depths.* == 0) branches reference the same FarDepthValue; keep the existing
SampleLevel() calls, PointBorderSampler usage, and coord/coord_with_offset logic
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5ec1c275-08a6-4371-abe7-bb4c9e0f3a0e
📒 Files selected for processing (3)
features/Screen-Space Shadows/Shaders/ScreenSpaceShadows/RaymarchCS.hlslfeatures/Screen-Space Shadows/Shaders/ScreenSpaceShadows/bend_sss_gpu.hlslisrc/Features/ScreenSpaceShadows.cpp
🚧 Files skipped from review as they are similar to previous changes (2)
- features/Screen-Space Shadows/Shaders/ScreenSpaceShadows/RaymarchCS.hlsl
- src/Features/ScreenSpaceShadows.cpp
|
✅ A pre-release build is available for this PR: |
…2219) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
Three reproj-independent SSS changes that benefit every VR user. Tracy-confirmed savings of ~500 µs/frame on
ScreenSpaceShadows::Prepassat VR resolutions, plus correctness fixes for shadow stability under camera motion and at the SBS center seam.What
1.
GetScaledSampleCount— resolution-independentReplaces the area-scaled formula (
SampleCount * 60 * sqrt(currentArea / 1920×1080)) withSampleCount * 64.SAMPLE_COUNTin Bend's algorithm controls the maximum pixel-space ray length the shadow trace covers. Pixel-space shadow reach is FOV-driven, not resolution-driven — a tree branch's shadow occupies the same pixel count regardless of render resolution at the same FOV. The area-scaled formula produced ~2-3× the necessary samples at VR resolutions with no quality benefit, only GPU cost.The user-facing 1-4 multiplier slider remains the way to scale reach if anyone needs longer/shorter shadows.
2.
bend_sss_gpu.hlsli— correctness fixesDynamicResordering: previouslyread_xy *= DynamicReswas applied beforeoffset_xywas added, making the bilinear-neighbour offset span ~3 px instead of 1. Fixed to applyDynamicResafter offset addition so the neighbour samples exactly 1 texel away. (Universally correct, not VR-gated; only the bug's visible symptom was VR-specific.)FarDepthValue(1.0).max(..., 1e-4)clamp ondepth_thickness_scaleprevents divide-by-zero.3.
RaymarchCS.hlsl—UsePrecisionOffset = falseBend's optional precision offset adds a depth bias that produces subtle shadow shifting on camera motion in VR. Disabled.
Risk
bend_sss_gpu.hlslicorrectness fixes change shadow appearance at the SBS center seam and around HMD mask pixels — visually expected to improve, not regress.Validation
./BuildRelease.bat ALL.ScreenSpaceShadows::Prepassdropped 645 → 131 µs/frame (dev.tracybaseline vsno_reproj.tracywith this branch's logic).Prepassparent dropped 827 → 280 µs.Summary by CodeRabbit