Skip to content

perf(terrain shadows): half res and 4x faster updates#2163

Merged
alandtse merged 2 commits into
devfrom
terrain-shadows-opt
Apr 21, 2026
Merged

perf(terrain shadows): half res and 4x faster updates#2163
alandtse merged 2 commits into
devfrom
terrain-shadows-opt

Conversation

@doodlum
Copy link
Copy Markdown
Collaborator

@doodlum doodlum commented Apr 20, 2026

Summary by CodeRabbit

  • Refactor
    • Improved terrain shadow pipeline: shadow calculations are aligned to a lower-resolution shadow texture and heightmap processing was optimized, yielding more consistent shadow sampling and reduced memory/processing for shadow updates.
  • Bug Fixes
    • Fixed shadow sampling mismatches that could cause subtle rendering artifacts, improving visual stability.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: eb381438-7631-4412-9ffa-1c6a4d2316b5

📥 Commits

Reviewing files that changed from the base of the PR and between e0e78f5 and ed7a7f6.

📒 Files selected for processing (1)
  • features/Terrain Shadows/Shaders/TerrainShadows/ShadowUpdate.cs.hlsl
🚧 Files skipped from review as they are similar to previous changes (1)
  • features/Terrain Shadows/Shaders/TerrainShadows/ShadowUpdate.cs.hlsl

📝 Walkthrough

Walkthrough

Decouples heightmap and shadow texture coordinate spaces: heightmap is resized to half resolution at load, and shader/CPU now use separate pixel dimensions and coordinates for height sampling versus shadow read/write.

Changes

Cohort / File(s) Summary
Shader Coordinate Space Separation
features/Terrain Shadows/Shaders/TerrainShadows/ShadowUpdate.cs.hlsl
Replaced single dims/threadPxCoord with heightDims/shadowDims and heightPxCoord/shadowPxCoord; height sampling uses heightPxCoord, shadow reads/writes and initial/final fetches use shadowPxCoord (including offset by LightPxDir).
Heightmap Loading & Shadow CB
src/Features/TerrainShadows.cpp
LoadHeightmap() resizes DDS heightmap to half resolution (min 1x1) with linear filtering before creating the texture. UpdateShadow() now sources width/height and PxSize from the shadow texture (texShadowHeight) rather than the full-res heightmap.

Sequence Diagram(s)

sequenceDiagram
    participant CPU as CPU (TerrainShadows.cpp)
    participant GPU as GPU Shader (ShadowUpdate.cs.hlsl)
    participant HeightTex as Height Texture (TexHeight)
    participant ShadowTex as RW Shadow Texture (RWTexShadowHeights)

    CPU->>HeightTex: Load DDS -> ScratchImage -> Resize (half-res)
    CPU->>HeightTex: CreateTexture with resized data
    CPU->>GPU: UpdateShadow CB (shadowDims, PxSize from ShadowTex)
    CPU->>GPU: Dispatch compute
    GPU->>HeightTex: Sample height using heightPxCoord (GetInterpolatedHeight)
    GPU->>ShadowTex: Read/Write using shadowPxCoord (initial fetch, ray-step, final write)
    GPU-->>CPU: Dispatch completes (shadow texture updated)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • alandtse
  • davo0411

Poem

🐇🌿 Half the map, a lighter hop,
Pixels shrink but shadows stop,
Heights sampled in one place,
Shadows stored in smaller space,
A rabbit cheers the speedy drop!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: a performance optimization that reduces terrain shadow resolution to half and improves update speed by 4x, which aligns with the actual code changes that reduce heightmap resolution and modify shadow texture operations.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch terrain-shadows-opt

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@doodlum doodlum changed the title perf: half res terrain shadows perf(terrain shadows): half res and 4x faster updates Apr 20, 2026
@doodlum
Copy link
Copy Markdown
Collaborator Author

doodlum commented Apr 20, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 20, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@github-actions
Copy link
Copy Markdown

No actionable suggestions for changed features.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@features/Terrain` Shadows/Shaders/TerrainShadows/ShadowUpdate.cs.hlsl:
- Around line 66-68: Update the comment block in ShadowUpdate.cs.hlsl to be
implementation-neutral: replace "Heightmap is sampled at full resolution; shadow
is written at half resolution per axis" with wording that describes two
independent coordinate spaces (heightmap sampling space vs shadow-pixel space)
and note that the CPU path may resize the height texture independently of shadow
resolution; keep the note that LightPxDir and PxSize are expressed in
shadow-pixel units and ensure the comment clarifies that sampling/resizing can
be done separately for each space.

In `@src/Features/TerrainShadows.cpp`:
- Around line 230-237: The downsampling uses linear filtering
(DirectX::TEX_FILTER_LINEAR) which can blur peak heights and cause shadow light
leaks for narrow ridges; in the block that reads rawImage / rawMeta and calls
DirectX::Resize, change the filter to TEX_FILTER_POINT (or make the filter
selectable/configurable) to preserve exact height samples, then run visual tests
on varied terrain and document the chosen filter and tradeoffs so we can revert
to linear only if performance/VRAM needs outweigh the visual cost.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bf6f3469-1ce3-4a71-885c-2095ec8fc80e

📥 Commits

Reviewing files that changed from the base of the PR and between 24c22be and e0e78f5.

📒 Files selected for processing (2)
  • features/Terrain Shadows/Shaders/TerrainShadows/ShadowUpdate.cs.hlsl
  • src/Features/TerrainShadows.cpp

Comment thread features/Terrain Shadows/Shaders/TerrainShadows/ShadowUpdate.cs.hlsl Outdated
Comment thread src/Features/TerrainShadows.cpp
@doodlum doodlum marked this pull request as ready for review April 20, 2026 22:43
@github-actions
Copy link
Copy Markdown

✅ A pre-release build is available for this PR:
Download

Copy link
Copy Markdown
Collaborator

@davo0411 davo0411 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this also effectively separate shadow updates from overwriting the height texture itself?

@doodlum
Copy link
Copy Markdown
Collaborator Author

doodlum commented Apr 21, 2026

does this also effectively separate shadow updates from overwriting the height texture itself?

the textures were always separate. note that the height texture is downscaled too.

@alandtse alandtse merged commit 7661ba7 into dev Apr 21, 2026
20 of 28 checks passed
@doodlum
Copy link
Copy Markdown
Collaborator Author

doodlum commented Apr 21, 2026

@coderabbitai why did this have an extreme negative performance impact

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 21, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

Let me look at the actual code changes to diagnose the performance regression.


🏁 Script executed:

#!/bin/bash
# Read the shader file
fd "ShadowUpdate.cs.hlsl" --exec cat {}

echo "---CPP---"
fd "TerrainShadows.cpp" --exec cat {}

Length of output: 4529


The CPP output was cut off. Let me get the full file content.


🏁 Script executed:

#!/bin/bash
fd "TerrainShadows.cpp" --exec cat {}
echo "---HEADER---"
fd "TerrainShadows.h" --exec cat {}

Length of output: 12


Let me search for the files differently.


🏁 Script executed:

#!/bin/bash
# Find all terrain shadow related files
fd -i "terrain" --type f
echo "---TREE---"
tree features/Terrain* 2>/dev/null || fd -i "terrainshadow" --type f
echo "---RG---"
rg -l "UpdateShadow\|LoadHeightmap" --type cpp

Length of output: 377


🏁 Script executed:

#!/bin/bash
# Get the PR diff via GitHub CLI
gh pr diff 2163 --repo doodlum/skyrim-community-shaders

Length of output: 4856


@doodlum The most likely cause of the extreme performance regression is a dispatch/texture resolution mismatch introduced by this PR. Here's the breakdown:


Root Cause: Dispatch Count Not Scaled Down

The AI-generated summary explicitly notes: "The dispatch dimensions were not changed."

In UpdateShadow(), width/height were changed to come from texShadowHeight->desc instead of texHeightMap->desc. These values are almost certainly also used to compute the dispatch group count (those lines weren't modified per the diff). If texShadowHeight was properly recreated at the new half-resolution, then PxSize is now 1/half_res (twice as large a UV step), but the dispatch still launches full-res many thread groups.

In the shader, the wraparound logic in main handles threads that go out of UV range:

float2 threadUV = rawThreadUV - floor(rawThreadUV);  // wraparound

This means the excess thread groups (groups half_res..full_res-1) don't get discarded — their UV wraps around and they re-process the same shadow pixels a second time, with lerp(pastHeights, g_shadowHeight[gtid], 0.5f) blending incorrect values into already-correct outputs. This doubles both computation and UAV write traffic, while also corrupting shadow data.


Secondary Issue: Redundant GetDimensions Calls

Both GetInterpolatedHeight and GetInterpolatedHeightRW call GetDimensions internally, which duplicates the calls already made at the top of main. The HLSL compiler often hoists SRV GetDimensions calls, but UAV (RWTexture2D) calls are not always treated as compile-time constants on all drivers:

float GetInterpolatedHeight(float2 pxCoord, bool isVertical)
{
    uint2 dims;
    TexHeight.GetDimensions(dims.x, dims.y);  // ← redundant; also fetched in main
    ...
}

float2 GetInterpolatedHeightRW(float2 pxCoord, bool isVertical)
{
    uint2 dims;
    RWTexShadowHeights.GetDimensions(dims.x, dims.y);  // ← UAV GetDimensions per-thread
    ...
}

These dims parameters could instead be passed in as arguments using the already-computed heightDims/shadowDims from main.


Things to Verify

  1. Was texShadowHeight recreated at half-resolution? If it was NOT (i.e., its creation code wasn't updated), then texShadowHeight->desc.Width/Height still equals the original full-res, and the shadow work is entirely unchanged — giving you zero perf benefit plus the overhead of the extra coordinate mapping.

  2. Does the dispatch group count use the new width/height? If it does and texShadowHeight is half-res, the wraparound scenario above applies. If it uses a separate variable that still reads the original heightmap size, that's the mismatch.


🧠 Learnings used
Learnt from: Dlizzio
Repo: doodlum/skyrim-community-shaders PR: 1966
File: package/Shaders/ISVolumetricLightingGenerateCS.hlsl:0-0
Timestamp: 2026-03-11T08:05:10.801Z
Learning: In doodlum/skyrim-community-shaders, the `sqrt()` applied to `CloudShadows::GetCloudShadowMult()` in `package/Shaders/ISVolumetricLightingGenerateCS.hlsl` is intentional. The maintainer (Dlizzio) confirmed it is used to increase cloud shadow strength at harsh angles for volumetric lighting, and should not be removed or replaced with a direct linear multiply.

Learnt from: Dlizzio
Repo: doodlum/skyrim-community-shaders PR: 1950
File: package/Shaders/Lighting.hlsl:2008-2012
Timestamp: 2026-03-07T00:53:21.267Z
Learning: In doodlum/skyrim-community-shaders PR `#1950`, the maintainer (Dlizzio) confirmed that per-axis normal reorientation for triplanar-projected normals/detail in package/Shaders/Lighting.hlsl is not required for their use case; the current approach intentionally mixes samples without axis-specific swizzle/sign correction.

Learnt from: vrnord
Repo: doodlum/skyrim-community-shaders PR: 2003
File: features/Upscaling/Shaders/Upscaling/DepthUpscalePS.hlsl:55-61
Timestamp: 2026-03-24T04:30:32.809Z
Learning: In `features/Upscaling/Shaders/Upscaling/DepthUpscalePS.hlsl` (doodlum/skyrim-community-shaders PR `#2003`), the shader is dispatched per-eye using per-eye dimensions (DstWidth/DstHeight). The UV (`input.TexCoord`) is derived from `dispatchID / DstDim`, which is already eye-local — not full SBS/packed stereo coordinates. The per-eye UV clamping logic (isRight, halfScale) operates entirely in eye-local space, and there is no stereo seam issue. Do not flag SBS seam concerns for this shader.

Learnt from: Dlizzio
Repo: doodlum/skyrim-community-shaders PR: 2000
File: src/WeatherEditor/EditorWindow.cpp:1271-1274
Timestamp: 2026-03-22T18:40:49.633Z
Learning: Repo: doodlum/skyrim-community-shaders — Maintainer preference (PR `#2000` on 2026-03-22): In src/WeatherEditor/EditorWindow.cpp within EditorWindow::RenderUI, keep `resetLayout = false;` at the end of the frame (after PaletteWindow::Draw). The reset is intentionally one-shot for windows rendered that frame; closed windows are allowed to reopen with their previous geometry. Do not move the clear later or broadcast resets to closed windows.

Learnt from: alandtse
Repo: doodlum/skyrim-community-shaders PR: 1941
File: package/Shaders/Common/ShadowSampling.hlsli:160-163
Timestamp: 2026-04-03T01:48:34.867Z
Learning: In `package/Shaders/Common/ShadowSampling.hlsli` (doodlum/skyrim-community-shaders PR `#1941`), the directional shadow fade in `GetLightingShadow` is intentionally computed as `fadeStart = shadow.EndSplitDistances.y * (1.0 - FadeWindow)` where `FadeWindow = 0.1`. This anchors the fade to the last 10% of the absolute far-cascade distance (`EndSplitDistances.y`), not to the cascade span. The narrow window is deliberate so shadows dissolve only at the hard far-cutoff without prematurely fading across the whole cascade range. Do not suggest using `lerp(StartSplitDistances.y, EndSplitDistances.y, 1.0 - FadeWindow)` as a fix.

Learnt from: Dlizzio
Repo: doodlum/skyrim-community-shaders PR: 2000
File: src/WeatherEditor/EditorWindow.cpp:1239-1244
Timestamp: 2026-03-22T18:32:22.461Z
Learning: In doodlum/skyrim-community-shaders (PR `#2000`, 2026-03-22), maintainer preference: keep the default browser width logic in src/WeatherEditor/EditorWindow.cpp within EditorWindow::RenderUI — browserWidth = min(availableWidth * 0.5f, 960.0f * Util::GetUIScale()) is applied unconditionally (before checking settings.showViewport) to preserve a consistent left-rail width on first-use/reset. Do not change this to fill full width when the viewport is hidden.

Learnt from: jiayev
Repo: doodlum/skyrim-community-shaders PR: 1947
File: package/Shaders/DeferredCompositeCS.hlsl:234-243
Timestamp: 2026-03-14T08:35:42.651Z
Learning: In `package/Shaders/DeferredCompositeCS.hlsl` (doodlum/skyrim-community-shaders), the fallback non-IBL SKYLIGHTING specular path intentionally double-weights with `skylightingSpecular`: `dalcScaled` is computed as `IrradianceToGamma(IrradianceToLinear(directionalAmbientColorSpecular) * skylightingSpecular)`, and the final `finalIrradiance` is then `lerp(specularIrradiance, specularIrradianceReflections, skylightingSpecular)`. This is original engine logic preserved from before PR `#1947` and should not be flagged as a double-attenuation bug.

Learnt from: vrnord
Repo: doodlum/skyrim-community-shaders PR: 1982
File: package/Shaders/Lighting.hlsl:3089-3095
Timestamp: 2026-03-16T13:13:11.490Z
Learning: In `package/Shaders/Lighting.hlsl` (doodlum/skyrim-community-shaders), the `TREE_ANIM` alpha test path intentionally uses a hardcoded `0.1` floor check (`if (alpha < 0.1) discard;`) and `AlphaTestRefRS` rather than `SharedData::VRAlphaTestThreshold`. `VRAlphaTestThreshold` is wired into `DistantTree.hlsl` and `RunGrass.hlsl` only. The threshold wiring for `Lighting.hlsl`'s `TREE_ANIM` path is deferred to a separate VR foliage PR.

Learnt from: alandtse
Repo: doodlum/skyrim-community-shaders PR: 0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants