fix(slf): VR accumulator OOB heap write#2432
Conversation
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: YtzyFvra <59631290+YtzyFvra@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: ParticleTroned <248299730+ParticleTroned@users.noreply.github.com>
Co-authored-by: jiayev <jiayev1@gmail.com> Co-authored-by: ParticleTroned <248299730+ParticleTroned@users.noreply.github.com>
## Summary Adds **DLSSperf** — a VR-only opt-in mode of the Upscaling feature. In Skyrim VR's standard DLSS pipeline, the engine allocates its render-target chain (kMAIN, depth, motion vectors, refraction, etc.) at the HMD's full native resolution. DLSS upscales the small render content into those large targets, and **the entire post-process chain — HDR, bloom, refraction, tonemap — then runs at the upscaled HMD resolution**. The post-process work is the most expensive part of the chain after world rendering, and most of it doesn't visibly benefit from being computed at the upscaled resolution. DLSSperf shrinks the engine's render targets to match DLSS's own internal (smaller) resolution. DLSS still reconstructs to HMD-native via a private `testTexture` used for OpenVR submit, but everything the engine itself draws and post-processes stays at the smaller working resolution. Decomposed from upstream PR-2096 (`YtzyFvra/feature/DLSSenhancer`). Co-authored-by: YtzyFvra <59631290+YtzyFvra@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
## Summary
- Drop `/XO` (eXclude Older) from the three robocopy invocations under
`AUTO_PLUGIN_DEPLOYMENT` so changed source files actually reach the
install
- Update inline comments to document the trap (any prior touch to dest
can give it a newer mtime than git-checkout source, which silently
disables deploy)
## Why
`/XO` skips files where source is older than dest. In a normal git
workflow that's usually safe — but as soon as anything *else* touches
the destination (a manual `cp` during debugging, a package install, even
a previous build that ran later in wall time than the source's git
checkout), dest gets a newer mtime and `/XO` then refuses to overwrite
even when the source file's **size and content** have changed.
## Concrete failure that prompted this
While debugging SLF I had RunGrass.hlsl fixed in source (commit
`f11bfaa32`, source size 33287 bytes with the corrected ShadowSampling
include order), but the deployed file stayed at the previous 31877 bytes
across multiple full builds. Mtime on dest (from a manual `cp` during
earlier diagnostics) was newer than the git-checkout mtime on source, so
`/XO` skipped the copy. Grass and Particle pixel shaders failed to
compile in-game:
> `LightLimitFix/LightLimitFix.hlsli(85,3-28): error X3000: unrecognized
identifier 'DirectionalShadowLightData'`
Same shape of bug can hit any contributor who: tests a package install
over their dev build, copies files manually for any reason, or just has
clock skew across filesystem boundaries.
## What this changes
| Before | After |
|---|---|
| \`/E /XD ... /COPY:DAT /XO /R:1 /W:1 ...\` | \`/E /XD ... /COPY:DAT
/R:1 /W:1 ...\` |
Without `/XO`, robocopy uses its default name + size + timestamp delta —
copies whenever **any** of those differ. Catches both newer-source and
size-mismatch cases. The git-HEAD-change block above the deploy commands
already clears AIO and deploy stamps on branch switch, so the
bulk-redeploy story is unchanged.
## Trade-off
The property `/XO` nominally provided ("don't clobber files the user
intentionally edited in-game between builds at the same HEAD") was
fragile anyway — broken by any external touch to dest. Removing it costs
~100ms of extra IO per build on unchanged files (robocopy still skips
matching name+size+timestamp). Correctness wins.
## Test plan
- [ ] \`BuildRelease.bat ALL-WITH-AUTO-DEPLOYMENT\` produces a deployed
install that matches \`build/<preset>/aio\` byte-for-byte
- [ ] Manually \`touch\` a file in the deployed install to give it a
newer mtime, rebuild, confirm the source file overwrites it
- [ ] Verify unchanged files still skip on subsequent builds (robocopy's
default behavior)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Adjusted Windows plugin deployment configuration to improve
synchronization between deployed and built files. Updated build system
documentation comments.
<!-- review_stack_entry_start -->
[](https://app.coderabbit.ai/change-stack/alandtse/open-shaders/pull/37?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)
<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Co-authored-by: openai-code-agent[bot] <242516109+Codex@users.noreply.github.com> Co-authored-by: alandtse <7086117+alandtse@users.noreply.github.com> Co-authored-by: Alan Tse <alandtse@gmail.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: doodlum <15017472+doodlum@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Skrubby Skrub In A Shrub <87662196+SkrubbySkrubInAShrub@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: YtzyFvra <59631290+YtzyFvra@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Hook_AccumulatedLightsArray bumped RDI (fill-loop count) but not RDX (BSTArray resize count) on VR. On SE and VR the patched 5 bytes are `LEA EDI,[RBP+0x8] ; MOV EDX,EDI`, executed via includeSize=+5 BEFORE the CONTEXT-capture stub, so EDX latches the un-bumped count. The resize then allocates the original 10-slot array while the fill loop writes (ShadowLightCount+ConvertedShadowSlots+1)*2 entries -- an OOB heap write proportional to the user's shadow count, corrupting a tbbmalloc freelist next-pointer. Symptom: random later crashes in the Papyrus VM string heap with RDI holding the prologue bytes of the inlined resize fn (0x83485708245C8948 = `48 89 5C 24 08 57 48 83`). Higher counts crash earlier. SE was already bumping RDX; AE inlines the resize and re-derives EDX from EDI AFTER the stub, so its RDX is dead at the hook point and must not be touched. New gate is !IsAE() instead of IsSE(). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Wrong target — recreating against alandtse/open-shaders |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: ⛔ Files ignored due to path filters (9)
📒 Files selected for processing (130)
📝 WalkthroughWalkthroughThis PR updates fork maintenance workflows and branding, adds restart-aware settings metadata, introduces a loopback MCP remote-control feature, expands Light Limit Fix shadow scheduling and shader paths, and adds VR perf-mode plus foveated upscaling support with new shaders, utilities, and tests. ChangesFork platform and maintenance flow
Restart-aware settings metadata
Remote control MCP feature
Light Limit Fix shadow pipeline
VR perf mode and foveated upscaling
UI branding and diagnostics reorganization
Sequence Diagram(s)sequenceDiagram
participant Client
participant RemoteControl
participant MCP as mcp::server
participant SKSE
Client->>MCP: tool request
MCP->>RemoteControl: dispatch handler
alt main-thread action needed
RemoteControl->>SKSE: queue task
SKSE-->>RemoteControl: apply action
end
RemoteControl-->>MCP: JSON text result
MCP-->>Client: SSE/HTTP response
sequenceDiagram
participant CreateRT as BSShaderRenderTargets::Create
participant PerfMode
participant Upscaling
participant Foveated as FoveatedRenderImpl::Core
participant Streamline
CreateRT->>PerfMode: enlarge RTs and setup resources
Upscaling->>Foveated: preprocess and resolve VR params
Foveated->>Streamline: per-eye DLSS evaluation
Streamline-->>Foveated: upscaled eye outputs
Foveated-->>Upscaling: final VR output
Upscaling->>PerfMode: HandlePostProcessing when testTexture is active
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
|
Actionable Suggestions
|
|
✅ A pre-release build is available for this PR: |
Summary
Hook_AccumulatedLightsArraybumpedRDI(fill-loop count) but notRDX(BSTArray resize count) on VR, producing an OOB heap write proportional toShadowLightCount. Symptom: random later crashes in the Papyrus VM string heap withRDI = 0x83485708245C8948— byte-reversed, those are the prologue bytes (48 89 5C 24 08 57 48 83) of the inlinedBSTArray::resizecallee, indicating a tbbmalloc freelist next-pointer was stomped and a later allocation returned code memory.Mechanism
The hook installs with
includeSize=+5, so the original 5 bytes run before the CONTEXT-capture stub. On SE and VR the patched 5 bytes are:After the stub,
EDIis bumped butEDXis not. DownstreamBSTArray::resizethen allocates the original 10-slot buffer while the fill loop writes(ShadowLightCount + ConvertedShadowSlots + 1) * 2entries. Higher counts crash earlier — matches user reports.AE inlines the resize and re-derives
EDXfromEDIafter the stub, so itsRDXat the hook point is a dead caller-saved value and must not be touched. SE was already bumpingRDXcorrectly. The original gate (!IsVR && !IsAE) inverted the VR case — it should have been!IsAE.Verification
Confirmed at the patch site in all three runtimes via Ghidra static disasm:
LEA EDI / MOV EDX,EDILEA EDI / TEST EDI,EDILEA EDI / MOV EDX,EDITest plan
ALLpreset, verify SE behavior unchanged at default shadow countsShadowLightCount(16+): play 15+ minutes loading interiors/exteriors, no Papyrus VM heap crashkSHADOWMAPSslices populate correctly at high count🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Improvements
Documentation