diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index e6b4da4993..c4cb716895 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -2,6 +2,18 @@ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. +## Fork identity + +This repository is **Open Shaders** ([alandtse/open-shaders](https://github.com/alandtse/open-shaders)), a fork of [Community Shaders](https://github.com/community-shaders/skyrim-community-shaders) ([Nexus mod 180419](https://www.nexusmods.com/skyrimspecialedition/mods/180419)). The public/display name is "Open Shaders"; the runtime identity is intentionally kept as upstream Community Shaders so user installs are drop-in compatible: + +- **Keep as `CommunityShaders`** (do NOT rename): the CMake `PROJECT_NAME`, the DLL filename, the `SKSE/Plugins/CommunityShaders/` runtime directory, the `CommunityShaders.log` log file, the ImGui window ID after `###`, asset paths under `package/Interface/CommunityShaders/`, and any HLSL include paths. +- **Use "Open Shaders"**: in-game menu titles, README/AI-INSTRUCTIONS public-facing copy, the AIO Nexus mod filename, the GitHub release name, the in-game Welcome / FAQ / About text. +- **Link "Community Shaders" explicitly to upstream**: when the text or comment refers to the upstream project (its Nexus page is 180419, its repo is `community-shaders/skyrim-community-shaders`, its wiki lives on that repo). Never link `doodlum/skyrim-community-shaders` — that path is dead. + +The AIO bundle ships only features whose `Shaders/Features/*.ini` has `autoupload = true` (CORE features always included). The `AIO_INCLUDE_NON_AUTOUPLOAD=ON` CMake option overrides for local dev builds. The Nexus upload workflow ships only the AIO archive — there is no per-feature matrix distribution. See `.github/workflows/nexus-upload.yaml`. + +**Logo absence is intentional.** The upstream Community Shaders logo is non-GPL, not trademark-licensed, and may not be redistributed by forks — so `cs-logo.png` is not present in this repo. The icon loader's load path is null-safe at every consumer (`IconLoader.cpp` retries the colored fallback; `Menu.cpp` derives `showLogo` from `texture != nullptr`; `MenuHeaderRenderer` and `HomePageRenderer` gate all logo draws on the null check). Missing logo → one `logger::warn`, menu renders headers without the logo image, layout adjusts via the `showLogo` flag. Do not "fix" the missing file or restore the upstream asset. + ## Build Commands ### WSL/Linux Environment Note @@ -361,6 +373,7 @@ Feature versions are automatically extracted from `.ini` files and compiled into - JSON-based settings with nlohmann_json - Hot-reload capability through ImGui interface - Versioned feature configurations for compatibility +- Restart-gated fields use `Util::Settings::BootSnapshot` + `kRestartFields` metadata to diff boot-latched vs selected values (drives `Util::Text::RestartNeeded` banners and MCP introspection; see Upscaling for a canary) ### Error Handling @@ -435,6 +448,9 @@ Feature versions are automatically extracted from `.ini` files and compiled into - **Complete Solutions**: Provide fully functional code with proper error handling and resource management - **Performance Conscious**: Always consider GPU workload and user experience impact - **Documentation**: Include Doxygen comments for public methods, especially graphics-related functions +- **Concise Comments**: Comments explain _why_, not _what_. Skip restating code in prose. A 1-line "why this hack" beats a 4-line block paraphrasing the next 4 lines. Block comments at the top of a function/section are fine when they capture non-obvious context (invariants, gotchas, history); avoid mid-function tutorial paragraphs. +- **Minimal Churn**: PRs touch only what the change requires. Don't reformat unrelated lines, rename adjacent variables, or "clean up" code outside the PR's scope. If you spot something worth fixing nearby, open a follow-up PR or surface it in the description rather than expanding the diff. Auto-format/lint touching unrelated lines is acceptable only when it's the linter's own commit; mixing with logic changes obscures review. +- **Comments describe present code, not absent code**: Don't add comments that describe code that used to be in the file but isn't now ("the Discord banner was removed", "this constant was renamed from X"). The reader sees only the present file; the deletion isn't visible. Past-tense framing of present behavior is fine ("if someone landed a commit during the release, abort"); the rule is specifically about describing code that no longer exists. Exception: a regression-risk warning that names the removed code so a future maintainer doesn't restore it ("do not re-add the Discord banner — the upstream invite isn't a fork channel") is load-bearing and stays. Commit messages, PR descriptions, and CHANGELOG entries are the right place for "what changed" — code comments are not. ## Development Best Practices (Learned from Codebase) @@ -445,13 +461,40 @@ Follow conventional commit format for consistency: - **Format**: `type(scope): description` - **Title Limit**: 50 characters maximum - **Body Wrap**: 72 characters per line -- **Types**: `feat`, `fix`, `refactor`, `docs`, `style`, `test`, `chore` - **Examples**: - `feat(menu): extract DrawMenuVisitor helper methods` - `fix(imgui): resolve orphaned TableNextColumn calls` - `refactor(constants): centralize UI constants in ThemeManager` + - `ci: gate cpp_tests build on changed files` + - `build: drop CORE marker from per-feature AIO copy` + - `test: fix cpp_tests build under MSVC C++23 modules` + +**Squash-merge note.** PRs are squash-merged, so the **PR title** becomes the commit message that semantic-release reads. Getting the title's type right matters more than per-commit messages on the PR branch — those get discarded. Use `gh pr edit --title "..."` to fix a stale title before merge. + +**Type → release impact** (the full set accepted by `amannn/action-semantic-pull-request@v5` + `@semantic-release/commit-analyzer` defaults): + +| Type | Use for | Release impact | +| ---------- | --------------------------------------------------------- | ----------------------- | +| `feat` | New user-facing feature or capability | **minor** (1.X.0) | +| `fix` | Bug fix to user-facing behavior | **patch** (1.5.X) | +| `perf` | Performance improvement to user-facing behavior | **patch** (1.5.X) | +| `revert` | Revert of a prior commit | follows reverted commit | +| `build` | Build system, packaging, dependencies (CMake, vcpkg, AIO) | none | +| `chore` | Maintenance, misc tooling, repo hygiene | none | +| `ci` | CI workflows, GitHub Actions, lint configs | none | +| `docs` | Documentation, comments, READMEs, CLAUDE.md | none | +| `refactor` | Code restructuring with no behavior change | none | +| `style` | Formatting, whitespace, missing semicolons | none | +| `test` | Tests, test fixtures, test infrastructure | none | + +Append `!` to the type (or add a `BREAKING CHANGE:` footer) for **major** (X.0.0). + +**Pick the type with version impact in mind.** Common traps: -Conventional commits drive semantic-release. `feat:` triggers a minor bump, `fix:` triggers a patch bump, `feat!:` or `BREAKING CHANGE:` triggers a major bump. `chore:`, `docs:`, `style:`, `test:`, `refactor:` produce no release on their own. Pick the type with the version impact in mind — a refactor mislabeled `feat:` will force a minor bump on the next release. +- A pure build/CI/test change mislabeled `fix:` will burn a patch release on a non-user-visible change. Use `build:`, `ci:`, or `test:` instead. +- A refactor mislabeled `feat:` will force a minor bump. +- A perf win on internal code (not exposed to users) is `refactor:`, not `perf:`. +- `chore:` is a catch-all; prefer the specific type when one fits. ### Release Branch Model @@ -493,7 +536,7 @@ After a hotfix release, open PRs targeting `dev` are auto-rebased by the `Auto-r - PR a feature branch directly into `main`. - Run `Release: Semantic Version` on `hotfix/X.Y.x` for the current line — it will fail with `cannot be published as it is out of range` because the maintenance contract requires the hotfix line to be strictly older than `main`. Use `ff_target` into `main` instead. -Full details: [Developers wiki — Patch Release Process](https://github.com/community-shaders/skyrim-community-shaders/wiki/Developers#patch-release-process-any-line). +Full details: [Open Shaders developer wiki — Patch Release Process](https://github.com/alandtse/open-shaders/wiki/Developers#patch-release-process-any-line). The fork now maintains its own wiki (transferred from upstream) at `alandtse/open-shaders/wiki`; the `maint-update-wiki.yaml` workflow auto-publishes buffer documentation there on every push to `dev`. Upstream Community Shaders maintains its own copy at `community-shaders/skyrim-community-shaders/wiki` — link to whichever is appropriate for the audience. ### Code Organization and Refactoring Patterns diff --git a/.gitattributes b/.gitattributes index 72467a3279..4a8802abb2 100644 --- a/.gitattributes +++ b/.gitattributes @@ -3,3 +3,57 @@ # Ensure patch files always use LF line endings *.patch text eol=lf + +# ----------------------------------------------------------------------------- +# Fork-owned paths — `merge=ours` during 3-way merges +# ----------------------------------------------------------------------------- +# These files are policy- or branding-owned by the fork. When the scheduled +# upstream-merge-sync workflow merges upstream/dev into dev, the `ours` driver +# preserves the fork's version of each path verbatim, ignoring whatever +# upstream did to the same file. This is what prevents the silent-revert +# class of bug (upstream cherry-picks our commit, then later deletes the +# file in a follow-up: a vanilla merge would honor the delete; `merge=ours` +# discards it). +# +# Prerequisites for the driver to fire: +# 1. Each clone (including CI runners) must define the `ours` driver: +# git config merge.ours.driver true +# The scheduled sync workflow does this in a setup step. Local +# contributors who run `git merge upstream/dev` by hand should run +# the same command once. See docs/development/upstream-sync.md. +# 2. The merge must be a true 3-way merge (i.e., `git merge`, not +# `git rebase` — rebase replays patches via `git apply` and never +# invokes merge drivers). +# +# Important: `merge=ours` fires on ANY 3-way merge that touches these +# paths, not just the scheduled upstream sync. If a PR is landed via +# GitHub's "Create a merge commit" button (or via a local `git merge` +# between fork branches), and the incoming branch changed one of these +# files, those changes are silently discarded. **Edits to fork-owned +# paths must land via squash-merge or rebase-merge** so the diff lands +# as a plain commit on dev rather than through a merge-driver-affected +# merge commit. The default PR merge methods on `dev` are configured +# to squash/rebase only to enforce this. +# +# When adding a path here, also add a one-line note in +# docs/development/upstream-sync.md explaining why the fork owns it. If a +# path stops being fork-owned (e.g., we converge with upstream's behavior), +# remove the entry so upstream's improvements flow back in. +# ----------------------------------------------------------------------------- + +# CI workflows the fork owns end-to-end (auto-rebase machinery, the +# release pipeline calibrated to RELEASE_PAT rather than upstream's +# GitHub App setup, hotfix flow, nexus upload, etc.). +.github/workflows/auto-rebase-prs.yaml merge=ours +.github/workflows/release-semantic.yaml merge=ours +.github/workflows/release-hotfix.yaml merge=ours +.github/workflows/nexus-upload.yaml merge=ours +.github/workflows/maint-cleanup-releases.yaml merge=ours +.github/workflows/maint-upstream-sync.yaml merge=ours + +# Semantic-release config — fork has independent versioning and may +# diverge from upstream's release rules. +.releaserc merge=ours + +# Branding-owned: rebrand commit established these as fork-specific. +README.md merge=ours diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index c0c3b0e730..d8c15c1265 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -39,8 +39,8 @@ SKSE64 plugin providing modular DirectX 11 graphics enhancements for Skyrim SE/A ### Essential Repository Setup ```bash -git clone https://github.com/doodlum/skyrim-community-shaders.git --recursive -cd skyrim-community-shaders +git clone https://github.com/alandtse/open-shaders.git --recursive +cd open-shaders git submodule update --init --recursive # If not cloned with --recursive ``` diff --git a/.github/workflows/_shared-build.yaml b/.github/workflows/_shared-build.yaml index ecf4319149..45af3d43a6 100644 --- a/.github/workflows/_shared-build.yaml +++ b/.github/workflows/_shared-build.yaml @@ -23,6 +23,10 @@ on: description: "Run shader unit tests" type: boolean default: true + run-cpp-tests: + description: "Run C++ unit tests (tests/cpp)" + type: boolean + default: true hlsl-should-build: description: "Passed to check-hlsl-changes; 'true' forces shader steps to run" type: string @@ -31,6 +35,10 @@ on: description: "Passed to check-hlsl-changes for unit tests" type: string default: "true" + cpp-tests-should-build: + description: "Forces cpp_tests build+run when 'true'; lets PR-checks skip it otherwise" + type: string + default: "true" cache-key-suffix: description: "Optional suffix to invalidate the build cache" type: string @@ -259,3 +267,67 @@ jobs: build/ALL/Testing/** retention-days: 7 if-no-files-found: ignore + + cpp-unit-tests: + name: Run C++ Unit Tests + if: inputs.run-cpp-tests + runs-on: windows-2025 + permissions: + contents: read + steps: + - name: Checkout code + uses: actions/checkout@v6 + with: + ref: ${{ inputs.ref }} + repository: ${{ inputs.repository }} + submodules: recursive + + # Inline gate (no composite action — the logic is one branch). + # PR events skip when no relevant files changed; push / dispatch / + # release always run. + - name: Check if cpp_tests should run + id: check-cpp + shell: bash + run: | + if [ "${{ github.event_name }}" != "pull_request_target" ]; then + echo "Non-PR event, proceeding." + echo "skip=false" >> $GITHUB_OUTPUT + elif [ "${{ inputs.cpp-tests-should-build }}" != "true" ]; then + echo "No cpp_tests-related changes detected, skipping." + echo "skip=true" >> $GITHUB_OUTPUT + else + echo "cpp_tests-related changes detected, proceeding." + echo "skip=false" >> $GITHUB_OUTPUT + fi + + - name: Setup Build Environment + id: setup + if: steps.check-cpp.outputs.skip != 'true' + uses: ./.github/actions/setup-build-environment + with: + cache-key-suffix: "cpp-tests${{ inputs.cache-key-suffix }}" + cmake-preset: "ALL" + build-dir: "build/ALL" + + - name: Build cpp_tests + if: steps.check-cpp.outputs.skip != 'true' + uses: lukka/run-cmake@v10 + with: + configurePreset: ALL + buildPreset: ALL + buildPresetAdditionalArgs: "['--target cpp_tests']" + + - name: Run C++ unit tests + if: steps.check-cpp.outputs.skip != 'true' + run: | + ctest --test-dir build/ALL -C Release --output-on-failure -R CppUtilTests --timeout 60 + + - name: Upload test results on failure + if: failure() && steps.check-cpp.outputs.skip != 'true' + uses: actions/upload-artifact@v7 + with: + name: cpp-test-results + path: | + build/ALL/Testing/** + retention-days: 7 + if-no-files-found: ignore diff --git a/.github/workflows/auto-rebase-prs.yaml b/.github/workflows/auto-rebase-prs.yaml new file mode 100644 index 0000000000..dcbf96fedd --- /dev/null +++ b/.github/workflows/auto-rebase-prs.yaml @@ -0,0 +1,146 @@ +name: "Auto-rebase open PRs" + +# Rebase open PRs against `dev` whenever it moves. Thin wrapper over +# peter-evans/rebase@v3. +# +# Forks: pushing to a fork PR head requires (a) a user PAT belonging to +# a maintainer of this repo and (b) the PR author having "Allow edits +# by maintainers" checked. The action silently skips PRs missing either, +# and skips conflicts. +# +# Token: RELEASE_PAT (classic PAT, `repo` + `workflow`). `workflow` is +# required because rebased PRs may include `.github/workflows/` diffs. + +on: + push: + branches: [dev] + workflow_dispatch: + inputs: + pr_number: + description: "Optional: rebase just this PR number (the action accepts head as `:`; we resolve via gh api). Leave empty to rebase all open PRs against dev." + required: false + default: "" + +permissions: + contents: write + pull-requests: write + +# Two jobs by design: +# +# 1. `debounce` — sleeps for 30 minutes on push events. Has +# `cancel-in-progress: true` so a fresh push during the sleep +# kills the still-sleeping run and the new push starts its own +# timer. Workflow-dispatch runs skip the sleep entirely. +# 2. `rebase` — depends on `debounce` and runs `peter-evans/rebase`. +# Has `cancel-in-progress: false` so once it starts force-pushing +# PR heads, it can't be interrupted mid-flight by a subsequent +# push. The downside is a small window (the rebase duration) +# where an additional push has to queue rather than supersede — +# but interrupting force-pushes mid-rebase leaves PRs in an +# inconsistent half-rebased state, which is worse. +# +# End state: one rebase pass per quiet 30-min window, no half-done +# rebases. + +jobs: + debounce: + runs-on: ubuntu-latest + # The debounce job alone is cancellable — a new push restarts + # its 30-minute timer, dropping the prior queued run. + concurrency: + group: auto-rebase-prs-debounce + cancel-in-progress: true + steps: + - name: Debounce dev-push storms + # Only debounce when fired by a push to dev. Manual dispatches + # run immediately (the next job has no dependency wait when + # this step is skipped). Coalesces a burst of PR merges into + # a single rebase pass, keeping PR-build CI load proportional + # to landings/half-hour rather than landings × open-PR-count. + if: github.event_name == 'push' + run: sleep 1800 + + rebase: + runs-on: ubuntu-latest + needs: debounce + # The rebase job is NOT cancellable — once peter-evans/rebase + # starts force-pushing PR heads, interrupting it mid-flight could + # leave some PRs rebased and others not, or interrupt a push + # mid-operation. A subsequent dev push queues behind this run + # instead. + concurrency: + group: auto-rebase-prs-rebase + cancel-in-progress: false + steps: + - name: Checkout + uses: actions/checkout@v6 + with: + fetch-depth: 0 + token: ${{ secrets.RELEASE_PAT }} + + - name: Resolve single-PR head spec (if pr_number given) + id: resolve_head + if: inputs.pr_number != '' + env: + GH_TOKEN: ${{ secrets.RELEASE_PAT }} + PR_NUMBER: ${{ inputs.pr_number }} + run: | + set -euo pipefail + # Guard against rebasing a closed PR or one targeting + # a non-dev base (e.g. a hotfix staging branch). + PR_JSON=$(gh pr view "${PR_NUMBER}" --repo '${{ github.repository }}' \ + --json state,baseRefName,headRepositoryOwner,headRefName) + PR_STATE=$(echo "${PR_JSON}" | jq -r '.state') + PR_BASE=$(echo "${PR_JSON}" | jq -r '.baseRefName') + if [[ "${PR_STATE}" != "OPEN" ]]; then + echo "::warning::PR #${PR_NUMBER} is ${PR_STATE}, not OPEN — skipping." + echo "head=" >> "$GITHUB_OUTPUT" + exit 0 + fi + if [[ "${PR_BASE}" != "dev" ]]; then + echo "::warning::PR #${PR_NUMBER} targets '${PR_BASE}', not 'dev' — skipping to avoid rebasing the wrong base." + echo "head=" >> "$GITHUB_OUTPUT" + exit 0 + fi + HEAD_REF=$(echo "${PR_JSON}" | jq -r '"\(.headRepositoryOwner.login):\(.headRefName)"') + echo "head=${HEAD_REF}" >> "$GITHUB_OUTPUT" + + - name: Rebase open PRs against dev + id: rebase + # Single-PR dispatches without a valid head fall through to + # the all-PRs default; this guard keeps that from happening. + if: inputs.pr_number == '' || steps.resolve_head.outputs.head != '' + uses: peter-evans/rebase@v3 + with: + token: ${{ secrets.RELEASE_PAT }} + base: dev + head: ${{ steps.resolve_head.outputs.head }} + exclude-drafts: true + exclude-labels: | + no-auto-rebase + + - name: Summarize + env: + REBASED: ${{ steps.rebase.outputs.rebased-count }} + run: | + { + echo "## Auto-rebase summary" + echo "" + echo "- PRs rebased: \`${REBASED:-0}\`" + echo "- Base: \`dev\`" + if [[ -n '${{ inputs.pr_number }}' ]]; then + echo "- Targeted single PR: #${{ inputs.pr_number }} (head \`${{ steps.resolve_head.outputs.head }}\`)" + fi + echo "" + echo "PRs not rebased fall into one of three buckets:" + echo "- Already up-to-date with \`dev\` (no-op)" + echo "- Draft or labeled \`no-auto-rebase\` (excluded)" + echo "- Conflict during rebase, OR fork without maintainer-edit access (action silently skips)" + echo "" + echo "PRs in the conflict bucket need a manual rebase by the author:" + echo '```bash' + echo "git fetch origin && git rebase origin/dev" + echo "# resolve conflicts, git rebase --continue" + echo "git push --force-with-lease" + echo '```' + } >> "$GITHUB_STEP_SUMMARY" diff --git a/.github/workflows/maint-cleanup-releases.yaml b/.github/workflows/maint-cleanup-releases.yaml index e934cf7085..76bc6a8b5e 100644 --- a/.github/workflows/maint-cleanup-releases.yaml +++ b/.github/workflows/maint-cleanup-releases.yaml @@ -150,7 +150,7 @@ jobs: tag_name=$(echo "$line" | cut -d' ' -f1) reason=$(echo "$line" | cut -d'(' -f2- | sed 's/)$//') echo " • $tag_name - $reason" - echo " 🔗 https://github.com/doodlum/skyrim-community-shaders/releases/tag/$tag_name" + echo " 🔗 ${{ github.server_url }}/${{ github.repository }}/releases/tag/$tag_name" fi done echo "" diff --git a/.github/workflows/maint-update-wiki.yaml b/.github/workflows/maint-update-wiki.yaml index 8a1f980d15..87ddc3315c 100644 --- a/.github/workflows/maint-update-wiki.yaml +++ b/.github/workflows/maint-update-wiki.yaml @@ -17,6 +17,14 @@ permissions: jobs: update-wiki: name: Update Buffer Documentation + # Publish buffer documentation to the wiki of any repo that maintains + # its own copy. The fork (alandtse/open-shaders) maintains a wiki + # transferred from upstream; upstream (community-shaders/skyrim- + # community-shaders) also maintains one. Forks that don't want a + # wiki of their own should drop their entry from this list. + if: > + github.repository == 'alandtse/open-shaders' || + github.repository == 'community-shaders/skyrim-community-shaders' runs-on: windows-2022 steps: - name: Checkout code diff --git a/.github/workflows/maint-upstream-sync.yaml b/.github/workflows/maint-upstream-sync.yaml new file mode 100644 index 0000000000..c970a5eb40 --- /dev/null +++ b/.github/workflows/maint-upstream-sync.yaml @@ -0,0 +1,180 @@ +name: "Maint: Sync upstream/dev" + +# Scheduled merge sync from community-shaders/skyrim-community-shaders into our +# dev. Combined with the `merge=ours` attributes in .gitattributes (see the +# `Fork-owned paths` section there), this pulls upstream's code changes while +# leaving the fork's CI/branding files untouched. +# +# Why merge and not rebase: rebase replays patches via `git apply` and never +# invokes merge drivers. Worse, when upstream cherry-picks one of our commits +# and then later modifies the same file, rebase's patch-id detection skips +# our original commit as "already applied" — and upstream's follow-up edit +# then silently lands as a regression. A 3-way merge consults both branches +# at every path and lets the `merge=ours` driver fire for fork-owned files, +# eliminating that whole class of bug. +# +# Versioning: we deliberately do NOT skip this merge commit in +# semantic-release's commit analysis. The DAG walk picks up upstream's +# `feat:`/`fix:` commits transitively, so our version reflects everything +# we actually ship to users (including upstream fixes that arrived via this +# merge). + +on: + schedule: + # Monday 08:00 UTC. Weekly cadence keeps drift small without + # spamming PR rebases — pair this with auto-rebase-prs.yaml's + # debounce (PR #29) so the post-sync PR rebases coalesce. + - cron: "0 8 * * 1" + workflow_dispatch: + inputs: + dry_run: + description: "Dry run — fetch and merge locally, but don't push" + type: boolean + default: false + +permissions: + # `contents: write` is the only scope we need — the merge push to + # `dev` is the only mutation. No PR API calls, no issue API calls. + contents: write + +concurrency: + # Serialize — two overlapping syncs would race on the dev push. + group: maint-upstream-sync + cancel-in-progress: false + +jobs: + sync: + runs-on: ubuntu-latest + # Guard against downstream forks running this on schedule: they + # don't have RELEASE_PAT, don't want to merge community-shaders/dev + # into their own dev, and the resulting failure would spam their + # Actions tab with red runs. Other maint workflows in this repo + # use the same pattern. + if: github.repository == 'alandtse/open-shaders' + steps: + - name: Checkout dev + uses: actions/checkout@v6 + with: + fetch-depth: 0 + ref: dev + # RELEASE_PAT so the merge push can bypass branch protection + # and trigger downstream workflows (auto-rebase of open PRs). + token: ${{ secrets.RELEASE_PAT }} + + - name: Configure git identity + merge driver + # The `ours` driver is referenced by .gitattributes and is not + # built into git; it must be defined locally. We define it as + # a no-op (driver = true) which means "keep the version on the + # current branch unchanged" — the standard recipe. + run: | + git config user.name 'github-actions[bot]' + git config user.email 'github-actions[bot]@users.noreply.github.com' + git config merge.ours.driver true + + - name: Fetch upstream/dev + id: fetch + run: | + git remote add upstream https://github.com/community-shaders/skyrim-community-shaders.git || \ + git remote set-url upstream https://github.com/community-shaders/skyrim-community-shaders.git + git fetch upstream dev + UPSTREAM_SHA=$(git rev-parse upstream/dev) + echo "upstream_sha=${UPSTREAM_SHA}" >> "$GITHUB_OUTPUT" + echo "upstream_short=${UPSTREAM_SHA:0:9}" >> "$GITHUB_OUTPUT" + + # If we're already at or ahead of upstream's tip, exit early. + if git merge-base --is-ancestor "${UPSTREAM_SHA}" HEAD; then + echo "already_synced=true" >> "$GITHUB_OUTPUT" + else + echo "already_synced=false" >> "$GITHUB_OUTPUT" + fi + + - name: Already synced — nothing to do + if: steps.fetch.outputs.already_synced == 'true' + run: | + { + echo "## ✅ Already synced" + echo "" + echo "Upstream tip \`${{ steps.fetch.outputs.upstream_short }}\` is already an ancestor of \`dev\`. Nothing to merge." + } >> "$GITHUB_STEP_SUMMARY" + + - name: Merge upstream/dev + id: merge + if: steps.fetch.outputs.already_synced != 'true' + run: | + set +e + git merge --no-ff --no-edit \ + -m "chore(sync): merge upstream/dev as of ${{ steps.fetch.outputs.upstream_short }}" \ + upstream/dev + merge_status=$? + set -e + + if [[ $merge_status -ne 0 ]]; then + # The merge=ours driver should have handled everything + # in the fork-owned list. A conflict here means upstream + # touched a non-fork-owned file in a way that genuinely + # collides with our changes — needs human eyes. + echo "::error::Upstream sync conflicted on non-fork-owned paths." + { + echo "## ❌ Conflict during sync" + echo "" + echo "Upstream tip: \`${{ steps.fetch.outputs.upstream_short }}\`" + echo "" + echo "Conflicted files (resolve manually, then push):" + echo "" + echo '```' + git diff --name-only --diff-filter=U + echo '```' + echo "" + echo "Resolution: clone, run the same merge locally, resolve, push." + echo "" + echo '```bash' + echo "git fetch upstream dev" + echo "git config merge.ours.driver true" + echo "git merge --no-ff --no-edit \\" + echo " -m 'chore(sync): merge upstream/dev as of ${{ steps.fetch.outputs.upstream_short }}' \\" + echo " upstream/dev" + echo "# resolve conflicts, git add, git commit" + echo "git push origin dev" + echo '```' + } >> "$GITHUB_STEP_SUMMARY" + git merge --abort + exit 1 + fi + + - name: Summarize merge + if: steps.fetch.outputs.already_synced != 'true' && success() + run: | + { + echo "## ✅ Merged upstream/dev" + echo "" + echo "Upstream tip: \`${{ steps.fetch.outputs.upstream_short }}\`" + echo "" + echo "Files changed:" + echo "" + echo '```' + git diff --stat HEAD~1..HEAD | head -40 + echo '```' + echo "" + echo "Commits brought in from upstream:" + echo "" + echo '```' + # The merge commit has two parents: HEAD^1 = previous + # fork tip, HEAD^2 = upstream tip. The range + # `HEAD^1..HEAD^2` enumerates the upstream commits + # reachable through the merge that weren't already in + # our history — exactly what we want to surface. + git log --oneline 'HEAD^1..HEAD^2' | head -30 + echo '```' + } >> "$GITHUB_STEP_SUMMARY" + + - name: Push to origin/dev + if: steps.fetch.outputs.already_synced != 'true' && inputs.dry_run != true + run: git push origin dev + + - name: Dry-run notice + if: steps.fetch.outputs.already_synced != 'true' && inputs.dry_run == true + run: | + { + echo "" + echo "**Dry run** — merge completed locally but not pushed." + } >> "$GITHUB_STEP_SUMMARY" diff --git a/.github/workflows/nexus-upload.yaml b/.github/workflows/nexus-upload.yaml index 1018be0674..4da6b588e9 100644 --- a/.github/workflows/nexus-upload.yaml +++ b/.github/workflows/nexus-upload.yaml @@ -1,8 +1,21 @@ name: "Nexus: Upload Release" +# `mode=aio` (default) uploads only the AIO to the fork's mod page. +# `mode=matrix` runs the upstream per-feature fan-out (`feature_version_audit.py` builds the matrix). +# Only `prepare-nexus-matrix` branches on mode; downstream jobs are +# matrix-shape-agnostic. + on: workflow_dispatch: inputs: + mode: + description: "Upload mode: 'aio' uploads only the AIO to a single mod page (fork default); 'matrix' fans out per-feature to their respective Nexus pages (upstream behavior)." + required: false + type: choice + default: "aio" + options: + - "aio" + - "matrix" tag: description: "Release tag to upload to Nexus" required: true @@ -11,17 +24,17 @@ on: required: false default: "skyrimspecialedition" nexus_mod_id: - description: "Nexus mod ID" + description: "Target Nexus mod ID. In `aio` mode this is the fork's single mod page. In `matrix` mode this seeds the CORE row of the upstream matrix (defaults to 86492 if empty)." required: false - default: "86492" + default: "" artifact_pattern: - description: "Artifact glob pattern to select the package to upload" + description: "Artifact glob pattern. Defaults: `CommunityShaders_AIO-*.7z` in aio mode, `CommunityShaders-*.7z` (core .7z) in matrix mode." required: false - default: "CommunityShaders-*.7z" + default: "" mod_filename: - description: "Filename used for the Nexus mod upload" + description: "Nexus mod filename. Defaults: `Open Shaders` in aio mode, `Community Shaders` in matrix mode." required: false - default: "Community Shaders" + default: "" dry_run: description: "If true, do not upload to Nexus; only report the planned upload" required: false @@ -35,6 +48,10 @@ on: required: false workflow_call: inputs: + mode: + required: false + type: string + default: "aio" tag: description: "Release tag to upload to Nexus" required: true @@ -46,15 +63,15 @@ on: nexus_mod_id: required: false type: string - default: "86492" + default: "" artifact_pattern: required: false type: string - default: "CommunityShaders-*.7z" + default: "" mod_filename: required: false type: string - default: "Community Shaders" + default: "" dry_run: required: false type: string @@ -86,6 +103,7 @@ jobs: version: ${{ steps.resolve.outputs.version }} dry_run: ${{ steps.dryrun.outputs.dry_run }} has_uploads: ${{ steps.generate.outputs.has_uploads }} + mode: ${{ steps.resolve.outputs.mode }} steps: - name: Checkout repository uses: actions/checkout@v6 @@ -97,7 +115,7 @@ jobs: with: python-version: "3.11" - - name: Resolve release tag + - name: Resolve release tag and mode id: resolve run: | TAG="$INPUT_TAG" @@ -116,10 +134,17 @@ jobs: exit 1 fi VERSION="${TAG#v}" - echo "tag=$TAG" >> $GITHUB_OUTPUT + MODE="${INPUT_MODE:-aio}" + if [[ "$MODE" != "aio" && "$MODE" != "matrix" ]]; then + echo "ERROR: mode must be 'aio' or 'matrix' (got '$MODE')" >&2 + exit 1 + fi + echo "tag=$TAG" >> $GITHUB_OUTPUT echo "version=$VERSION" >> $GITHUB_OUTPUT + echo "mode=$MODE" >> $GITHUB_OUTPUT env: INPUT_TAG: ${{ inputs.tag || '' }} + INPUT_MODE: ${{ inputs.mode || '' }} GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} - name: Compute dry-run mode @@ -136,44 +161,86 @@ jobs: - name: Generate Nexus upload matrix id: generate run: | - ARGS=( --export-nexus-matrix --matrix-output nexus-matrix-raw.json ) - PREVIOUS_TAG=$(git tag --merged "$RELEASE_TAG" --list 'v*.*.*' --sort=-v:refname 2>/dev/null \ - | grep -v -- '-' \ - | awk -v current="$RELEASE_TAG" '$0 != current { print; exit }') - if [ -n "$PREVIOUS_TAG" ]; then - ARGS+=( --base "$PREVIOUS_TAG" ) + set -euo pipefail + MODE='${{ steps.resolve.outputs.mode }}' + + # Workflow inputs default to empty so matrix mode keeps + # its upstream defaults below; aio mode uses fork defaults. + if [ "$MODE" = "aio" ]; then + EFFECTIVE_MOD_ID="${INPUT_NEXUS_MOD_ID}" + EFFECTIVE_ARTIFACT="${INPUT_ARTIFACT_PATTERN:-CommunityShaders_AIO-*.7z}" + EFFECTIVE_FILENAME="${INPUT_MOD_FILENAME:-Open Shaders}" else - ARGS+=( --all-features ) + EFFECTIVE_MOD_ID="${INPUT_NEXUS_MOD_ID:-86492}" + EFFECTIVE_ARTIFACT="${INPUT_ARTIFACT_PATTERN:-CommunityShaders-*.7z}" + EFFECTIVE_FILENAME="${INPUT_MOD_FILENAME:-Community Shaders}" fi - if [ -n "$INPUT_NEXUS_MOD_ID" ]; then - ARGS+=( --core-mod-id "$INPUT_NEXUS_MOD_ID" ) - fi - if [ -n "$INPUT_MOD_FILENAME" ]; then - ARGS+=( --core-filename "$INPUT_MOD_FILENAME" ) - fi - if [ -n "$INPUT_ARTIFACT_PATTERN" ]; then - ARGS+=( --core-artifact-pattern "$INPUT_ARTIFACT_PATTERN" ) - fi - # Strip the leading 'v' so we get e.g. "1.5.2" — the value - # is rendered into Nexus file descriptions, where the - # bare semver is what users see in the UI. - RELEASE_VER="${RELEASE_TAG#v}" - ARGS+=( --release-version "$RELEASE_VER" ) - python tools/feature_version_audit.py "${ARGS[@]}" - - # Fetch the specific release by tag — avoids pagination issues with - # the releases list endpoint (default page size is 30). + export EFFECTIVE_MOD_ID EFFECTIVE_ARTIFACT EFFECTIVE_FILENAME + + # Fetch the release body once — both modes use it as the + # changelog for the CORE/AIO row. gh api "repos/$GITHUB_REPOSITORY/releases/tags/$RELEASE_TAG" \ 2>/dev/null > release.json || echo '{}' > release.json - python -c " + + if [ "$MODE" = "aio" ]; then + if [ -z "$EFFECTIVE_MOD_ID" ] && [ "$DRY_RUN" != "true" ]; then + echo "::error::aio mode requires nexus_mod_id for a real upload. Set it via the workflow input or update the default once the Open Shaders Nexus page exists." + exit 1 + fi + if [ -z "$EFFECTIVE_MOD_ID" ]; then + echo "::warning::nexus_mod_id is empty — dry-run continues, but a real upload would fail." + fi + RELEASE_VER="${RELEASE_TAG#v}" + export RELEASE_VER + python3 << 'PYEOF' + import json, os, re + with open('release.json') as f: + release = json.load(f) + if not isinstance(release, dict): + release = {} + body = release.get('body') or '' + body = re.sub(r'\n---\n+# Feature Version Audit\b.*', '', body, flags=re.DOTALL).strip() + release_ver = os.environ['RELEASE_VER'] + row = { + 'name': 'core', + 'is_core': True, + 'nexus_mod_id': os.environ['EFFECTIVE_MOD_ID'], + 'artifact_pattern': os.environ['EFFECTIVE_ARTIFACT'], + 'mod_filename': os.environ['EFFECTIVE_FILENAME'], + 'auto_upload': True, + 'changelog': body, + 'file_description': f'Open Shaders {release_ver} AIO. See changelog for included features.', + } + with open('nexus-matrix-raw.json', 'w') as f: + json.dump([row], f) + PYEOF + else + # Matrix mode: upstream behavior — the audit tool emits + # the full per-feature matrix. + ARGS=( --export-nexus-matrix --matrix-output nexus-matrix-raw.json ) + PREVIOUS_TAG=$(git tag --merged "$RELEASE_TAG" --list 'v*.*.*' --sort=-v:refname 2>/dev/null \ + | grep -v -- '-' \ + | awk -v current="$RELEASE_TAG" '$0 != current { print; exit }') + if [ -n "$PREVIOUS_TAG" ]; then + ARGS+=( --base "$PREVIOUS_TAG" ) + else + ARGS+=( --all-features ) + fi + ARGS+=( --core-mod-id "$EFFECTIVE_MOD_ID" ) + ARGS+=( --core-filename "$EFFECTIVE_FILENAME" ) + ARGS+=( --core-artifact-pattern "$EFFECTIVE_ARTIFACT" ) + RELEASE_VER="${RELEASE_TAG#v}" + ARGS+=( --release-version "$RELEASE_VER" ) + python tools/feature_version_audit.py "${ARGS[@]}" + + # Splice the GitHub release body into the CORE row's changelog. + python3 << 'PYEOF' import json, os, re with open('release.json') as f: release = json.load(f) if not isinstance(release, dict): release = {} body = release.get('body') or '' - # Strip feature audit appendix — Nexus changelog should only - # contain human-readable release notes, not the audit table. body = re.sub(r'\n---\n+# Feature Version Audit\b.*', '', body, flags=re.DOTALL).strip() with open('nexus-matrix-raw.json') as f: data = json.load(f) @@ -183,14 +250,16 @@ jobs: break with open('nexus-matrix-raw.json', 'w') as f: json.dump(data, f) - " + PYEOF + fi - python -c " + # Common post-processing: build the upload-eligible matrix + # (filter on auto_upload + artifact presence in the release). + # Single-row AIO matrix trivially passes this filter. + python3 << 'PYEOF' import json, os, fnmatch with open('nexus-matrix-raw.json') as f: data = json.load(f) - # Load asset names from the release to skip uploads where the - # artifact was not included — applies to all rows including core. try: with open('release.json') as f: release = json.load(f) @@ -202,16 +271,8 @@ jobs: def artifact_in_release(row): pat = row.get('artifact_pattern', '') return bool(pat) and any(fnmatch.fnmatch(n, pat) for n in asset_names) - # Full matrix for artifact prep; upload matrix excludes auto_upload=false - # and features whose standalone artifact is absent from the release. upload_data = [row for row in data if row.get('auto_upload', True) and artifact_in_release(row)] - # Drift detector: a feature opted in to auto_upload but its - # artifact pattern doesn't match anything on the release. - # This is almost always a metadata bug (folder name vs - # nexusfilename mismatch — see HDR Display in v1.5.2). - # Surface it loudly via GHA warning + step summary so it - # cannot silently drop a feature from a release again. missing = [row for row in data if row.get('auto_upload') is True and not row.get('is_core') @@ -226,7 +287,6 @@ jobs: pat = row.get('artifact_pattern', '?') mod_id = row.get('nexus_mod_id', '?') lines.append(f'| `{name}` | `{pat}` | {mod_id} |') - # Annotation appears inline in the run UI. print(f'::warning title=Nexus auto-upload missing artifact::' f'{name} (mod {mod_id}) marked autoupload=true but no ' f'release asset matches pattern {pat!r}. Likely cause: ' @@ -248,7 +308,8 @@ jobs: json.dump({'include': upload_data or [{'name': '_empty', 'skip': True}]}, f) with open('nexus-upload-state.json', 'w') as f: json.dump({'has_uploads': has_uploads}, f) - " + PYEOF + DELIM=$(openssl rand -hex 8) { printf 'matrix<<%s\n' "$DELIM" @@ -258,13 +319,14 @@ jobs: cat nexus-upload-matrix.json printf '\n%s\n' "$DELIM" } >> "$GITHUB_OUTPUT" - HAS_UPLOADS=$(python -c 'import json; d=json.load(open("nexus-upload-state.json")); print("true" if d.get("has_uploads") else "false")') + HAS_UPLOADS=$(python3 -c 'import json; d=json.load(open("nexus-upload-state.json")); print("true" if d.get("has_uploads") else "false")') echo "has_uploads=${HAS_UPLOADS}" >> "$GITHUB_OUTPUT" env: RELEASE_TAG: ${{ steps.resolve.outputs.tag }} INPUT_NEXUS_MOD_ID: ${{ inputs.nexus_mod_id || '' }} INPUT_MOD_FILENAME: ${{ inputs.mod_filename || '' }} INPUT_ARTIFACT_PATTERN: ${{ inputs.artifact_pattern || '' }} + DRY_RUN: ${{ steps.dryrun.outputs.dry_run }} GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} prepare-artifacts: @@ -346,6 +408,7 @@ jobs: version = os.environ.get('VERSION', '') game_id = os.environ.get('NEXUS_GAME_ID', 'skyrimspecialedition') api_key = os.environ.get('UNEX_APIKEY', '') + ua = 'open-shaders-ci/1.0' if os.environ.get('MODE') == 'aio' else 'community-shaders-ci/1.0' summary = open(os.environ['GITHUB_STEP_SUMMARY'], 'a') def w(line=''): @@ -366,8 +429,8 @@ jobs: mod_id = str(row.get('nexus_mod_id', '')) planned = row.get('mod_version') or version label = row.get('mod_filename', name) - url = f'https://www.nexusmods.com/{game_id}/mods/{mod_id}' - link = f'[{label}]({url})' + url = f'https://www.nexusmods.com/{game_id}/mods/{mod_id}' if mod_id else '' + link = f'[{label}]({url})' if url else label if not api_key or not mod_id: w(f'| {link} | `{planned}` | ⚠️ skipped | — |') @@ -376,7 +439,7 @@ jobs: api_url = f'https://api.nexusmods.com/v1/games/{game_id}/mods/{mod_id}/files.json' req = urllib.request.Request(api_url, headers={ 'apikey': api_key, - 'User-Agent': 'community-shaders-ci/1.0', + 'User-Agent': ua, 'Accept': 'application/json', }) try: @@ -412,6 +475,7 @@ jobs: UPLOAD_MATRIX: ${{ needs.prepare-nexus-matrix.outputs.upload_matrix }} VERSION: ${{ needs.prepare-nexus-matrix.outputs.version }} NEXUS_GAME_ID: ${{ inputs.nexus_game_id || 'skyrimspecialedition' }} + MODE: ${{ needs.prepare-nexus-matrix.outputs.mode }} UNEX_APIKEY: ${{ secrets.UNEX_APIKEY }} upload-to-nexus: @@ -435,10 +499,7 @@ jobs: mod_version: ${{ matrix.mod_version || needs.prepare-nexus-matrix.outputs.version }} mod_filename: ${{ matrix.mod_filename }} changelog: ${{ inputs.changelog || matrix.changelog || '' }} - # file_description anchors each .7z to the CS release it shipped - # with. Empty string falls back to the upstream default - # ("See mod description for details.") so non-release dispatches - # without an audit-tool-built matrix still work. + # file_description anchors each .7z to the release it shipped with. file_description: ${{ matrix.file_description || '' }} check_existing: true secrets: @@ -457,6 +518,7 @@ jobs: echo "" >> $GITHUB_STEP_SUMMARY echo "**Workflow:** ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}" >> $GITHUB_STEP_SUMMARY echo "**Tag:** ${{ needs.prepare-nexus-matrix.outputs.tag }}" >> $GITHUB_STEP_SUMMARY + echo "**Mode:** ${{ needs.prepare-nexus-matrix.outputs.mode }}" >> $GITHUB_STEP_SUMMARY echo "**Dry run:** ${{ needs.prepare-nexus-matrix.outputs.dry_run }}" >> $GITHUB_STEP_SUMMARY echo "" >> $GITHUB_STEP_SUMMARY @@ -480,6 +542,7 @@ jobs: version = os.environ.get('VERSION', '') game_id = os.environ.get('NEXUS_GAME_ID', 'skyrimspecialedition') api_key = os.environ.get('UNEX_APIKEY', '') + ua = 'open-shaders-ci/1.0' if os.environ.get('MODE') == 'aio' else 'community-shaders-ci/1.0' summary = open(os.environ['GITHUB_STEP_SUMMARY'], 'a') def w(line=''): @@ -491,7 +554,7 @@ jobs: url = f'https://api.nexusmods.com/v1/games/{game_id}/mods/{mod_id}/files.json' req = urllib.request.Request(url, headers={ 'apikey': api_key, - 'User-Agent': 'community-shaders-ci/1.0', + 'User-Agent': ua, 'Accept': 'application/json', }) try: @@ -515,7 +578,7 @@ jobs: mod_id = str(row.get('nexus_mod_id', '')) planned = row.get('mod_version') or version label = row.get('mod_filename', name) - url = f'https://www.nexusmods.com/{game_id}/mods/{mod_id}' + url = f'https://www.nexusmods.com/{game_id}/mods/{mod_id}' if mod_id else '' artifact = f'nexus-upload-{name}' versions = nexus_versions(mod_id) if versions is None: @@ -543,7 +606,9 @@ jobs: w('|--------|---------|---------|-------|----------|') for icon, label, planned, url, artifact, status in rows_status: art = f'`{artifact}`' if status in ('failed', 'unknown') else '—' - w(f'| {icon} | [{label}]({url}) | `{planned}` | [View]({url}) | {art} |') + link = f'[{label}]({url})' if url else label + view = f'[View]({url})' if url else '—' + w(f'| {icon} | {link} | `{planned}` | {view} | {art} |') if failed or unknowns: w() w('Re-run is safe: already-uploaded versions will be automatically skipped.') @@ -554,4 +619,5 @@ jobs: UPLOAD_MATRIX: ${{ needs.prepare-nexus-matrix.outputs.upload_matrix }} VERSION: ${{ needs.prepare-nexus-matrix.outputs.version }} NEXUS_GAME_ID: ${{ inputs.nexus_game_id || 'skyrimspecialedition' }} + MODE: ${{ needs.prepare-nexus-matrix.outputs.mode }} UNEX_APIKEY: ${{ secrets.UNEX_APIKEY }} diff --git a/.github/workflows/pr-checks.yaml b/.github/workflows/pr-checks.yaml index 27f3509377..26e255be6f 100644 --- a/.github/workflows/pr-checks.yaml +++ b/.github/workflows/pr-checks.yaml @@ -54,6 +54,10 @@ jobs: should-build: ${{ steps.changed-files.outputs.build_any_changed == 'true' || steps.changed-files.outputs.cpp_any_changed == 'true' || steps.changed-files.outputs.build_ci_any_changed == 'true' || steps.changed-files.outcome == 'failure' }} hlsl-should-build: ${{ steps.changed-files.outputs.hlsl_any_changed == 'true' || steps.changed-files.outputs.cmake_any_changed == 'true' || steps.changed-files.outputs.build_ci_any_changed == 'true' || steps.changed-files.outcome == 'failure' }} shader-tests-should-build: ${{ steps.changed-files.outputs.shader_tests_any_changed == 'true' || steps.changed-files.outputs.hlsl_any_changed == 'true' || steps.changed-files.outputs.cmake_any_changed == 'true' || steps.changed-files.outputs.build_ci_any_changed == 'true' || steps.changed-files.outcome == 'failure' }} + # cpp_tests target compiles src/Utils/Subrect.cpp directly (see tests/cpp/CMakeLists.txt), + # so any cpp_any change is conservative-but-correct. cpp_tests_any catches changes to + # the test sources themselves. cmake/build_ci cover CMake + CI plumbing. + cpp-tests-should-build: ${{ steps.changed-files.outputs.cpp_tests_any_changed == 'true' || steps.changed-files.outputs.cpp_any_changed == 'true' || steps.changed-files.outputs.cmake_any_changed == 'true' || steps.changed-files.outputs.build_ci_any_changed == 'true' || steps.changed-files.outcome == 'failure' }} steps: - uses: actions/checkout@v6 with: @@ -104,6 +108,8 @@ jobs: - 'features/**/Shaders/**' shader_tests: - 'tests/shaders/**' + cpp_tests: + - 'tests/cpp/**' base_sha: ${{ github.event.pull_request.base.sha }} sha: ${{ github.event.pull_request.head.sha }} @@ -125,8 +131,10 @@ jobs: run-cpp: ${{ needs.check-changes.outputs.should-build != 'false' }} run-shader-validation: ${{ needs.check-changes.outputs.hlsl-should-build != 'false' }} run-shader-tests: ${{ needs.check-changes.outputs.shader-tests-should-build != 'false' }} + run-cpp-tests: ${{ needs.check-changes.outputs.cpp-tests-should-build != 'false' }} hlsl-should-build: ${{ needs.check-changes.outputs.hlsl-should-build || 'true' }} shader-tests-should-build: ${{ needs.check-changes.outputs.shader-tests-should-build || 'true' }} + cpp-tests-should-build: ${{ needs.check-changes.outputs.cpp-tests-should-build || 'true' }} # Security: this job uses GITHUB_TOKEN with write permissions but does NOT execute # fork code — it only downloads pre-built artifacts from the build job above. @@ -187,7 +195,7 @@ jobs: if: steps.gen_notes.outputs.notes_generated == 'true' uses: ncipollo/release-action@339a81892b84b4eeb0f6e744e4574d79d0d9b8dd # v1 with: - name: "Community Shaders ${{ needs.build.outputs.version }} PR #${{ github.event.pull_request.number }}" + name: "Open Shaders ${{ needs.build.outputs.version }} PR #${{ github.event.pull_request.number }}" tag: v${{ needs.build.outputs.version }}-pr${{ github.event.pull_request.number }} prerelease: true artifacts: "${{ github.workspace }}/dist/CommunityShaders_AIO-*.7z" @@ -200,7 +208,7 @@ jobs: if: steps.gen_notes.outputs.notes_generated != 'true' uses: ncipollo/release-action@339a81892b84b4eeb0f6e744e4574d79d0d9b8dd # v1 with: - name: "Community Shaders ${{ needs.build.outputs.version }} PR #${{ github.event.pull_request.number }}" + name: "Open Shaders ${{ needs.build.outputs.version }} PR #${{ github.event.pull_request.number }}" tag: v${{ needs.build.outputs.version }}-pr${{ github.event.pull_request.number }} prerelease: true artifacts: "${{ github.workspace }}/dist/CommunityShaders_AIO-*.7z" diff --git a/.github/workflows/release-build.yaml b/.github/workflows/release-build.yaml index 98fd52d763..244c9f4187 100644 --- a/.github/workflows/release-build.yaml +++ b/.github/workflows/release-build.yaml @@ -175,7 +175,7 @@ jobs: id: create_release uses: ncipollo/release-action@v1 with: - name: "Community Shaders ${{ steps.combined_notes.outputs.release_tag }}" + name: "Open Shaders ${{ steps.combined_notes.outputs.release_tag }}" draft: true omitDraftDuringUpdate: true tag: ${{ steps.combined_notes.outputs.release_tag }} diff --git a/.github/workflows/release-hotfix.yaml b/.github/workflows/release-hotfix.yaml index 6339f8099b..2827a38754 100644 --- a/.github/workflows/release-hotfix.yaml +++ b/.github/workflows/release-hotfix.yaml @@ -58,19 +58,12 @@ jobs: hotfix: runs-on: ubuntu-latest steps: - - name: Generate app token - uses: actions/create-github-app-token@v1 - id: app-token - with: - app-id: ${{ secrets.APP_ID }} - private-key: ${{ secrets.APP_PRIVATE_KEY }} - - name: Checkout uses: actions/checkout@v6 with: fetch-depth: 0 persist-credentials: false - token: ${{ steps.app-token.outputs.token }} + token: ${{ secrets.RELEASE_PAT }} - name: Configure git identity run: | @@ -233,7 +226,7 @@ jobs: - name: Close superseded staging PRs and branches if: ${{ !inputs.dry_run && steps.cherry.outputs.picked_count != '0' }} env: - GH_TOKEN: ${{ steps.app-token.outputs.token }} + GH_TOKEN: ${{ secrets.RELEASE_PAT }} run: | set -euo pipefail HOTFIX='${{ steps.plan.outputs.hotfix_branch }}' @@ -253,7 +246,7 @@ jobs: - name: Push maintenance branch (if new) and staging branch if: ${{ !inputs.dry_run && steps.cherry.outputs.picked_count != '0' }} env: - GITHUB_TOKEN: ${{ steps.app-token.outputs.token }} + GITHUB_TOKEN: ${{ secrets.RELEASE_PAT }} run: | set -euo pipefail REMOTE="https://x-access-token:${GITHUB_TOKEN}@github.com/${{ github.repository }}.git" @@ -324,7 +317,13 @@ jobs: echo "Resolve by checking out \`${STAGING}\` locally, cherry-picking the conflicted commits manually, and pushing." fi echo "" - echo "See: [Hotfix Release Process](https://github.com/${{ github.repository }}/wiki/Developers#hotfix-release-process)" + # Link to the running repo's own wiki. Both upstream + # (community-shaders/skyrim-community-shaders) and the + # Open Shaders fork (alandtse/open-shaders) maintain + # wiki copies; `maint-update-wiki.yaml` keeps them in + # sync from buffer scans, so `${{ github.repository }}` + # resolves correctly in either context. + echo "See: [Hotfix Release Process](${{ github.server_url }}/${{ github.repository }}/wiki/Developers#hotfix-release-process)" } > /tmp/pr-body.md # Best-effort: ensure the hotfix label exists so --label diff --git a/.github/workflows/release-semantic.yaml b/.github/workflows/release-semantic.yaml index 3fddf9c84b..57f1551780 100644 --- a/.github/workflows/release-semantic.yaml +++ b/.github/workflows/release-semantic.yaml @@ -7,7 +7,7 @@ name: "Release: Semantic Version" # release, then reconcile dev. Dev-source reconciles FF-only; hotfix- # staging source rebases dev onto the new main so the release tag # enters dev's ancestry — the only force-push this workflow performs. -# Open PRs targeting dev will need a manual rebase by their authors. +# `auto-rebase-prs.yaml` then rebases open PRs targeting dev. on: workflow_dispatch: @@ -30,13 +30,6 @@ jobs: release: runs-on: ubuntu-latest steps: - - name: Generate app token - uses: actions/create-github-app-token@v1 - id: app-token - with: - app-id: ${{ secrets.APP_ID }} - private-key: ${{ secrets.APP_PRIVATE_KEY }} - - name: Checkout repository uses: actions/checkout@v6 with: @@ -44,9 +37,9 @@ jobs: persist-credentials: false # Token used for the working checkout. The actual pushes # (FF main, semantic-release commit/tag, FF dev) all go - # through the app token below so they can bypass branch + # through RELEASE_PAT below so they can bypass branch # protection and fire downstream workflows. - token: ${{ steps.app-token.outputs.token }} + token: ${{ secrets.RELEASE_PAT }} - name: Configure git identity run: | @@ -114,7 +107,7 @@ jobs: - name: Fast-forward main to ff_target if: ${{ inputs.ff_target != '' }} env: - TOKEN: ${{ steps.app-token.outputs.token }} + TOKEN: ${{ secrets.RELEASE_PAT }} FF_TARGET: ${{ inputs.ff_target }} run: | set -euo pipefail @@ -143,19 +136,17 @@ jobs: @semantic-release/github tag_format: "v${version}" env: - # App token required for two reasons: + # RELEASE_PAT required for two reasons: # 1. Allows pushing the version commit and tag to the protected # target branch (GITHUB_TOKEN cannot bypass branch protection). - # Prerequisite: the GitHub App must be in the bypass actor - # list for main and dev — the token alone is not sufficient. - # 2. An app-token tag push triggers downstream workflows + # 2. A PAT-sourced tag push triggers downstream workflows # (release-build.yaml); GITHUB_TOKEN pushes do not. - GITHUB_TOKEN: ${{ steps.app-token.outputs.token }} + GITHUB_TOKEN: ${{ secrets.RELEASE_PAT }} - name: Reconcile dev with release commit (promotion mode, dev source only) if: ${{ inputs.ff_target != '' && steps.semantic.outputs.new_release_published == 'true' && steps.validate.outputs.source == 'dev' }} env: - TOKEN: ${{ steps.app-token.outputs.token }} + TOKEN: ${{ secrets.RELEASE_PAT }} run: | set -euo pipefail git fetch origin main dev @@ -186,7 +177,7 @@ jobs: - name: Rebase-reconcile dev with release commit (promotion mode, hotfix-staging source) if: ${{ inputs.ff_target != '' && steps.semantic.outputs.new_release_published == 'true' && steps.validate.outputs.source != 'dev' && steps.validate.outputs.source != '' }} env: - TOKEN: ${{ steps.app-token.outputs.token }} + TOKEN: ${{ secrets.RELEASE_PAT }} run: | set -euo pipefail git fetch origin main dev @@ -249,6 +240,6 @@ jobs: echo "- Source: hotfix-staging \`${{ steps.validate.outputs.source }}\`" echo "- New main: \`$(git rev-parse --short=9 "${NEW_MAIN}")\`" echo "- Dev rebased linearly onto new main; identical-patch commits auto-dropped by \`git rebase\`." - echo "- Open PRs against dev will need a manual rebase by their authors." + echo "- Open PRs against dev are auto-rebased by the \`Auto-rebase open PRs\` workflow." } >> "$GITHUB_STEP_SUMMARY" echo "::notice::Reconciled dev onto main via rebase (linear history preserved)." diff --git a/.gitmodules b/.gitmodules index db0fe8c6f7..7bed141dcb 100644 --- a/.gitmodules +++ b/.gitmodules @@ -8,3 +8,6 @@ path = extern/FidelityFX-SDK url = https://github.com/alandtse/FidelityFX-SDK-DX11 branch = optiscaler-build +[submodule "extern/cpp-mcp"] + path = extern/cpp-mcp + url = https://github.com/hkr04/cpp-mcp.git diff --git a/AI-INSTRUCTIONS.md b/AI-INSTRUCTIONS.md index 56b9ddfe66..daa956cfd7 100644 --- a/AI-INSTRUCTIONS.md +++ b/AI-INSTRUCTIONS.md @@ -1,6 +1,6 @@ # AI Development Instructions -This file provides guidance for AI assistants working with the Skyrim Community Shaders codebase. +This file provides guidance for AI assistants working with the Open Shaders codebase — a fork of [Community Shaders](https://github.com/community-shaders/skyrim-community-shaders) ([Nexus](https://www.nexusmods.com/skyrimspecialedition/mods/180419)). The runtime layout (DLL name, settings path, log file) intentionally matches upstream Community Shaders so users can switch without losing settings; only the public display name and in-game branding are "Open Shaders". ## Primary Documentation @@ -52,4 +52,10 @@ For full details about manual packaging targets (Package-Core, Package-AIO-Manua **Key Focus**: Performance impact awareness, runtime compatibility (SE/AE/VR), complete working solutions, DirectX/HLSL best practices. +**Style directives** (see `.claude/CLAUDE.md` "Code Quality Expectations" for full text): + +- **Concise comments**: explain _why_, not _what_. Don't paraphrase the next 4 lines in 4 lines of comment. +- **Minimal churn**: PRs touch only what the change requires. Out-of-scope cleanups go in a follow-up, not the current diff. +- **No comments about absent code**: don't describe code that used to be in the file but isn't now ("X was removed", "Y was renamed from Z"). The deletion isn't visible. Past-tense framing of present behavior is fine. Exception: a regression-risk warning that names the removed code to prevent re-adding it. + For detailed explanations, examples, and comprehensive guidance, refer to `.claude/CLAUDE.md`. diff --git a/CMakeLists.txt b/CMakeLists.txt index c05cbf4f1b..a070df5f27 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -44,11 +44,17 @@ option( "Build shader unit tests (runs automatically before packaging)" ON ) +option( + AIO_INCLUDE_NON_AUTOUPLOAD + "Include features whose .ini has autoupload != true in the AIO. Off by default — the AIO ships only release-ready features. Turn on for local dev builds that want every feature regardless of release status." + OFF +) message("\tAuto plugin deployment: ${AUTO_PLUGIN_DEPLOYMENT}") message("\tZip to dist: ${ZIP_TO_DIST}") message("\tAIO Zip to dist: ${AIO_ZIP_TO_DIST}") message("\tTracy profiler: ${TRACY_SUPPORT}") message("\tShader tests: ${BUILD_SHADER_TESTS}") +message("\tAIO include non-autoupload features: ${AIO_INCLUDE_NON_AUTOUPLOAD}") # ####################################################################################################################### # # Build version info from git @@ -88,6 +94,7 @@ find_package(pystring CONFIG REQUIRED) find_package(cppwinrt CONFIG REQUIRED) find_package(unordered_dense CONFIG REQUIRED) find_package(efsw CONFIG REQUIRED) +find_path(EXPRTK_INCLUDE_DIRS "exprtk.hpp" REQUIRED) find_package(Tracy CONFIG REQUIRED) find_package(directx-headers CONFIG REQUIRED) add_subdirectory(${CMAKE_SOURCE_DIR}/cmake/Streamline) @@ -95,6 +102,7 @@ add_subdirectory(${CMAKE_SOURCE_DIR}/cmake/Streamline) find_path(DETOURS_INCLUDE_DIRS "detours/detours.h") find_library(DETOURS_LIBRARY detours REQUIRED) include(FidelityFX-SDK) +include(cpp-mcp) target_compile_definitions( ${PROJECT_NAME} @@ -121,6 +129,7 @@ target_include_directories( ${CLIB_UTIL_INCLUDE_DIRS} "${CMAKE_SOURCE_DIR}/package/Shaders" ${DETOURS_INCLUDE_DIRS} + ${EXPRTK_INCLUDE_DIRS} ) target_link_libraries( @@ -138,6 +147,7 @@ target_link_libraries( unordered_dense::unordered_dense efsw::efsw Tracy::TracyClient + cpp-mcp Streamline d3d12.lib Microsoft::DirectX-Headers @@ -352,6 +362,117 @@ string(TIMESTAMP UTC_NOW "%Y-%m-%dT%H-%MZ" UTC) # Set AIO directory path used by multiple targets below set(AIO_DIR "${CMAKE_CURRENT_BINARY_DIR}/aio") +# TRUE iff the feature's .ini sets `autoupload = true` (case-insensitive) +# or the feature is CORE (always included). Result in ${out_var}. +function(feature_is_autoupload feature_path out_var) + if(EXISTS "${feature_path}/CORE") + set(${out_var} TRUE PARENT_SCOPE) + return() + endif() + file(GLOB _ini_candidates "${feature_path}/Shaders/Features/*.ini") + foreach(_ini IN LISTS _ini_candidates) + file(STRINGS "${_ini}" _ini_lines REGEX "^[ \t]*autoupload[ \t]*=") + foreach(_line IN LISTS _ini_lines) + # Strip "key =" to extract the value, lowercase it. + string( + REGEX REPLACE "^[ \t]*autoupload[ \t]*=[ \t]*" + "" + _val + "${_line}" + ) + string(STRIP "${_val}" _val) + string(TOLOWER "${_val}" _val) + # Treat unset / falsey values as off; only explicit truthy values count. + if( + _val STREQUAL "true" + OR _val STREQUAL "1" + OR _val STREQUAL "yes" + OR _val STREQUAL "on" + ) + set(${out_var} TRUE PARENT_SCOPE) + return() + endif() + endforeach() + endforeach() + set(${out_var} FALSE PARENT_SCOPE) +endfunction() + +# Partition features into AIO-included / -excluded sets. +# AIO_INCLUDE_NON_AUTOUPLOAD=ON includes everything (local dev override). +set(AIO_INCLUDED_FEATURE_PATHS "") +set(AIO_EXCLUDED_FEATURE_PATHS "") +foreach(_fpath IN LISTS FEATURE_PATHS) + if(NOT IS_DIRECTORY "${_fpath}") + continue() + endif() + if(AIO_INCLUDE_NON_AUTOUPLOAD) + list(APPEND AIO_INCLUDED_FEATURE_PATHS "${_fpath}") + else() + feature_is_autoupload("${_fpath}" _ok) + if(_ok) + list(APPEND AIO_INCLUDED_FEATURE_PATHS "${_fpath}") + else() + list(APPEND AIO_EXCLUDED_FEATURE_PATHS "${_fpath}") + endif() + endif() +endforeach() +list(LENGTH AIO_INCLUDED_FEATURE_PATHS _aio_in_count) +list(LENGTH AIO_EXCLUDED_FEATURE_PATHS _aio_out_count) +message("\tAIO features included: ${_aio_in_count}") + +# Build the paths (relative to AIO root) that the archive step deletes +# from the stage. We derive them from each feature's on-disk structure +# because the source folder, inner shader dir, and .ini basename can +# all differ (e.g. IBL ships ImageBasedLighting.ini under +# Shaders/Features, with shader code under Shaders/IBL/). +set(AIO_EXCLUDED_FEATURE_NAMES "") +set(AIO_EXCLUDED_STAGE_RELPATHS "") +foreach(_fpath IN LISTS AIO_EXCLUDED_FEATURE_PATHS) + get_filename_component(_fname "${_fpath}" NAME) + list(APPEND AIO_EXCLUDED_FEATURE_NAMES "${_fname}") + + if(EXISTS "${_fpath}/Shaders") + file( + GLOB _shader_subdirs + LIST_DIRECTORIES TRUE + RELATIVE "${_fpath}/Shaders" + "${_fpath}/Shaders/*" + ) + foreach(_sd IN LISTS _shader_subdirs) + if(_sd STREQUAL "Features") + continue() + endif() + if(IS_DIRECTORY "${_fpath}/Shaders/${_sd}") + list(APPEND AIO_EXCLUDED_STAGE_RELPATHS "Shaders/${_sd}") + endif() + endforeach() + endif() + + if(EXISTS "${_fpath}/Shaders/Features") + file( + GLOB _ini_files + RELATIVE "${_fpath}/Shaders" + "${_fpath}/Shaders/Features/*.ini" + ) + foreach(_ini IN LISTS _ini_files) + list(APPEND AIO_EXCLUDED_STAGE_RELPATHS "Shaders/${_ini}") + endforeach() + endif() +endforeach() +list(REMOVE_DUPLICATES AIO_EXCLUDED_STAGE_RELPATHS) + +if(_aio_out_count GREATER 0) + message( + "\tAIO features excluded from release archive (autoupload != true, set AIO_INCLUDE_NON_AUTOUPLOAD=ON to override): ${_aio_out_count}" + ) + foreach(_fname IN LISTS AIO_EXCLUDED_FEATURE_NAMES) + message("\t\t- ${_fname}") + endforeach() + message( + "\tNote: excluded features still build and validate; they are stripped only at AIO archive time." + ) +endif() + # Robocopy wrapper for Windows incremental file copy (used by deployment targets). # We invoke through `cmd /c ""` rather than the bare wrapper path because # modern Windows refuses to execute scripts from the current directory without an explicit @@ -371,7 +492,9 @@ endif() # # CMake install() infrastructure for manual packaging # ####################################################################################################################### -# Append a '/' to the end of each feature path for installation all its contents but not itself +# Append a '/' to install contents but not the directory itself. Uses +# the full FEATURE_PATHS (not the autoupload filter) so cross-feature +# shader #includes still resolve at build time. set(FEATURE_PATHS_SLASH ${FEATURE_PATHS}) list(TRANSFORM FEATURE_PATHS_SLASH APPEND /) @@ -481,9 +604,15 @@ if(AUTO_PLUGIN_DEPLOYMENT OR AIO_ZIP_TO_DIST) list(FILTER _AIO_PACKAGE_SOURCE_FILES EXCLUDE REGEX "/Shaders/") append_copy_if_different(_prepare_aio_cmds _AIO_PACKAGE_SOURCE_FILES "${CMAKE_SOURCE_DIR}/package" "${AIO_DIR}") - # Copy feature folders (only files, preserve existing files in AIO). - # Shader files are excluded - copy_shaders.stamp owns the Shaders/ subdir - # so the two custom-build rules don't race on the same destinations. + # Copy feature folders (files only; copy_shaders.stamp owns Shaders/ + # to avoid racing). All features copied — autoupload filter applies + # only at archive time. + # + # The CORE marker is excluded from the per-feature copy: every feature + # that has one would write to the same `${AIO_DIR}/CORE` path, and the + # trailing `remove` below deletes it anyway. When MSBuild runs the copy + # commands in parallel two features can race for the write and one + # fails with "Permission denied", taking PREPARE_AIO down with it. foreach(_fpath IN LISTS FEATURE_PATHS) if(EXISTS "${_fpath}") file( @@ -492,11 +621,13 @@ if(AUTO_PLUGIN_DEPLOYMENT OR AIO_ZIP_TO_DIST) "${_fpath}/*" ) list(FILTER _feature_files EXCLUDE REGEX "/Shaders/") + list(FILTER _feature_files EXCLUDE REGEX "/CORE$") append_copy_if_different(_prepare_aio_cmds _feature_files "${_fpath}" "${AIO_DIR}") endif() endforeach() - # Remove CORE from AIO if it exists (keep rest intact) + # Remove CORE from AIO if it's left over from a previous build that pre- + # dated the per-feature exclusion above. Cheap, idempotent. list( APPEND _prepare_aio_cmds COMMAND @@ -569,6 +700,8 @@ if(AUTO_PLUGIN_DEPLOYMENT OR AIO_ZIP_TO_DIST) list(FILTER _package_shaders EXCLUDE REGEX "/Tests/") set(_shader_copy_cmds) + # All features' shaders are copied — cross-feature #includes need the + # full tree at compile time (e.g. DistantTree.hlsl pulls IBL/IBL.hlsli). set(_feature_shader_paths) foreach(_fpath IN LISTS FEATURE_PATHS) if(EXISTS "${_fpath}/Shaders") @@ -596,7 +729,6 @@ if(AUTO_PLUGIN_DEPLOYMENT OR AIO_ZIP_TO_DIST) append_copy_if_different(_shader_copy_cmds _package_shaders "${CMAKE_SOURCE_DIR}/package/Shaders" "${AIO_DIR}/Shaders") - # feature shader folders foreach(_fpath IN LISTS FEATURE_PATHS) if(EXISTS "${_fpath}/Shaders") file( @@ -635,11 +767,12 @@ if(AUTO_PLUGIN_DEPLOYMENT) # Detect git HEAD changes (branch switch or new commit) and invalidate stale # deploy state. BuildRelease.bat always reconfigures, so this runs on every # build invocation. When HEAD changes we: - # 1. Touch all AIO files so robocopy /XO sees them as newer than any - # previously-deployed files (including manual package installs). + # 1. Clear the AIO directory so PREPARE_AIO re-copies everything with + # fresh timestamps (catches content-identical files cmake's + # copy_if_different would otherwise skip). # 2. Delete deploy stamps so CMake re-runs the robocopy commands. - # Files the user intentionally edited in-game remain protected by /XO on - # subsequent builds (same HEAD = no touch, same stamp file exists). + # Per-build incremental relies on robocopy's default name+size+timestamp + # detection to copy any file whose size or mtime differs from dest. execute_process( COMMAND git rev-parse --short HEAD WORKING_DIRECTORY "${CMAKE_SOURCE_DIR}" @@ -662,10 +795,10 @@ if(AUTO_PLUGIN_DEPLOYMENT) # Delete the entire AIO directory so PREPARE_AIO re-copies everything # with fresh timestamps. Without this, copy_if_different skips # content-identical files (shaders, textures, configs, etc.) and - # leaves them with old timestamps that deployed files (e.g. from a - # manual package install) may beat under /XO. The DLL is always - # rebuilt fresh by the C++ compile step so it doesn't need special - # handling. + # leaves them with old timestamps that previously-deployed files + # (e.g. from a manual package install) can beat on timestamp delta. + # The DLL is always rebuilt fresh by the C++ compile step so it + # doesn't need special handling. file(REMOVE_RECURSE "${AIO_DIR}") # Also clear the PREPARE_AIO / COPY_SHADERS stamps so cmake --build # actually re-runs those targets. @@ -703,10 +836,18 @@ if(AUTO_PLUGIN_DEPLOYMENT) add_custom_command( OUTPUT ${DEPLOY_TARGET_HASH}_deploy.stamp COMMAND ${CMAKE_COMMAND} -E make_directory "${DEPLOY_TARGET}" + # No /XO: a previous manual or external deploy can leave dest + # files with mtime > source. /XO then skips them even when the + # source file changed (different size and/or content), so build + # output silently diverges from what runs in-game. Without /XO, + # robocopy's default name+size+timestamp delta catches both + # newer-source and size-mismatch cases. The git-HEAD-change + # block above handles bulk-redeploy when checking out a new + # branch; this command handles the per-build incremental. COMMAND ${ROBOCOPY_WRAPPER} "${AIO_DIR}" "${DEPLOY_TARGET}" "/E" - "/XD" "${AIO_DIR}/Shaders" "/COPY:DAT" "/XO" "/R:1" "/W:1" - "/NFL" "/NDL" "/NJH" "/NJS" + "/XD" "${AIO_DIR}/Shaders" "/COPY:DAT" "/R:1" "/W:1" "/NFL" + "/NDL" "/NJH" "/NJS" COMMAND ${CMAKE_COMMAND} -E touch ${DEPLOY_TARGET_HASH}_deploy.stamp DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/prepare_aio.stamp @@ -746,10 +887,11 @@ if(AUTO_PLUGIN_DEPLOYMENT) COMMAND ${CMAKE_COMMAND} -E make_directory "${DEPLOY_TARGET}/Shaders" + # See /XO rationale above on the main deploy block. COMMAND ${ROBOCOPY_WRAPPER} "${AIO_DIR}/Shaders" - "${DEPLOY_TARGET}/Shaders" "/E" "/COPY:DAT" "/XO" "/R:1" - "/W:1" "/NFL" "/NDL" "/NJH" "/NJS" + "${DEPLOY_TARGET}/Shaders" "/E" "/COPY:DAT" "/R:1" "/W:1" + "/NFL" "/NDL" "/NJH" "/NJS" COMMAND ${CMAKE_COMMAND} -E touch ${DEPLOY_TARGET_HASH}_shaders_only.stamp @@ -769,10 +911,11 @@ if(AUTO_PLUGIN_DEPLOYMENT) COMMAND ${CMAKE_COMMAND} -E make_directory "${DEPLOY_TARGET}/Shaders" + # See /XO rationale above on the main deploy block. COMMAND ${ROBOCOPY_WRAPPER} "${AIO_DIR}/Shaders" - "${DEPLOY_TARGET}/Shaders" "/E" "/COPY:DAT" "/XO" "/R:1" - "/W:1" "/NFL" "/NDL" "/NJH" "/NJS" + "${DEPLOY_TARGET}/Shaders" "/E" "/COPY:DAT" "/R:1" "/W:1" + "/NFL" "/NDL" "/NJH" "/NJS" COMMAND ${CMAKE_COMMAND} -E touch ${DEPLOY_TARGET_HASH}_shaders_full.stamp @@ -915,15 +1058,92 @@ if(AIO_ZIP_TO_DIST) set(TARGET_AIO_ZIP "${PROJECT_NAME}_AIO-${UTC_NOW}.7z") set(AIO_ARCHIVE "${CMAKE_SOURCE_DIR}/dist/${TARGET_AIO_ZIP}") set(AIO_ZIP_STAMP "${CMAKE_CURRENT_BINARY_DIR}/aio_package.stamp") + set(AIO_STAGE_DIR "${CMAKE_CURRENT_BINARY_DIR}/aio_stage") message("Zipping ${AIO_DIR} to ${AIO_ARCHIVE}") + # Fast path: tar AIO_DIR directly. Slow path (excluded features): + # copy → stage, strip excluded relpaths, tar the stage. AIO_DIR stays + # complete so AUTO_PLUGIN_DEPLOYMENT delivers the full tree. + set(_aio_zip_cmds + COMMAND + ${CMAKE_COMMAND} + -E + make_directory + "${CMAKE_SOURCE_DIR}/dist" + ) + if(AIO_EXCLUDED_FEATURE_NAMES) + list( + APPEND _aio_zip_cmds + COMMAND + ${CMAKE_COMMAND} + -E + rm + -rf + "${AIO_STAGE_DIR}" + COMMAND + ${CMAKE_COMMAND} + -E + copy_directory + "${AIO_DIR}" + "${AIO_STAGE_DIR}" + ) + foreach(_relpath IN LISTS AIO_EXCLUDED_STAGE_RELPATHS) + list( + APPEND _aio_zip_cmds + COMMAND + ${CMAKE_COMMAND} + -E + rm + -rf + "${AIO_STAGE_DIR}/${_relpath}" + ) + endforeach() + list( + APPEND _aio_zip_cmds + COMMAND + ${CMAKE_COMMAND} + -E + chdir + "${AIO_STAGE_DIR}" + ${CMAKE_COMMAND} + -E + tar + cf + "${AIO_ARCHIVE}" + --format=7zip + -- + . + ) + else() + list( + APPEND _aio_zip_cmds + COMMAND + ${CMAKE_COMMAND} + -E + chdir + "${AIO_DIR}" + ${CMAKE_COMMAND} + -E + tar + cf + "${AIO_ARCHIVE}" + --format=7zip + -- + . + ) + endif() + list( + APPEND _aio_zip_cmds + COMMAND + ${CMAKE_COMMAND} + -E + touch + ${AIO_ZIP_STAMP} + ) + add_custom_command( - OUTPUT ${AIO_ZIP_STAMP} - COMMAND ${CMAKE_COMMAND} -E make_directory "${CMAKE_SOURCE_DIR}/dist" - COMMAND ${CMAKE_COMMAND} -E tar cf ${AIO_ARCHIVE} --format=7zip -- . - COMMAND ${CMAKE_COMMAND} -E touch ${AIO_ZIP_STAMP} - WORKING_DIRECTORY ${AIO_DIR} + OUTPUT ${AIO_ZIP_STAMP} ${_aio_zip_cmds} DEPENDS PREPARE_AIO ${CMAKE_CURRENT_BINARY_DIR}/copy_shaders.stamp COMMENT "Creating AIO archive ${AIO_ARCHIVE}" ) @@ -1021,16 +1241,89 @@ add_custom_command( ) add_custom_target("AIO" DEPENDS ${AIO_DIR}/SKSE/Plugins/${PROJECT_NAME}.dll) -# Manual AIO package target +# Manual AIO package target. Strips non-autoupload features at archive +# time (mirroring AIO_ZIP_PACKAGE) so the manual path produces the same +# release-ready bundle as the automated path. set(AIO_PACKAGE "${DIST_PATH}/${PROJECT_NAME}_AIO-${UTC_NOW}.7z") +set(AIO_MANUAL_STAGE_DIR "${CMAKE_CURRENT_BINARY_DIR}/aio_manual_stage") +set(_aio_manual_cmds + COMMAND + ${CMAKE_COMMAND} + -E + make_directory + ${AIO_DIR} + COMMAND + ${CMAKE_COMMAND} + --install + ${CMAKE_BINARY_DIR} + --prefix + ${AIO_DIR} +) +if(AIO_EXCLUDED_FEATURE_NAMES) + list( + APPEND _aio_manual_cmds + COMMAND + ${CMAKE_COMMAND} + -E + rm + -rf + "${AIO_MANUAL_STAGE_DIR}" + COMMAND + ${CMAKE_COMMAND} + -E + copy_directory + "${AIO_DIR}" + "${AIO_MANUAL_STAGE_DIR}" + ) + foreach(_relpath IN LISTS AIO_EXCLUDED_STAGE_RELPATHS) + list( + APPEND _aio_manual_cmds + COMMAND + ${CMAKE_COMMAND} + -E + rm + -rf + "${AIO_MANUAL_STAGE_DIR}/${_relpath}" + ) + endforeach() + list( + APPEND _aio_manual_cmds + COMMAND + ${CMAKE_COMMAND} + -E + chdir + "${AIO_MANUAL_STAGE_DIR}" + ${CMAKE_COMMAND} + -E + tar + cfv + ${AIO_PACKAGE} + --format=7zip + -- + . + ) +else() + list( + APPEND _aio_manual_cmds + COMMAND + ${CMAKE_COMMAND} + -E + chdir + ${AIO_DIR} + ${CMAKE_COMMAND} + -E + tar + cfv + ${AIO_PACKAGE} + --format=7zip + -- + . + ) +endif() + add_custom_command( OUTPUT ${AIO_PACKAGE} - DEPENDS ${CORE_SOURCES} - COMMAND ${CMAKE_COMMAND} -E make_directory ${AIO_DIR} - COMMAND ${CMAKE_COMMAND} --install ${CMAKE_BINARY_DIR} --prefix ${AIO_DIR} - COMMAND - ${CMAKE_COMMAND} -E chdir ${AIO_DIR} ${CMAKE_COMMAND} -E tar cfv - ${AIO_PACKAGE} --format=7zip -- . + DEPENDS ${CORE_SOURCES} ${_aio_manual_cmds} COMMENT "Creating AIO zip package (manual)" ) add_custom_target("Package-AIO-Manual" DEPENDS ${AIO_PACKAGE}) @@ -1042,7 +1335,26 @@ if(BUILD_SHADER_TESTS) message(STATUS "Adding shader tests subdirectory") enable_testing() # Enable CTest integration for shader tests add_subdirectory(tests/shaders) +endif() + +# C++ unit tests for plugin utility code (separate from HLSL shader tests). +# Gated on its own flag so it can be enabled/disabled independently. +if(EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/tests/cpp/CMakeLists.txt") + enable_testing() + add_subdirectory(tests/cpp) + if(TARGET cpp_tests) + add_custom_target( + run_cpp_tests + COMMAND $ --reporter compact + DEPENDS cpp_tests + WORKING_DIRECTORY $ + COMMENT "Running C++ unit tests..." + VERBATIM + ) + endif() +endif() +if(BUILD_SHADER_TESTS) # Add a custom target that runs the shader tests # Users can run this manually with: cmake --build --target run_shader_tests # Runs the test executable directly (not via CTest) to show discovery count @@ -1082,7 +1394,7 @@ if(BUILD_SHADER_TESTS) endif() message("*************************************************************") -message("Community Shaders configuration complete") +message("Open Shaders configuration complete (fork of Community Shaders)") message("To prepare a ZIP package of AIO, Core, or Features") message(" Build cmake targets:") message(" - Package-Core: Core package") diff --git a/Dockerfile b/Dockerfile index af2eac944d..927ca80a41 100644 --- a/Dockerfile +++ b/Dockerfile @@ -24,8 +24,8 @@ RUN git clone https://github.com/microsoft/vcpkg.git C:/vcpkg && \ cd C:/vcpkg && \ bootstrap-vcpkg.bat -RUN setx /M VCPKG_ROOT "C:/vcpkg" && mkdir C:\skyrim-community-shaders +RUN setx /M VCPKG_ROOT "C:/vcpkg" && mkdir C:\open-shaders -WORKDIR C:/skyrim-community-shaders +WORKDIR C:/open-shaders -ENTRYPOINT ["powershell", "-File", "C:/skyrim-community-shaders/containerbuild.ps1"] +ENTRYPOINT ["powershell", "-File", "C:/open-shaders/containerbuild.ps1"] diff --git a/README.md b/README.md index 9a8128f28d..e90af5f4e3 100644 --- a/README.md +++ b/README.md @@ -1,23 +1,37 @@ -[![Latest Release](https://img.shields.io/github/v/release/doodlum/skyrim-community-shaders)](https://github.com/doodlum/skyrim-community-shaders/releases) -[![License](https://img.shields.io/github/license/doodlum/skyrim-community-shaders)](./LICENSE) -[![Last Commit](https://img.shields.io/github/last-commit/doodlum/skyrim-community-shaders)](https://github.com/doodlum/skyrim-community-shaders/commits) -[![Build Status](https://img.shields.io/github/actions/workflow/status/doodlum/skyrim-community-shaders/release-build.yaml?branch=dev)](https://github.com/doodlum/skyrim-community-shaders/actions) -[![Discord](https://img.shields.io/discord/1080142797870485606?label=discord&logo=discord&color=5865F2)](https://discord.com/invite/nkrQybAsyy) -[![Open Issues](https://img.shields.io/github/issues/doodlum/skyrim-community-shaders)](https://github.com/doodlum/skyrim-community-shaders/issues) -[![Contributors](https://img.shields.io/github/contributors/doodlum/skyrim-community-shaders)](https://github.com/doodlum/skyrim-community-shaders/graphs/contributors) -[![Stars](https://img.shields.io/github/stars/doodlum/skyrim-community-shaders?style=social)](https://github.com/doodlum/skyrim-community-shaders/stargazers) +[![Latest Release](https://img.shields.io/github/v/release/alandtse/open-shaders)](https://github.com/alandtse/open-shaders/releases) +[![License](https://img.shields.io/github/license/alandtse/open-shaders)](./LICENSE) +[![Last Commit](https://img.shields.io/github/last-commit/alandtse/open-shaders)](https://github.com/alandtse/open-shaders/commits) +[![Build Status](https://img.shields.io/github/actions/workflow/status/alandtse/open-shaders/release-build.yaml?branch=dev)](https://github.com/alandtse/open-shaders/actions) +[![Open Issues](https://img.shields.io/github/issues/alandtse/open-shaders)](https://github.com/alandtse/open-shaders/issues) +[![Contributors](https://img.shields.io/github/contributors/alandtse/open-shaders)](https://github.com/alandtse/open-shaders/graphs/contributors) +[![Stars](https://img.shields.io/github/stars/alandtse/open-shaders?style=social)](https://github.com/alandtse/open-shaders/stargazers) -[![Pre-commit CI](https://results.pre-commit.ci/badge/github/doodlum/skyrim-community-shaders/dev.svg)](https://results.pre-commit.ci/latest/github/doodlum/skyrim-community-shaders/dev) -![CodeRabbit Pull Request Reviews](https://img.shields.io/coderabbit/prs/github/doodlum/skyrim-community-shaders?utm_source=oss&utm_medium=github&utm_campaign=doodlum%2Fskyrim-community-shaders&labelColor=171717&color=FF570A&link=https%3A%2F%2Fcoderabbit.ai&label=CodeRabbit+Reviews) +[![Pre-commit CI](https://results.pre-commit.ci/badge/github/alandtse/open-shaders/dev.svg)](https://results.pre-commit.ci/latest/github/alandtse/open-shaders/dev) +![CodeRabbit Pull Request Reviews](https://img.shields.io/coderabbit/prs/github/alandtse/open-shaders?utm_source=oss&utm_medium=github&utm_campaign=alandtse%2Fopen-shaders&labelColor=171717&color=FF570A&link=https%3A%2F%2Fcoderabbit.ai&label=CodeRabbit+Reviews) -[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/doodlum/skyrim-community-shaders) +[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/alandtse/open-shaders) -# Skyrim Community Shaders +# Open Shaders -SKSE core plugin for community-driven advanced graphics modifications. +SKSE core plugin for advanced graphics modifications for Skyrim and fork of Community Shaders. -[Nexus](https://www.nexusmods.com/skyrimspecialedition/mods/86492) -[User Wiki](https://modding.wiki/en/skyrim/developers/community-shaders) +[Open Shaders developer wiki](https://github.com/alandtse/open-shaders/wiki) · [Upstream Community Shaders on Nexus](https://www.nexusmods.com/skyrimspecialedition/mods/180419) · [Upstream source](https://github.com/community-shaders/skyrim-community-shaders) · [Upstream developer wiki](https://github.com/community-shaders/skyrim-community-shaders/wiki) + +## About this fork + +**Open Shaders is a fork of [Community Shaders](https://github.com/community-shaders/skyrim-community-shaders).** All of the architecture, the shader pipeline, the feature framework, and the vast majority of the code in this repository originated upstream and is the work of the upstream Community Shaders authors and contributors. This fork inherits the upstream [GPL-3.0-or-later license with the Modding and Linking exceptions](./COPYING) — copyrights, authorship, and the modding exceptions are preserved unchanged. See the upstream [contributors page](https://github.com/community-shaders/skyrim-community-shaders/graphs/contributors) for the team behind the project. + +**Naming convention used throughout this repo and the in-game UI:** + +| Term | Refers to | +| ---------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | +| Community Shaders | The upstream project (`community-shaders/skyrim-community-shaders`, Nexus mod 180419) | +| Open Shaders | This fork (`alandtse/open-shaders`) | +| `CommunityShaders` (as a path / filename / identifier in source) | Runtime-compat identifier; intentionally kept identical to upstream so settings, themes, and SKSE plugin discovery work without migration | + +The upstream branding (logo, Nexus icon, typography) is non-GPL and not redistributed by this fork — see the "Icons" section under [License](#license) below. + +An Open Shaders Nexus mod page does not exist yet; for now, install from [GitHub releases](https://github.com/alandtse/open-shaders/releases) or build from source. ## Requirements @@ -60,13 +74,15 @@ Install them manually only if you want them in everywhere. To clone the repository with all submodules, run the following command in your terminal: ```bash -git clone https://github.com/doodlum/skyrim-community-shaders.git --recursive -cd skyrim-community-shaders +git clone https://github.com/alandtse/open-shaders.git --recursive +cd open-shaders ``` +> The DLL filename is `CommunityShaders.dll` and the SKSE plugin directory is `SKSE/Plugins/CommunityShaders/` — identical to upstream Community Shaders, so user settings, themes, and mod-manager profiles are drop-in compatible. Only the public name and in-game branding are "Open Shaders". + ### Visual Studio build -To build the project, just open `./skyrim-community-shaders` with Visual Studio's "Open Folder" feature. (Ensure you have `CMake Tools for Windows` selected when installing VS) +To build the project, just open `./open-shaders` with Visual Studio's "Open Folder" feature. (Ensure you have `CMake Tools for Windows` selected when installing VS) Follow the prompts to `Configure` and `Build` the project. It should generate the AIO package in the `./build/ALL/aio` folder by default. @@ -118,6 +134,8 @@ cmake --build ./build/ALL --config Release --target Package-Core cmake --build ./build/ALL --config Release --target Package-GrassLighting ``` +The AIO bundles only features marked `autoupload = true` in their feature `.ini` — features not yet ready for release are built but excluded from the AIO. To include everything in a local build, see the `AIO_INCLUDE_NON_AUTOUPLOAD` CMake option. + For more details about packaging targets, options, and the difference between automated and manual packaging, see the "Manual packaging targets (detailed)" section in `.claude/CLAUDE.md`. #### CMAKE Options (optional) @@ -149,13 +167,13 @@ For those who prefer to not install Visual Studio or other build dependencies on ```pwsh & 'C:\Program Files\Docker\Docker\DockerCli.exe' -SwitchWindowsEngine; ` -docker build -t skyrim-community-shaders . +docker build -t open-shaders . ``` 3. Then run the build: ```pwsh -docker run -it --rm -v .:C:/skyrim-community-shaders skyrim-community-shaders:latest +docker run -it --rm -v .:C:/open-shaders open-shaders:latest ``` 4. Retrieve the generated build files from the `build/aio` folder. @@ -166,7 +184,7 @@ docker run -it --rm -v .:C:/skyrim-community-shaders skyrim-community-shaders:la If you run into `Access violation` build errors during step 3, you can try adding [`--isolation=process`](https://learn.microsoft.com/en-us/virtualization/windowscontainers/manage-containers/hyperv-container): ```pwsh -docker run -it --rm --isolation=process -v .:C:/skyrim-community-shaders skyrim-community-shaders:latest +docker run -it --rm --isolation=process -v .:C:/open-shaders open-shaders:latest ``` ## Debugging @@ -218,4 +236,4 @@ See LICENSE within each directory; if none, it's [Default](#default) ### Icons -- [Community Shaders Logo](package/Interface/CommunityShaders/Icons/Community%20Shaders%20Logo/) is not covered by the GPL-3.0 license. It is provided solely for personal use (e.g., building from source) and may only be used in unmodified form. There is no license for any other purpose or to distribute the logo. No trademark license is granted for the logo. Any use not expressly permitted is prohibited without the express written consent of the Community Shaders team. +Open Shaders does not ship the upstream Community Shaders logo. The upstream logo is non-GPL, not trademark-licensed, and may only be used in unmodified form with the Community Shaders team's permission — none of which extends to forks. Action icons and category icons are bundled as before; the upstream Discord banner has been removed since the fork has no affiliated Discord channel. The menu renders without a logo image when none is present (the load path is null-safe). diff --git a/cmake/cpp-mcp.cmake b/cmake/cpp-mcp.cmake new file mode 100644 index 0000000000..e218957891 --- /dev/null +++ b/cmake/cpp-mcp.cmake @@ -0,0 +1,107 @@ +# Build cpp-mcp (https://github.com/hkr04/cpp-mcp) from its vendored +# submodule as a static library target. Upstream has no install rules +# (PR #12 still open), so we drive its build ourselves — same pattern +# we use for FidelityFX-SDK and Streamline. +# +# Only the server-side translation units are compiled; the bundled +# stdio/SSE *client* implementations are intentionally omitted because +# we are exclusively a server. +# +# nlohmann_json ABI alignment: +# cpp-mcp vendors nlohmann_json 3.11.3 in extern/cpp-mcp/common/json.hpp, +# while vcpkg ships 3.12.0. Both versions wrap their public API in an +# ABI-versioned inline namespace (`nlohmann::json_abi_v3_11_3` vs +# `nlohmann::json_abi_v3_12_0`), so even though both files share the +# same include guard (INCLUDE_NLOHMANN_JSON_HPP_), the symbol names +# differ. If cpp-mcp's own TUs picked up the vendored copy and our +# consumers picked up vcpkg's, `mcp::server::set_capabilities` and +# `register_tool` would link-fail (LNK2001) with two different +# ABI-tagged signatures. +# +# Fix: patch mcp_message.h at configure time to use +# `#include ` instead of `#include "json.hpp"`. +# The patched copy is written to a build-tree mirror; the submodule +# stays clean. Both cpp-mcp's own compilation and every consumer +# then resolve to vcpkg's 3.12.0 → single ABI namespace, symbols +# match, linker happy. + +set(CPP_MCP_DIR "${CMAKE_SOURCE_DIR}/extern/cpp-mcp") +set(CPP_MCP_PATCHED_INC "${CMAKE_BINARY_DIR}/cpp-mcp-patched/include") + +if(NOT EXISTS "${CPP_MCP_DIR}/src/mcp_server.cpp") + message(FATAL_ERROR + "cpp-mcp submodule missing. Run:\n" + " git submodule update --init --recursive extern/cpp-mcp") +endif() + +find_package(Threads REQUIRED) +find_package(nlohmann_json CONFIG REQUIRED) + +# Patch mcp_message.h to use vcpkg nlohmann_json (see header comment). +# All other cpp-mcp headers are copied verbatim into the patched mirror +# so they live next to the patched header and find each other. +file(MAKE_DIRECTORY "${CPP_MCP_PATCHED_INC}") +file(GLOB _cpp_mcp_headers CONFIGURE_DEPENDS "${CPP_MCP_DIR}/include/*.h") +foreach(_hdr IN LISTS _cpp_mcp_headers) + get_filename_component(_name "${_hdr}" NAME) + file(READ "${_hdr}" _content) + if(_name STREQUAL "mcp_message.h") + # Fail fast if the expected include vanishes upstream — otherwise the + # ABI mismatch would silently come back and only surface as an LNK2001 + # well into the link step. + string(FIND "${_content}" "#include \"json.hpp\"" _json_inc_pos) + if(_json_inc_pos EQUAL -1) + message(FATAL_ERROR + "cpp-mcp: expected `#include \"json.hpp\"` in mcp_message.h " + "but did not find it. Upstream may have changed the include; " + "review cmake/cpp-mcp.cmake and adjust the patch (see header " + "comment for the ABI-alignment rationale).") + endif() + string(REPLACE + "#include \"json.hpp\"" + "#include " + _content "${_content}") + endif() + file(WRITE "${CPP_MCP_PATCHED_INC}/${_name}" "${_content}") +endforeach() + +add_library(cpp-mcp STATIC + "${CPP_MCP_DIR}/src/mcp_message.cpp" + "${CPP_MCP_DIR}/src/mcp_resource.cpp" + "${CPP_MCP_DIR}/src/mcp_server.cpp" + "${CPP_MCP_DIR}/src/mcp_tool.cpp" +) + +# Order matters: patched mirror first so its mcp_message.h wins over the +# submodule's. `common/` is still needed for httplib.h (no ABI issue +# there — it's not shared with any vcpkg dep). +target_include_directories(cpp-mcp + PUBLIC "${CPP_MCP_PATCHED_INC}" + "${CPP_MCP_DIR}/common" +) + +target_compile_features(cpp-mcp PUBLIC cxx_std_17) + +target_compile_definitions(cpp-mcp PUBLIC + MCP_MAX_SESSIONS=10 + MCP_SESSION_TIMEOUT=30 + # cpp-mcp's vendored cpp-httplib pulls in . Skyrim/CLib's + # transitive defaults to the legacy , which + # conflicts (redefinition of sockaddr, WSAData, etc.). Tell Windows + # headers to skip the legacy winsock so winsock2.h is the only one + # in the build. PUBLIC so it propagates to every TU that links + # cpp-mcp (including the PCH compilation of CommunityShaders). + _WINSOCKAPI_ +) + +target_link_libraries(cpp-mcp PUBLIC + Threads::Threads + nlohmann_json::nlohmann_json +) + +if(MSVC) + target_compile_options(cpp-mcp PRIVATE /utf-8 /bigobj /W0) + target_compile_definitions(cpp-mcp PRIVATE _CRT_SECURE_NO_WARNINGS) +endif() + +set_target_properties(cpp-mcp PROPERTIES FOLDER "extern") diff --git a/containerbuild.ps1 b/containerbuild.ps1 index 80390fbb43..8294082ed0 100644 --- a/containerbuild.ps1 +++ b/containerbuild.ps1 @@ -9,7 +9,7 @@ Write-Host "Starting build..." if (-Not (Test-Path -Path "CMakeUserPresets.json")) { Copy-Item -Path "CMakeUserPresets.json.template" -Destination "CMakeUserPresets.json" - (Get-Content -Path "CMakeUserPresets.json") -replace 'F:/MySkyrimModpack/mods/CommunityShaders;F:/SteamLibrary/steamapps/common/SkyrimVR/Data;F:/SteamLibrary/steamapps/common/Skyrim Special Edition/Data', 'C:/skyrim-community-shaders/build' | Set-Content -Path "CMakeUserPresets.json" + (Get-Content -Path "CMakeUserPresets.json") -replace 'F:/MySkyrimModpack/mods/CommunityShaders;F:/SteamLibrary/steamapps/common/SkyrimVR/Data;F:/SteamLibrary/steamapps/common/Skyrim Special Edition/Data', 'C:/open-shaders/build' | Set-Content -Path "CMakeUserPresets.json" Write-Host "CMakeUserPresets.json created and modified." } else { Write-Host "CMakeUserPresets.json already exists. No action taken." diff --git a/docs/development/README.md b/docs/development/README.md index 92cc9e7fff..4dc103e5e4 100644 --- a/docs/development/README.md +++ b/docs/development/README.md @@ -4,6 +4,7 @@ - **[VSCode Setup](./vscode-setup.md)** - IDE configuration, extensions, and auto-deploy - **[Shader Workflow](./shader-workflow.md)** - Fast shader iteration and deployment +- **[Upstream Sync](./upstream-sync.md)** - How Open Shaders merges with upstream community-shaders ## Quick Links diff --git a/docs/development/upstream-sync.md b/docs/development/upstream-sync.md new file mode 100644 index 0000000000..63058dbb53 --- /dev/null +++ b/docs/development/upstream-sync.md @@ -0,0 +1,90 @@ +# Upstream Sync + +How Open Shaders stays current with upstream `community-shaders/skyrim-community-shaders` without losing the fork-specific CI policy, branding, and feature setup. + +## Mechanism + +Upstream syncs land as **merge commits** on `dev`, never as rebases. The scheduled `Maint: Sync upstream/dev` workflow runs a three-way merge of `upstream/dev` into our `dev` and pushes the result. Two pieces make this safe: + +1. **`.gitattributes` with `merge=ours` entries** for every file the fork owns end-to-end (CI workflows, `.releaserc`, README). During the merge, git's `ours` driver keeps the fork's version of those paths verbatim — upstream's changes to them are discarded without surfacing as conflicts. +2. **The `ours` merge driver itself** must be defined locally. `merge=ours` in `.gitattributes` _references_ a driver but doesn't define one. The sync workflow defines it as a no-op (`git config merge.ours.driver true`). **Local contributors who run the merge by hand must run the same command once per clone** — see [Setup](#setup-once-per-clone) below. + +## Why merge, not rebase + +We previously used `git rebase upstream/dev`. It silently regressed fork-owned files on every sync. The mechanism: upstream cherry-picks one of our fork commits → we rebase later → git detects the duplicate via patch-id and skips our commit as "already applied" → an upstream follow-up that deletes or edits the same file then applies cleanly. End result: the fork loses content with zero merge conflicts and zero log noise. The rebase reports success. + +A 3-way merge consults both sides at every path independently of patch-id. Our `merge=ours` driver fires for fork-owned paths; everything else gets a real 3-way merge. Either a clean result or a visible conflict — nothing silent. + +See `.gitattributes` for the current fork-owned list. When a file should join or leave that list, update both the attributes and the comment block above the list explaining why. + +## Setup (once per clone) + +```bash +git config merge.ours.driver true +``` + +That's it. The `ours` driver is intentionally not built into git (for security — driver definitions can run arbitrary commands), so each clone declares it locally. The sync CI does this in its own setup step. + +If you've already run an upstream merge without this config, git would have raised an "unknown merge driver 'ours'" warning and used the default 3-way merge for those files, potentially producing surprising conflicts in fork-owned paths. Re-run with the driver configured and the conflicts disappear. + +## Running a sync manually + +```bash +# Make sure local dev matches origin/dev before merging upstream. +# A stale local dev would either reject the push as non-fast-forward +# or have you re-resolving conflicts already resolved by a prior run. +git fetch origin dev upstream/dev +git switch dev +git reset --hard origin/dev + +git merge --no-ff --no-edit \ + -m "chore(sync): merge upstream/dev as of $(git rev-parse --short upstream/dev)" \ + upstream/dev +# resolve conflicts (only in non-fork-owned paths), then: +git push origin dev +``` + +The scheduled workflow does exactly this on Monday 08:00 UTC. Manual dispatch via `gh workflow run "Maint: Sync upstream/dev"` is available for urgent syncs and accepts a `dry_run` flag. + +## Versioning and changelog interaction + +The merge commit's message is `chore(sync): merge upstream/dev as of `. semantic-release sees it as a `chore` and doesn't release on the commit itself. + +**However**, semantic-release's default DAG walk follows the merge into upstream's commit history. Upstream's `feat:` and `fix:` commits that came in via the merge are visible to the commit analyzer and **do** drive version bumps in our release stream. This is deliberate: the fork's version reflects everything actually shipped to users, including upstream fixes that arrived via merge. + +When upstream cherry-picks one of our commits and that cherry-pick lands in our merge, both copies of the same logical change get walked. Version-wise this is harmless (one release can only bump once at the max severity). Changelog-wise it produces a duplicate entry. If this becomes annoying, the fix is a `writerOpts.transform` in `.releaserc` that dedupes by patch-id — not done preemptively. + +## When the workflow halts + +A real conflict (in a file _not_ on the fork-owned list) means upstream and the fork have both meaningfully changed the same code. Examples we'd expect: + +- Both forks bump the same feature INI version. +- We add a method to a class upstream also modified. +- We rename a function upstream also renamed. + +The workflow `git merge --abort`s, posts the conflicted file list to the workflow summary, and exits non-zero. Resolution is manual: clone, run the same merge locally, resolve, push. + +If you do recurring syncs, enabling `git rerere` is worth the one-time setup — it caches each conflict resolution and replays it the next time the same hunks conflict. Per-clone setting, not repo-wide: + +```bash +git config rerere.enabled true +git config rerere.autoupdate true +``` + +Caches live in `.git/rr-cache/` and aren't pushed, so each maintainer builds their own. CI runners start with empty caches every run and benefit nothing from rerere — only the maintainers doing the merges locally see the time savings. + +## Inspecting what a sync did + +Each sync workflow run leaves a summary on the run page with: + +- Upstream tip SHA +- `git diff --stat` of files changed +- `git log --oneline` of commits brought in + +For deeper inspection after a push: + +```bash +# all changes since the last sync merge +git log --first-parent --merges --grep='chore(sync)' -1 # find the merge commit +git diff ~1.. # changes the merge introduced +``` diff --git a/docs/development/vscode-setup.md b/docs/development/vscode-setup.md index 7122622582..8dac26867a 100644 --- a/docs/development/vscode-setup.md +++ b/docs/development/vscode-setup.md @@ -60,7 +60,7 @@ Automatically deploy shaders when you save `.hlsl` or `.hlsli` files. **Interaction with built-in filewatcher:** -Community Shaders has a built-in filewatcher (**Settings → Advanced → Shader Compilation → Enable File Watcher**) that hot-reloads shaders when files change in the game's `Data/Shaders/` directory. The workflow is: +Community Shaders has a built-in filewatcher (**Settings → Advanced → Shaders → Cache & File Watcher → Enable File Watcher**) that hot-reloads shaders when files change in the game's `Data/Shaders/` directory. The workflow is: 1. Edit shader in VSCode 2. Save → RunOnSave deploys to `Data/Shaders/` diff --git a/docs/new-feature-template/NewFeatureReadme.md b/docs/new-feature-template/NewFeatureReadme.md index 85c06059fa..2abe7e6b4d 100644 --- a/docs/new-feature-template/NewFeatureReadme.md +++ b/docs/new-feature-template/NewFeatureReadme.md @@ -1,6 +1,6 @@ # New Feature Development Reference -Quick reference for creating new graphics features in Community Shaders. +Quick reference for creating new graphics features in Open Shaders (and upstream Community Shaders — the feature model is shared). ## File Structure diff --git a/extern/cpp-mcp b/extern/cpp-mcp new file mode 160000 index 0000000000..a0eb22c98d --- /dev/null +++ b/extern/cpp-mcp @@ -0,0 +1 @@ +Subproject commit a0eb22c98dbd8ce8b3ef69679310c1a038905c08 diff --git a/features/Light Limit Fix/Shaders/Features/LightLimitFix.ini b/features/Light Limit Fix/Shaders/Features/LightLimitFix.ini index 7b3c6c9135..0cb32375a0 100644 --- a/features/Light Limit Fix/Shaders/Features/LightLimitFix.ini +++ b/features/Light Limit Fix/Shaders/Features/LightLimitFix.ini @@ -1,5 +1,5 @@ [Info] -Version = 3-0-3 +Version = 3-2-0 [Nexus] autoupload = false diff --git a/features/Light Limit Fix/Shaders/LightLimitFix/Common.hlsli b/features/Light Limit Fix/Shaders/LightLimitFix/Common.hlsli index f2388d9ddd..4baa0e8389 100644 --- a/features/Light Limit Fix/Shaders/LightLimitFix/Common.hlsli +++ b/features/Light Limit Fix/Shaders/LightLimitFix/Common.hlsli @@ -43,9 +43,144 @@ struct Light float4 positionWS[2]; uint4 roomFlags; uint lightFlags; - uint shadowLightIndex; - uint pad0; - uint pad1; + uint shadowMapIndex; + float2 pad0; }; +// --------------------------------------------------------------------------- +// LLFDEBUG visualization helpers — only compiled when the debug macro is set +// (pixel shaders only; compute shaders that include this file don't define it) +// --------------------------------------------------------------------------- +#if defined(LLFDEBUG) + +// Accumulated per-pixel debug counters filled during the light loop. +// Declare with: LLFDebugInfo di = LLFDebugInfoInit(); +// Update with: LLFDebugAccumulate(di, light, shadowComponent, shadowCoverage); +struct LLFDebugInfo +{ + uint PLShadowCount; // shadow-flagged lights seen (valid + overflow) + float MinPLShadow; // darkest shadow value (1.0 = none seen yet) + uint UnshadowedPLCount; // point/spot lights without shadow maps + uint OverflowCount; // shadow lights whose slot index exceeded ShadowMapSlots + uint FirstShadowIndex; // shadowMapIndex of first valid shadow light + bool HasFirstShadow; + uint SpotCount; // ShadowLightParam.x == 0 + uint HemiCount; // ShadowLightParam.x == 1 + uint OmniCount; // ShadowLightParam.x == 2 +}; + +LLFDebugInfo LLFDebugInfoInit() +{ + LLFDebugInfo di; + di.PLShadowCount = 0; + di.MinPLShadow = 1.0; + di.UnshadowedPLCount = 0; + di.OverflowCount = 0; + di.FirstShadowIndex = 0; + di.HasFirstShadow = false; + di.SpotCount = 0; + di.HemiCount = 0; + di.OmniCount = 0; + return di; +} + +// Call once per clustered/strict light after sampling its shadow. +// shadowCoverage should be the hasCoverage output from GetShadowLightShadow. +// shadowType should be (uint)LightLimitFix::Shadows[light.shadowMapIndex].ShadowLightParam.x +// when light.shadowMapIndex < ShadowMapSlots, or any value otherwise (it won't be read). +void LLFDebugAccumulate(inout LLFDebugInfo di, Light light, float shadowComponent, bool shadowCoverage, + uint shadowType) +{ + if (light.lightFlags & LightFlags::Shadow) { + di.PLShadowCount++; + if (shadowCoverage) + di.MinPLShadow = min(di.MinPLShadow, shadowComponent); + if (light.shadowMapIndex >= SharedData::lightLimitFixSettings.ShadowMapSlots) { + di.OverflowCount++; + } else { + if (!di.HasFirstShadow) { + di.FirstShadowIndex = light.shadowMapIndex; + di.HasFirstShadow = true; + } + if (shadowType == 0) + di.SpotCount++; + else if (shadowType == 1) + di.HemiCount++; + else + di.OmniCount++; + } + } else { + di.UnshadowedPLCount++; + } +} + +// Returns the debug visualization color for this pixel. +// Callers supply the small set of per-shader-variant values: +// mode0Color — output for mode 0 (e.g. TurboColormap(strictLightsOverflow)) +// mode1Color — output for mode 1 (e.g. TurboColormap(strictLightCount/15)) +// mode2Color — output for mode 2 (e.g. TurboColormap(clusteredCount/MAX)) +// mode3Color — output for mode 3 (e.g. float3(dirSoftShadow, dirDetailedShadow, 0)) +// lumaColor — accumulated lighting color used as luma source for mode 8 +float3 LLFDebugGetVizColor(LLFDebugInfo di, + float3 mode0Color, float3 mode1Color, float3 mode2Color, float3 mode3Color, + float3 lumaColor) +{ + uint mode = SharedData::lightLimitFixSettings.LightsVisualisationMode; + + if (mode == 0) + return mode0Color; + else if (mode == 1) + return mode1Color; + else if (mode == 2) + return mode2Color; + else if (mode == 3) + return mode3Color; + else if (mode == 4) + return Color::TurboColormap((float)di.PLShadowCount / 8.0); + else if (mode == 5) + return float3(di.MinPLShadow, di.MinPLShadow, di.MinPLShadow); + else if (mode == 6) + return Color::TurboColormap((float)di.UnshadowedPLCount / 8.0); + else if (mode == 7) { + if (di.OverflowCount > 0) + return float3(1.0, 0.0, 0.0); + uint validCount = di.PLShadowCount - di.OverflowCount; + uint slots = SharedData::lightLimitFixSettings.ShadowMapSlots; + float t; + if (validCount == 0) + t = 0.0; + else if (validCount <= 4) + t = float(validCount - 1) / 3.0 * 0.3; + else { + uint extSlots = max(slots, 6u) - 5u; + t = 0.3 + saturate(float(validCount - 5) / float(extSlots)) * 0.5; + } + return Color::TurboColormap(t); + } else if (mode == 8) { + float luma = dot(lumaColor, float3(0.2126, 0.7152, 0.0722)); + if (di.OverflowCount > 0) + return float3(1.0, 0.0, 0.0); + else if (!di.HasFirstShadow) + return luma.xxx; + float hue = frac(float(di.FirstShadowIndex) * 0.618033988); + float3 rgb = saturate(abs(frac(hue + float3(0.0, 2.0 / 3.0, 1.0 / 3.0)) * 6.0 - 3.0) - 1.0); + return rgb * luma; + } else { + // Mode 9 — light type visualization + if (di.OverflowCount > 0) + return float3(1.0, 0.0, 0.0); + float scale = 1.0 / 4.0; + float3 typeColor = float3( + saturate(float(di.SpotCount) * scale), + saturate(float(di.HemiCount) * scale), + saturate(float(di.OmniCount) * scale)); + bool hasShadowLights = (di.SpotCount + di.HemiCount + di.OmniCount) > 0; + if (!hasShadowLights) + typeColor = saturate(float(di.UnshadowedPLCount) * scale) * 0.35; + return typeColor; + } +} + +#endif // defined(LLFDEBUG) + #endif //__LLF_COMMON_DEPENDENCY_HLSL__ \ No newline at end of file diff --git a/features/Light Limit Fix/Shaders/LightLimitFix/LightLimitFix.hlsli b/features/Light Limit Fix/Shaders/LightLimitFix/LightLimitFix.hlsli index c30b35ed57..82d86247bc 100644 --- a/features/Light Limit Fix/Shaders/LightLimitFix/LightLimitFix.hlsli +++ b/features/Light Limit Fix/Shaders/LightLimitFix/LightLimitFix.hlsli @@ -4,6 +4,11 @@ namespace LightLimitFix #include "LightLimitFix/Common.hlsli" + static const float DirectionalBias = 0.5f * (0.00025f) / 3.0f; + + // Shadow Radius for PCF + static const float PCFRadius2D = 0.002; + cbuffer StrictLightData : register(b3) { uint NumStrictLights; @@ -37,12 +42,90 @@ namespace LightLimitFix return true; } - bool IsLightIgnored(Light light) + bool IsSaturated(float value) + { + return value == saturate(value); + } + + bool IsSaturated(float2 value) + { + return IsSaturated(value.x) && IsSaturated(value.y); + } + + // Per-eye stereo-stable IGN coord. In VR we use screenUV (per-eye via + // CameraProj[eye]) instead of SV_Position so both eyes hash the same + // value at the same world pixel — SV_Position differs between eyes in + // a packed stereo buffer, producing per-eye jitter that reads as flicker + // on contact-shadow recipients. + // + // BufferDim.x is the full packed stereo width (kMAIN spans both eyes + // side-by-side), so the 0.5 factor lands us on the per-eye integer + // pixel grid — same IGN frequency as flat mode for a buffer sized + // (BufferDim.x/2, BufferDim.y). Do not drop the 0.5: that over-samples + // IGN by ~2x in X, giving a higher-frequency noise pattern, not lower. + float2 GetContactShadowNoiseCoord(float2 screenPosition, float2 screenUV) + { +#if defined(VR) + return screenUV * float2(SharedData::BufferDim.x * 0.5, SharedData::BufferDim.y); +#else + return screenPosition; +#endif + } + + // Skyrim's first-person viewmodel renders in a compressed depth range below this + // linearized value; reject occluders there since the viewmodel isn't in the world. + static const float CONTACT_SHADOW_FIRST_PERSON_MAX_DEPTH = 16.5; + + // Reference view-space depth for perspective-correct stride. At/below this depth, + // stride matches its prior view-space meaning; beyond it, stride and the depth-delta + // band scale linearly with depth so each step covers ~constant screen-space distance + // and the shadow-thickness band tracks the same screen-space extent. + static const float CONTACT_SHADOW_REFERENCE_DEPTH = 100.0; + + float ContactShadows(float3 viewPosition, float noise2D, float3 lightDirectionVS, uint contactShadowSteps, uint a_eyeIndex = 0) { - if (light.lightFlags & LightLimitFix::LightFlags::Shadow) { - return !(ShadowBitMask & (1 << light.shadowLightIndex)); + if (contactShadowSteps == 0) + return 1.0; + + // Perspective-correct stride: scale view-space step length with depth so each step + // covers ~constant screen-space distance. Inverse-scale the thickness/fade band so + // the depth-delta window tracks the same screen-space extent across depths. + float perspectiveScale = max(viewPosition.z, CONTACT_SHADOW_REFERENCE_DEPTH) / CONTACT_SHADOW_REFERENCE_DEPTH; + float depthDeltaThickness = SharedData::lightLimitFixSettings.ContactShadowThickness / perspectiveScale; + float depthDeltaFade = SharedData::lightLimitFixSettings.ContactShadowDepthFade / perspectiveScale; + lightDirectionVS *= SharedData::lightLimitFixSettings.ContactShadowStride * perspectiveScale; + + // Offset starting position with interleaved gradient noise + viewPosition += lightDirectionVS * noise2D; + + // Accumulate samples + float contactShadow = 0.0; + for (uint i = 0; i < contactShadowSteps; i++) { + // Step the ray + viewPosition += lightDirectionVS; + + float2 rayUV = FrameBuffer::ViewToUV(viewPosition, true, a_eyeIndex); + + // Ensure the UV coordinates are inside the screen + if (!IsSaturated(rayUV)) + break; + + // Compute the difference between the ray's and the camera's depth + float rayDepth = SharedData::GetScreenDepth(rayUV, a_eyeIndex); + + // Difference between the current ray distance and the marched light + float depthDelta = viewPosition.z - rayDepth; + if (rayDepth > CONTACT_SHADOW_FIRST_PERSON_MAX_DEPTH) + contactShadow = max(contactShadow, saturate(depthDelta * depthDeltaThickness) - saturate(depthDelta * depthDeltaFade)); + if (contactShadow == 1.0) + break; } + return 1.0 - saturate(contactShadow); + } + + bool IsLightIgnored(Light light) + { bool lightIgnored = false; if ((light.lightFlags & LightFlags::PortalStrict) && RoomIndex >= 0) { lightIgnored = true; @@ -60,4 +143,289 @@ namespace LightLimitFix } return lightIgnored; } + + struct ShadowLightData + { + column_major float4x4 ShadowProj; + column_major float4x4 InvShadowProj; + float4 ShadowLightParam; + }; + + // t100/t101 are reserved for Grass Collision (its Collision texture binds at + // t100, and shaders like RunGrass include both features). LLF shadow data + // uses t102/t103 to avoid the collision; keep the C++ PSSetShaderResources + // slots in src/Features/LightLimitFix/ShadowRenderer.cpp in sync. + StructuredBuffer Shadows : register(t102); + Texture2DArray ShadowMaps : register(t103); + Texture2DArray DirectionalShadowCascades : register(t99); + + // engineMaskShadow: the engine's pre-rendered 4-cascade shadow mask sample + // at this pixel (TexShadowMaskSampler.Load(int3(Position.xy, 0)).x). LLF's + // DirectionalShadowLightData carries only cascades 0/1 (ShadowProj[2] / + // EndSplitDistances.xy); past EndSplitDistances.y we have no LLF data and + // must fall through to the engine mask. Returning 1.0 there leaves distant + // pixels fully lit -- visible as global scene brightening with shadows + // disappearing past a depth that varies with camera position. + float GetDirectionalShadow(float3 worldPosition, float3 worldPositionWS, float2x2 rotationMatrix, uint eyeIndex, float engineMaskShadow) + { + DirectionalShadowLightData shadowLightData = DirectionalShadowLights[0]; + + float shadowMapDepth = SharedData::GetScreenDepth(FrameBuffer::GetShadowDepth(worldPosition, eyeIndex)); + + // Past cascade 1 -- defer to the engine's 4-cascade mask. + if (shadowMapDepth > shadowLightData.EndSplitDistances.y) + return engineMaskShadow; + + // Blend from LLF PCF deep in cascade 1 toward the engine mask as we + // approach cascade 1's far edge, avoiding a hard discontinuity at the + // boundary where LLF stops and engine sampling takes over. + // + // Previous formula used `dot(worldPosition, worldPosition) / + // EndSplitDistances.y` -- dimensionally wrong (length^2 / length) + // AND inverted (close pixels got engineMaskShadow, far got LLF). + // Because `worldPosition` is camera-relative in Skyrim's vertex + // output, that produced a visible ~sqrt(EndSplitDistances.y)-radius + // ring around the camera that moved with the player -- a clear + // HMD-tracked artifact in VR. Switching to linear `shadowMapDepth` + // and reversing the blend direction makes the handoff a smooth + // world-anchored transition at the cascade boundary. + float fadeFactor = smoothstep(shadowLightData.EndSplitDistances.y * 0.8, + shadowLightData.EndSplitDistances.y, + shadowMapDepth); + + // Compute cascade blend factor + float cascadeSelect = smoothstep(shadowLightData.StartSplitDistances.y, shadowLightData.EndSplitDistances.x, shadowMapDepth); + + // Determine which cascade(s) to sample + uint primaryCascade = cascadeSelect; + bool needsBlending = (cascadeSelect > 0.0) && (cascadeSelect < 1.0); + + // Transform ray to light space for primary cascade + float3 positionLS = mul(shadowLightData.ShadowProj[primaryCascade], float4(worldPositionWS, 1)).xyz; + positionLS.z -= DirectionalBias; + + // Sample primary cascade + float shadow = 0.0; + + [unroll] for (int i = 0; i < 8; i++) + { + float2 sampleOffset = mul(Random::SpiralSampleOffsets8[i], rotationMatrix); + float2 sampleUV = positionLS.xy + sampleOffset * PCFRadius2D; + shadow += dot(float4(DirectionalShadowCascades.GatherRed(LinearSampler, float3(saturate(sampleUV), primaryCascade)) > positionLS.z), 0.25); + } + + shadow /= 8.0; + + // Blend with secondary cascade if needed + [branch] if (needsBlending) + { + uint secondaryCascade = 1 - primaryCascade; + + positionLS = mul(shadowLightData.ShadowProj[secondaryCascade], float4(worldPositionWS, 1)).xyz; + positionLS.z -= DirectionalBias; + + float shadowBlend = 0.0; + + [unroll] for (int i = 0; i < 8; i++) + { + float2 sampleOffset = mul(Random::SpiralSampleOffsets8[i], rotationMatrix); + float2 sampleUV = positionLS.xy + sampleOffset * PCFRadius2D; + shadowBlend += dot(float4(DirectionalShadowCascades.GatherRed(LinearSampler, float3(saturate(sampleUV), secondaryCascade)) > positionLS.z), 0.25); + } + + shadowBlend /= 8.0; + + shadow = lerp(shadow, shadowBlend, cascadeSelect); + } + + // Within cascade 1's far edge, blend LLF's PCF toward the engine + // mask instead of fading to fully-lit -- avoids a hard brightness + // discontinuity at the cascade boundary. + shadow = lerp(shadow, engineMaskShadow, fadeFactor); + + // Focus shadows: high-resolution actor shadows the engine renders to + // kSHADOWMAPS slices [kFocusShadowBaseSlotIndex .. +FocusShadowCount). + // Each focus matrix projects worldPositionWS into the actor's clip + // space; pixels outside [0,1] UV or [0,1] depth aren't covered and + // contribute no occlusion. Combine via min() so any occluding actor + // wins. Without this, the player's own shadow vanishes when LLF is + // on (the cascade has it at lower resolution; focus made it visible). + // + // Guards: + // - SharedData::lightLimitFixSettings.ShadowMapSlots bounds the slice + // index so we never sample past the texture's allocated array size + // (a real concern with ShadowLightCount < 8 in extended mode). + // - focusClip.w > EPSILON_DIVISION avoids div-by-zero NaN when the focus + // matrix hasn't been populated yet (first frame of a scene load + // before the engine's RenderShadowmaps has run the focus loop). + [unroll] for (uint fi = 0; fi < 4; fi++) + { + [branch] if (fi >= shadowLightData.FocusShadowCount) break; + const uint focusSlice = 4 + fi; // kFocusShadowBaseSlotIndex + [branch] if (focusSlice >= SharedData::lightLimitFixSettings.ShadowMapSlots) break; + float4 focusClip = mul(shadowLightData.FocusShadowProj[fi], float4(worldPositionWS, 1)); + [branch] if (focusClip.w <= EPSILON_DIVISION) continue; + focusClip.xyz /= focusClip.w; + float2 focusUV = focusClip.xy * 0.5 + 0.5; + [branch] if (all(focusUV >= 0.0) && all(focusUV <= 1.0) && focusClip.z >= 0.0 && focusClip.z <= 1.0) + { + float focusDepth = focusClip.z - DirectionalBias; + float focusVis = 0.0; + [unroll] for (int fs = 0; fs < 8; fs++) + { + float2 fsOffset = mul(Random::SpiralSampleOffsets8[fs], rotationMatrix); + float2 fsUV = focusUV + fsOffset * PCFRadius2D; + focusVis += dot(float4(ShadowMaps.GatherRed(LinearSampler, float3(saturate(fsUV), focusSlice)) > focusDepth), 0.25); + } + focusVis /= 8.0; + shadow = min(shadow, focusVis); + // Fully occluded -- remaining focus actors can only multiply + // by zero, so skip their 8-tap GatherRed work on this pixel. + [branch] if (shadow <= 0.0) break; + } + } + + return shadow; + } + + // Convenience overload: callers without TexShadowMaskSampler bound + // (e.g. Particle.hlsl) get the lit-fallback behaviour (1.0) past + // cascade 1, matching the pre-engine-mask behaviour for those paths. + float GetDirectionalShadow(float3 worldPosition, float3 worldPositionWS, float2x2 rotationMatrix, uint eyeIndex) + { + return GetDirectionalShadow(worldPosition, worldPositionWS, rotationMatrix, eyeIndex, 1.0); + } + + float GetDirectionalShadow(float3 worldPosition, float3 worldPositionWS, float2x2 rotationMatrix) + { + return GetDirectionalShadow(worldPosition, worldPositionWS, rotationMatrix, 0, 1.0); + } + + float SampleShadowGather(uint shadowIndex, float2 uv, float receiverDepth) + { + float4 samples = ShadowMaps.GatherRed(LinearSampler, float3(uv, shadowIndex)); + return dot(float4(samples > receiverDepth), 0.25); + } + + float GetSpotlightShadow(ShadowLightData shadowLightData, uint shadowIndex, float4 positionLS, float2x2 rotationMatrix) + { + positionLS.xyz /= positionLS.w; + positionLS.xy = positionLS.xy * 0.5 + 0.5; + positionLS.z -= shadowLightData.ShadowLightParam.z; + + float shadow = 0.0; + + [unroll] for (int i = 0; i < 8; i++) + { + float2 sampleOffset = mul(Random::SpiralSampleOffsets8[i], rotationMatrix); + float2 sampleUV = positionLS.xy + sampleOffset * PCFRadius2D; + shadow += SampleShadowGather(shadowIndex, sampleUV, positionLS.z); + } + + return shadow / 8.0; + } + + // PCF sample around a paraboloid UV. + // isDualParaboloid = true : the slice contains two stacked paraboloids + // (omni: upper in y∈[0,0.5], lower in y∈[0.5,1]). + // Clamp PCF samples to the originating half so we + // don't bleed across the seam. + // isDualParaboloid = false : the slice contains a single paraboloid filling + // the whole y∈[0,1] (hemi). No clamping needed — + // the entire slice is valid shadow data. + float SampleParaboloidShadow(uint shadowIndex, float2 sampleUV, float depth, float2x2 rotationMatrix, bool isDualParaboloid) + { + float shadow = 0.0; + + [unroll] for (int i = 0; i < 8; i++) + { + float2 offset = mul(Random::SpiralSampleOffsets8[i], rotationMatrix) * PCFRadius2D; + float2 uv = sampleUV + offset; + + if (isDualParaboloid) { + // Clamp PCF samples to the originating paraboloid half. + uv.y = (sampleUV.y >= 0.5) ? max(uv.y, 0.5) : min(uv.y, 0.5); + } + + shadow += SampleShadowGather(shadowIndex, uv, depth); + } + + return shadow / 8.0; + } + + float GetOmnidirectionalShadow(ShadowLightData shadowLightData, uint shadowIndex, float4 positionLS, float2x2 rotationMatrix) + { + // ShadowLightParam.x: + // 0 = spot/frustum (handled in GetShadowLightShadow before reaching here) + // 1 = hemisphere — engine renders ONE paraboloid filling the slice + // 2 = omnidirectional (dual paraboloid) — TWO paraboloids stacked in slice + // + // Verified against kSHADOWMAPS slice contents in RenderDoc: hemi slices show + // a single continuous depth gradient across y=0.5 with no seam, while omni + // slices show two distinct paraboloid renderings stacked. Treating hemi + // like omni applies a Y-axis compression / mirror that visibly distorts + // (the "inverted or rotated 90°" symptom). + const bool isOmni = (shadowLightData.ShadowLightParam.x == 2); + + bool lowerHalf = positionLS.z < 0; + + // Hemi only renders the +Z paraboloid; behind the light has no shadow data. + // Returning 1.0 (fully lit) lets the light's own attenuation handle falloff + // for points the engine never wrote shadow data for. + if (!isOmni && lowerHalf) + return 1.0; + + positionLS.xyz /= positionLS.w; + + float3 posOffset = lowerHalf ? float3(0, 0, -1) : float3(0, 0, 1); + float3 lightDirection = normalize(normalize(positionLS.xyz) + posOffset); + float2 sampleUV = lightDirection.xy / lightDirection.z * 0.5 + 0.5; + + // Y compression only applies to omni's dual layout. Hemi fills the whole + // slice so its sampleUV.y stays in [0, 1] directly. + if (isOmni) + sampleUV.y = lowerHalf ? 1.0 - 0.5 * sampleUV.y : 0.5 * sampleUV.y; + + float depth = saturate(length(positionLS.xyz) / shadowLightData.ShadowLightParam.y); + depth -= shadowLightData.ShadowLightParam.z; + + return SampleParaboloidShadow(shadowIndex, sampleUV, depth, rotationMatrix, isOmni); + } + + // Single-assignment of hasCoverage at function entry keeps FXC's flow + // analyser quiet: prior versions used an early-return overflow guard + // that wrote hasCoverage on two paths, which tripped X4000 "potentially + // uninitialized" warnings at the post-merge point across 360 permutations + // (both `out` and `inout` signatures hit the same false positive). + // + // Overflow handling: `shadowIndex >= ShadowMapSlots` can occur transiently + // when a light was promoted to shadow on a frame where the texture-array + // allocation hadn't extended to cover it yet. StructuredBuffer reads beyond + // declared bounds return zero per the D3D11 spec, so the Shadows[shadowIndex] + // read is safe -- it falls into the `ShadowLightParam.y == 0` branch below + // and returns 1.0. The `hasCoverage` flag tells the caller whether the + // sample was real, so suppression still works correctly upstream. + float GetShadowLightShadow(uint shadowIndex, float3 worldPositionWS, float2x2 rotationMatrix, out bool hasCoverage) + { + hasCoverage = shadowIndex < SharedData::lightLimitFixSettings.ShadowMapSlots; + + ShadowLightData shadowLightData = Shadows[shadowIndex]; + + [flatten] if (shadowLightData.ShadowLightParam.y == 0) return 1.0; + [flatten] if (shadowLightData.ShadowLightParam.y < 0) return 0.0; + + float4 positionLS = mul(shadowLightData.ShadowProj, float4(worldPositionWS, 1)); + + [branch] if (shadowLightData.ShadowLightParam.x == 0) + { + float shadowBaseVisibility = GetSpotlightShadow(shadowLightData, shadowIndex, positionLS, rotationMatrix); + positionLS.xyz /= positionLS.w; + + float spotFalloff = saturate(1.0 - dot(positionLS.xy, positionLS.xy)); + + return shadowBaseVisibility * spotFalloff; + } + + return GetOmnidirectionalShadow(shadowLightData, shadowIndex, positionLS, rotationMatrix); + } } diff --git a/features/Remote Control/CORE b/features/Remote Control/CORE new file mode 100644 index 0000000000..e69de29bb2 diff --git a/features/Remote Control/Shaders/Features/RemoteControl.ini b/features/Remote Control/Shaders/Features/RemoteControl.ini new file mode 100644 index 0000000000..000b60a568 --- /dev/null +++ b/features/Remote Control/Shaders/Features/RemoteControl.ini @@ -0,0 +1,2 @@ +[Info] +Version = 1-0-0 diff --git a/features/Upscaling/Shaders/Upscaling/FoveatedRender/SubrectStretchCS.hlsl b/features/Upscaling/Shaders/Upscaling/FoveatedRender/SubrectStretchCS.hlsl new file mode 100644 index 0000000000..266fe5fd50 --- /dev/null +++ b/features/Upscaling/Shaders/Upscaling/FoveatedRender/SubrectStretchCS.hlsl @@ -0,0 +1,97 @@ +// Stretches the DRS-rendered region from a temporary render-resolution SBS texture +// to fill the entire eye in the display-resolution kMAIN SBS texture. +// Dispatched once per eye. Supports multiple sampling modes: +// 0 = Bilinear (clean upscale) +// 1 = Point / Nearest (cheapest, VRS-like broadcast) +// 2 = Gaussian Blur 3x3 (soft periphery background) + +cbuffer StretchCB : register(b0) +{ + uint DstOffsetX; // SBS destination X offset for this eye (0 or eyeWidthOut) + uint DstWidth; // display-resolution eye width + uint DstHeight; // display-resolution eye height + uint SrcOffsetX; // SBS source X offset for this eye (0 or renderEyeW) + uint SrcWidth; // render-resolution SBS total width (for UV normalisation) + uint SrcHeight; // render-resolution SBS total height + uint SrcEyeWidth; // render-resolution per-eye width + uint SrcEyeHeight; // render-resolution per-eye height + uint StretchMode; // 0=Bilinear, 1=Point, 2=GaussianBlur + float BlurRadius; // Texel-space radius for Gaussian blur (typical 0.5-4.0) + uint DebugVisualize; // 0=off, 1=tint stretched periphery red so the DLSS region pops + uint _pad; +}; + +Texture2D SrcTex : register(t0); +SamplerState BilinearSampler : register(s0); +RWTexture2D DstTex : register(u0); + +[numthreads(8, 8, 1)] void main(uint3 tid : SV_DispatchThreadID) { + // Zero-dim guard: a misconfigured dispatch with any zero extent would + // divide-by-zero into NaN UVs and underflow point-mode coords into + // huge uint values. Bail before any math. + if (DstWidth == 0 || DstHeight == 0 || SrcWidth == 0 || SrcHeight == 0 || + SrcEyeWidth == 0 || SrcEyeHeight == 0) + return; + + if (tid.x >= DstWidth || tid.y >= DstHeight) + return; + + // Map output pixel to normalised position within this eye [0,1] + float u = ((float)tid.x + 0.5) / (float)DstWidth; + float v = ((float)tid.y + 0.5) / (float)DstHeight; + + // Map to source texel coordinates within this eye's render region + // then convert to full SBS texture UV (adding eye offset) + float srcU = (u * (float)SrcEyeWidth + (float)SrcOffsetX) / (float)SrcWidth; + float srcV = (v * (float)SrcEyeHeight) / (float)SrcHeight; + + // Clamp sample UVs to per-eye texel bounds so the bilinear footprint and + // blur kernel can't reach across the SBS midline into the neighboring + // eye's pixels. + float2 eyeMinUV = float2(((float)SrcOffsetX + 0.5) / (float)SrcWidth, + 0.5 / (float)SrcHeight); + float2 eyeMaxUV = float2(((float)(SrcOffsetX + SrcEyeWidth) - 0.5) / (float)SrcWidth, + ((float)SrcEyeHeight - 0.5) / (float)SrcHeight); + + float4 color; + + if (StretchMode == 1) { + // Point / Nearest: integer texel lookup, cheapest. min() keeps us + // inside [0, SrcEyeWidth-1] / [0, SrcEyeHeight-1] when u/v == 1. + uint2 srcPixel = uint2( + min((uint)(u * (float)SrcEyeWidth), SrcEyeWidth - 1) + SrcOffsetX, + min((uint)(v * (float)SrcEyeHeight), SrcEyeHeight - 1)); + color = SrcTex.Load(int3(srcPixel, 0)); + } else if (StretchMode == 2) { + // Gaussian blur 3x3: 9-tap weighted average around center + float2 texelSize = float2(1.0 / (float)SrcWidth, 1.0 / (float)SrcHeight); + float2 center = float2(srcU, srcV); + float2 step = texelSize * BlurRadius; + + // Gaussian weights for 3x3 kernel (sigma ~ 0.85 * radius) + // Center=4, Edge=2, Corner=1, sum=16 + float4 sum = SrcTex.SampleLevel(BilinearSampler, clamp(center, eyeMinUV, eyeMaxUV), 0) * 4.0; + sum += SrcTex.SampleLevel(BilinearSampler, clamp(center + float2(-step.x, 0), eyeMinUV, eyeMaxUV), 0) * 2.0; + sum += SrcTex.SampleLevel(BilinearSampler, clamp(center + float2(step.x, 0), eyeMinUV, eyeMaxUV), 0) * 2.0; + sum += SrcTex.SampleLevel(BilinearSampler, clamp(center + float2(0, -step.y), eyeMinUV, eyeMaxUV), 0) * 2.0; + sum += SrcTex.SampleLevel(BilinearSampler, clamp(center + float2(0, step.y), eyeMinUV, eyeMaxUV), 0) * 2.0; + sum += SrcTex.SampleLevel(BilinearSampler, clamp(center + float2(-step.x, -step.y), eyeMinUV, eyeMaxUV), 0); + sum += SrcTex.SampleLevel(BilinearSampler, clamp(center + float2(step.x, -step.y), eyeMinUV, eyeMaxUV), 0); + sum += SrcTex.SampleLevel(BilinearSampler, clamp(center + float2(-step.x, step.y), eyeMinUV, eyeMaxUV), 0); + sum += SrcTex.SampleLevel(BilinearSampler, clamp(center + float2(step.x, step.y), eyeMinUV, eyeMaxUV), 0); + color = sum * (1.0 / 16.0); + } else { + // Bilinear (default): single hardware-filtered sample + color = SrcTex.SampleLevel(BilinearSampler, clamp(float2(srcU, srcV), eyeMinUV, eyeMaxUV), 0); + } + + // Debug visualizer: tint the cheap-stretched periphery red so the DLSS + // subrect (which BlendSubrectToOutput overwrites on top of us) reads as + // the un-tinted region. Lets users see at a glance where DLSS is actually + // reconstructing vs where the cheap stretch is filling. + if (DebugVisualize != 0) { + color.rgb = lerp(color.rgb, color.rgb * float3(1.6, 0.35, 0.35), 0.6); + } + + DstTex[uint2(tid.x + DstOffsetX, tid.y)] = color; +} diff --git a/features/Upscaling/Shaders/Upscaling/PerfMode/BoxDownscalePS.hlsl b/features/Upscaling/Shaders/Upscaling/PerfMode/BoxDownscalePS.hlsl new file mode 100644 index 0000000000..a1370e417b --- /dev/null +++ b/features/Upscaling/Shaders/Upscaling/PerfMode/BoxDownscalePS.hlsl @@ -0,0 +1,35 @@ +// BoxDownscalePS.hlsl — PerfMode downscale pass +// Box 3×3 filter: testTexture (3k) → kMAIN (1k). +// For 3:1 downscale, each output pixel averages the 3×3 source region, +// ensuring all DLSS output pixels contribute (vs bilinear's 2×2 coverage). +// Reuses UpscaleVS.hlsl for fullscreen triangle generation (SV_VertexID). + +#include "Upscaling/UpscaleVS.hlsl" + +#if defined(PSHADER) + +typedef VS_OUTPUT PS_INPUT; + +SamplerState LinearSampler : register(s0); +Texture2D SourceTex : register(t0); + +float4 main(PS_INPUT input) : SV_Target +{ + float2 srcSize; + SourceTex.GetDimensions(srcSize.x, srcSize.y); + float2 texelSize = 1.0 / srcSize; + + // Clamp tap UVs to [0,1] so border pixels don't read across edges if the + // sampler ever gets created with wrap/mirror addressing instead of clamp. + // On clamp samplers this saturate() is a no-op the compiler can elide. + float4 sum = 0; + [unroll] for (int y = -1; y <= 1; y++) + [unroll] for (int x = -1; x <= 1; x++) + { + float2 uv = saturate(input.TexCoord + float2(x, y) * texelSize); + sum += SourceTex.SampleLevel(LinearSampler, uv, 0); + } + return sum * (1.0 / 9.0); +} + +#endif diff --git a/features/Upscaling/Shaders/Upscaling/PerfMode/MenuBGBlitPS.hlsl b/features/Upscaling/Shaders/Upscaling/PerfMode/MenuBGBlitPS.hlsl new file mode 100644 index 0000000000..9b20ae742b --- /dev/null +++ b/features/Upscaling/Shaders/Upscaling/PerfMode/MenuBGBlitPS.hlsl @@ -0,0 +1,28 @@ +// MenuBGBlitPS.hlsl — PerfMode main-menu / loading-screen BG blit. +// Fullscreen 1:1 sample of the source texture into kTOTAL/kMENUBG. The +// caller (MaybeBlitMenuBG) feeds DLSS-reconstructed testTexture (R16G16 +// B16A16_FLOAT, displayRes) and the destination kTOTAL is R8G8B8A8_UNORM +// at the same dims — CopyResource can't do this because the formats +// differ, so a draw-based blit handles the implicit float→unorm +// conversion via the RTV format. +// +// Reuses UpscaleVS.hlsl for the fullscreen triangle and a linear clamp +// sampler. saturate() on UV is defense-in-depth for callers binding non- +// clamp samplers. + +#include "Upscaling/UpscaleVS.hlsl" + +#if defined(PSHADER) + +typedef VS_OUTPUT PS_INPUT; + +SamplerState LinearSampler : register(s0); +Texture2D SourceTex : register(t0); + +float4 main(PS_INPUT input) : + SV_Target +{ + return SourceTex.SampleLevel(LinearSampler, saturate(input.TexCoord), 0); +} + +#endif diff --git a/features/VR/Shaders/Features/VR.ini b/features/VR/Shaders/Features/VR.ini index 0bc0292971..9a09bb7e17 100644 --- a/features/VR/Shaders/Features/VR.ini +++ b/features/VR/Shaders/Features/VR.ini @@ -1,5 +1,2 @@ [Info] -Version = 1-1-0 - -[Nexus] -autoupload = false +Version = 1-1-1 diff --git a/features/Volumetric Shadows/Shaders/Features/VolumetricShadows.ini b/features/Volumetric Shadows/Shaders/Features/VolumetricShadows.ini index e9d66d302c..14a4f48fa2 100644 --- a/features/Volumetric Shadows/Shaders/Features/VolumetricShadows.ini +++ b/features/Volumetric Shadows/Shaders/Features/VolumetricShadows.ini @@ -1,5 +1,5 @@ [Info] -Version = 2-0-1 +Version = 2-1-0 [Nexus] autoupload = false diff --git a/features/Volumetric Shadows/Shaders/VolumetricShadows/VolumetricShadows.hlsli b/features/Volumetric Shadows/Shaders/VolumetricShadows/VolumetricShadows.hlsli index cdfb339ba2..9f002d4723 100644 --- a/features/Volumetric Shadows/Shaders/VolumetricShadows/VolumetricShadows.hlsli +++ b/features/Volumetric Shadows/Shaders/VolumetricShadows/VolumetricShadows.hlsli @@ -130,7 +130,7 @@ namespace VolumetricShadows return ComputeVSM(moments, positionLS.z); } - float GetVSMShadow2D(float3 position, uint eyeIndex, out float detailedShadow) + float GetVSMShadow2D(float3 position, float3 positionWS, uint eyeIndex, out float detailedShadow) { DirectionalShadowLightData directionalShadowLightData = DirectionalShadowLights[0]; @@ -145,9 +145,6 @@ namespace VolumetricShadows // Reduce over distance float fade = saturate(shadowMapDepth / directionalShadowLightData.EndSplitDistances.y); - // Cascade projections are world-space; position comes in camera-relative. - float3 positionWS = position + FrameBuffer::CameraPosAdjust[eyeIndex].xyz; - // Compute cascade blend factor with smoothstep float cascadeSelect = saturate((shadowMapDepth - directionalShadowLightData.StartSplitDistances.y) / (directionalShadowLightData.EndSplitDistances.x - directionalShadowLightData.StartSplitDistances.y)); diff --git a/package/Interface/CommunityShaders/Icons/Action Icons/discord.png b/package/Interface/CommunityShaders/Icons/Action Icons/discord.png deleted file mode 100644 index 3654718739..0000000000 Binary files a/package/Interface/CommunityShaders/Icons/Action Icons/discord.png and /dev/null differ diff --git a/package/Interface/CommunityShaders/Icons/Community Shaders Logo/Monochrome/cs-logo.png b/package/Interface/CommunityShaders/Icons/Community Shaders Logo/Monochrome/cs-logo.png deleted file mode 100644 index 56b3d908e4..0000000000 Binary files a/package/Interface/CommunityShaders/Icons/Community Shaders Logo/Monochrome/cs-logo.png and /dev/null differ diff --git a/package/Interface/CommunityShaders/Icons/Community Shaders Logo/cs-logo.png b/package/Interface/CommunityShaders/Icons/Community Shaders Logo/cs-logo.png deleted file mode 100644 index a3b86eff27..0000000000 Binary files a/package/Interface/CommunityShaders/Icons/Community Shaders Logo/cs-logo.png and /dev/null differ diff --git a/package/SKSE/Plugins/CommunityShaders/Overrides/README.md b/package/SKSE/Plugins/CommunityShaders/Overrides/README.md index a39e0cc4aa..114197a73d 100644 --- a/package/SKSE/Plugins/CommunityShaders/Overrides/README.md +++ b/package/SKSE/Plugins/CommunityShaders/Overrides/README.md @@ -1,6 +1,6 @@ -# Community Shaders - Settings Override System +# Open Shaders - Settings Override System -The Settings Override System allows mods to provide custom configuration overrides for Community Shaders features without modifying the main settings file. This enables better mod compatibility and allows multiple mods to adjust different settings independently. +The Settings Override System allows mods to provide custom configuration overrides for Open Shaders features without modifying the main settings file. This enables better mod compatibility and allows multiple mods to adjust different settings independently. ## Directory Structure @@ -121,7 +121,7 @@ To create feature-specific overrides, you need to use the correct feature short ## How It Works -1. **Discovery**: Override files are automatically discovered when Community Shaders loads +1. **Discovery**: Override files are automatically discovered when Open Shaders loads 2. **Priority**: Overrides are applied after the main settings are loaded but before features initialize 3. **Merging**: Override values are merged into the existing settings, overwriting only the specified values 4. **Global vs Feature**: Global overrides affect the main settings structure, while feature-specific overrides only affect individual features @@ -130,7 +130,7 @@ To create feature-specific overrides, you need to use the correct feature short ### In-Game UI -- Navigate to the "Overrides" tab in the Community Shaders menu +- Navigate to the "Overrides" tab in the Open Shaders menu - View all discovered override files - Enable/disable individual overrides - Refresh to discover new override files @@ -159,7 +159,7 @@ To create feature-specific overrides, you need to use the correct feature short - Verify JSON syntax is valid - Ensure feature short name is correct - Check that override system is enabled in the UI -- Look for errors in the Community Shaders log +- Look for errors in the Open Shaders log (CommunityShaders.log) ### JSON Validation @@ -171,7 +171,7 @@ Use a JSON validator to ensure your override files have valid syntax: ### Log Messages -Community Shaders logs override discovery and application: +Open Shaders logs override discovery and application: - Check `CommunityShaders.log` for override-related messages - Look for "Discovered X override files" and "Applied X override(s)" messages diff --git a/package/SKSE/Plugins/CommunityShaders/Themes/DragonBlood/discord.png b/package/SKSE/Plugins/CommunityShaders/Themes/DragonBlood/discord.png deleted file mode 100644 index 666cb18c9b..0000000000 Binary files a/package/SKSE/Plugins/CommunityShaders/Themes/DragonBlood/discord.png and /dev/null differ diff --git a/package/SKSE/Plugins/CommunityShaders/Themes/DwemerBronze/discord.png b/package/SKSE/Plugins/CommunityShaders/Themes/DwemerBronze/discord.png deleted file mode 100644 index c5cdae99ee..0000000000 Binary files a/package/SKSE/Plugins/CommunityShaders/Themes/DwemerBronze/discord.png and /dev/null differ diff --git a/package/SKSE/Plugins/CommunityShaders/Themes/HighContrast/discord.png b/package/SKSE/Plugins/CommunityShaders/Themes/HighContrast/discord.png deleted file mode 100644 index 6b585b8583..0000000000 Binary files a/package/SKSE/Plugins/CommunityShaders/Themes/HighContrast/discord.png and /dev/null differ diff --git a/package/SKSE/Plugins/CommunityShaders/Themes/Light/cs-logo.png b/package/SKSE/Plugins/CommunityShaders/Themes/Light/cs-logo.png deleted file mode 100644 index 6f9084da18..0000000000 Binary files a/package/SKSE/Plugins/CommunityShaders/Themes/Light/cs-logo.png and /dev/null differ diff --git a/package/SKSE/Plugins/CommunityShaders/Themes/Light/discord.png b/package/SKSE/Plugins/CommunityShaders/Themes/Light/discord.png deleted file mode 100644 index 6b585b8583..0000000000 Binary files a/package/SKSE/Plugins/CommunityShaders/Themes/Light/discord.png and /dev/null differ diff --git a/package/SKSE/Plugins/CommunityShaders/Themes/NordicFrost/discord.png b/package/SKSE/Plugins/CommunityShaders/Themes/NordicFrost/discord.png deleted file mode 100644 index 53705ef62c..0000000000 Binary files a/package/SKSE/Plugins/CommunityShaders/Themes/NordicFrost/discord.png and /dev/null differ diff --git a/package/Shaders/Common/FrameBuffer.hlsli b/package/Shaders/Common/FrameBuffer.hlsli index 68f0f90371..775d41f2d6 100644 --- a/package/Shaders/Common/FrameBuffer.hlsli +++ b/package/Shaders/Common/FrameBuffer.hlsli @@ -85,9 +85,6 @@ namespace FrameBuffer return clamp(screenPositionDR, minValue, maxValue); } - // Projects a world-space (camera-relative) point into NDC using the eye's CameraViewProj - // and returns the post-perspective z (NDC depth). Combine with SharedData::GetScreenDepth - // to get a linear view-space distance suitable for cascade-split comparisons. float GetShadowDepth(float3 positionWS, uint eyeIndex) { float4 positionCS = mul(FrameBuffer::CameraViewProj[eyeIndex], float4(positionWS, 1)); diff --git a/package/Shaders/Common/ShadowSampling.hlsli b/package/Shaders/Common/ShadowSampling.hlsli index 70e7143f54..2dcba97856 100644 --- a/package/Shaders/Common/ShadowSampling.hlsli +++ b/package/Shaders/Common/ShadowSampling.hlsli @@ -30,6 +30,12 @@ struct DirectionalShadowLightData column_major float4x4 InvShadowProj[2]; float2 EndSplitDistances; float2 StartSplitDistances; + // Focus shadow projections (per FocusShadowActor, max 4). Sample at + // kSHADOWMAPS slice (4 + i) using FocusShadowProj[i]; only entries with + // index < FocusShadowCount are valid. + column_major float4x4 FocusShadowProj[4]; + uint FocusShadowCount; + uint3 _pad0; }; StructuredBuffer DirectionalShadowLights : register(t98); @@ -112,10 +118,7 @@ namespace ShadowSampling surfaceShadow *= vsmSurfaceShadow; return worldShadow * shadow; } -#else - return worldShadow; #endif - return worldShadow; } @@ -127,7 +130,7 @@ namespace ShadowSampling } #if defined(VOLUMETRIC_SHADOWS) - float shadow = VolumetricShadows::GetVSMShadow2D(worldPosition, eyeIndex, detailedShadow); + float shadow = VolumetricShadows::GetVSMShadow2D(worldPosition, worldPosition + FrameBuffer::CameraPosAdjust[eyeIndex].xyz, eyeIndex, detailedShadow); return shadow; #else detailedShadow = 1.0; diff --git a/package/Shaders/Common/SharedData.hlsli b/package/Shaders/Common/SharedData.hlsli index 8168507571..8530f295d6 100644 --- a/package/Shaders/Common/SharedData.hlsli +++ b/package/Shaders/Common/SharedData.hlsli @@ -75,10 +75,20 @@ namespace SharedData struct LightLimitFixSettings { + uint EnableContactShadows; + uint ContactShadowMaxSteps; + float ContactShadowMaxDistance; + float ContactShadowStride; + float ContactShadowThickness; + float ContactShadowDepthFade; + float ContactShadowMinIntensity; + uint ShadowMapSlots; // total shadow map texture-array capacity + // Cluster config (computed) + uint4 ClusterSize; + // Debug (last) uint EnableLightsVisualisation; uint LightsVisualisationMode; - float2 pad0; - uint4 ClusterSize; + uint2 pad0; }; struct WetnessEffectsSettings diff --git a/package/Shaders/Effect.hlsl b/package/Shaders/Effect.hlsl index fcd9dfb2e8..1ea8713bd3 100644 --- a/package/Shaders/Effect.hlsl +++ b/package/Shaders/Effect.hlsl @@ -503,14 +503,6 @@ cbuffer PerGeometry : register(b2) # endif }; -# if defined(LIGHT_LIMIT_FIX) -# include "LightLimitFix/LightLimitFix.hlsli" -# endif - -# if defined(ISL) && defined(LIGHT_LIMIT_FIX) -# include "InverseSquareLighting/InverseSquareLighting.hlsli" -# endif - # define LinearSampler SampBaseSampler # if defined(SKYLIGHTING) @@ -528,6 +520,14 @@ cbuffer PerGeometry : register(b2) # include "Common/ShadowSampling.hlsli" +# if defined(LIGHT_LIMIT_FIX) +# include "LightLimitFix/LightLimitFix.hlsli" +# endif + +# if defined(ISL) && defined(LIGHT_LIMIT_FIX) +# include "InverseSquareLighting/InverseSquareLighting.hlsli" +# endif + # if defined(LIGHTING) float3 GetLightingColor(float3 msPosition, float3 worldPosition, float2 screenPosition, uint eyeIndex, inout float shadowVariance) { @@ -598,7 +598,7 @@ float3 GetLightingColor(float3 msPosition, float3 worldPosition, float2 screenPo return color; } # else -float3 GetLightingShadow(float3 color, float3 worldPosition, float2 screenPosition, float depth, uint eyeIndex, inout float shadowVariance) +float3 GetLightingShadow(float3 color, float3 worldPosition, float2 screenPosition, float depth, uint eyeIndex, inout float shadowVariance, float noise) { float3 dirColor; float3 ambientColor; @@ -612,12 +612,6 @@ float3 GetLightingShadow(float3 color, float3 worldPosition, float2 screenPositi static const uint sampleCount = 8; static const float rcpSampleCount = 1.0 / float(sampleCount); - float noise = Random::InterleavedGradientNoise(screenPosition, SharedData::FrameCount); - float noiseTransform = noise * 2.0 - 1.0; - float2 rotation; - sincos(Math::TAU * noise, rotation.y, rotation.x); - float2x2 rotationMatrix = float2x2(rotation.x, rotation.y, -rotation.y, rotation.x); - // Enough for sky statics float maxDistance = max(0, SharedData::GetScreenDepth(depth)); float viewRayLength = 2048.0; @@ -711,41 +705,76 @@ PS_OUTPUT main(PS_INPUT input) float3 propertyColor = Color::Effect(PropertyColor.xyz); float shadowVariance = 1.0; + float screenNoise = Random::InterleavedGradientNoise(input.Position.xy, SharedData::FrameCount); + + float2 rotation; + sincos(Math::TAU * screenNoise, rotation.y, rotation.x); + float2x2 rotationMatrix = float2x2(rotation.x, rotation.y, -rotation.y, rotation.x); + # if defined(LIGHTING) propertyColor = GetLightingColor(input.MSPosition.xyz, input.WorldPosition.xyz, input.Position.xy, eyeIndex, shadowVariance); # if defined(LIGHT_LIMIT_FIX) - uint lightCount = 0; - float3 viewPosition = mul(FrameBuffer::CameraView[eyeIndex], float4(input.WorldPosition.xyz, 1)).xyz; float2 screenUV = FrameBuffer::ViewToUV(viewPosition, true, eyeIndex); bool inWorld = Permutation::ExtraShaderDescriptor & Permutation::ExtraFlags::InWorld; + uint numClusteredLights = 0; + uint lightOffset = 0; uint clusterIndex = 0; - if (inWorld && LightLimitFix::GetClusterIndex(screenUV, viewPosition.z, clusterIndex)) { - lightCount = LightLimitFix::lightGrid[clusterIndex].lightCount; - uint lightOffset = LightLimitFix::lightGrid[clusterIndex].offset; - [loop] for (uint i = 0; i < lightCount; i++) - { - uint clusteredLightIndex = LightLimitFix::lightList[lightOffset + i]; - LightLimitFix::Light light = LightLimitFix::lights[clusteredLightIndex]; - if (LightLimitFix::IsLightIgnored(light) || light.lightFlags & LightLimitFix::LightFlags::Shadow) { + uint numStrictLights = 0; + if (inWorld) { + // Gate strict lights behind inWorld too -- they live in + // LightLimitFix::StrictLights which is populated from world-space + // CB data. Including them on non-world passes (UI overlays, blood + // splatter on screen-space surfaces, etc.) leaks world lighting + // into effects that shouldn't be lit by point/spot lights at all. + // Clustered lights are already inWorld-gated below; strict needs + // the same treatment for symmetry. + numStrictLights = LightLimitFix::NumStrictLights; + if (LightLimitFix::GetClusterIndex(screenUV, viewPosition.z, clusterIndex)) { + numClusteredLights = LightLimitFix::lightGrid[clusterIndex].lightCount; + lightOffset = LightLimitFix::lightGrid[clusterIndex].offset; + } + } + uint totalLightCount = numStrictLights + numClusteredLights; + + [loop] for (uint i = 0; i < totalLightCount; i++) + { + LightLimitFix::Light light; + if (i < numStrictLights) { + light = LightLimitFix::StrictLights[i]; + } else { + uint clusteredLightIndex = LightLimitFix::lightList[lightOffset + (i - numStrictLights)]; + light = LightLimitFix::lights[clusteredLightIndex]; + if (LightLimitFix::IsLightIgnored(light)) continue; - } - float3 lightDirection = light.positionWS[eyeIndex].xyz - input.WorldPosition.xyz; - float lightDist = length(lightDirection); + } + + float3 lightDirection = light.positionWS[eyeIndex].xyz - input.WorldPosition.xyz; + float lightDist = length(lightDirection); # if defined(ISL) - float intensityMultiplier = InverseSquareLighting::GetAttenuation(lightDist, light); + float intensityMultiplier = InverseSquareLighting::GetAttenuation(lightDist, light); + if (intensityMultiplier < 1e-5) + continue; # else - float intensityFactor = saturate(lightDist / light.radius); - float intensityMultiplier = 1 - intensityFactor * intensityFactor; + float intensityFactor = saturate(lightDist / light.radius); + if (intensityFactor == 1) + continue; + float intensityMultiplier = 1 - intensityFactor * intensityFactor; # endif - const bool isPointLightLinear = light.lightFlags & LightLimitFix::LightFlags::Linear; - float3 lightColor = Color::PointLight(light.color.xyz, isPointLightLinear) * intensityMultiplier * 0.5 * light.fade * Color::EffectLightingMult(); - propertyColor += lightColor; + float shadowMul = 1.0; + if (inWorld && (light.lightFlags & LightLimitFix::LightFlags::Shadow)) { + bool shadowCoverage = false; + float3 worldPositionWS = input.WorldPosition.xyz + FrameBuffer::CameraPosAdjust[eyeIndex].xyz; + shadowMul = LightLimitFix::GetShadowLightShadow(light.shadowMapIndex, worldPositionWS, rotationMatrix, shadowCoverage); } + + const bool isPointLightLinear = light.lightFlags & LightLimitFix::LightFlags::Linear; + float3 lightColor = Color::PointLight(light.color.xyz, isPointLightLinear) * intensityMultiplier * 0.5 * light.fade * Color::EffectLightingMult() * shadowMul; + propertyColor += lightColor; } # endif @@ -840,7 +869,7 @@ PS_OUTPUT main(PS_INPUT input) # if !defined(LIGHTING) && defined(VC) && defined(TEXCOORD) && defined(NORMALS) && defined(TEXTURE) && defined(FALLOFF) && defined(SOFT) if (Permutation::PixelShaderDescriptor & Permutation::EffectFlags::GrayscaleToAlpha && lightingInfluence == 1.0) - lightColor = GetLightingShadow(lightColor, input.WorldPosition.xyz, input.Position.xy, depth, eyeIndex, shadowVariance); + lightColor = GetLightingShadow(lightColor, input.WorldPosition.xyz, input.Position.xy, depth, eyeIndex, shadowVariance, screenNoise); # endif lightColor = Color::EffectMult(lightColor); diff --git a/package/Shaders/Lighting.hlsl b/package/Shaders/Lighting.hlsl index 953b09e13d..4270c648f2 100644 --- a/package/Shaders/Lighting.hlsl +++ b/package/Shaders/Lighting.hlsl @@ -449,8 +449,6 @@ SamplerState SampLandLodBlend2Sampler : register(s15); SamplerState SampLandLodNoiseSampler : register(s15); # endif -SamplerState SampShadowMaskSampler : register(s14); - # if defined(LANDSCAPE) Texture2D TexColorSampler : register(t0); @@ -896,14 +894,6 @@ float GetSnowParameterY(float texProjTmp, float alpha) # include "ScreenSpaceShadows/ScreenSpaceShadows.hlsli" # endif -# if defined(LIGHT_LIMIT_FIX) -# include "LightLimitFix/LightLimitFix.hlsli" -# endif - -# if defined(ISL) && defined(LIGHT_LIMIT_FIX) -# include "InverseSquareLighting/InverseSquareLighting.hlsli" -# endif - # if defined(TREE_ANIM) # undef WETNESS_EFFECTS # endif @@ -941,6 +931,14 @@ float GetSnowParameterY(float texProjTmp, float alpha) # include "Common/ShadowSampling.hlsli" +# if defined(LIGHT_LIMIT_FIX) +# include "LightLimitFix/LightLimitFix.hlsli" +# endif + +# if defined(ISL) && defined(LIGHT_LIMIT_FIX) +# include "InverseSquareLighting/InverseSquareLighting.hlsli" +# endif + # if defined(IBL) # include "IBL/IBL.hlsli" # endif @@ -2023,15 +2021,6 @@ PS_OUTPUT main(PS_INPUT input, bool frontFace : SV_IsFrontFace) # endif // SPARKLE # endif // defined (MODELSPACENORMALS) && !defined (SKINNED) - float2 baseShadowUV = 1.0.xx; - float4 shadowColor = 1.0; - if ((Permutation::PixelShaderDescriptor & Permutation::LightingFlags::DefShadow) && ((Permutation::PixelShaderDescriptor & Permutation::LightingFlags::ShadowDir) || inWorld) || numShadowLights > 0) { - baseShadowUV = input.Position.xy * FrameBuffer::DynamicResolutionParams2.xy; - float2 adjustedShadowUV = baseShadowUV * VPOSOffset.xy + VPOSOffset.zw; - float2 shadowUV = FrameBuffer::GetDynamicResolutionAdjustedScreenPosition(adjustedShadowUV); - shadowColor = TexShadowMaskSampler.Sample(SampShadowMaskSampler, shadowUV); - } - float projectedMaterialWeight = 0; float projWeight = 0; @@ -2508,20 +2497,53 @@ PS_OUTPUT main(PS_INPUT input, bool frontFace : SV_IsFrontFace) float dirDetailedShadow = 1.0; - if ((Permutation::PixelShaderDescriptor & Permutation::LightingFlags::DefShadow) && (Permutation::PixelShaderDescriptor & Permutation::LightingFlags::ShadowDir)) { - dirDetailedShadow *= shadowColor.x; + float2 rotation; + sincos(Math::TAU * screenNoise, rotation.y, rotation.x); + float2x2 rotationMatrix = float2x2(rotation.x, rotation.y, -rotation.y, rotation.x); + float3 worldPositionWS = input.WorldPosition.xyz + FrameBuffer::CameraPosAdjust[eyeIndex].xyz; + + // Engine pre-renders the 4-cascade directional shadow into a screen-space + // mask at t14. LLF samples only cascades 0/1; we pass the engine mask + // through so LLF::GetDirectionalShadow can fall back to it past + // EndSplitDistances.y instead of returning fully-lit. + float4 shadowColor = (Permutation::PixelShaderDescriptor & Permutation::LightingFlags::DefShadow) ? TexShadowMaskSampler.Load(int3(input.Position.xy, 0)) : 1.0; + + // Mirrors #2319 for VOLUMETRIC_SHADOWS: use HasDirectionalShadows() (= !IsInterior() || + // InteriorSun::IsActive) instead of the bare !InInterior gate, so Interior Sun cells + // reach the LLF cascade + engine-mask sampling path. Without this, interior scenes + // with active Interior Sun render with zero directional contribution and no sun shadow. + if (inWorld && !inReflection && ShadowSampling::HasDirectionalShadows()) { +# if !defined(LOD) + // On non-deferred passes, use the cheaper VSM shadows if available +# if defined(LIGHT_LIMIT_FIX) && (defined(DEFERRED) || !defined(VOLUMETRIC_SHADOWS)) + dirDetailedShadow = LightLimitFix::GetDirectionalShadow(input.WorldPosition.xyz, worldPositionWS, rotationMatrix, eyeIndex, + (Permutation::PixelShaderDescriptor & Permutation::LightingFlags::ShadowDir) ? shadowColor.x : 1.0); +# elif !defined(LIGHT_LIMIT_FIX) + dirDetailedShadow = (Permutation::PixelShaderDescriptor & Permutation::LightingFlags::ShadowDir) ? shadowColor.x : 1.0; +# endif // LIGHT_LIMIT_FIX + +# if defined(VOLUMETRIC_SHADOWS) + float vsmDetailedShadow = 1.0; + dirSoftShadow = VolumetricShadows::GetVSMShadow2D(input.WorldPosition.xyz, worldPositionWS, eyeIndex, vsmDetailedShadow); + dirSoftShadow = max(dirSoftShadow, dirDetailedShadow); + +# if !defined(LIGHT_LIMIT_FIX) + if (!(Permutation::PixelShaderDescriptor & Permutation::LightingFlags::ShadowDir)) + dirDetailedShadow = vsmDetailedShadow; +# elif !(defined(DEFERRED)) + dirDetailedShadow = vsmDetailedShadow; +# endif -# if !defined(VOLUMETRIC_SHADOWS) +# else dirSoftShadow = dirDetailedShadow; +# endif // VOLUMETRIC_SHADOWS # endif - } else { - dirDetailedShadow = dirVSMDetailedShadow; - } # if defined(SCREEN_SPACE_SHADOWS) && defined(DEFERRED) - if (!SharedData::InInterior && dirLightAngle >= 0.0) - dirDetailedShadow *= ScreenSpaceShadows::GetScreenSpaceShadow(input.Position.xyz, screenUV, screenNoise, eyeIndex); -# endif + if (!SharedData::InInterior && dirLightAngle >= 0.0) + dirDetailedShadow *= ScreenSpaceShadows::GetScreenSpaceShadow(input.Position.xyz, screenUV, screenNoise, eyeIndex); +# endif // SCREEN_SPACE_SHADOWS + } # if defined(EMAT) && (defined(SKINNED) || !defined(MODELSPACENORMALS)) [branch] if (inWorld && SharedData::extendedMaterialSettings.EnableShadows) @@ -2678,6 +2700,29 @@ PS_OUTPUT main(PS_INPUT input, bool frontFace : SV_IsFrontFace) lightOffset = LightLimitFix::lightGrid[clusterIndex].offset; } +# if defined(LLFDEBUG) + LightLimitFix::LLFDebugInfo llfDebug = LightLimitFix::LLFDebugInfoInit(); +# endif + +# if defined(DEFERRED) + // Contact-shadow setup, gated on the runtime toggle so we don't pay the + // noise hash + step-count math for every pixel when the feature is off + // (it defaults off). The step count and noise are reused across every + // clustered light in this pixel so we hoist them out of the per-light loop. + uint contactShadowSteps = 0; + float contactShadowNoise = 0.0; + [branch] if (SharedData::lightLimitFixSettings.EnableContactShadows) + { + contactShadowSteps = round(SharedData::lightLimitFixSettings.ContactShadowMaxSteps * + (1.0 - saturate(viewPosition.z / SharedData::lightLimitFixSettings.ContactShadowMaxDistance))); + // The helper stays stereo-stable in VR — see + // LightLimitFix::GetContactShadowNoiseCoord for the eye-buffer math. + contactShadowNoise = Random::InterleavedGradientNoise( + LightLimitFix::GetContactShadowNoiseCoord(input.Position.xy, screenUV), + SharedData::FrameCount); + } +# endif + [loop] for (uint lightIndex = 0; lightIndex < totalLightCount; lightIndex++) { LightLimitFix::Light light; @@ -2710,16 +2755,68 @@ PS_OUTPUT main(PS_INPUT input, bool frontFace : SV_IsFrontFace) float lightShadow = 1.0; float shadowComponent = 1.0; - if (Permutation::PixelShaderDescriptor & Permutation::LightingFlags::DefShadow) { + bool shadowCoverage = false; + if (inWorld && !inReflection) { if (light.lightFlags & LightLimitFix::LightFlags::Shadow) { - shadowComponent = shadowColor[light.shadowLightIndex]; + shadowComponent = LightLimitFix::GetShadowLightShadow(light.shadowMapIndex, worldPositionWS, rotationMatrix, shadowCoverage); lightShadow *= shadowComponent; } } +# if defined(LLFDEBUG) + uint llfShadowType = (light.lightFlags & LightLimitFix::LightFlags::Shadow && + light.shadowMapIndex < SharedData::lightLimitFixSettings.ShadowMapSlots) ? + (uint)LightLimitFix::Shadows[light.shadowMapIndex].ShadowLightParam.x : + 0; + LightLimitFix::LLFDebugAccumulate(llfDebug, light, shadowComponent, shadowCoverage, llfShadowType); +# endif + float3 normalizedLightDirection = normalize(lightDirection); float lightAngle = dot(worldNormal.xyz, normalizedLightDirection.xyz); + float contactShadow = 1.0; + +# if defined(DEFERRED) + // Outer guard: contactShadowSteps > 0 covers both "feature off" and "pixel past + // MaxDistance", so all per-light intensity-gate math is paid only when a raymarch + // is actually possible. Without this, the falloff math fires for every clustered + // light even in the default-off case. + [branch] if (contactShadowSteps > 0) + { + // Strict lights always raymarch -- skip the falloff math for them entirely. + // Clustered lights need a normalized falloff to compare against MinIntensity; + // derive it from intensityMultiplier on the non-ISL path (where it IS already + // 1 - (d/r)^2) and re-compute on the ISL path (where GetAttenuation isn't + // [0,1]-normalized, so the threshold would mean different things otherwise). + const bool isClusteredLight = lightIndex >= LightLimitFix::NumStrictLights; + bool passesIntensityGate = !isClusteredLight; + if (isClusteredLight) { +# if defined(ISL) + float falloffFactor = saturate(lightDist * light.invRadius); + passesIntensityGate = (1.0 - falloffFactor * falloffFactor) > + SharedData::lightLimitFixSettings.ContactShadowMinIntensity; +# else + passesIntensityGate = intensityMultiplier > + SharedData::lightLimitFixSettings.ContactShadowMinIntensity; +# endif + } + + [branch] if ( + !(light.lightFlags & LightLimitFix::LightFlags::Simple) && + shadowComponent != 0.0 && + lightAngle > 0.0 && + passesIntensityGate) + { + // Derive view-space position via CameraView; the Light struct only carries positionWS + // (camera-relative) so the matrix multiply here is the cheapest path until positionVS + // is added to the struct + populated CPU-side. + float3 lightPositionVS = mul(FrameBuffer::CameraView[eyeIndex], float4(light.positionWS[eyeIndex].xyz, 1)).xyz; + float3 normalizedLightDirectionVS = normalize(lightPositionVS - viewPosition.xyz); + contactShadow = LightLimitFix::ContactShadows(viewPosition, contactShadowNoise, normalizedLightDirectionVS, contactShadowSteps, eyeIndex); + } + } +# endif + float3 refractedLightDirection = normalizedLightDirection; # if defined(TRUE_PBR) && !defined(LANDSCAPE) && !defined(LODLANDSCAPE) [branch] if ((PBRFlags & PBR::Flags::InterlayerParallax) != 0) @@ -2736,7 +2833,8 @@ PS_OUTPUT main(PS_INPUT input, bool frontFace : SV_IsFrontFace) SharedData::extendedMaterialSettings.EnableShadows && !(light.lightFlags & LightLimitFix::LightFlags::Simple) && lightAngle > 0.0 && - shadowComponent != 0.0) + shadowComponent != 0.0 && + contactShadow != 0.0) { float3 lightDirectionTS = normalize(mul(refractedLightDirection, tbn).xyz); # if defined(PARALLAX) @@ -2765,7 +2863,7 @@ PS_OUTPUT main(PS_INPUT input, bool frontFace : SV_IsFrontFace) DirectContext pointLightContext; DirectLightingOutput pointLightOutput; - float pointLightShadow = lightShadow * parallaxShadow; + float pointLightShadow = lightShadow * parallaxShadow * contactShadow; # if defined(TRUE_PBR) pointLightContext = CreateDirectLightingContext(worldNormal.xyz, coatWorldNormal, vertexNormal.xyz, refractedViewDirection, viewDirection, refractedLightDirection, normalizedLightDirection, lightColor, pointLightShadow, pointLightShadow); # else @@ -2799,7 +2897,7 @@ PS_OUTPUT main(PS_INPUT input, bool frontFace : SV_IsFrontFace) # if !defined(LANDSCAPE) if (Permutation::PixelShaderDescriptor & Permutation::LightingFlags::CharacterLight) { float charLightMul = saturate(dot(viewDirection, worldNormal.xyz)) * CharacterLightParams.x + CharacterLightParams.y * saturate(dot(float2(0.164398998, -0.986393988), worldNormal.yz)); - float charLightColor = min(CharacterLightParams.w, max(0, CharacterLightParams.z * TexCharacterLightProjNoiseSampler.Sample(SampCharacterLightProjNoiseSampler, baseShadowUV).x)); + float charLightColor = min(CharacterLightParams.w, max(0, CharacterLightParams.z * TexCharacterLightProjNoiseSampler.Sample(SampCharacterLightProjNoiseSampler, screenUV).x)); diffuseColor += (charLightMul * charLightColor).xxx; } # endif @@ -3250,15 +3348,13 @@ PS_OUTPUT main(PS_INPUT input, bool frontFace : SV_IsFrontFace) # if defined(LIGHT_LIMIT_FIX) && defined(LLFDEBUG) if (SharedData::lightLimitFixSettings.EnableLightsVisualisation) { - if (SharedData::lightLimitFixSettings.LightsVisualisationMode == 0) { - psout.Diffuse.xyz = Color::TurboColormap(LightLimitFix::NumStrictLights >= 7.0); - } else if (SharedData::lightLimitFixSettings.LightsVisualisationMode == 1) { - psout.Diffuse.xyz = Color::TurboColormap((float)LightLimitFix::NumStrictLights / 15.0); - } else if (SharedData::lightLimitFixSettings.LightsVisualisationMode == 2) { - psout.Diffuse.xyz = Color::TurboColormap((float)numClusteredLights / MAX_CLUSTER_LIGHTS); - } else { - psout.Diffuse.xyz = shadowColor.xyz; - } + psout.Diffuse.xyz = LightLimitFix::LLFDebugGetVizColor( + llfDebug, + Color::TurboColormap(LightLimitFix::NumStrictLights >= 7.0), + Color::TurboColormap((float)LightLimitFix::NumStrictLights / 15.0), + Color::TurboColormap((float)numClusteredLights / MAX_CLUSTER_LIGHTS), + float3(dirSoftShadow, dirDetailedShadow, 0.0), + color.xyz); baseColor.xyz = 0.0; } else { psout.Diffuse.xyz = color.xyz; @@ -3325,7 +3421,7 @@ PS_OUTPUT main(PS_INPUT input, bool frontFace : SV_IsFrontFace) # endif # if !defined(HDR_OUTPUT) // Do not apply gamma correction before we pass to ISHDR. - if ((!inWorld && !inReflection) && SharedData::linearLightingSettings.enableLinearLighting && !(Permutation::PixelShaderDescriptor & Permutation::LightingFlags::DefShadow)) { + if ((!inWorld && !inReflection) && SharedData::linearLightingSettings.enableLinearLighting) { psout.Diffuse.xyz = Color::LinearToSrgb(psout.Diffuse.xyz); } # endif diff --git a/package/Shaders/Particle.hlsl b/package/Shaders/Particle.hlsl index c35f68861c..9a15e7ebc0 100644 --- a/package/Shaders/Particle.hlsl +++ b/package/Shaders/Particle.hlsl @@ -215,14 +215,6 @@ struct PS_OUTPUT #ifdef PSHADER -# if defined(LIGHT_LIMIT_FIX) -# include "LightLimitFix/LightLimitFix.hlsli" -# endif - -# if defined(ISL) && defined(LIGHT_LIMIT_FIX) -# include "InverseSquareLighting/InverseSquareLighting.hlsli" -# endif - SamplerState SampSourceTexture : register(s0); # if defined(GRAYSCALE_TO_COLOR) || defined(GRAYSCALE_TO_ALPHA) SamplerState SampGrayscaleTexture : register(s1); @@ -248,8 +240,17 @@ cbuffer PerGeometry : register(b2) }; # define LinearSampler SampSourceTexture + # include "Common/ShadowSampling.hlsli" +# if defined(LIGHT_LIMIT_FIX) +# include "LightLimitFix/LightLimitFix.hlsli" +# endif + +# if defined(ISL) && defined(LIGHT_LIMIT_FIX) +# include "InverseSquareLighting/InverseSquareLighting.hlsli" +# endif + PS_OUTPUT main(PS_INPUT input) { PS_OUTPUT psout; @@ -295,11 +296,34 @@ PS_OUTPUT main(PS_INPUT input) positionWS = mul(FrameBuffer::CameraViewProjInverse[eyeIndex], positionWS); positionWS.xyz = positionWS.xyz / positionWS.w; - float unusedDetailedShadow; - float3 dirLightColor = SharedData::DirLightColor.xyz * ShadowSampling::GetLightingShadow(positionWS.xyz, eyeIndex, unusedDetailedShadow); + float screenNoise = Random::InterleavedGradientNoise(input.Position.xy, SharedData::FrameCount); + float2 rotation; + sincos(Math::TAU * screenNoise, rotation.y, rotation.x); + float2x2 rotationMatrix = float2x2(rotation.x, rotation.y, -rotation.y, rotation.x); + + float3 worldPositionWS = positionWS.xyz + FrameBuffer::CameraPosAdjust[eyeIndex].xyz; + + float dirSoftShadow = 1.0; + float dirDetailedShadow = 1.0; + + float3 dirLightColor = SharedData::DirLightColor.xyz; + + // Mirrors #2319 / Lighting.hlsl: HasDirectionalShadows() admits Interior Sun cells + // to the directional shadow sampling path. + if (ShadowSampling::HasDirectionalShadows()) { + // Use the cheaper VSM shadows if available +# if defined(VOLUMETRIC_SHADOWS) + dirSoftShadow = VolumetricShadows::GetVSMShadow2D(positionWS.xyz, worldPositionWS, eyeIndex, dirDetailedShadow); +# elif defined(LIGHT_LIMIT_FIX) + dirDetailedShadow = LightLimitFix::GetDirectionalShadow(positionWS.xyz, worldPositionWS, rotationMatrix, eyeIndex); +# endif + } + float3 ambientColor = max(0, SharedData::GetAmbient(float3(0, 0, 1))); - propertyColor += dirLightColor; + // Exactly one of dirSoftShadow / dirDetailedShadow is < 1.0 (the two paths + // above are mutually exclusive); the other stays at its default 1.0. + propertyColor += dirLightColor * dirSoftShadow * dirDetailedShadow; propertyColor += ambientColor; # if defined(LIGHT_LIMIT_FIX) diff --git a/package/Shaders/RunGrass.hlsl b/package/Shaders/RunGrass.hlsl index e7ef06f9f8..c7ada6ef78 100644 --- a/package/Shaders/RunGrass.hlsl +++ b/package/Shaders/RunGrass.hlsl @@ -420,6 +420,12 @@ cbuffer AlphaTestRefCB : register(b11) # include "ScreenSpaceShadows/ScreenSpaceShadows.hlsli" # endif +// ShadowSampling.hlsli must be included before LightLimitFix.hlsli because +// LightLimitFix.hlsli references DirectionalShadowLightData / DirectionalShadowLights +// which are declared in ShadowSampling.hlsli. +# define LinearSampler SampBaseSampler +# include "Common/ShadowSampling.hlsli" + # if defined(LIGHT_LIMIT_FIX) # include "LightLimitFix/LightLimitFix.hlsli" # endif @@ -446,10 +452,6 @@ cbuffer AlphaTestRefCB : register(b11) # include "ExponentialHeightFog/ExponentialHeightFog.hlsli" # endif -# define LinearSampler SampBaseSampler - -# include "Common/ShadowSampling.hlsli" - # ifdef GRASS_LIGHTING # if defined(TRUE_PBR) @@ -602,11 +604,13 @@ PS_OUTPUT main(PS_INPUT input, bool frontFace : SV_IsFrontFace) float dirDetailedShadow = 1.0; - if (!SharedData::InInterior) + // HasDirectionalShadows() admits Interior Sun cells; mirrors the + // same swap in Lighting.hlsl / Particle.hlsl. + if (ShadowSampling::HasDirectionalShadows()) dirDetailedShadow *= shadowColor.x; # if defined(SCREEN_SPACE_SHADOWS) - if (!SharedData::InInterior && dirLightAngle >= 0.0) + if (ShadowSampling::HasDirectionalShadows() && dirLightAngle >= 0.0) dirDetailedShadow *= ScreenSpaceShadows::GetScreenSpaceShadow(input.HPosition.xyz, screenUV, screenNoise, eyeIndex); # endif // SCREEN_SPACE_SHADOWS @@ -686,8 +690,17 @@ PS_OUTPUT main(PS_INPUT input, bool frontFace : SV_IsFrontFace) float lightShadow = 1.0; float shadowComponent = 1.0; + bool shadowCoverage = false; if (light.lightFlags & LightLimitFix::LightFlags::Shadow) { - shadowComponent = shadowColor[light.shadowLightIndex]; + // Per-pixel PCF rotation + world-space position for new SLF shadow API. + // Replaces the old shadowColor[light.shadowLightIndex] vanilla path which + // referenced a now-renamed Light field (shadowLightIndex -> shadowMapIndex) + // and bypassed the SLF shadow infrastructure. + float2 rotation; + sincos(Math::TAU * screenNoise, rotation.y, rotation.x); + float2x2 rotationMatrix = float2x2(rotation.x, rotation.y, -rotation.y, rotation.x); + float3 worldPositionWS = input.WorldPosition.xyz + FrameBuffer::CameraPosAdjust[eyeIndex].xyz; + shadowComponent = LightLimitFix::GetShadowLightShadow(light.shadowMapIndex, worldPositionWS, rotationMatrix, shadowCoverage); lightShadow *= shadowComponent; } @@ -837,11 +850,13 @@ PS_OUTPUT main(PS_INPUT input) float dirDetailedShadow = 1.0; - if (!SharedData::InInterior) + // HasDirectionalShadows() admits Interior Sun cells; mirrors the + // same swap in Lighting.hlsl / Particle.hlsl. + if (ShadowSampling::HasDirectionalShadows()) dirDetailedShadow = shadowColor.x; # if defined(SCREEN_SPACE_SHADOWS) - if (!SharedData::InInterior) + if (ShadowSampling::HasDirectionalShadows()) dirDetailedShadow *= ScreenSpaceShadows::GetScreenSpaceShadow(input.HPosition.xyz, screenUV, screenNoise, eyeIndex); # endif // SCREEN_SPACE_SHADOWS @@ -882,8 +897,15 @@ PS_OUTPUT main(PS_INPUT input) float lightShadow = 1.0; float shadowComponent = 1.0; + bool shadowCoverage = false; if (light.lightFlags & LightLimitFix::LightFlags::Shadow) { - shadowComponent = shadowColor[light.shadowLightIndex]; + // Per-pixel PCF rotation + world-space position for new SLF shadow API. + // Replaces the old shadowColor[light.shadowLightIndex] vanilla path. + float2 rotation; + sincos(Math::TAU * screenNoise, rotation.y, rotation.x); + float2x2 rotationMatrix = float2x2(rotation.x, rotation.y, -rotation.y, rotation.x); + float3 worldPositionWS = input.WorldPosition.xyz + FrameBuffer::CameraPosAdjust[eyeIndex].xyz; + shadowComponent = LightLimitFix::GetShadowLightShadow(light.shadowMapIndex, worldPositionWS, rotationMatrix, shadowCoverage); lightShadow *= shadowComponent; } diff --git a/package/Shaders/Utility.hlsl b/package/Shaders/Utility.hlsl index 95522d81c9..356ab59b07 100644 --- a/package/Shaders/Utility.hlsl +++ b/package/Shaders/Utility.hlsl @@ -538,7 +538,7 @@ PS_OUTPUT main(PS_INPUT input) # elif SHADOWFILTER == 1 shadowVisibility = TexShadowMapSamplerComp.SampleCmpLevelZero(SampShadowMapSamplerComp, float3(positionLS.xy, cascadeIndex), positionLS.z).x; # elif SHADOWFILTER == 3 - shadowVisibility = SampleShadowPCF(TexShadowMapSamplerComp, SampShadowMapSamplerComp, positionLS.xy, cascadeIndex, positionLS.z, rotationMatrix, ShadowSampleParam.z); + shadowVisibility = SampleShadowPCF(TexShadowMapSamplerComp, SampShadowMapSamplerComp, positionLS.xy, cascadeIndex, positionLS.z, rotationMatrix, ShadowSampleParam.z * 0.5); # endif if (cascadeIndex < 1 && StartSplitDistances.y < shadowMapDepth) { @@ -554,7 +554,7 @@ PS_OUTPUT main(PS_INPUT input) # elif SHADOWFILTER == 1 cascade1ShadowVisibility = TexShadowMapSamplerComp.SampleCmpLevelZero(SampShadowMapSamplerComp, float3(cascade1PositionLS.xy, 1), cascade1PositionLS.z).x; # elif SHADOWFILTER == 3 - cascade1ShadowVisibility = SampleShadowPCF(TexShadowMapSamplerComp, SampShadowMapSamplerComp, cascade1PositionLS.xy, 1, cascade1PositionLS.z, rotationMatrix, ShadowSampleParam.z); + cascade1ShadowVisibility = SampleShadowPCF(TexShadowMapSamplerComp, SampShadowMapSamplerComp, cascade1PositionLS.xy, 1, cascade1PositionLS.z, rotationMatrix, ShadowSampleParam.z * 0.5); # endif float cascade1BlendFactor = smoothstep(0, 1, (shadowMapDepth - StartSplitDistances.y) / (EndSplitDistances.x - StartSplitDistances.y)); diff --git a/src/Deferred.cpp b/src/Deferred.cpp index 870d1354e3..ac6d831dc0 100644 --- a/src/Deferred.cpp +++ b/src/Deferred.cpp @@ -8,6 +8,7 @@ #include "Features/DynamicCubemaps.h" #include "Features/IBL.h" +#include "Features/LightLimitFix/ShadowCasterManager.h" #include "Features/ScreenSpaceGI.h" #include "Features/Skylighting.h" #include "Features/SubsurfaceScattering.h" @@ -564,6 +565,33 @@ void Deferred::SetShadowCascadeParameters(T& lightData, DirectionalShadowLightDa DirectX::XMMATRIX invProj = DirectX::XMMatrixInverse(nullptr, proj); DirectX::XMStoreFloat4x4(&dd.InvShadowProj[i], invProj); } + + // Focus shadow matrices (one per active focus actor; engine writes them + // to focusShadowmapDescriptors[i].lightTransform during its per-cascade + // render). The shader samples kSHADOWMAPS slice (4 + i) for each entry + // to recover the player/NPC high-resolution shadow. + const auto focusCount = std::min( + static_cast(std::size(lightData.focusShadowmapDescriptors)), + static_cast(std::size(dd.FocusShadowProj))); + // Preserve descriptor->slice correspondence by writing FocusShadowProj[i] + // for descriptor[i] -- the LLF shader samples kSHADOWMAPS slice (4 + fi) + // using fi as the matrix index, so packing densely (e.g. via a separate + // counter) would pair matrix N with the wrong shadow slice when there are + // disabled holes between descriptors. Disabled descriptors leave their + // FocusShadowProj slot at the default-zero matrix; the shader's existing + // `focusClip.w <= EPSILON_DIVISION` guard treats that as "no actor in + // this slice" and skips sampling. FocusShadowCount is the upper iteration + // bound (last enabled index + 1) so the shader still exits early when + // trailing slots are empty. + dd.FocusShadowCount = 0; + for (uint32_t i = 0; i < focusCount; i++) { + const auto& desc = lightData.focusShadowmapDescriptors[i]; + if (!desc.isEnabled) + continue; // descriptor unused this frame -- leave FocusShadowProj[i] at zero + auto proj = DirectX::XMLoadFloat4x4(reinterpret_cast(&desc.lightTransform)); + DirectX::XMStoreFloat4x4(&dd.FocusShadowProj[i], proj); + dd.FocusShadowCount = i + 1; + } } void Deferred::CopyShadowLightData() @@ -598,6 +626,10 @@ void Deferred::CopyShadowLightData() ID3D11ShaderResourceView* srv = directionalShadowLights->srv.get(); context->PSSetShaderResources(98, 1, &srv); + + // t99: cascade depth array used by LightLimitFix::GetDirectionalShadow for PCF sampling. + ID3D11ShaderResourceView* cascadeSRV = globals::game::renderer->GetDepthStencilData().depthStencils[RE::RENDER_TARGET_DEPTHSTENCIL::kSHADOWMAPS_ESRAM].depthSRV; + context->PSSetShaderResources(99, 1, &cascadeSRV); } void Deferred::ClearShaderCache() diff --git a/src/Deferred.h b/src/Deferred.h index c95f28ed2a..841ce082e0 100644 --- a/src/Deferred.h +++ b/src/Deferred.h @@ -4,6 +4,7 @@ #include "Buffer.h" #include "RE/B/BSShadowDirectionalLight.h" +#include "RE/B/BSShadowLight.h" #define ALBEDO RE::RENDER_TARGETS::kINDIRECT #define SPECULAR RE::RENDER_TARGETS::kINDIRECT_DOWNSCALED @@ -27,8 +28,34 @@ class Deferred float4x4 InvShadowProj[2]; float2 EndSplitDistances; float2 StartSplitDistances; + // Focus shadow projection matrices, written by SCM each frame for the + // active FocusShadowActors (player + tracked NPCs, max 4). Each matrix + // projects world-space to the focus shadow's clip space; HLSL samples + // kSHADOWMAPS slice (4 + i) for matrix i to get the per-actor high-res + // shadow. FocusShadowCount in [0..4]; entries beyond it are ignored. + float4x4 FocusShadowProj[4]; + uint FocusShadowCount; + uint pad0[3]; }; STATIC_ASSERT_ALIGNAS_16(DirectionalShadowLightData); + // Size guard catches silent layout drift between this and the HLSL mirror + // in ShadowSampling.hlsli; any size change here corrupts every uploaded + // directional shadow record so we want it to fail at compile time. + // 8 float4x4 (Shadow + Inv + Focus) + 2 float4 (splits + FocusCount/pad). + static_assert(sizeof(DirectionalShadowLightData) == 8 * sizeof(float4x4) + 2 * sizeof(float4), + "DirectionalShadowLightData layout drifted from ShadowSampling.hlsli mirror"); + + struct alignas(16) ShadowLightData + { + float4x4 ShadowProj; + float4x4 InvShadowProj; + float4 ShadowParam; + }; + + STATIC_ASSERT_ALIGNAS_16(ShadowLightData); + // Same guard for the per-slot point/spot shadow record (LightLimitFix.hlsli). + static_assert(sizeof(ShadowLightData) == 2 * sizeof(float4x4) + sizeof(float4), + "ShadowLightData layout drifted from LightLimitFix.hlsli mirror"); void SetupResources(); void ReflectionsPrepasses(); @@ -43,9 +70,6 @@ class Deferred void ClearShaderCache(); - ID3D11ComputeShader* GetComputeMainComposite(); - ID3D11ComputeShader* GetComputeMainCompositeInterior(); - // Reads directional shadow parameters from BSShadowDirectionalLight and uploads // to the structured buffer at t98 (DirectionalShadowLightData — cascade splits + // world-to-shadow projections). Called during EarlyPrepasses once shadow maps @@ -53,6 +77,9 @@ class Deferred // constant-buffer fields into a UAV. void CopyShadowLightData(); + ID3D11ComputeShader* GetComputeMainComposite(); + ID3D11ComputeShader* GetComputeMainCompositeInterior(); + ID3D11BlendState* deferredBlendStates[7][2][13][2]; ID3D11BlendState* forwardBlendStates[7][2][13][2]; diff --git a/src/Feature.cpp b/src/Feature.cpp index aa0e23992f..5264802f5a 100644 --- a/src/Feature.cpp +++ b/src/Feature.cpp @@ -18,6 +18,7 @@ #include "Features/LightLimitFix.h" #include "Features/LinearLighting.h" #include "Features/PerformanceOverlay.h" +#include "Features/RemoteControl.h" #include "Features/RenderDoc.h" #include "Features/ScreenSpaceGI.h" #include "Features/ScreenSpaceShadows.h" @@ -101,7 +102,7 @@ void Feature::Load(json& o_json) std::string minimalVersionString = Util::GetFormattedVersion(minimalFeatureVersion); if (IsCore()) { - failedLoadedMessage = std::format("This feature is already included as part of the core Community Shaders installation. Uninstall this feature with your mod manager."); + failedLoadedMessage = std::format("This feature is already included as part of the core Open Shaders / Community Shaders installation. Uninstall this feature with your mod manager."); } else if (majorVersionMismatch) { failedLoadedMessage = std::format("{} {} is too old, major version incompatibility detected. Required: {}", GetShortName(), value, minimalVersionString); } else { @@ -240,6 +241,7 @@ const std::vector& Feature::GetFeatureList() &globals::features::extendedTranslucency, &globals::features::upscaling, &globals::features::renderDoc, + &globals::features::remoteControl, &globals::features::weatherEditor, &globals::features::screenshotFeature, &globals::features::linearLighting, diff --git a/src/Feature.h b/src/Feature.h index 95ca1741a7..e8e75a0afe 100644 --- a/src/Feature.h +++ b/src/Feature.h @@ -3,6 +3,11 @@ #include "FeatureCategories.h" #include "FeatureConstraints.h" #include "FeatureVersions.h" +#include "Utils/RestartSettings.h" + +#include +#include +#include #ifdef TRACY_ENABLE # include # include @@ -21,6 +26,38 @@ struct Feature // Override in features to expose settings for search virtual std::vector GetSettingsSearchEntries() { return {}; } + // Restart-required settings introspection. Default: none. + // Features with restart-gated fields override these to expose them to UI + // helpers and MCP/RemoteControl without per-feature glue. + virtual std::span GetRestartRequiredFields() const { return {}; } + virtual const void* GetBootValue(std::string_view /*jsonKey*/) const { return nullptr; } + virtual const void* GetSettingsBlob() const { return nullptr; } + virtual size_t GetSettingsBlobSize() const { return 0; } + + // True if any restart-gated setting's live value differs from the + // boot-latched value. Drives the green "RestartNeeded" tint in the + // feature list and the `pending` flag in MCP's `list` response. + bool HasAnyPendingRestart() const + { + const auto fields = GetRestartRequiredFields(); + if (fields.empty()) + return false; + const auto* live = reinterpret_cast(GetSettingsBlob()); + const size_t liveSize = GetSettingsBlobSize(); + if (!live || liveSize == 0) + return false; + for (const auto& field : fields) { + if (!field.jsonKey || field.size == 0) + continue; + if (field.offset + field.size > liveSize) + continue; + const void* boot = GetBootValue(field.jsonKey); + if (boot && std::memcmp(boot, live + field.offset, field.size) != 0) + return true; + } + return false; + } + // Nexus Mods base URL for Skyrim Special Edition static constexpr std::string_view NEXUS_BASE_URL = "https://www.nexusmods.com/skyrimspecialedition/mods/"; bool loaded = false; diff --git a/src/FeatureIssues.cpp b/src/FeatureIssues.cpp index f954c88736..64362979d5 100644 --- a/src/FeatureIssues.cpp +++ b/src/FeatureIssues.cpp @@ -48,7 +48,7 @@ namespace FeatureIssues .displayName = "Complex Parallax Materials", .rejectionReason = "Integrated into ExtendedMaterials feature", .replacementFeature = "ExtendedMaterials", - .userMessage = "This functionality is now built into Community Shaders. Remove the old feature as it's no longer needed.", + .userMessage = "This functionality is now built into Open Shaders. Remove the old feature as it's no longer needed.", .removedInVersion = { 1, 0, 0 }, .modifiedShaderDirectory = false, .issueType = FeatureIssueInfo::IssueType::OBSOLETE } }, @@ -56,7 +56,7 @@ namespace FeatureIssues .displayName = "Tree LOD Lighting", .rejectionReason = "Functionality integrated into base CS lighting system", .replacementFeature = "", - .userMessage = "This functionality is now built into Community Shaders. Remove the old feature as it's no longer needed.", + .userMessage = "This functionality is now built into Open Shaders. Remove the old feature as it's no longer needed.", .removedInVersion = { 1, 0, 0 }, .modifiedShaderDirectory = true, .issueType = FeatureIssueInfo::IssueType::OBSOLETE } }, @@ -88,7 +88,7 @@ namespace FeatureIssues .displayName = "Distant Tree Lighting", .rejectionReason = "Replaced by TreeLODLighting, which was later integrated into CS core", .replacementFeature = "", - .userMessage = "This functionality is now built into Community Shaders. Remove the old feature as it's no longer needed.", + .userMessage = "This functionality is now built into Open Shaders. Remove the old feature as it's no longer needed.", .removedInVersion = { 0, 8, 0 }, .modifiedShaderDirectory = true, .issueType = FeatureIssueInfo::IssueType::OBSOLETE } } @@ -591,7 +591,7 @@ namespace FeatureIssues ImGui::SameLine(); ImGui::Text("Core feature already installed"); if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::TextWrapped("This feature is already included as part of the core Community Shaders installation. Uninstall this feature with your mod manager."); + ImGui::TextWrapped("This feature is already included as part of the core Open Shaders installation. Uninstall this feature with your mod manager."); } } else if (issue.IsVersionMismatch()) { ImGui::SameLine(); @@ -676,7 +676,7 @@ namespace FeatureIssues ImGui::TextColored(theme.StatusPalette.Warning, "If compilation issues persist after deletion:"); ImGui::BulletText("Completely uninstall the feature via your mod manager"); ImGui::BulletText("Check for modified files in Data/Shaders/ (not in feature subfolders)"); - ImGui::BulletText("Consider reinstalling Community Shaders if issues persist"); + ImGui::BulletText("Consider reinstalling Open Shaders if issues persist"); ImGui::Spacing(); ImGui::Separator(); ImGui::Spacing(); @@ -1462,74 +1462,70 @@ namespace FeatureIssues auto* menu = Menu::GetSingleton(); const auto& themeSettings = menu->GetTheme(); - if (ImGui::CollapsingHeader("Testing", ImGuiTreeNodeFlags_OpenOnArrow | ImGuiTreeNodeFlags_OpenOnDoubleClick)) { - { - auto sectionWrapper = Util::SectionWrapper("Feature Issue Testing", - "These tools create test INI files to trigger all known feature issue types for testing purposes.", - themeSettings.Palette.Text); - - if (sectionWrapper) { - const bool hasActiveTests = HasActiveTestInis(); - if (hasActiveTests) { // Warning section using theme colors - ImGui::PushStyleColor(ImGuiCol_Text, themeSettings.StatusPalette.RestartNeeded); - ImGui::TextWrapped("Test INI files are currently active. Restart CS to see feature issues."); - ImGui::PopStyleColor(); // Show detailed test state information - ImGui::Spacing(); - ImGui::PushStyleColor(ImGuiCol_Text, themeSettings.StatusPalette.RestartNeeded); - ImGui::TextWrapped(GetTestStateDescription().c_str()); - ImGui::PopStyleColor(); - ImGui::Spacing(); - } + auto sectionWrapper = Util::SectionWrapper("Feature Issue Testing", + "These tools create test INI files to trigger all known feature issue types for testing purposes.", + themeSettings.Palette.Text); - // Create Test INIs button - { - auto disableGuard = Util::DisableGuard(hasActiveTests); - auto buttonStyle = Util::StyledButtonWrapper( - themeSettings.Palette.FrameBorder, - themeSettings.StatusPalette.RestartNeeded, - themeSettings.StatusPalette.CurrentHotkey); - - if (ImGui::Button("Create Test Inis", { -1, 0 })) { - auto testInis = CreateTestInis(); - logger::info("Created {} test INI files for feature issue testing", testInis.size()); - } - } + if (sectionWrapper) { + const bool hasActiveTests = HasActiveTestInis(); + if (hasActiveTests) { // Warning section using theme colors + ImGui::PushStyleColor(ImGuiCol_Text, themeSettings.StatusPalette.RestartNeeded); + ImGui::TextWrapped("Test INI files are currently active. Restart CS to see feature issues."); + ImGui::PopStyleColor(); // Show detailed test state information + ImGui::Spacing(); + ImGui::PushStyleColor(ImGuiCol_Text, themeSettings.StatusPalette.RestartNeeded); + ImGui::TextWrapped(GetTestStateDescription().c_str()); + ImGui::PopStyleColor(); + ImGui::Spacing(); + } - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text( - "Creates test INI files that trigger all known feature issue cases:\n" - "- Obsolete features (ComplexParallaxMaterials, TerrainBlending, etc.)\n" - "- Unknown features (fake non-existent features)\n" - "- Version mismatch (modifies existing feature version)\n" - "Restart CS after creating to see the issues in action."); - } + // Create Test INIs button + { + auto disableGuard = Util::DisableGuard(hasActiveTests); + auto buttonStyle = Util::StyledButtonWrapper( + themeSettings.Palette.FrameBorder, + themeSettings.StatusPalette.RestartNeeded, + themeSettings.StatusPalette.CurrentHotkey); + + if (ImGui::Button("Create Test INIs", { -1, 0 })) { + auto testInis = CreateTestInis(); + logger::info("Created {} test INI files for feature issue testing", testInis.size()); + } + } - // Restore button - { - auto disableGuard = Util::DisableGuard(!hasActiveTests); - auto buttonStyle = Util::StyledButtonWrapper( - themeSettings.Palette.FrameBorder, - themeSettings.StatusPalette.Error, - themeSettings.StatusPalette.CurrentHotkey); - - if (ImGui::Button("Restore", { -1, 0 })) { - auto& testInis = GetCurrentTestInis(); - if (RestoreOriginalState(testInis)) { - logger::info("Successfully restored original state"); - } else { - logger::warn("Some restoration operations failed"); - } - } - } + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text( + "Creates test INI files that trigger all known feature issue cases:\n" + "- Obsolete features (ComplexParallaxMaterials, TerrainBlending, etc.)\n" + "- Unknown features (fake non-existent features)\n" + "- Version mismatch (modifies existing feature version)\n" + "Restart CS after creating to see the issues in action."); + } - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text( - "Removes all test INI files and restores any modified INI files to their original state.\n" - "This undoes all changes made by 'Create Test Inis'.\n" - "Restart CS after restoring to see normal operation."); + // Restore button + { + auto disableGuard = Util::DisableGuard(!hasActiveTests); + auto buttonStyle = Util::StyledButtonWrapper( + themeSettings.Palette.FrameBorder, + themeSettings.StatusPalette.Error, + themeSettings.StatusPalette.CurrentHotkey); + + if (ImGui::Button("Restore", { -1, 0 })) { + auto& testInis = GetCurrentTestInis(); + if (RestoreOriginalState(testInis)) { + logger::info("Successfully restored original state"); + } else { + logger::warn("Some restoration operations failed"); } } } + + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text( + "Removes all test INI files and restores any modified INI files to their original state.\n" + "This undoes all changes made by 'Create Test INIs'.\n" + "Restart CS after restoring to see normal operation."); + } } } bool RefreshTestState() diff --git a/src/Features/DynamicCubemaps.cpp b/src/Features/DynamicCubemaps.cpp index 560c902452..1d09edfae7 100644 --- a/src/Features/DynamicCubemaps.cpp +++ b/src/Features/DynamicCubemaps.cpp @@ -6,6 +6,7 @@ #include "ShaderCache.h" #include "State.h" #include "Utils/D3D.h" +#include "Utils/UI.h" constexpr auto MIPLEVELS = 8; @@ -30,14 +31,9 @@ void DynamicCubemaps::DrawSettings() recompileFlag |= ImGui::Checkbox("Enable Screen Space Reflections", reinterpret_cast(&settings.EnabledSSR)); if (auto _tt = Util::HoverTooltipWrapper()) { ImGui::Text("Enable Screen Space Reflections on Water"); - if (REL::Module::IsVR() && !enabledAtBoot) { - ImGui::PushStyleColor(ImGuiCol_Text, ImVec4(1.0f, 0.0f, 0.0f, 1.0f)); - ImGui::Text( - "A restart is required to enable in VR. " - "Save Settings after enabling and restart the game."); - ImGui::PopStyleColor(); - } } + if (globals::game::isVR) + Util::UI::DrawSettingDiff(bootSnapshot, settings, &Settings::EnabledSSR); ImGui::TreePop(); } @@ -119,7 +115,7 @@ void DynamicCubemaps::DrawSettings() } ImGui::TreePop(); } - if (REL::Module::IsVR()) { + if (globals::game::isVR) { if (ImGui::TreeNodeEx("Advanced VR Settings", ImGuiTreeNodeFlags_DefaultOpen)) { Util::RenderImGuiSettingsTree(iniVRCubeMapSettings, "VR"); Util::RenderImGuiSettingsTree(hiddenVRCubeMapSettings, "hiddenVR"); @@ -131,7 +127,7 @@ void DynamicCubemaps::DrawSettings() void DynamicCubemaps::LoadSettings(json& o_json) { settings = o_json; - if (REL::Module::IsVR()) { + if (globals::game::isVR) { Util::LoadGameSettings(iniVRCubeMapSettings); } recompileFlag = true; @@ -140,7 +136,7 @@ void DynamicCubemaps::LoadSettings(json& o_json) void DynamicCubemaps::SaveSettings(json& o_json) { o_json = settings; - if (REL::Module::IsVR()) { + if (globals::game::isVR) { Util::SaveGameSettings(iniVRCubeMapSettings); } } @@ -148,7 +144,7 @@ void DynamicCubemaps::SaveSettings(json& o_json) void DynamicCubemaps::RestoreDefaultSettings() { settings = {}; - if (REL::Module::IsVR()) { + if (globals::game::isVR) { Util::ResetGameSettingsToDefaults(iniVRCubeMapSettings); Util::ResetGameSettingsToDefaults(hiddenVRCubeMapSettings); } @@ -157,7 +153,7 @@ void DynamicCubemaps::RestoreDefaultSettings() void DynamicCubemaps::DataLoaded() { - if (REL::Module::IsVR()) { + if (globals::game::isVR) { // enable cubemap settings in VR Util::EnableBooleanSettings(iniVRCubeMapSettings, GetName()); Util::EnableBooleanSettings(hiddenVRCubeMapSettings, GetName()); @@ -167,7 +163,8 @@ void DynamicCubemaps::DataLoaded() void DynamicCubemaps::PostPostLoad() { - if (REL::Module::IsVR() && settings.EnabledSSR) { + bootSnapshot.LatchIfNeeded(settings); + if (globals::game::isVR && settings.EnabledSSR) { std::map earlyhiddenVRCubeMapSettings{ { "bScreenSpaceReflectionEnabled:Display", 0x1ED5BC0 }, }; @@ -180,7 +177,6 @@ void DynamicCubemaps::PostPostLoad() *setting = true; } } - enabledAtBoot = true; } } diff --git a/src/Features/DynamicCubemaps.h b/src/Features/DynamicCubemaps.h index 7b23744d41..ac46d30629 100644 --- a/src/Features/DynamicCubemaps.h +++ b/src/Features/DynamicCubemaps.h @@ -1,6 +1,7 @@ #pragma once #include "Buffer.h" +#include "Utils/BootSnapshot.h" class MenuOpenCloseEventHandler : public RE::BSTEventSink { @@ -119,7 +120,21 @@ struct DynamicCubemaps : Feature }; Settings settings; - bool enabledAtBoot = false; + + inline static constexpr Util::Settings::RestartTable kRestartFields{ { + UTIL_RESTART_FIELD(Settings, EnabledSSR, "Screen Space Reflections"), + } }; + Util::Settings::BootSnapshot bootSnapshot{ kRestartFields }; + + std::span GetRestartRequiredFields() const override + { + // VR-only: enabling SSR needs game-setting initialization at startup. + return globals::game::isVR ? std::span{ kRestartFields.data(), kRestartFields.size() } : std::span{}; + } + const void* GetBootValue(std::string_view jsonKey) const override { return bootSnapshot.RawBoot(jsonKey); } + const void* GetSettingsBlob() const override { return &settings; } + size_t GetSettingsBlobSize() const override { return sizeof(settings); } + void UpdateCubemap(); void PostDeferred(); diff --git a/src/Features/GrassLighting.cpp b/src/Features/GrassLighting.cpp index 59054832c9..9d2c696912 100644 --- a/src/Features/GrassLighting.cpp +++ b/src/Features/GrassLighting.cpp @@ -57,7 +57,7 @@ void GrassLighting::DrawSettings() ImGui::Text( "Override the settings set by the grass mesh author. " "Complex grass authors can define the brightness for their grass meshes. " - "However, some authors may not account for the extra lights available from Community Shaders. " + "However, some authors may not account for the extra lights available from Open Shaders. " "This option will treat their grass settings like non-complex grass. " "This was the default in Community Shaders < 0.7.0"); } diff --git a/src/Features/InverseSquareLighting.cpp b/src/Features/InverseSquareLighting.cpp index a602079470..ab16e2ebcd 100644 --- a/src/Features/InverseSquareLighting.cpp +++ b/src/Features/InverseSquareLighting.cpp @@ -66,7 +66,13 @@ void InverseSquareLighting::ProcessLight(LightLimitFix::LightData& light, RE::BS if (bsLight->pointLight && isInvSq) { const float intensity = runtimeData->fade * 4; - light.radius = CalculateRadius(intensity, bsLight->IsShadowLight(), runtimeData->cutoffOverride, runtimeData->size); + // Use the type-based helper rather than the virtual IsShadowLight(): + // SCM's Hook_IsShadowLight reports false for shadow lights converted + // to normal-light overflow handling (issue #2121 #3). If we followed + // that hook here the cutoff would flip from DefaultShadowCasterCutoff + // (0.022) to DefaultCutoff (0.05) when a light is converted, shrinking + // its effective radius by ~33% and visibly reducing its lit area. + light.radius = CalculateRadius(intensity, ShadowCasterManager::IsShadowLightType(bsLight), runtimeData->cutoffOverride, runtimeData->size); runtimeData->radius = light.radius; light.invRadius = 1.f / light.radius; light.fadeZone = 1.f / (light.radius * std::clamp(FadeZoneBase * light.invRadius, 0.f, 1.f)); diff --git a/src/Features/LightLimitFix.cpp b/src/Features/LightLimitFix.cpp index c7528fe44c..98a6e136bb 100644 --- a/src/Features/LightLimitFix.cpp +++ b/src/Features/LightLimitFix.cpp @@ -1,22 +1,111 @@ #include "LightLimitFix.h" #include "InverseSquareLighting.h" #include "LinearLighting.h" +#include "Utils/UI.h" +#include "Deferred.h" #include "Menu/ThemeManager.h" -#include "Utils/ExternalEmittance.h" #include "Shadercache.h" #include "State.h" #include "Util.h" +#include "Utils/ExternalEmittance.h" -static constexpr uint CLUSTER_MAX_LIGHTS = 128; -static constexpr uint MAX_LIGHTS = 1024; +// Debug visualisation state (EnableLightsVisualisation / LightsVisualisationMode) +// is intentionally NOT in Settings -- it lives as instance members on the +// LightLimitFix class so it resets per session and can't accidentally end +// up in a shipped JSON config that would force every load to compile the +// heavier LLFDEBUG shader permutation. +NLOHMANN_DEFINE_TYPE_NON_INTRUSIVE_WITH_DEFAULT( + LightLimitFix::Settings, + EnableContactShadows, + ContactShadowMaxSteps, + ContactShadowMaxDistance, + ContactShadowStride, + ContactShadowThickness, + ContactShadowDepthFade, + ContactShadowMinIntensity, + ShowShadowOverlay, + ShadowSettings) void LightLimitFix::DrawSettings() { auto shaderCache = globals::shaderCache; - if (ImGui::TreeNodeEx("Statistics", ImGuiTreeNodeFlags_DefaultOpen)) { - ImGui::Text(std::format("Clustered Light Count : {}", lightCount).c_str()); + ShadowCasterManager::DrawSettings(settings.ShadowSettings); + + // ---- Active Shadow Casters -------------------------------------- + // One cohesive section: overlay toggle, then ALL the stats grouped + // together (summary + scheduler stats + budget verdict), then the + // table below. Same layout as the overlay so testers see the same + // thing in both views with the stats above the (potentially long) + // table -- no scrolling required to find the headline numbers. + ImGui::SeparatorText("Shadow Limit Fix -- Active Casters"); + + ImGui::Checkbox("Show Shadow Overlay", &settings.ShowShadowOverlay); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text( + "Pop out an always-visible overlay window with the shadow caster table.\n" + "Without this, the overlay only appears when a light is suppressed\n" + "or a visualisation mode is active. Enable to access the table's\n" + "debug controls (cycle button, solo, Shift+hover pulse) any time."); + } + + ShadowCasterManager::DrawShadowSummary(lightCount, MAX_LIGHTS, shadowUnshadowedLightCount); + ShadowCasterManager::DrawShadowSchedulerStats(); + ImGui::Separator(); + ShadowCasterManager::DrawShadowLightTable(true, false); + + /////////////////////////////// + ImGui::SeparatorText("Shadows"); + + ImGui::Checkbox("Enable Contact Shadows", &settings.EnableContactShadows); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("All point lights (strict and clustered, except simple lights) cast short screen-space shadows. Performance impact."); + } + + if (settings.EnableContactShadows && ImGui::TreeNode("Contact Shadow Tuning")) { + // SliderScalar with ImGuiDataType_U32 instead of `SliderInt + (int*)cast`: + // the cast violates strict aliasing (UB) and would also misinterpret any + // transient negative value inside ImGui before clamp. SliderScalar + // reads/writes the uint storage directly with explicit min/max bounds. + constexpr uint32_t kMinSteps = 1, kMaxSteps = 16; + ImGui::SliderScalar("Max Steps", ImGuiDataType_U32, &settings.ContactShadowMaxSteps, + &kMinSteps, &kMaxSteps, "%u", ImGuiSliderFlags_AlwaysClamp); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("Raymarch steps at zero depth. Higher = longer / more accurate contact shadows, linearly more cost.\nVR users should consider 2 to halve per-eye cost."); + } + + // AlwaysClamp on every float slider too: without it, Ctrl+Click text entry can + // land arbitrary out-of-range values in settings before GetCommonBufferData's + // boundary clamp catches them at the GPU side. + ImGui::SliderFloat("Max Distance", &settings.ContactShadowMaxDistance, 64.0f, 4096.0f, "%.0f", ImGuiSliderFlags_AlwaysClamp); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("View-space depth at which contact shadows fade to zero steps. Avoids paying for shadows on distant surfaces where they don't read."); + } + + ImGui::SliderFloat("Stride", &settings.ContactShadowStride, 0.5f, 8.0f, "%.2f", ImGuiSliderFlags_AlwaysClamp); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("Per-step march length in view-space units at near depth (auto-scales linearly past ~100 units so far surfaces don't undersample). Larger = longer screen-space reach with coarser detail."); + } + + ImGui::SliderFloat("Thickness", &settings.ContactShadowThickness, 0.0f, 1.0f, "%.3f", ImGuiSliderFlags_AlwaysClamp); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("Depth-delta multiplier for shadow onset. Larger = darker contact at occluder edges."); + } + + ImGui::SliderFloat("Depth Fade", &settings.ContactShadowDepthFade, 0.0f, 1.0f, "%.3f", ImGuiSliderFlags_AlwaysClamp); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("Depth-delta multiplier for shadow falloff. Larger = shadows truncate sooner behind thick occluders."); + } + + ImGui::SliderFloat("Min Light Intensity", &settings.ContactShadowMinIntensity, 0.0f, 1.0f, "%.2f", ImGuiSliderFlags_AlwaysClamp); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text( + "Skip contact shadows for CLUSTERED lights whose normalized distance falloff " + "`1 - (lightDist/radius)^2` at the pixel is below this threshold. " + "Strict lights are always raymarched regardless of this threshold. " + "Higher = larger perf win, may drop subtle shadows from weak lights at their reach edge."); + } ImGui::TreePop(); } @@ -25,25 +114,45 @@ void LightLimitFix::DrawSettings() ImGui::SeparatorText("Debug"); if (ImGui::TreeNode("Light Limit Visualization")) { - ImGui::Checkbox("Enable Lights Visualisation", &settings.EnableLightsVisualisation); + ImGui::Checkbox("Enable Lights Visualisation", &EnableLightsVisualisation); if (auto _tt = Util::HoverTooltipWrapper()) { ImGui::Text("Enables visualization of the light limit\n"); } { - static const char* comboOptions[] = { "Light Limit", "Strict Lights Count", "Clustered Lights Count", "Shadow Mask" }; - ImGui::Combo("Lights Visualisation Mode", (int*)&settings.LightsVisualisationMode, comboOptions, 4); + static const char* comboOptions[] = { + "Light Limit", + "Strict Lights Count", + "Clustered Lights Count", + "Shadow Mask", + "Shadow Light Count", + "Point Light Shadow Factor", + "Unshadowed Point Lights", + "Shadow Caster Density", + "Shadow Slot Index Color", + "Light Type Visualization", + }; + // Round-trip through int instead of `(int*)&uint` to avoid strict-aliasing UB + // (ImGui has no ComboScalar). Clamp on the way in defends against any stale + // persisted value that might still exist from older builds. + int visMode = std::clamp(static_cast(LightsVisualisationMode), + 0, IM_ARRAYSIZE(comboOptions) - 1); + ImGui::Combo("Lights Visualisation Mode", &visMode, comboOptions, IM_ARRAYSIZE(comboOptions)); + LightsVisualisationMode = static_cast(visMode); if (auto _tt = Util::HoverTooltipWrapper()) { ImGui::Text( - " - Visualise the light limit. Red when the \"strict\" light limit is reached (portal-strict lights).\n" - " - Visualise the number of strict lights.\n" - " - Visualise the number of clustered lights.\n" - " - Visualize the Shadow Mask.\n"); + "Light Limit: Red when the strict light limit is reached (>=7 portal-strict lights).\n" + "\n" + "Strict Lights Count: Heatmap of portal-strict lights per pixel (blue=0, red=15).\n" + "\n" + "Clustered Lights Count: Heatmap of dynamic lights in each screen tile (blue=0, red=128)."); + ShadowCasterManager::DrawVisualisationTooltipShadowModes(); } } - currentEnableLightsVisualisation = settings.EnableLightsVisualisation; + + currentEnableLightsVisualisation = EnableLightsVisualisation; if (previousEnableLightsVisualisation != currentEnableLightsVisualisation) { - globals::state->SetDefines(settings.EnableLightsVisualisation ? "LLFDEBUG" : ""); + globals::state->SetDefines(EnableLightsVisualisation ? "LLFDEBUG" : ""); shaderCache->Clear(RE::BSShader::Type::Lighting); previousEnableLightsVisualisation = currentEnableLightsVisualisation; } @@ -52,23 +161,35 @@ void LightLimitFix::DrawSettings() } } -void LightLimitFix::DrawOverlay() -{ - if (!settings.EnableLightsVisualisation) - return; - const float pos = ThemeManager::Constants::OVERLAY_WINDOW_POSITION * Util::GetUIScale(); - ImGui::SetNextWindowPos(ImVec2(pos, pos), ImGuiCond_Always); - ImGui::Begin("##LLFDebug", nullptr, ImGuiWindowFlags_NoTitleBar | ImGuiWindowFlags_AlwaysAutoResize | ImGuiWindowFlags_NoMove | ImGuiWindowFlags_NoSavedSettings); - ImGui::TextColored(ImVec4(1.0f, 0.3f, 0.3f, 1.0f), "DEBUG FEATURE - LIGHT LIMIT VISUALISATION ENABLED"); - ImGui::End(); -} - LightLimitFix::PerFrame LightLimitFix::GetCommonBufferData() { + // Defensive sanitization before the values hit the constant buffer. The + // sliders enforce ImGuiSliderFlags_AlwaysClamp at the UI, but Settings + // can be mutated through other paths (JSON persistence, mod overrides, + // remote-control / MCP server, or just an internal logic bug) -- a few + // of these fields will produce divisions, infinite loops, or visual + // corruption if they arrive non-finite or out-of-range, so we re-validate + // at the shader boundary rather than trusting upstream callers. + // + // std::clamp passes NaN through unchanged (every NaN comparison is false), + // so reject non-finite values explicitly first; fall back to the lower + // bound on NaN/inf to produce degraded but stable behavior. + auto sanitizeFloat = [](float v, float lo, float hi) { + return std::isfinite(v) ? std::clamp(v, lo, hi) : lo; + }; + PerFrame perFrame{}; - perFrame.EnableLightsVisualisation = settings.EnableLightsVisualisation; - perFrame.LightsVisualisationMode = settings.LightsVisualisationMode; + perFrame.EnableContactShadows = settings.EnableContactShadows; + perFrame.ContactShadowMaxSteps = std::clamp(settings.ContactShadowMaxSteps, 1u, 16u); + perFrame.ContactShadowMaxDistance = sanitizeFloat(settings.ContactShadowMaxDistance, 64.0f, 4096.0f); + perFrame.ContactShadowStride = sanitizeFloat(settings.ContactShadowStride, 0.5f, 8.0f); + perFrame.ContactShadowThickness = sanitizeFloat(settings.ContactShadowThickness, 0.0f, 1.0f); + perFrame.ContactShadowDepthFade = sanitizeFloat(settings.ContactShadowDepthFade, 0.0f, 1.0f); + perFrame.ContactShadowMinIntensity = sanitizeFloat(settings.ContactShadowMinIntensity, 0.0f, 1.0f); + perFrame.ShadowMapSlots = ShadowCasterManager::GetInstalledSlotCount(); std::copy(clusterSize, clusterSize + 3, perFrame.ClusterSize); + perFrame.EnableLightsVisualisation = EnableLightsVisualisation; + perFrame.LightsVisualisationMode = LightsVisualisationMode; return perFrame; } @@ -177,6 +298,23 @@ void LightLimitFix::RestoreDefaultSettings() settings = {}; } +void LightLimitFix::LoadSettings(json& o_json) +{ + settings = o_json; + // iShadowMapResolution:Display is owned by Skyrim's INI, not our JSON. + ShadowCasterManager::LoadINISettings(); + + // Raise saved values below the current floor so older configs migrate. + if (settings.ShadowSettings.MaxRedrawPerFrame < ShadowCasterManager::Settings::kMinMaxRedrawPerFrame) + settings.ShadowSettings.MaxRedrawPerFrame = ShadowCasterManager::Settings::kMinMaxRedrawPerFrame; +} + +void LightLimitFix::SaveSettings(json& o_json) +{ + o_json = settings; + ShadowCasterManager::SaveINISettings(); +} + RE::NiNode* GetParentRoomNode(RE::NiAVObject* object) { if (object == nullptr) { @@ -252,23 +390,33 @@ void LightLimitFix::BSLightingShader_SetupGeometry_GeometrySetupConstantPointLig if (i < a_pass->numShadowLights) { auto* shadowLight = static_cast(bsLight); - GET_INSTANCE_MEMBER(maskIndex, shadowLight); - light.shadowMaskIndex = maskIndex; - light.lightFlags.set(LightFlags::Shadow); + // Use SCM's stable container-slot index instead of reading the + // live `shadowmapDescriptors[0].shadowmapIndex`. The descriptor + // field can be corrupted mid-frame by ReturnShadowmaps() (called + // via Hook_DisableColorMask) after ScheduleShadowCasters fixed + // it but before this strict-light setup runs -- a stale-but-in + // -range index would still pass an upper-bound check yet point + // strict-light shader sampling at the wrong kSHADOWMAPS slice. + // GetShadowSlot reads from the SCM's own pool (s_lights, set in + // ScheduleShadowCasters and never touched by ReturnShadowmaps), + // so it stays consistent with CopyShadowLightData and + // UpdateLights, which also key off it. Returns -1 for the sun + // or inactive lights; both cases skip setting the Shadow flag. + const int32_t slot = ShadowCasterManager::GetShadowSlot(shadowLight); + if (slot >= 0 && static_cast(slot) < ShadowCasterManager::GetInstalledSlotCount()) { + light.shadowMapIndex = static_cast(slot); + light.lightFlags.set(LightFlags::Shadow); + } } strictLightDataTemp.StrictLights[writeIdx++] = light; } strictLightDataTemp.NumStrictLights = writeIdx; - for (uint32_t i = 0; i < a_pass->numShadowLights; i++) { - auto bsLight = a_pass->sceneLights[i + 1]; - if (!bsLight) - continue; - auto* shadowLight = static_cast(bsLight); - GET_INSTANCE_MEMBER(maskIndex, shadowLight); - strictLightDataTemp.ShadowBitMask |= (1u << maskIndex); - } + // Don't reinstate a build loop for strictLightDataTemp.ShadowBitMask: + // no shader reads it (the IsLightIgnored bit-mask branch was replaced by + // per-light shadowMapIndex sampling). The field stays for cbuffer ABI + // stability and is zero-initialised above. } void LightLimitFix::BSLightingShader_SetupGeometry_After(RE::BSRenderPass*) @@ -287,14 +435,12 @@ void LightLimitFix::BSLightingShader_SetupGeometry_After(RE::BSRenderPass*) const auto isEmpty = strictLightDataTemp.NumStrictLights == 0; const bool isWorld = accumulator->GetRuntimeData().activeShadowSceneNode == shadowSceneNode; const auto roomIndex = strictLightDataTemp.RoomIndex; - const auto shadowBitMask = strictLightDataTemp.ShadowBitMask; - if (!isEmpty || (isEmpty && !wasEmpty) || isWorld != wasWorld || previousRoomIndex != roomIndex || shadowBitMask != previousShadowBitMask) { + if (!isEmpty || (isEmpty && !wasEmpty) || isWorld != wasWorld || previousRoomIndex != roomIndex) { strictLightDataCB->Update(strictLightDataTemp); wasEmpty = isEmpty; wasWorld = isWorld; previousRoomIndex = roomIndex; - previousShadowBitMask = shadowBitMask; } if (frameChecker.IsNewFrame()) { @@ -330,6 +476,7 @@ void LightLimitFix::Prepass() ZoneScoped; TracyD3D11Zone(globals::state->tracyCtx, "LightLimitFix Prepass"); state->BeginPerfEvent("LightLimitFix Prepass"); + ShadowCasterManager::Update(settings.ShadowSettings, globals::game::smState->shadowSceneNode[0], nullptr); UpdateLights(); ID3D11ShaderResourceView* views[3]{}; @@ -343,7 +490,7 @@ void LightLimitFix::Prepass() bool LightLimitFix::IsValidLight(RE::BSLight* a_light) { - return a_light && !a_light->light->GetFlags().any(RE::NiAVObject::Flag::kHidden); + return a_light && a_light->light && a_light->light.get() && !a_light->light->GetFlags().any(RE::NiAVObject::Flag::kHidden); } bool LightLimitFix::IsGlobalLight(RE::BSLight* a_light) @@ -354,6 +501,8 @@ bool LightLimitFix::IsGlobalLight(RE::BSLight* a_light) void LightLimitFix::PostPostLoad() { Hooks::Install(); + ShadowCasterManager::Init(settings.ShadowSettings); + ShadowCasterManager::Install(settings.ShadowSettings); } void LightLimitFix::DataLoaded() @@ -382,6 +531,7 @@ void LightLimitFix::ClearShaderCache() void LightLimitFix::UpdateLights() { + ZoneScopedN("LLF::UpdateLights"); auto smState = globals::game::smState; auto& isl = globals::features::inverseSquareLighting; @@ -412,9 +562,27 @@ void LightLimitFix::UpdateLights() light.roomFlags.SetBit(roomIndex, 1); }; + // Hover-pulse helper: if the table has a hovered row matching this light's + // pointer, replace the cluster colour with a magenta pulse so the user can + // see which light a row corresponds to in 3D. Pulse cycles ~once per second + // using ImGui::GetTime() for a stable visual signal. + auto applyDebugOverrides = [](LightData& light, const void* lightPtr) { + auto hoverKey = ShadowCasterManager::GetHoveredLight(); + if (hoverKey != 0 && reinterpret_cast(lightPtr) == hoverKey) { + float t = 0.5f + 0.5f * std::sin(static_cast(ImGui::GetTime()) * 6.2831853f); + light.color = { 1.0f, 0.0f, 1.0f }; // magenta + light.fade = 4.0f + t * 4.0f; // pulsed intensity + } + }; + auto addLight = [&](const RE::NiPointer& e) { if (auto bsLight = e.get()) { if (auto niLight = bsLight->light.get()) { + // IsSuppressed includes solo (every key except the soloed one is + // implicitly suppressed). This filters every non-shadow cluster + // light through the user's debug overrides. + if (ShadowCasterManager::IsSuppressed(reinterpret_cast(bsLight))) + return; if (IsValidLight(bsLight)) { auto& runtimeData = niLight->GetLightRuntimeData(); @@ -444,33 +612,153 @@ void LightLimitFix::UpdateLights() light.lightFlags.set(LightFlags::PortalStrict); } - if (bsLight->IsShadowLight()) { - auto* shadowLight = static_cast(bsLight); - GET_INSTANCE_MEMBER(maskIndex, shadowLight); - light.shadowMaskIndex = maskIndex; - light.lightFlags.set(LightFlags::Shadow); + SetLightPosition(light, niLight->world.translate); + + applyDebugOverrides(light, bsLight); + + if ((light.color.x + light.color.y + light.color.z) * light.fade > 1e-4 && light.radius > 1e-4) { + lightsData.push_back(light); } + } + } + } + }; + + auto addShadowLight = [&](RE::BSShadowLight* shadowLight, bool castsShadow, uint32_t shadowSlot = 0) { + if (IsValidLight(shadowLight)) { + if (auto niLight = shadowLight->light.get()) { + auto& runtimeData = niLight->GetLightRuntimeData(); + + LightData light{}; + light.color = { runtimeData.diffuse.red, runtimeData.diffuse.green, runtimeData.diffuse.blue }; + light.lightFlags = std::bit_cast(runtimeData.ambient.red); + + if (isl.loaded) { + isl.ProcessLight(light, shadowLight, niLight); + } else { + light.radius = runtimeData.radius.x; + // light.color *= runtimeData.fade; + light.fade = runtimeData.fade; + } - // Check for inactive shadow light - if (light.shadowMaskIndex != 255) { - SetLightPosition(light, niLight->world.translate); + light.fade *= shadowLight->lodDimmer; - if ((light.color.x + light.color.y + light.color.z) * light.fade > 1e-4 && light.radius > 1e-4) { - lightsData.push_back(light); - } + if (!IsGlobalLight(shadowLight)) { + // List of BSMultiBoundRooms affected by a light + for (const auto& roomPtr : shadowLight->rooms) { + addRoom(roomPtr, light); } + // List of BSPortals affected by a light + for (const auto& portalPtr : shadowLight->portals) { + addRoom(portalPtr->portalSharedNode.get(), light); + } + light.lightFlags.set(LightFlags::PortalStrict); + } + + if (castsShadow) { + // Use the caller-provided stable slot index from s_lights + // rather than shadowmapDescriptors[0].shadowmapIndex, which + // can drift relative to our scheduler-assigned slot when + // ReturnShadowmaps fires between scheduling and lighting. + light.shadowMapIndex = shadowSlot; + light.lightFlags.set(LightFlags::Shadow); + } + + SetLightPosition(light, niLight->world.translate); + + applyDebugOverrides(light, shadowLight); + + if ((light.color.x + light.color.y + light.color.z) * light.fade > 1e-4 && light.radius > 1e-4) { + lightsData.push_back(light); } } } }; + // Single pass over shadowLightsAccum: + // - Builds shadowLightPtrs so activeLights below skips lights already added here. + // - Calls addShadowLight for each logical light. + // EnableLight calls both GameEnableLight (→ activeLights) and + // GameSetShadowCasterSlot (→ shadowLightsAccum) for redrawn lights, so without + // the skip below each redrawn shadow light would be added twice. + // + // Static reuses the bucket array across frames -- a local set would + // destroy + recreate its buckets every frame, defeating the reserve(). + // Dense layout avoids the per-insert node allocation a std::unordered_set + // would incur. Upper bound is the configured kSHADOWMAPS slot count; + // shadowLightsAccum is sized to hold at most that many distinct point/spot + // lights (sun occupies one logical entry but no kSHADOWMAPS slice, hence + // the belt-and-braces +1). + static ankerl::unordered_dense::set shadowLightPtrs; + shadowLightPtrs.clear(); + shadowLightPtrs.reserve(ShadowCasterManager::GetInstalledSlotCount() + 1); + ShadowCasterManager::ForEachShadowLight(shadowSceneNode->GetRuntimeData().shadowLightsAccum, + [&](RE::BSShadowLight* light) { + shadowLightPtrs.insert(light); + // GetShadowSlot returns the kSHADOWMAPS texture slot: + // -1 : sun (no kSHADOWMAPS slice — sun shadows live in kSHADOWMAPS_ESRAM + // and are sampled via the directional cascade path, not the cluster + // loop). Skip cluster injection entirely. The sun stays in + // shadowLightPtrs so the activeLights loop below doesn't re-add it. + // >=0: kSHADOWMAPS slice index (0..ShadowMapSlots-1) post-reclaim. + int32_t stableSlot = ShadowCasterManager::GetShadowSlot(light); + if (stableSlot < 0) + return; + bool castsShadow = static_cast(stableSlot) < ShadowCasterManager::GetInstalledSlotCount(); + addShadowLight(light, castsShadow, castsShadow ? static_cast(stableSlot) : 0u); + }); + for (auto& e : shadowSceneNode->GetRuntimeData().activeLights) { - addLight(e); - } - for (auto& e : shadowSceneNode->GetRuntimeData().activeShadowLights) { + if (auto bsLight = e.get(); bsLight && shadowLightPtrs.count(bsLight)) + continue; // shadow light: already added above with correct Shadow flag addLight(e); } + // Converted shadow lights (shadow lights demoted to normal-light overflow handling + // via SCM's ConvertExcessToNormal) live in the engine's activeShadowLights list + // (offset 0x148) — verified via Ghidra against ShadowSceneNode AE 1.6.1170. They + // are NOT migrated to activeLights (0x130) when our Hook_IsShadowLight reports + // false, because the engine's AddLight just searches the existing wrappers and + // activates the matching one in-place rather than moving entries between lists. + // + // Iterate SCM's s_normalConvert directly rather than scanning activeShadowLights: + // only lights actually in s_normalConvert are intended to render as non-shadow. + // activeShadowLights also contains BSShadowLights that are merely active shadow + // casters this frame (already handled via shadowLightsAccum above), and could in + // principle contain disabled-but-not-yet-removed entries. Iterating the convert + // list is both tighter (no false positives) and cheaper. + // + // Without this, ConvertExcessToNormal lights have no entry in the cluster + // lightsData[] and never render — the user-visible "converted lights are + // invisible" symptom of issue #2121 #3. + ShadowCasterManager::ForEachConvertedLight([&](RE::BSShadowLight* light) { + auto* asBs = static_cast(light); + if (shadowLightPtrs.count(asBs)) + return; // simultaneously a shadow caster this frame; already added + // Honour the user's suppression toggle in the shadow caster table: + // converted lights share the same lightKey suppression set as shadow + // lights, so suppressing one in the table hides it whether it's + // rendering as a shadow caster or demoted to non-shadow. + if (ShadowCasterManager::IsSuppressed(reinterpret_cast(light))) + return; + // Engine zeroes lodDimmer when its shadow-distance LOD cull fires + // (BSShadowParabolicLight_UpdateCamera test 2, gated on the lodFade + // flag -- not a visibility test, see ShadowCasterManager.cpp's + // Ghidra-verified comment). Without restoration, addLight()'s + // `light.fade *= lodDimmer` would zero the contribution and the + // (color*fade > 1e-4) filter would drop the light entirely. + // + // Restore only when fully zeroed. Any smooth fade value the engine + // set (between 0 and 1) is preserved -- those represent the engine's + // own gradual distance attenuation, which is correct to honour for + // cluster lighting. Overriding unconditionally was producing + // distant always-full-bright converted lights that ignored the + // engine's intended fade-with-distance. + if (light->lodDimmer == 0.0f) + light->lodDimmer = 1.0f; + addLight(RE::NiPointer(asBs)); + }); + auto context = globals::d3d::context; lightCount = std::min((uint)lightsData.size(), MAX_LIGHTS); @@ -482,6 +770,13 @@ void LightLimitFix::UpdateLights() context->Unmap(lights->resource.get(), 0); UpdateStructure(); + + // Single-shot consumption: clear the hover key after the cluster has read it. + // The table re-sets it every frame the cursor is hovering a row with Shift + // held, so the pulse continues smoothly while hovering. As soon as the menu + // closes (or the cursor leaves the table, or Shift is released), the table + // stops re-setting the key and the pulse vanishes on the next frame. + ShadowCasterManager::SetHoveredLight(0); } void LightLimitFix::UpdateStructure() @@ -564,6 +859,64 @@ void LightLimitFix::Hooks::BSLightingShader_SetupGeometry::thunk(RE::BSShader* T void LightLimitFix::Hooks::BSEffectShader_SetupGeometry::thunk(RE::BSShader* This, RE::BSRenderPass* Pass, uint32_t RenderFlags) { + // Defensive pre-call guard: BSEffectShader::SetupGeometry iterates + // Pass->sceneLights[i] and dereferences bsLight->light->fade + // (BSLight+0x48 -> NiLight+0x134) with NO null check. Stale entries are + // possible because Pass->sceneLights[] is a raw BSLight** (not + // NiPointer<>): the engine's pass cache can outlive individual lights + // or capture them after their NiLight has been cleared. Crashes seen in + // the wild include garbage data (BSLight memory recycled as a string + // buffer) and outright NULL NiLight (engine half-destroyed the BSLight + // but it's still ref-counted alive in some list). + // + // Walk the array and clamp numLights to the count of entries that the + // engine can safely dereference. Validation: + // - BSLight* is canonical, 8-byte aligned, non-null + // - bsLight->light pointer is canonical, 8-byte aligned, non-null + // Entries failing either check stop the loop; the engine's own loop + // bails on the first bad entry too, so clamping matches its contract. + if (Pass && Pass->sceneLights && Pass->numLights > 0) { + const auto isPlausible = [](const void* p) { + const auto v = reinterpret_cast(p); + return v >= 0x10000 && v < 0x800000000000ull && (v & 0x7) == 0; + }; + std::uint8_t validCount = 0; + for (std::uint8_t i = 0; i < Pass->numLights; ++i) { + RE::BSLight* bsLight = Pass->sceneLights[i]; + if (!isPlausible(bsLight)) { + static int loggedBsLight = 0; + if (loggedBsLight++ < 10) { + logger::warn( + "[LLF] BSEffectShader_SetupGeometry: bad BSLight* at " + "sceneLights[{}]=0x{:x} numLights={}; clamping to {}", + i, reinterpret_cast(bsLight), Pass->numLights, validCount); + } + break; + } + RE::NiLight* niLight = bsLight->light.get(); + if (!isPlausible(niLight)) { + // Catches both NULL (engine cleared the NiPointer) and + // garbage (BSLight memory recycled). NULL is the more common + // observed failure -- the engine's loop has no null check + // before reading [+0x134]. + static int loggedNiLight = 0; + if (loggedNiLight++ < 10) { + logger::warn( + "[LLF] BSEffectShader_SetupGeometry: bad NiLight at " + "sceneLights[{}] (BSLight=0x{:x} NiLight=0x{:x}); clamping to {}", + i, + reinterpret_cast(bsLight), + reinterpret_cast(niLight), + validCount); + } + break; + } + ++validCount; + } + if (validCount < Pass->numLights) + Pass->numLights = validCount; + } + func(This, Pass, RenderFlags); ExternalEmittance::UpdatePermutation(Pass); auto& singleton = globals::features::lightLimitFix; diff --git a/src/Features/LightLimitFix.h b/src/Features/LightLimitFix.h index b4a2dcd3dc..c71b0dc736 100644 --- a/src/Features/LightLimitFix.h +++ b/src/Features/LightLimitFix.h @@ -1,10 +1,15 @@ #pragma once #include "Buffer.h" +#include "LightLimitFix/ShadowCasterManager.h" #include "OverlayFeature.h" struct LightLimitFix : OverlayFeature { +private: + static constexpr uint32_t MAX_LIGHTS = 1024; + static constexpr uint32_t CLUSTER_MAX_LIGHTS = 128; + public: virtual inline std::string GetName() override { return "Light Limit Fix"; } virtual inline std::string GetShortName() override { return "LightLimitFix"; } @@ -14,13 +19,12 @@ struct LightLimitFix : OverlayFeature virtual std::pair> GetFeatureSummary() override { return { - "Light Limit Fix removes the vanilla game's 4-light limit, allowing unlimited dynamic lights in scenes.\n" - "This dramatically improves lighting quality and enables more realistic illumination scenarios.", + "Light Limit Fix removes the vanilla game's 4-light limit, allowing unlimited dynamic lights in scenes. " + "It also extends shadow support to all point and spot lights.", { "Removes 4-light limit", "Unlimited dynamic lights", - "Improved lighting quality", - "Enhanced visual realism", - "Enhanced visual realism" } + "Shadow support for point and spot lights", + "Improved lighting quality" } }; } @@ -55,9 +59,8 @@ struct LightLimitFix : OverlayFeature PositionOpt positionWS[2]; uint128_t roomFlags = uint32_t(0); stl::enumeration lightFlags; - uint32_t shadowMaskIndex = 0; - uint pad0; - uint pad1; + uint32_t shadowMapIndex = 0; + float2 pad0; }; STATIC_ASSERT_ALIGNAS_16(LightData); @@ -94,12 +97,31 @@ struct LightLimitFix : OverlayFeature struct alignas(16) PerFrame { + uint EnableContactShadows; + uint ContactShadowMaxSteps; + float ContactShadowMaxDistance; + float ContactShadowStride; + float ContactShadowThickness; + float ContactShadowDepthFade; + float ContactShadowMinIntensity; + uint32_t ShadowMapSlots; // total shadow map texture-array capacity + // Cluster config (computed) + uint ClusterSize[4]; + // Debug (last) uint EnableLightsVisualisation; uint LightsVisualisationMode; - float pad0[2]; - uint ClusterSize[4]; + uint pad0[2]; }; STATIC_ASSERT_ALIGNAS_16(PerFrame); + // Compile-time size lock catches CPU/GPU cbuffer layout drift. STATIC_ASSERT_ALIGNAS_16 + // only enforces the 16-byte alignment / multiple-of-16 contract that HLSL constant + // buffers require; it doesn't notice if a field is added, removed, or resized in a + // way that still happens to land on a 16-byte boundary. The shader-side mirror is + // SharedData::LightLimitFixSettings in package/Shaders/Common/SharedData.hlsli + // (embedded in the shared FeatureData cbuffer at b6), and must match this layout + // field-for-field. Update both sides when the layout changes, then bump this constant. + static_assert(sizeof(PerFrame) == 64, + "LightLimitFix::PerFrame layout drifted -- update SharedData::LightLimitFixSettings in package/Shaders/Common/SharedData.hlsli to match, then update this assert."); PerFrame GetCommonBufferData(); @@ -118,8 +140,16 @@ struct LightLimitFix : OverlayFeature ConstantBuffer* strictLightDataCB = nullptr; int eyeCount = !REL::Module::IsVR() ? 1 : 2; - bool previousEnableLightsVisualisation = settings.EnableLightsVisualisation; - bool currentEnableLightsVisualisation = settings.EnableLightsVisualisation; + + // Debug-only visualization state. Lives on the instance rather than in + // Settings so it can't accidentally persist into a user's config: a + // shipped JSON with `EnableLightsVisualisation = true` would force every + // load to compile the heavier LLFDEBUG shader permutation. These reset to + // off on each session. + bool EnableLightsVisualisation = false; + uint LightsVisualisationMode = 0; + bool previousEnableLightsVisualisation = false; + bool currentEnableLightsVisualisation = false; ID3D11ComputeShader* clusterBuildingCS = nullptr; ID3D11ComputeShader* clusterCullingCS = nullptr; @@ -141,17 +171,36 @@ struct LightLimitFix : OverlayFeature bool wasEmpty = false; bool wasWorld = false; int previousRoomIndex = -1; - uint previousShadowBitMask = 0; Util::FrameChecker frameChecker; + // Point/spot shadow resources (t102, t103 -- t100/t101 reserved for Grass Collision) + // shadowLights is lazily allocated in CopyShadowLightData() since shadowMapSlots + // is not known until Deferred::SetupResources() runs (after Feature::SetupResources()). + Buffer* shadowLights = nullptr; + uint32_t shadowLightsCapacity = 0; + + // Per-frame shadow accounting (displayed in DrawSettings Statistics tree). + uint32_t shadowLightCount = 0; // distinct lights processed (including dropped) + uint32_t shadowUnshadowedLightCount = 0; // lights that exceeded slot capacity + + /// Generate a text legend mapping each shadow-map slot index to its golden-ratio hue + /// and light type. Used for RenderDoc capture comments when mode 8 is active. + std::string BuildShadowSlotColorLegend() const; + virtual void SetupResources() override; virtual void RestoreDefaultSettings() override; + virtual void LoadSettings(json& o_json) override; + virtual void SaveSettings(json& o_json) override; virtual void DrawSettings() override; virtual void DrawOverlay() override; - virtual bool IsOverlayVisible() const override { return settings.EnableLightsVisualisation; } + virtual bool IsOverlayVisible() const override + { + return EnableLightsVisualisation || settings.ShowShadowOverlay || + ShadowCasterManager::HasSuppressedLights() || ShadowCasterManager::HasAnyOverrides(); + } virtual void PostPostLoad() override; virtual void DataLoaded() override; @@ -161,7 +210,11 @@ struct LightLimitFix : OverlayFeature void SetLightPosition(LightLimitFix::LightData& a_light, RE::NiPoint3 a_initialPosition, bool a_cached = true); void UpdateLights(); void UpdateStructure(); + virtual void EarlyPrepass() override; virtual void Prepass() override; + void CopyShadowLightData(); + + // Shadow rendering helpers (implemented in LightLimitFix/ShadowRenderer.cpp) static inline float3 Saturation(float3 color, float saturation); static inline bool IsValidLight(RE::BSLight* a_light); @@ -169,8 +222,31 @@ struct LightLimitFix : OverlayFeature struct Settings { - bool EnableLightsVisualisation = false; - uint LightsVisualisationMode = 0; + bool EnableContactShadows = false; + // Max raymarch steps at zero depth; linearly ramps to 0 at MaxDistance. + uint ContactShadowMaxSteps = 4; + // View-space depth at which contact shadows fade fully off. + float ContactShadowMaxDistance = 1024.0f; + // Per-step march length in view-space units. Larger -> longer shadows, coarser detail. + float ContactShadowStride = 2.0f; + // Depth-delta multiplier for shadow onset (higher -> darker contact). + float ContactShadowThickness = 0.20f; + // Depth-delta multiplier for shadow falloff (higher -> shorter shadow). + float ContactShadowDepthFade = 0.05f; + // Skip contact shadows for CLUSTERED lights whose normalized distance falloff + // (1 - (lightDist/radius)^2) at the pixel is below this threshold. Strict + // lights always raymarch. 0 = never skip; 1 = always skip. + float ContactShadowMinIntensity = 0.25f; + + /// Show the shadow caster overlay (suppression / debug-override table) + /// independently of the visualization mode and suppression state. + /// Without this, the overlay only appeared when a light was suppressed + /// or visualisation was active — making it hard to access the overlay's + /// debug controls (cycle button, solo, hover-pulse) in the default state. + bool ShowShadowOverlay = false; + + // Shadow caster scheduling (ShadowCasterManager) + ShadowCasterManager::Settings ShadowSettings; }; uint clusterSize[3] = { 16 }; diff --git a/src/Features/LightLimitFix/ShadowCasterManager.cpp b/src/Features/LightLimitFix/ShadowCasterManager.cpp new file mode 100644 index 0000000000..d1bb5d8dc0 --- /dev/null +++ b/src/Features/LightLimitFix/ShadowCasterManager.cpp @@ -0,0 +1,5503 @@ +// ShadowCasterManager.cpp +// Shadow caster scheduling for LightLimitFix. +// +// Based on Intellightent by meh321 +// https://www.nexusmods.com/skyrimspecialedition/mods/172423 +// +// Ported and adapted for Community Shaders by the Community Shaders team with permission. + +#include "ShadowCasterManager.h" +#include "../../Deferred.h" +#include "../../Globals.h" +#include "../../State.h" +#include "../../Utils/Game.h" +#include "../../Utils/UI.h" +#include "../Upscaling.h" +#include "../VR.h" + +#include + +namespace ShadowCasterManager +{ + // ========================================================================= + // Formula evaluator (exprtk) + // ========================================================================= + + struct FormulaWrapper + { + exprtk::expression expression; + exprtk::parser parser; + }; + + static double s_formulaParams[kFormulaParam_Max]; + static exprtk::symbol_table s_symbolTable; + static bool s_formulaInited = false; + + struct FormulaVarInfo + { + const char* name; + const char* description; + int32_t index; + }; + + // Single authoritative list of formula variables. + // Drives both symbol table registration and the formula editor help text. + static constexpr FormulaVarInfo kFormulaVars[] = { + { "lightindex", "sequential index of this candidate light", kFormulaParam_LightIndex }, + { "lightintensity", "NiLight fade/intensity", kFormulaParam_LightIntensity }, + { "lightdistance", "camera-to-light distance (game units; 1 unit ~= 1.428 cm)", kFormulaParam_LightDistance }, + { "lightradius", "light radius/range (game units; 1 unit ~= 1.428 cm)", kFormulaParam_LightRadius }, + { "lightx", "light world X", kFormulaParam_LightX }, + { "lighty", "light world Y", kFormulaParam_LightY }, + { "lightz", "light world Z", kFormulaParam_LightZ }, + { "lightr", "diffuse red", kFormulaParam_LightR }, + { "lightg", "diffuse green", kFormulaParam_LightG }, + { "lightb", "diffuse blue", kFormulaParam_LightB }, + { "lightambientr", "ambient red", kFormulaParam_LightAmbientR }, + { "lightambientg", "ambient green", kFormulaParam_LightAmbientG }, + { "lightambientb", "ambient blue", kFormulaParam_LightAmbientB }, + { "lightchosenlastframe", "1 if this light held a slot last frame", kFormulaParam_LightChosenLastFrame }, + { "lightframessincerender", "frames since this light's slot was last actually rendered into the shadow atlas; 1e6 sentinel when never rendered or unassigned", kFormulaParam_LightFramesSinceRender }, + { "lightneverfades", "1 if lodFade disabled (permanent light)", kFormulaParam_LightNeverFades }, + { "lightportalstrict", "1 if portal-strict (always 1 for shadow casters)", kFormulaParam_LightPortalStrict }, + { "lightns", "1 if promoted from normal light (PromoteNormalToShadow)", kFormulaParam_LightNS }, + { "lightconverted", "1 if light is in the converted (non-shadow) slot range", kFormulaParam_LightConverted }, + { "lightdisplacement", "distance this light moved since its last shadow map render (game units; 0 when not yet tracked or in score formula)", kFormulaParam_LightDisplacement }, + { "playerlightdistance", "distance from the player character to the light (game units; falls back to lightdistance when player unavailable)", kFormulaParam_PlayerLightDistance }, + { "lightimportance", "contribution score: lum(diffuse*fade) * max(att_cam,att_plr) where att=(1-(dist/radius)^2)^2; 0 in score formula", kFormulaParam_LightImportance }, + { "lightisspot", "1 if this is a spot/frustum shadow light (BSShadowFrustumLight); 0 for omni / hemi / sun", kFormulaParam_LightIsSpot }, + { "lightspotvisible", "1 if the spot's cone plausibly reaches the camera frustum, 0 otherwise. Always 1 for non-spot lights so existing omni-only formulas are unaffected", kFormulaParam_LightSpotVisible }, + { "camerax", "camera world X", kFormulaParam_CameraX }, + { "cameray", "camera world Y", kFormulaParam_CameraY }, + { "cameraz", "camera world Z", kFormulaParam_CameraZ }, + { "isinterior", "1 in interior cells, 0 outdoors", kFormulaParam_IsInterior }, + { "timeofday", "in-game hour (0.0-24.0)", kFormulaParam_TimeOfDay }, + { "frametime", "EMA-smoothed frame time (ms)", kFormulaParam_FrameTime }, + { "frametarget", "90th-percentile recent frame time (ms) -- headroom ceiling", kFormulaParam_FrameTarget }, + { "stableframes", "consecutive frames EMA has been below frametarget", kFormulaParam_StableFrames }, + }; + + static void InitFormulaSystem() + { + if (s_formulaInited) + return; + s_formulaInited = true; + + memset(s_formulaParams, 0, sizeof(double) * kFormulaParam_Max); + + for (const auto& v : kFormulaVars) + s_symbolTable.add_variable(v.name, s_formulaParams[v.index]); + } + + FormulaHelper::FormulaHelper() : + _ptr(nullptr) { InitFormulaSystem(); } + + FormulaHelper::~FormulaHelper() + { + if (_ptr) + delete static_cast(_ptr); + } + + bool FormulaHelper::Parse(const std::string& input) + { + if (_ptr) + return false; + auto* w = new FormulaWrapper(); + w->expression.register_symbol_table(s_symbolTable); + // Defer the _ptr assignment until compile succeeds. Otherwise a + // failed compile leaves the helper in a "parsed" state (Calculate + // would evaluate an uncompiled expression and the early-return + // guard above would block subsequent Parse retries). + if (!w->parser.compile(input, w->expression)) { + delete w; + return false; + } + _ptr = w; + return true; + } + + double FormulaHelper::Calculate() + { + auto* w = static_cast(_ptr); + return w ? w->expression.value() : 0.0; + } + + bool FormulaHelper::Reparse(const std::string& input) + { + std::string err; + if (!Validate(input, err)) + return false; + if (_ptr) + delete static_cast(_ptr); + _ptr = nullptr; + return Parse(input); + } + + bool FormulaHelper::Validate(const std::string& input, std::string& errorOut) + { + InitFormulaSystem(); + FormulaWrapper tmp; + tmp.expression.register_symbol_table(s_symbolTable); + if (tmp.parser.compile(input, tmp.expression)) + return true; + if (tmp.parser.error_count() > 0) + errorOut = tmp.parser.get_error(0).diagnostic; + else + errorOut = "Unknown parse error"; + return false; + } + + void FormulaHelper::SetParam(int32_t index, double value) { s_formulaParams[index] = value; } + double FormulaHelper::GetParam(int32_t index) { return s_formulaParams[index]; } + + // ========================================================================= + // Module-level state + // ========================================================================= + + /// Total LightEntry slots: sun (1) + shadow casters (≥4) + converted pool. + static int32_t LightContainerSize(const Settings& s) + { + return std::max(4, s.ShadowLightCount) + 1 + s.ConvertedShadowSlots; + } + + static Settings s_settings; + static LightContainer s_lights; + static BudgetTracker s_budget; + + // External conflict detection -- set during Install(), checked by Update() and DrawSettings(). + static bool s_externalConflict = false; + static std::string s_conflictMessage; + + // Per-frame count of kSHADOWMAPS slots claimed by the engine's focus + // shadow renderer (player + tracked NPCs, max 4). Read from + // FocusShadowActors.size each frame; values clamp to [0, 4]. Reserves + // the slot range [g_focusShadowBaseSlotIndex .. +s_focusShadowSlots) = + // [4 .. 4+count) from the point-light pool dynamically: zero focus + // actors means the full pool is available, four means slots 4-7 are + // off-limits. Point lights occupying a freshly-claimed slot are + // ejected at scheduling time and re-allocated to a free slot or + // converted as excess. + static int s_focusShadowSlots = 0; + + // Rolling redraw history (128-frame window) for DrawSettings statistics. + static constexpr int kRedrawHistorySize = 128; + static int32_t s_redrawHistory[kRedrawHistorySize] = {}; + static int32_t s_redrawHistoryPos = 0; + static int32_t s_redrawSum = 0; + + // Rolling budget-consumed history (same window) for DrawSettings statistics. + static int32_t s_budgetHistory[kRedrawHistorySize] = {}; + static int32_t s_budgetHistoryPos = 0; + static int64_t s_budgetSum = 0; + + // Frame-time tracking — used by Formula's frametime/frametarget/stableframes + // formula params, the shared frame-state diagnostic block, and stats UI. + // Persists in both Manual and Formula modes; the cost is one float per frame. + static constexpr int kFrameWindow = 120; // ~2 s at 60 fps + static float s_ftRing[kFrameWindow]{}; + static int s_ftHead = 0; + static int s_ftCount = 0; + static float s_ftEMA = 0.0f; + static int s_stableFrames = 0; + static float s_autoBudgetMs = 0.0f; // last computed budget; used by UI, scheduling, and stats + + // "Steady" state thresholds for the shared frame-state diagnostic. + // Mirror the old Auto-mode hysteresis values so the indicator behaves the + // same way users grew used to, just informational rather than driving control. + static constexpr float kFrameHeadroomDeadZoneMs = 0.3f; // |headroom| below this = "steady" + static constexpr float kFrameHeadroomSafetyMs = 0.5f; // headroom must clear this before "growing" + + // Budget tracking for UI display + static int32_t s_redrawnLightsThisFrame = 0; + static int32_t s_totalShadowLightsThisFrame = 0; + static uint32_t s_highImportanceLightCount = 0; + static float s_redrawnLightsSmoothed = 0.0f; // EMA-smoothed for stable UI display + + // Tracy diagnostic counters reset at the start of each scheduler frame. + // Each candidate-handling path increments its bucket; values are emitted + // as TracyPlot at frame end so a capture can be queried to identify + // which paths fire under which budget/setting combinations. Cross- + // reference per-action ZoneText emissions (light pointer, reason) to + // identify *which* lights are hitting each path. + struct SchedDiagCounters + { + int candidates_total = 0; + int candidates_chosen = 0; + int candidates_excess = 0; + int candidates_invalid_camera = 0; + int candidates_invalid_portal = 0; + int candidates_invalid_frustum = 0; // sub-reason: outside camera frustum + int candidates_invalid_lod = 0; // sub-reason: lodDimmer zeroed (engine LOD fade) + int candidates_invalid_other = 0; // invalidCamera but neither frustum nor LOD flag + int converted_invalid = 0; // ConvertLight from c.invalidCamera path + int converted_excess = 0; // ConvertLight from c.excess path + int disabled_invalid = 0; // DisableLight from c.invalid path (portal/spot/no-convert) + int disabled_excess = 0; // DisableLight from c.excess path (spot/no-convert) + int reconciliation_clears = 0; // slot freed because light gone from activeShadowLights + int slots_in_use = 0; // sampled at frame end + int first_render_skips = 0; // chosen lights deferred from shadow set: no valid slice yet + }; + static SchedDiagCounters s_schedDiag; + + static float ComputeFrameTimePercentile90() + { + if (s_ftCount == 0) + return 16.67f; // fallback: 60 fps target + const int n = std::min(s_ftCount, kFrameWindow); + float tmp[kFrameWindow]; + std::copy(s_ftRing, s_ftRing + n, tmp); + const int idx = static_cast(n * 0.9f); + std::nth_element(tmp, tmp + idx, tmp + n); + return tmp[idx]; + } + + // Maximum ShadowLightCount the installed infrastructure supports. + // Set once by Install() to the *requested* count; later refined by + // RefreshInstalledSlotCount() to reflect what the GPU actually allocated. + // Update() clamps the user-facing setting to this. + static int32_t s_installedShadowLightCount; + + // What SCM asked the engine for. Equals settings.ShadowLightCount -- + // the sun lives in a separate texture (kSHADOWMAPS_ESRAM), so there's + // no +1 sun cascade slice in kSHADOWMAPS. Captured at Install so the + // post-allocation verification can detect VRAM-exhaustion fallbacks + // where the actual texture ends up smaller than requested. + static uint32_t s_requestedSlotCount = 0; + + // Total kSHADOWMAPS texture-array capacity *as actually allocated*. + // 0 until kSHADOWMAPS exists and we've read its real ArraySize back. + // Owned here (not in Deferred) because SCM is the only thing that + // modifies the engine's allocation request, and verification of that + // request is the same code path. Consumers (LLF cluster pipeline, + // SCM scheduler clamp, SCM UI) read via GetInstalledSlotCount(). + static uint32_t s_installedSlotCount = 0; + + // True once we've logged a verification result. Prevents spam if the + // SRV stays null forever (vanilla-disabled session) or oscillates. + static bool s_slotCountLogged = false; + + // Formula instances (allocated at Init if formula strings are non-empty) + static std::unique_ptr s_formulaScore; + static std::unique_ptr s_formulaRedrawInterval; + static std::unique_ptr s_formulaRedrawBudget; + + // Lights converted to normal (non-shadow) lights for diffuse-only rendering + struct ConvertedLight + { + RE::BSShadowLight* light; + bool isNS; + }; + static std::vector s_normalConvert; + static std::set s_shadowConvert; + + // User suppression set (lightKey = BSShadowLight pointer cast to uintptr_t). + // Persisted across light lifetimes so suppressing a torch survives the player + // leaving and returning to a cell. + static std::unordered_set s_suppressedLights; + + // Debugging overrides — see header docs for ClearAllOverrides / SetPinnedShadow / etc. + // Declared up here (rather than next to s_shadowSlotInfos) because the scheduler + // reads s_pinShadow / s_pinConvert to bias candidate scoring, and that's compiled + // long before the table-rendering code. + static std::unordered_set s_pinShadow; ///< force chosen (top of score sort) + static std::unordered_set s_pinConvert; ///< force excess + ConvertLight + static uintptr_t s_soloLight = 0; ///< 0 = no solo + static uintptr_t s_hoverLightKey = 0; ///< transient (per table draw) + + // ========================================================================= + // Helpers for depth-target index globals + // SE: 14304EEE8 / AE: n/a (adjacent) / VR: 143180df0 + // ========================================================================= + static int32_t GetDepthTargetType() + { + static REL::RelocationID uid(524780, 388826); + return *reinterpret_cast(uid.address()); + } + + static int32_t GetDepthTargetSubIndex() + { + static REL::RelocationID uid(524780, 388826); + return *reinterpret_cast(uid.address() + 4); + } + + // ========================================================================= + // Hook implementations + // ========================================================================= + + // ------------------------------------------------------------------------- + // Expanded accumulated-lights array + // The game allocates a local array sized for 8 lights (with +1 sentinel). + // When using more than 8 shadow casters we extend RDI (SE) / RBX (AE/VR) + // which is the loop-end counter, and RDX (SE) which is the copy-end counter. + // ------------------------------------------------------------------------- + static void Hook_AccumulatedLightsArray(CONTEXT& ctx) + { + int needed = (s_settings.ShadowLightCount + s_settings.ConvertedShadowSlots + 1) * 2; + int have = 10; // game default: (4+1)*2 + int extra = needed - have; + if (extra > 0) { + ctx.Rdi += extra; + // SE/VR latch EDX from EDI inside the patched 5 bytes (before the + // stub captures CONTEXT), so BSTArray::resize runs with the un- + // bumped count while the fill loop runs with the bumped one -- + // OOB heap write scaling with ShadowLightCount. AE inlines the + // resize and re-reads EDI after the stub, so RDX is dead here. + if (!REL::Module::IsAE()) + ctx.Rdx += extra; + } + } + + // ------------------------------------------------------------------------- + // Redirect depth-stencil-view creation to our extended arrays + // The game loops 0..7 creating depth stencil views and stores each pointer + // in a game-managed struct at R9. We redirect R9 to our own arrays so + // views >= 8 land in globals::features::llf::normalDepthBuffer / globals::features::llf::readOnlyDepthBuffer. + // ------------------------------------------------------------------------- + static void Hook_CreateNormalDepthBuffer(CONTEXT& ctx) + { + // R12 (SE/AE) or R13 (VR) holds a_target * 0x13; value 4*19=76 identifies + // the shadow-map depth target. RDI (SE) / RBX (AE/VR) is the loop index. + if (REL::Relocate(ctx.R12, ctx.R12, ctx.R13) != 4 * 19) + return; + int idx = (int)REL::Relocate(ctx.Rdi, ctx.Rbx, ctx.Rbx); + ctx.R9 = reinterpret_cast(&globals::features::llf::normalDepthBuffer[idx]); + } + + static void Hook_CreateReadOnlyDepthBuffer(CONTEXT& ctx) + { + if (REL::Relocate(ctx.R12, ctx.R12, ctx.R13) != 4 * 19) + return; + int idx = (int)REL::Relocate(ctx.Rdi, ctx.Rbx, ctx.Rbx); + ctx.R9 = reinterpret_cast(&globals::features::llf::readOnlyDepthBuffer[idx]); + } + + // ------------------------------------------------------------------------- + // Copy first 8 views into the game's own DepthStencilData array + // Called after the creation loop finishes; syncs the game struct so existing + // code reading depthStencils[4].views[0..7] still works correctly. + // ------------------------------------------------------------------------- + static void Hook_SetupGameArray(CONTEXT& ctx) + { + if (REL::Relocate(ctx.R12, ctx.R12, ctx.R13) != 4 * 19) + return; + auto* renderer = reinterpret_cast(ctx.R15); + for (int i = 0; i < 8; i++) { + renderer->GetDepthStencilData().depthStencils[4].views[i] = reinterpret_cast(globals::features::llf::normalDepthBuffer[i]); + renderer->GetDepthStencilData().depthStencils[4].readOnlyViews[i] = reinterpret_cast(globals::features::llf::readOnlyDepthBuffer[i]); + } + } + + // ------------------------------------------------------------------------- + // Redirect depth-buffer selection at draw time + // When the active depth target is type 4 (shadow maps), route sub-index + // lookups through our extended arrays instead of the game struct. + // Hook #1: renderer in R8, result -> RBX. + // ------------------------------------------------------------------------- + static void Hook_SelectDepthBuffer1(CONTEXT& ctx) + { + auto* data = reinterpret_cast(ctx.R8); + int type = GetDepthTargetType(); + int sub = GetDepthTargetSubIndex(); + + if (type == 4) { + ctx.Rbx = data->readOnlyDepth ? reinterpret_cast(globals::features::llf::readOnlyDepthBuffer[sub]) : reinterpret_cast(globals::features::llf::normalDepthBuffer[sub]); + } else { + ctx.Rbx = data->readOnlyDepth ? reinterpret_cast(RE::BSGraphics::Renderer::GetSingleton()->GetDepthStencilData().depthStencils[type].readOnlyViews[sub]) : reinterpret_cast(RE::BSGraphics::Renderer::GetSingleton()->GetDepthStencilData().depthStencils[type].views[sub]); + } + } + + // Hook #2: VR: renderer in R14, result -> RBP; SE/AE: renderer in RBP, result -> R14. + static void Hook_SelectDepthBuffer2(CONTEXT& ctx) + { + bool isVR = REL::Module::IsVR(); + bool readOnly = isVR ? reinterpret_cast(ctx.R14)->GetRuntimeData().readOnlyDepth : reinterpret_cast(ctx.Rbp)->GetRuntimeData().readOnlyDepth; + + int type = GetDepthTargetType(); + int sub = GetDepthTargetSubIndex(); + + DWORD64 result; + if (type == 4) { + result = readOnly ? reinterpret_cast(globals::features::llf::readOnlyDepthBuffer[sub]) : reinterpret_cast(globals::features::llf::normalDepthBuffer[sub]); + } else { + result = readOnly ? reinterpret_cast(RE::BSGraphics::Renderer::GetSingleton()->GetDepthStencilData().depthStencils[type].readOnlyViews[sub]) : reinterpret_cast(RE::BSGraphics::Renderer::GetSingleton()->GetDepthStencilData().depthStencils[type].views[sub]); + } + + if (isVR) + ctx.Rbp = result; + else + ctx.R14 = result; + } + + // ------------------------------------------------------------------------- + // Release extended depth buffers at renderer shutdown + // ------------------------------------------------------------------------- + static void ReleaseExtendedDepthBuffers(int shadowCount) + { + for (int i = 8; i < shadowCount; i++) { + if (globals::features::llf::normalDepthBuffer[i]) { + reinterpret_cast(globals::features::llf::normalDepthBuffer[i])->Release(); + globals::features::llf::normalDepthBuffer[i] = nullptr; + } + if (globals::features::llf::readOnlyDepthBuffer[i]) { + reinterpret_cast(globals::features::llf::readOnlyDepthBuffer[i])->Release(); + globals::features::llf::readOnlyDepthBuffer[i] = nullptr; + } + } + } + + static void Hook_DeleteDepthBuffers_SE(CONTEXT& ctx) + { + // Only fire when RBX points at depthStencils[4], not at other delete calls. + auto* data = reinterpret_cast(ctx.Rbx); + if (data == &RE::BSGraphics::Renderer::GetSingleton()->GetDepthStencilData().depthStencils[4]) + ReleaseExtendedDepthBuffers(s_settings.ShadowLightCount); + } + + static void Hook_DeleteDepthBuffers_AE(CONTEXT& /*ctx*/) + { + ReleaseExtendedDepthBuffers(s_settings.ShadowLightCount); + } + + // ------------------------------------------------------------------------- + // Force each light to use its assigned shadow map slot + // RenderCascade would otherwise recalculate a slot index from a global + // counter, causing lights that weren't re-rendered this frame to corrupt + // each other's shadow maps. + // SE: light pointer in R15, slot index out in RSI. + // VR: light pointer in R14, slot index out in RDX. + // ------------------------------------------------------------------------- + static void Hook_OverwriteShadowMapIndex(CONTEXT& ctx) + { + // Enabled is a boot-time gate (see Init early-return) -- this + // hook is only installed when SCM is enabled at boot, so it + // runs unconditionally per-frame from there. Toggling Enabled + // off at runtime no longer affects the hook; restart is the + // only safe way to revert. See Hook_CalculateActiveShadowCasters + // comment for the crash rationale. + + auto* light = reinterpret_cast(REL::Relocate(ctx.R15, ctx.R15, ctx.R14)); + int32_t idx = s_lights.FindLight(light, s_settings.ShadowLightCount); + if (idx < 0) + idx = 0; // should not happen; fail-safe to slot 0 + // This hook runs inside BSShadowParabolicLight::RenderCascade's + // `renderTarget == kNONE` block, so it only fires for point/spot + // lights (the sun's RenderShadowmaps presets renderTarget to 2/3/4 + // before each call, skipping the block). FindLight must therefore + // cover the same range as FindFreeIndex; a mismatch means a light + // silently gets idx=0 and corrupts the slot at index 0. + + if (REL::Module::IsVR()) + ctx.Rdx = static_cast(idx); + else + ctx.Rsi = static_cast(idx); + } + + // ------------------------------------------------------------------------- + // Screen-space shadow-mask pass wrapper + // ------------------------------------------------------------------------- + // + // Vanilla wires Main::RenderShadowmasks (100422/107140) to call + // RenderShadowLightsWithUtilityShader (100423/107141) which: + // - binds kSHADOW_MASK as RT, clears it, + // - walks ssn->shadowLightsAccum[] and for each entry emits a full-screen + // BSUtilityShader pass that samples the cascade / parabolic depth maps + // and writes the mask. + // + // The inner loop indexes a hard-coded 4-entry table (DAT_141861380, + // per-slot m_AlphaBlendWriteMode) by BSShadowLight::maskIndex (offset 0x520 + // SE/AE, 0x580 VR -- see CommonLib BSShadowLight.h). Vanilla only ever + // populates 4 kSHADOWMAPS slices so maskIndex stays in [0..3] and the index + // is safe. + // + // SLF's extended scheduler assigns maskIndex up to ShadowLightCount-1 + // (LightContainer / EnableLight; see ShadowField(e.Light, maskIndex) = + // static_cast(slot) below). For any slot >= 4, the engine's + // MOV [R15 + RDX*0x4] OOB-reads garbage out of DAT_141861380 (next dword is + // 0x3F7FFFDE, a float bit pattern) which lands in + // g_RendererShadowState.m_AlphaBlendWriteMode -> undefined D3D state. + // + // Previous fix nopped out the CALL site entirely ("Hook_DisableColorMask", + // misnamed: the patched call IS RenderShadowLightsWithUtilityShader, NOT a + // color-mask call -- verified via Ghidra on SE 1.5.97 (+0x90 -> 0x1412e3b80) + // and SkyrimVR (+0x9E -> 0x141323740), matching RelocationID 100423/107141). + // That killed the screen-space mask globally, removing sun shadows and + // brightening the scene because deferred lighting sampled an undisturbed + // (effectively fully-lit) mask RT. RenderDoc evidence: empty "Shadowmasks" + // engine marker; cascade depth maps still rendered upstream but never + // consumed. + // + // This wrapper restores vanilla behaviour for the first 4 cascade slices + // and silently elides any extended-slot entries by writing a null sentinel + // into shadowLightsAccum at the cutoff. The engine's + // GetShadowCasterLightArrayEntry terminates when the slot pointer is null, + // so the loop stops cleanly without ever indexing DAT_141861380 for slot + // >= 4. The saved pointer is restored after the call. + // + // Under LIGHT_LIMIT_FIX only the mask's R channel (sun cascades) is read by + // the lighting shader (Lighting.hlsl:2516 shadowColor.x); G/B/A and any + // slot >= 4 are handled by LLF's cluster pipeline sampling kSHADOWMAPS + // directly. Restoring the mask therefore fixes the sun-shadow regression + // without interfering with extended shadow casters. + struct Hook_RenderShadowLightsWithUtilityShader + { + // Skip vanilla entirely. + // + // Vanilla's RenderShadowLightsWithUtilityShader iterates + // shadowLightsAccum and emits a full-screen pass per entry, indexing + // a 4-entry per-slot blend-mode table (DAT_141861380) by each light's + // maskIndex (BSShadowLight+0x520). Three failure modes were observed + // with SLF's scheduling: + // 1. Extended slots (maskIndex >= 4) OOB-read the table. + // 2. Vanilla advances `uVar7 += light->shadowMapCount` and reads + // `shadowLightsAccum[uVar7]`; with a 3-cascade sun and + // accum.size() < 4, the next read is past the array buffer. + // Heap garbage that looks like a BSShadowLight* gets + // dereferenced on [+0x520]. Verified crashes: + // crash-2026-05-25-15-16-25.log RDX=0x3B1F3023 + // crash-2026-05-25-15-26-59.log RDX=0x3AA96F53 + // crash-2026-05-25-15-28-04.log RDX=0x3A4A3190 + // crash-2026-05-25-15-36-15.log RDX=0x3A4B11F5 + // all at 107141+0x319. + // 3. shadowLightsAccum entries created by GameAccumulate() (engine + // focus path) bypass SLF's maskIndex assignment in EnableLight, + // so maskIndex stays at uninitialized memory. + // + // Trying to bound vanilla's iteration safely required defending all + // three modes (BSTArray padding, maskIndex clamp, slice-count cap) + // and one of them kept slipping through. The simplest robust answer + // is to skip vanilla entirely. + // + // Under LIGHT_LIMIT_FIX (this fork's shipping configuration) the + // screen-space mask is not on the sun-shadow consumer path: + // - Lighting.hlsl:2515 uses LightLimitFix::GetDirectionalShadow, + // which samples DirectionalShadowCascades (t99) directly. + // - The cluster loop uses LightLimitFix::GetShadowLightShadow, + // which samples kSHADOWMAPS slices directly. + // shadowColor.x is consulted only as a fallback past the cascade + // range and during the !LIGHT_LIMIT_FIX vanilla path. Dropping the + // mask therefore loses no functionality LLF provides. + // + // Critically, unlike the previous Hook_DisableColorMask, we do NOT + // call ReturnShadowmaps. That side-effect cleared shadowmap- + // Descriptors and broke Deferred::CopyShadowLightData's cascade + // matrix upload, which is what produced the original "no sun + // shadow + scene brighter" symptom. + static void thunk() + { + (void)func; // suppress "unused" warning while keeping the relocation + } + static inline REL::Relocation func; + }; + + // ========================================================================= + // LightContainer methods + // ========================================================================= + + // Engine writes focus shadows to kSHADOWMAPS slots + // [kFocusShadowBaseSlotIndex .. +s_focusShadowSlots) (DAT_141867188 = 4 + // in vanilla, max 4 actors). Two predicates separate "could be claimed" + // from "currently claimed": + // IsFocusShadowReservableSlot(i) -- in the full [4..8) range that + // focus might use. FindFreeIndex treats these as last-resort so an + // actor appearance rarely needs to evict anything. + // IsFocusShadowSlot(i) -- currently held by an active focus actor; + // never allocated, and any point light here gets ejected at + // scheduling time. + static constexpr int32_t kFocusShadowBaseSlotIndex = 4; + static constexpr int32_t kFocusShadowMaxSlots = 4; + + static inline bool IsFocusShadowReservableSlot(int32_t i) + { + return i >= kFocusShadowBaseSlotIndex && i < kFocusShadowBaseSlotIndex + kFocusShadowMaxSlots; + } + + static inline bool IsFocusShadowSlot(int32_t i) + { + return i >= kFocusShadowBaseSlotIndex && i < kFocusShadowBaseSlotIndex + s_focusShadowSlots; + } + + int32_t LightContainer::FindFreeIndex(bool shadowSlot, int32_t shadowCount, int32_t convertCount) const + { + // Pool layout when Sun=true: [0]=sun, [1..shadowCount]=point lights, [shadowCount+1..]=converted + // Sun=false: [0..shadowCount-1]=point lights, [shadowCount..]=converted + // + // Slot 0 is reserved for the sun pointer when present (sunOff=1) so + // FindLight can locate the sun. The sun renders to kSHADOWMAPS_ESRAM + // (target 2), not kSHADOWMAPS (target 4), so slot 0 in our pool maps + // to a kSHADOWMAPS slice the sun never writes -- safe to leave as a + // bookkeeping placeholder. + // + // Slots 4..7 are the engine's focus shadow range. Two-pass allocation: + // preferred slots first (avoiding 4..7 entirely so an actor appearance + // rarely needs to evict), then 4..7 as fallback for slots not + // currently claimed by an active focus actor. + const int32_t sunOff = Sun ? 1 : 0; + const int32_t shadowEnd = sunOff + shadowCount; + auto scanShadow = [&](auto reservablePolicy) -> int32_t { + for (int i = sunOff; i < shadowEnd; i++) { + if (reservablePolicy(i)) + continue; + if (IsFocusShadowSlot(i)) + continue; // actively held by focus right now + if (!Lights[i].Light) + return i; + } + return -1; + }; + if (shadowSlot) { + // First pass: skip the entire focus-reservable range so a focus + // actor appearance ideally finds those slots already empty. + if (int32_t i = scanShadow([](int32_t k) { return IsFocusShadowReservableSlot(k); }); i >= 0) + return i; + // Fallback: fill the unclaimed focus-reservable slots from the + // top down. Engine packs FocusShadowActors densely from slot + // kFocusShadowBaseSlotIndex upward (player first, then tracked + // NPCs in priority order), so slot 7 is the LAST to be claimed + // as focus count grows. Placing a point light there has the + // lowest probability of being evicted later. + for (int32_t i = kFocusShadowBaseSlotIndex + kFocusShadowMaxSlots - 1; i >= kFocusShadowBaseSlotIndex; --i) { + if (i >= shadowEnd) + continue; + if (IsFocusShadowSlot(i)) + continue; + if (!Lights[i].Light) + return i; + } + return -1; + } + // Converted lights live past the shadow range and never collide with + // focus slots (focus base is 4, converted base is >= sunOff + shadowCount > 4). + const int32_t convBase = shadowEnd; + for (int i = convBase; i < convBase + convertCount; i++) { + if (!Lights[i].Light) + return i; + } + return -1; + } + + int32_t LightContainer::FindLight(RE::BSShadowLight* light, int32_t shadowCount) const + { + // Search the full allocation range. Hook_OverwriteShadowMapIndex calls + // in for the sun too, so the sun pointer (in slot 0 when Sun=true) must + // be findable. The fallback to idx=0 on -1 silently corrupts slot 0, + // so the search range must match FindFreeIndex's allocation range. + const int32_t sunOff = Sun ? 1 : 0; + const int32_t maxIdx = sunOff + shadowCount; + for (int i = 0; i < maxIdx; i++) + if (Lights[i].Light == light) + return i; + return -1; + } + + std::uint32_t MaxShadowAccumIterationBound() + { + // Each entry advances idx by its shadowMapCount. Worst-case per + // light is the directional sun's cascade count (iNumSplits:Display, + // INI-capped at 3 by ShadowmapRasterizerFix). 4 is a defensive + // upper bound. With ShadowLightCount user-capped at 127 plus one + // sun bookkeeping slot, the walked index never exceeds + // (1 + 127) * 4 = 512; add a margin so a transient mismatch + // between live settings and an already-populated engine array + // doesn't tripwire iteration. + constexpr std::uint32_t kCascadesPerLight = 4; + constexpr std::uint32_t kMargin = 16; + const std::uint32_t lights = static_cast(std::max(1, s_settings.ShadowLightCount)); + return (lights + 1) * kCascadesPerLight + kMargin; + } + + // Verdict for a candidate shadow-array footprint vs the DXGI budget. + // "tight" = free VRAM below 512 MB or shadow array > 25% of budget. + // "over" = free VRAM below 128 MB or shadow array > 50% of budget. + // Driven by free headroom rather than shadow share because a small + // array next to a tight budget is just as risky as a huge one in a + // roomy budget. + struct VRAMVerdict + { + bool tight = false; + bool over = false; + ImVec4 colour{ 0.55f, 0.85f, 0.55f, 1 }; // green by default + }; + static VRAMVerdict EvaluateVRAMVerdict(std::uint64_t shadowBytes, std::uint64_t freeBytes, std::uint64_t budgetBytes) + { + constexpr std::uint64_t kTightFree = 512ull * 1024 * 1024; + constexpr std::uint64_t kOverFree = 128ull * 1024 * 1024; + VRAMVerdict v; + v.tight = freeBytes < kTightFree || shadowBytes * 4 > budgetBytes; + v.over = freeBytes < kOverFree || shadowBytes * 2 > budgetBytes; + v.colour = v.over ? ImVec4(0.95f, 0.35f, 0.35f, 1) : + v.tight ? ImVec4(0.95f, 0.85f, 0.25f, 1) : + ImVec4(0.55f, 0.85f, 0.55f, 1); + return v; + } + + // Reads kSHADOWMAPS's underlying Texture2D desc, bypassing the SRV's + // ViewDimension. Skyrim creates the SRV with a non-array view dimension + // even though the resource itself is a Texture2DArray, so reading + // `desc.Texture2DArray.ArraySize` from the SRV desc returns 0; only the + // texture's own ArraySize is reliable. Returns false on any failure + // stage; out param is left untouched. + static bool TryReadShadowTextureDesc(D3D11_TEXTURE2D_DESC& out) + { + auto* renderer = globals::game::renderer; + if (!renderer) + return false; + auto* srv = renderer->GetDepthStencilData() + .depthStencils[RE::RENDER_TARGET_DEPTHSTENCIL::kSHADOWMAPS] + .depthSRV; + if (!srv) + return false; + winrt::com_ptr resource; + srv->GetResource(resource.put()); + if (!resource) + return false; + winrt::com_ptr tex; + if (FAILED(resource->QueryInterface(IID_PPV_ARGS(tex.put())))) + return false; + D3D11_TEXTURE2D_DESC desc{}; + tex->GetDesc(&desc); + if (desc.ArraySize == 0) + return false; + out = desc; + return true; + } + + // Lazily verifies that the engine's actual kSHADOWMAPS slice count + // matches what SCM patched in. Self-healing: bails until the texture + // is readable, then early-returns. Cross-checks against the requested + // count and clamps the scheduler on mismatch so out-of-bounds slice + // indexing can't occur after a VRAM-exhaustion fallback. + void RefreshInstalledSlotCount() + { + if (s_installedSlotCount > 0) + return; + + D3D11_TEXTURE2D_DESC desc{}; + if (!TryReadShadowTextureDesc(desc)) + return; + + uint32_t actual = desc.ArraySize; + s_installedSlotCount = actual; + if (s_slotCountLogged) + return; + s_slotCountLogged = true; + if (s_requestedSlotCount && actual != s_requestedSlotCount) { + logger::warn( + "[SCM] Requested {} kSHADOWMAPS slots, GPU allocated {} -- " + "clamping scheduler to the actual count.", + s_requestedSlotCount, actual); + s_installedShadowLightCount = std::min(s_installedShadowLightCount, static_cast(actual)); + } else { + logger::info("[SCM] kSHADOWMAPS array verified: {} slots allocated", actual); + } + } + + uint32_t GetInstalledSlotCount() + { + // Lazy-refresh; cheap once verified. Fall back to the requested + // count when verification can't complete -- a non-zero slot count + // is needed for the cluster pipeline to engage shadow handling. + // Out-of-bounds slice indexes are hardware-clamped in D3D11, so a + // transient over-estimate yields stale shadow data rather than a + // crash. + RefreshInstalledSlotCount(); + return s_installedSlotCount > 0 ? s_installedSlotCount : s_requestedSlotCount; + } + + // Resolution actually used to allocate kSHADOWMAPS this session. Captured + // lazily from the real D3D11 texture geometry the first time it becomes + // readable -- NOT from the RE::Setting at Install() time. The engine's + // SkyrimPrefs.ini load happens after our PostPostLoad hook, so a snapshot + // at Install() catches the hardcoded default (e.g. 2048) before the + // user's INI value (e.g. 4096) is applied. Reading from the texture is + // the source of truth either way -- it reflects what was actually + // allocated, regardless of where the setting ended up. + static std::int32_t s_initialShadowMapResolution = 0; + + // kSHADOWMAPS footprint = w*h*bytesPerPixel*ArraySize. Per-slice cost + // is 64 MB at 4K D32_FLOAT; arrays grow linearly with ShadowLightCount. + // Returned info.valid is false only when both the DXGI budget query + // and the texture/INI fallback fail (rare). + VRAMInfo GetVRAMInfo() + { + VRAMInfo info{}; + + // DXGI budget. Prefer Menu's cached adapter; fall back to the + // device-derived path before Menu::Init() has run. + winrt::com_ptr adapter3; + if (auto* menu = Menu::GetSingleton()) + adapter3 = menu->GetDXGIAdapter3(); + if (!adapter3 && globals::d3d::device) { + winrt::com_ptr dxgiDevice; + if (SUCCEEDED(globals::d3d::device->QueryInterface(dxgiDevice.put()))) { + winrt::com_ptr dxgiAdapter; + if (SUCCEEDED(dxgiDevice->GetAdapter(dxgiAdapter.put()))) + dxgiAdapter->QueryInterface(adapter3.put()); + } + } + if (adapter3) { + DXGI_QUERY_VIDEO_MEMORY_INFO vmem{}; + HRESULT hr = adapter3->QueryVideoMemoryInfo(0, DXGI_MEMORY_SEGMENT_GROUP_LOCAL, &vmem); + if (SUCCEEDED(hr) && vmem.Budget > 0) { + info.currentUsageBytes = vmem.CurrentUsage; + info.budgetBytes = vmem.Budget; + } + } + + // kSHADOWMAPS geometry from the underlying texture (when readable). + D3D11_TEXTURE2D_DESC desc{}; + if (TryReadShadowTextureDesc(desc)) { + info.shadowWidth = desc.Width; + info.shadowHeight = desc.Height; + info.shadowSlices = desc.ArraySize; + // Latch the texture's allocated resolution as the canonical + // "what this session is using" value -- this is what the UI's + // restart-required indicator compares against. Once latched it + // doesn't move for the session (kSHADOWMAPS is allocated once). + if (s_initialShadowMapResolution == 0) + s_initialShadowMapResolution = static_cast(desc.Width); + // Default to 4 B/pixel (R32_TYPELESS / D32_FLOAT — the format + // Skyrim ships with) and override for stencil-packed variants. + std::uint32_t bytesPerPixel = 4; + switch (desc.Format) { + case DXGI_FORMAT_R32G8X24_TYPELESS: + case DXGI_FORMAT_D32_FLOAT_S8X24_UINT: + case DXGI_FORMAT_R32_FLOAT_X8X24_TYPELESS: + case DXGI_FORMAT_X32_TYPELESS_G8X24_UINT: + bytesPerPixel = 8; + break; + case DXGI_FORMAT_R16_TYPELESS: + case DXGI_FORMAT_D16_UNORM: + case DXGI_FORMAT_R16_UNORM: + bytesPerPixel = 2; + break; + default: + break; // 4 B fallback covers R24G8 and R32 families + } + info.bytesPerSlice = info.shadowWidth * info.shadowHeight * bytesPerPixel; + info.shadowArrayBytes = static_cast(info.bytesPerSlice) * info.shadowSlices; + } + + // INI-based fallback when the texture isn't readable yet (e.g. + // main menu, before BSShaderRenderTargets_Create). Resolution + // from SkyrimPrefs.ini, slot count from settings; assume the + // stock D32_FLOAT format (4 B/pixel). + if (info.bytesPerSlice == 0) { + std::uint32_t res = 4096; // SkyrimPrefs.ini default + if (auto* prefColl = RE::INIPrefSettingCollection::GetSingleton()) { + if (auto* setting = prefColl->GetSetting("iShadowMapResolution:Display")) { + int v = setting->GetInteger(); + if (v > 0) + res = static_cast(v); + } + } + info.shadowWidth = res; + info.shadowHeight = res; + info.bytesPerSlice = info.shadowWidth * info.shadowHeight * 4; + info.shadowSlices = static_cast(s_settings.ShadowLightCount); + info.shadowArrayBytes = static_cast(info.bytesPerSlice) * info.shadowSlices; + } + + // Budget and per-slice are independent so a partial answer still + // renders (budget alone shows VRAM headroom, per-slice alone shows + // projection from the INI fallback). + info.valid = info.budgetBytes > 0 || info.bytesPerSlice > 0; + + // One-shot log on first valid observation. Any caller trips it. + static bool s_loggedFirstValid = false; + if (info.valid && !s_loggedFirstValid) { + s_loggedFirstValid = true; + const std::uint64_t freeBytes = info.budgetBytes > info.currentUsageBytes ? info.budgetBytes - info.currentUsageBytes : 0; + const float arrayMB = static_cast(info.shadowArrayBytes) / (1024.f * 1024.f); + const float perSliceMB = static_cast(info.bytesPerSlice) / (1024.f * 1024.f); + const float budgetMB = static_cast(info.budgetBytes) / (1024.f * 1024.f); + const float usageMB = static_cast(info.currentUsageBytes) / (1024.f * 1024.f); + logger::info( + "[SCM] kSHADOWMAPS {}x{} x {} slices, {:.2f} MB/slice -> {:.1f} MB " + "(VRAM {:.1f}/{:.1f} MB used, ShadowLightCount={})", + info.shadowWidth, info.shadowHeight, info.shadowSlices, + perSliceMB, arrayMB, usageMB, budgetMB, s_settings.ShadowLightCount); + if (info.shadowArrayBytes > freeBytes) { + logger::warn( + "[SCM] Shadow texture array ({:.1f} MB) exceeds remaining VRAM budget " + "({:.1f} MB). Lower Shadow Light Count or iShadowMapResolution if you " + "see stutter or driver hitches.", + arrayMB, static_cast(freeBytes) / (1024.f * 1024.f)); + } + } + + return info; + } + + std::uint64_t ProjectShadowArrayBytes(std::uint32_t sliceCount) + { + auto info = GetVRAMInfo(); + if (!info.valid) + return 0; + return static_cast(info.bytesPerSlice) * sliceCount; + } + + // ========================================================================= + // BudgetEntry / BudgetTracker methods + // ========================================================================= + + static int64_t GetPerfCounter() + { + LARGE_INTEGER counter; + QueryPerformanceCounter(&counter); + + int64_t t = (int64_t)counter.QuadPart; + + static int64_t freq = 0; + if (freq == 0) { + LARGE_INTEGER f; + QueryPerformanceFrequency(&f); + freq = f.QuadPart / 1000000; + } + + return t / freq; + } + + void BudgetEntry::BeginStep(int32_t /*step*/) + { + _startTime = GetPerfCounter(); + } + + void BudgetEntry::EndStep(int32_t step, int32_t helperCounter) + { + int64_t diff = GetPerfCounter() - _startTime; + + if (step == 0) { + Progress = static_cast(std::min(diff, (int64_t)0xFFFFFFFF)); + } else if (step == 1) { + diff += Progress; + int32_t ix = TrackedCount % kBudgetWindowSize; + Current -= Tracked[ix]; + Tracked[ix] = static_cast(std::min(diff, (int64_t)0xFFFFFFFF)); + Current += Tracked[ix]; + TrackedCount++; + LastTrackedHelper = helperCounter; + } + } + + bool BudgetEntry::IsExpired(int32_t helperCounter) const + { + return LastTrackedHelper < 0 || (helperCounter - LastTrackedHelper) >= 600; + } + + void BudgetTracker::Begin(int32_t step) + { + if (step == 0) { + _counter++; + // Amortise the GC: a periodic full-map walk that freed every + // expired BudgetEntry in one frame caused ~10s-cadence stutters + // (300 frames at 30 fps) because the heap freed dozens of + // unique_ptr back to back, taking a heap lock for + // each. Run incrementally every 30 frames (~0.5s at 60fps) and + // cap erasures per call so the cost spreads across many frames + // instead of spiking once. + if ((_counter % 30) == 0) + CleanupExpired(); + } + } + + void BudgetTracker::BeginLight(RE::BSShadowLight* light, int32_t step) + { + uint64_t key = reinterpret_cast(light); + auto& e = _map[key]; + if (!e) { + e = std::make_unique(); + e->Key = key; + } + e->BeginStep(step); + } + + void BudgetTracker::EndLight(RE::BSShadowLight* light, int32_t step) + { + uint64_t key = reinterpret_cast(light); + auto it = _map.find(key); + if (it == _map.end()) + return; + it->second->EndStep(step, _counter); + } + + int32_t BudgetTracker::GetCost(RE::BSShadowLight* light) const + { + uint64_t key = reinterpret_cast(light); + auto it = _map.find(key); + if (it == _map.end() || it->second->TrackedCount == 0) + return GetAverageCostUs(); // unknown light: fall back to fleet average + int32_t n = std::min(kBudgetWindowSize, it->second->TrackedCount); + return it->second->Current / std::max(1, n); + } + + void BudgetTracker::CleanupExpired() + { + ZoneScopedN("SCM::BudgetTracker::CleanupExpired"); + // Hard cap on erasures per call so a wave of expirations (e.g. the + // player crossed a cell boundary 600 frames ago and dozens of + // shadow lights all expire on the same tick) spreads its heap-free + // cost across many frames instead of stalling one frame. With + // kMaxErasePerCall=4 and Begin() calling this every 30 frames, the + // tracker can drain ~8 expired entries per second steady-state and + // up to 4 per call worst-case -- enough to keep the map bounded + // in practice without the periodic stutter. + constexpr size_t kMaxErasePerCall = 4; + size_t erased = 0; + for (auto it = _map.begin(); it != _map.end() && erased < kMaxErasePerCall;) { + if (it->second->IsExpired(_counter)) { + it = _map.erase(it); + ++erased; + } else { + ++it; + } + } + } + + int32_t BudgetTracker::GetAverageCostUs() const + { + int64_t sum = 0; + int32_t count = 0; + for (auto& [k, entry] : _map) { + int32_t n = std::min(kBudgetWindowSize, entry->TrackedCount); + if (n == 0) + continue; + sum += entry->Current / std::max(1, n); + count++; + } + return count > 0 ? static_cast(sum / count) : 0; + } + + // ========================================================================= + // Game accessor helpers + // + // Thin wrappers around game globals and engine functions. + // All REL::RelocationID pairs are (SE_id, AE_id). + // VR addresses verified against the VR address library CSV. + // ========================================================================= + + // ---------- globals ---------- + + static RE::ShadowSceneNode* GetShadowSceneNode() + { + static REL::RelocationID uid(513211, 390951); + return *reinterpret_cast(uid.address()); + } + + static RE::NiCamera* GetWorldCamera() + { + // world scene graph -> camera + static REL::RelocationID uid(528087, 415032); + auto* sg = *reinterpret_cast(uid.address()); + return sg ? sg->GetRuntimeData().camera.get() : nullptr; + } + + static bool GetSunBool1() + { + static REL::RelocationID uid(513201, 390932); + return *reinterpret_cast(uid.address()); + } + // Engine's per-frame count of focus shadow actors (player + tracked NPCs); + // max is iNumFocusShadow:Display (default 4). The engine renders one + // high-resolution shadow per entry into kSHADOWMAPS slots + // [g_focusShadowBaseSlotIndex .. +count). Used by the scheduler to + // dynamically reserve that range out of the point-light pool. + static int GetFocusShadowActorCount() + { + static REL::RelocationID uid(527703, 414625); + return *reinterpret_cast(uid.address()); + } + static bool GetSunBool2() + { + static REL::RelocationID uid(528095, 415040); + return *reinterpret_cast(uid.address()); + } + + static bool* GetFocusShadowSelected() + { + static REL::RelocationID uid(528096, 415041); + return reinterpret_cast(uid.address()); + } + static uint64_t* GetSunPtr() + { + static REL::RelocationID uid(528315, 415267); + return reinterpret_cast(uid.address()); + } + + // Current accumulated shadow slot (used as Accumulate() first arg). + static uint32_t* GetAccumLightSlot() + { + static REL::RelocationID uid(528091, 415036); + return reinterpret_cast(uid.address()); + } + // Running mask index counter (incremented each time a light is slotted). + static uint32_t* GetMaskIndex() + { + static REL::RelocationID uid(528091, 415036); + return reinterpret_cast(uid.address() + 4); + } + // Active shadow caster bitmask (ORed per slot). + static uint32_t* GetShadowMask() + { + static REL::RelocationID uid(528093, 415038); + return reinterpret_cast(uid.address()); + } + // Written back to the game at the end of scheduling. + static uint32_t* GetFrameLightCount() + { + static REL::RelocationID uid(528090, 415035); + return reinterpret_cast(uid.address()); + } + + // VR-only globals + static bool GetVRDrawShadows() + { + static REL::Offset uid{ 0x1ed3cb0 }; + return *reinterpret_cast(uid.address()); + } + static bool GetVRAccumFirst() + { + static REL::Offset uid{ 0x1ed4118 }; + return *reinterpret_cast(uid.address()); + } + static float GetVRDRSWidthRatio() + { + static REL::Offset bDis{ 0x3186d28 }, r{ 0x3186d14 }; + return *reinterpret_cast(bDis.address()) ? 1.0f : *reinterpret_cast(r.address()); + } + static float GetVRDRSHeightRatio() + { + static REL::Offset bDis{ 0x3186d28 }, r{ 0x3186d18 }; + return *reinterpret_cast(bDis.address()) ? 1.0f : *reinterpret_cast(r.address()); + } + + // ---------- engine function wrappers ---------- + + static void GameAccumulate(RE::BSShadowLight* light) + { + // BSShadowDirectionalLight::AccumulateFullFrustumCascades / unk_Accumulate + using F = void (*)(RE::BSShadowLight*); + static REL::Relocation func{ REL::RelocationID(100819, 107603) }; + func(light); + } + + static void GameSetupDirectionalLight(RE::BSShadowLight* light, RE::NiCamera* cam) + { + using F = void (*)(RE::BSShadowLight*, RE::NiCamera*); + static REL::Relocation func{ REL::RelocationID(100817, 107601) }; + func(light, cam); + } + + static void GameEnableLight(RE::ShadowSceneNode* ssn, RE::BSLight* light) + { + using F = void (*)(RE::ShadowSceneNode*, RE::BSLight*); + static REL::Relocation func{ REL::RelocationID(99708, 106342) }; + func(ssn, light); + } + + static void GameSetShadowCasterSlot(RE::ShadowSceneNode* ssn, RE::BSLight* light, uint32_t index, uint32_t unk) + { + using F = void (*)(RE::ShadowSceneNode*, RE::BSLight*, uint32_t, uint32_t); + static REL::Relocation func{ REL::RelocationID(99728, 106365) }; + func(ssn, light, index, unk); + } + + static void GameClearPortalVisibility(RE::BSPortalGraphEntry* entry) + { + using F = void (*)(RE::BSPortalGraphEntry*); + static REL::Relocation func{ REL::RelocationID(74395, 76119) }; + func(entry); + } + + static bool GamePortalHasSharedVisibility(RE::BSPortalGraphEntry* a, RE::BSPortalGraphEntry* b) + { + using F = bool (*)(RE::BSPortalGraphEntry*, RE::BSPortalGraphEntry*); + static REL::Relocation func{ REL::RelocationID(74397, 76121) }; + return func(a, b); + } + + static void GameClearGeometryList(RE::BSLight* light) + { + using F = void (*)(RE::BSLight*); + static REL::Relocation func{ REL::RelocationID(101298, 108285) }; + func(light); + } + + static bool GameIsLightAffectingSurface(RE::BSLightingShaderProperty* p, RE::BSLight* light) + { + using F = bool (*)(RE::BSLightingShaderProperty*, RE::BSLight*); + static REL::Relocation func{ REL::RelocationID(98902, 105550) }; + return func(p, light); + } + + static void GameApplyLensFlare(RE::BSLight* light) + { + // SE/AE only -- no VR equivalent (ID 100440) + if (REL::Module::IsVR()) + return; + using F = void (*)(RE::BSLight*); + static REL::Relocation func{ REL::RelocationID(100440, 107157) }; + func(light); + } + + // VR-only + static void GameVRPrepareShadowMaps(RE::BSLight* light) + { + using F = void (*)(RE::BSLight*); + static REL::Relocation func{ REL::Offset(0x1356e50) }; + func(light); + } + + static void GameVRAccumulateShadowMaps(RE::BSLight* light) + { + using F = void (*)(RE::BSLight*); + static REL::Relocation func{ REL::Offset(0x1357450) }; + func(light); + } + + static void GameFrustumOverlap(RE::NiCamera* cam, float* coord, float* r1, float* r2, float eps) + { + // Non-VR: (cam, coord, r1, r2, eps) + // VR: (cam, coord, r1, r2, eyeIndex, eps) -- pass 0xffffffff for combined frustum + static REL::Relocation addr{ REL::RelocationID(69265, 70632) }; + auto ptr = addr.address(); + if (REL::Module::IsVR()) { + using VR = void (*)(RE::NiCamera*, float*, float*, float*, uint32_t, float); + reinterpret_cast(ptr)(cam, coord, r1, r2, 0xffffffffu, eps); + } else { + using SE = void (*)(RE::NiCamera*, float*, float*, float*, float); + reinterpret_cast(ptr)(cam, coord, r1, r2, eps); + } + } + + // Convenience: runtime-aware shadow-light field accessor (SE vs VR RuntimeData differ). + // Usage: ShadowField(light, maskIndex) = 3; +#define ShadowField(light, member) \ + (REL::Module::IsVR() ? (light)->GetVRRuntimeData().member : (light)->GetRuntimeData().member) + + // Returns the culling process for the first shadow descriptor of a light. + static RE::BSCullingProcess* GetLightCullingProcess(RE::BSShadowLight* light) + { + return REL::Module::IsVR() ? light->GetVRRuntimeData().shadowmapDescriptors.front().cullingProcess : light->GetRuntimeData().shadowmapDescriptors.front().cullingProcess; + } + + // ========================================================================= + // Formula helpers + // + // SetupSceneFormula: called once per frame, sets camera/scene params. + // SetupLightFormula: called per candidate light, sets all light params. + // CalculateLightScore: evaluates s_formulaScore if available. + // ========================================================================= + + static void SetupSceneFormula(const RE::NiCamera* camera) + { + if (camera) { + FormulaHelper::SetParam(kFormulaParam_CameraX, camera->world.translate.x); + FormulaHelper::SetParam(kFormulaParam_CameraY, camera->world.translate.y); + FormulaHelper::SetParam(kFormulaParam_CameraZ, camera->world.translate.z); + } else { + FormulaHelper::SetParam(kFormulaParam_CameraX, 0.0); + FormulaHelper::SetParam(kFormulaParam_CameraY, 0.0); + FormulaHelper::SetParam(kFormulaParam_CameraZ, 0.0); + } + + FormulaHelper::SetParam(kFormulaParam_IsInterior, 0); + auto* plr = RE::PlayerCharacter::GetSingleton(); + if (plr) { + auto* cell = plr->parentCell; + if (cell && cell->IsInteriorCell()) + FormulaHelper::SetParam(kFormulaParam_IsInterior, 1); + } + + // Time of day from GameHour global + auto* cal = RE::Calendar::GetSingleton(); + if (cal) + FormulaHelper::SetParam(kFormulaParam_TimeOfDay, cal->GetHour()); + } + + static void SetupLightFormula(const RE::BSShadowLight* light, const RE::NiCamera* camera, int32_t index) + { + FormulaHelper::SetParam(kFormulaParam_LightConverted, 0.0); + FormulaHelper::SetParam(kFormulaParam_LightIndex, index); + FormulaHelper::SetParam(kFormulaParam_LightDisplacement, 0.0); // overridden per-entry in redraw interval loop + FormulaHelper::SetParam(kFormulaParam_PlayerLightDistance, 0.0); // overridden below after light position is known + FormulaHelper::SetParam(kFormulaParam_LightImportance, 0.0); // overridden per-entry in redraw interval loop; 0 in score formula + + // Temporal stickiness signals. Both derived from the slot pool in one + // pass: chosenLastFrame is the boolean kept for backward-compat with + // user formulas; framesSinceRender is a continuous age that decays to + // zero stickiness once the slot has been stale long enough to no + // longer represent a true rank-drift case. Sentinel 1e6 covers the + // "no slot" and "never rendered" branches so the default formula's + // max(0, 1 - age/window) decay term cleanly collapses to 0. + double chosenLastFrame = 0.0; + double framesSinceRender = 1e6; + { + const int32_t now = *globals::game::frameCounter; + for (int i = s_lights.PointLightFirst(); i < s_lights.PointLightEnd(s_settings.ShadowLightCount); i++) { + const auto& e = s_lights.Lights[i]; + if (e.Light != light) + continue; + chosenLastFrame = 1.0; + if (e.LastDrawnFrame >= 0) + framesSinceRender = static_cast(now - e.LastDrawnFrame); + break; + } + } + FormulaHelper::SetParam(kFormulaParam_LightChosenLastFrame, chosenLastFrame); + FormulaHelper::SetParam(kFormulaParam_LightFramesSinceRender, framesSinceRender); + + FormulaHelper::SetParam(kFormulaParam_LightNeverFades, light->lodFade ? 0.0 : 1.0); + FormulaHelper::SetParam(kFormulaParam_LightPortalStrict, light->portalStrict ? 1.0 : 0.0); + FormulaHelper::SetParam(kFormulaParam_LightNS, 0.0); + + // Spot detection + cone-aware visibility prior (option 1 from spot + // preservation analysis). Non-spots get spotvisible=1 so existing + // omni-tuned formulas are unaffected. For spots, we read last + // frame's UpdateCamera verdict (frustumCull / lodDimmer) -- the + // score runs BEFORE this frame's validation pass updates those, + // but cameras move continuously so last-frame's cone-vs-frustum + // is a strong predictor of this-frame's. Trading a one-frame lag + // for not double-calling UpdateCamera is a worthwhile cost since + // the score is a preference, not a gate. + const bool isSpot = (skyrim_cast(light) != nullptr); + double spotVisible = 1.0; // default for non-spots: always "visible" + if (isSpot) { + // frustumCull == 0 means "in frustum"; engine sets 0xff when + // cone-vs-frustum rejects. lodDimmer > 0 means the LOD fader + // hasn't zeroed the light. Both must hold for a spot to count + // as plausibly visible. + // Note: the engine field is misspelled "frustrumCull" in the SDK + // (matches Bethesda's original symbol). 0 = visible, 0xff = culled. + const bool inFrustum = (light->frustrumCull == 0); + const bool lodLit = (light->lodDimmer > 0.0f); + spotVisible = (inFrustum && lodLit) ? 1.0 : 0.0; + } + FormulaHelper::SetParam(kFormulaParam_LightIsSpot, isSpot ? 1.0 : 0.0); + FormulaHelper::SetParam(kFormulaParam_LightSpotVisible, spotVisible); + + float x, y, z; + + auto* nilight = light->light.get(); + if (nilight) { + FormulaHelper::SetParam(kFormulaParam_LightIntensity, nilight->GetLightRuntimeData().fade); + FormulaHelper::SetParam(kFormulaParam_LightRadius, nilight->GetLightRuntimeData().radius.x); + FormulaHelper::SetParam(kFormulaParam_LightR, nilight->GetLightRuntimeData().diffuse.red); + FormulaHelper::SetParam(kFormulaParam_LightG, nilight->GetLightRuntimeData().diffuse.green); + FormulaHelper::SetParam(kFormulaParam_LightB, nilight->GetLightRuntimeData().diffuse.blue); + FormulaHelper::SetParam(kFormulaParam_LightAmbientR, nilight->GetLightRuntimeData().ambient.red); + FormulaHelper::SetParam(kFormulaParam_LightAmbientG, nilight->GetLightRuntimeData().ambient.green); + FormulaHelper::SetParam(kFormulaParam_LightAmbientB, nilight->GetLightRuntimeData().ambient.blue); + x = nilight->world.translate.x; + y = nilight->world.translate.y; + z = nilight->world.translate.z; + + if (s_settings.PromoteNormalToShadow) + FormulaHelper::SetParam(kFormulaParam_LightNS, s_shadowConvert.find(nilight) != s_shadowConvert.end() ? 1.0 : 0.0); + } else { + FormulaHelper::SetParam(kFormulaParam_LightIntensity, 0.0); + FormulaHelper::SetParam(kFormulaParam_LightRadius, 0.0); + FormulaHelper::SetParam(kFormulaParam_LightR, 1.0); + FormulaHelper::SetParam(kFormulaParam_LightG, 1.0); + FormulaHelper::SetParam(kFormulaParam_LightB, 1.0); + FormulaHelper::SetParam(kFormulaParam_LightAmbientR, 1.0); + FormulaHelper::SetParam(kFormulaParam_LightAmbientG, 1.0); + FormulaHelper::SetParam(kFormulaParam_LightAmbientB, 1.0); + x = light->worldTranslate.x; + y = light->worldTranslate.y; + z = light->worldTranslate.z; + } + + FormulaHelper::SetParam(kFormulaParam_LightX, x); + FormulaHelper::SetParam(kFormulaParam_LightY, y); + FormulaHelper::SetParam(kFormulaParam_LightZ, z); + + float camx = camera ? camera->world.translate.x : (float)FormulaHelper::GetParam(kFormulaParam_CameraX); + float camy = camera ? camera->world.translate.y : (float)FormulaHelper::GetParam(kFormulaParam_CameraY); + float camz = camera ? camera->world.translate.z : (float)FormulaHelper::GetParam(kFormulaParam_CameraZ); + + float dx = x - camx, dy = y - camy, dz = z - camz; + FormulaHelper::SetParam(kFormulaParam_LightDistance, sqrtf(dx * dx + dy * dy + dz * dz)); + + // Player-to-light distance: ensures third-person shadow maps redraw when the + // player character is inside a light's radius even if the camera is outside. + double playerLightDist = FormulaHelper::GetParam(kFormulaParam_LightDistance); + auto* plr = RE::PlayerCharacter::GetSingleton(); + if (plr) { + auto pp = plr->GetPosition(); + float pdx = x - pp.x, pdy = y - pp.y, pdz = z - pp.z; + playerLightDist = static_cast(sqrtf(pdx * pdx + pdy * pdy + pdz * pdz)); + } + FormulaHelper::SetParam(kFormulaParam_PlayerLightDistance, playerLightDist); + } + + static double CalculateLightScore(const RE::BSShadowLight* light, const RE::NiCamera* camera, int32_t index) + { + SetupLightFormula(light, camera, index); + + if (s_formulaScore) + return s_formulaScore->Calculate(); + + return 0.0; + } + + // ========================================================================= + // Shadow map content hash for cached-shadow-map detection + // ========================================================================= + + /// Mixes a 32-bit value into a running 64-bit hash. boost::hash_combine + /// constants -- the magic number 0x9e3779b9 is the golden-ratio reciprocal, + /// chosen for good bit distribution. Fast (a few ALU ops) and we don't + /// need cryptographic strength -- only that distinct inputs map to + /// distinct outputs with very high probability. + static inline std::uint64_t HashCombine(std::uint64_t h, std::uint32_t v) noexcept + { + return h ^ (static_cast(v) + 0x9e3779b9ull + (h << 6) + (h >> 2)); + } + static inline std::uint64_t HashCombineFloat(std::uint64_t h, float f) noexcept + { + return HashCombine(h, std::bit_cast(f)); + } + + /// Quantize a float to a step size before hashing. Skyrim's kFlicker / + /// kPulse light flags oscillate animated torches by sub-unit position / + /// radius amounts every frame. Bit-exact hashing on those oscillations + /// produces a fresh hash every frame, defeating cache validity. Quantizing + /// at sub-pixel-precision thresholds folds imperceptible animations into + /// a stable hash bucket so the cached-shadow priority demotion fires + /// correctly for visually-unchanging lights. + static inline float QuantizeFloat(float f, float step) noexcept + { + return std::round(f / step) * step; + } + + /// Hash of inputs that determine a shadow map's content: the light's + /// pose + radius, and each caster's worldBound + identity. worldBound + /// tracks rigid motion and BSDynamicTriShape vertex updates, so mesh + /// data isn't inspected directly. Identical hashes across frames mean + /// the cached slot is byte-for-byte current -- caller can skip the + /// redraw. Returns 0 only on null light/NiLight (sentinel for "never + /// rendered"); HashCombine constants make a real-data 0 essentially + /// impossible. + + static std::uint64_t ComputeShadowGeomHash(RE::BSShadowLight* light) + { + if (!light) + return 0; + auto* ni = light->light.get(); + if (!ni) + return 0; + std::uint64_t h = 0x9e3779b97f4a7c15ull; // arbitrary nonzero seed + + // Quantization thresholds: tuned to be one to two orders of + // magnitude below perceptible difference in the rendered shadow. + // kPosStep = 1.0 game unit (~1.4 cm world space; sub-texel + // at typical 2048 shadow res * 500 unit light radius) + // kRotStep = 0.01 in matrix entries (~0.5 degrees) + // kRadiusStep = 1.0 unit (well under any visible frustum + // resize from torch pulse animations) + constexpr float kPosStep = 1.0f; + constexpr float kRotStep = 0.01f; + constexpr float kRadiusStep = 1.0f; + + // Light pose + const auto& t = ni->world.translate; + h = HashCombineFloat(h, QuantizeFloat(t.x, kPosStep)); + h = HashCombineFloat(h, QuantizeFloat(t.y, kPosStep)); + h = HashCombineFloat(h, QuantizeFloat(t.z, kPosStep)); + const auto& r = ni->world.rotate; + for (int i = 0; i < 3; ++i) + for (int j = 0; j < 3; ++j) + h = HashCombineFloat(h, QuantizeFloat(r.entry[i][j], kRotStep)); + // Light radius (NiPointLight uses .x; spotlights use direction in + // rotation matrix already hashed above). + const auto& rtd = ni->GetLightRuntimeData(); + h = HashCombineFloat(h, QuantizeFloat(rtd.radius.x, kRadiusStep)); + // Caster set + each caster's worldBound (engine-updated). + for (auto& nip : light->geomList) { + auto* ts = nip.get(); + if (!ts) + continue; + const auto raw = reinterpret_cast(ts); + h = HashCombine(h, static_cast(raw)); + h = HashCombine(h, static_cast(raw >> 32)); + const auto& wb = ts->worldBound; + h = HashCombineFloat(h, QuantizeFloat(wb.center.x, kPosStep)); + h = HashCombineFloat(h, QuantizeFloat(wb.center.y, kPosStep)); + h = HashCombineFloat(h, QuantizeFloat(wb.center.z, kPosStep)); + h = HashCombineFloat(h, QuantizeFloat(wb.radius, kRadiusStep)); + } + return h; + } + + // ========================================================================= + // Light enable / disable helpers + // ========================================================================= + + /// Removes `light` from s_normalConvert and clears its geometry list. + /// No-op if the light is not in the list. + static void EraseFromConvertList(RE::BSShadowLight* light) + { + for (auto it = s_normalConvert.begin(); it != s_normalConvert.end(); ++it) { + if (it->light == light) { + GameClearGeometryList(light); + s_normalConvert.erase(it); + return; + } + } + } + + static void DisableLight(RE::BSShadowLight* light) + { + EraseFromConvertList(light); + auto* cull = light->cullingProcess; + if (cull && cull->portalGraphEntry) + GameClearPortalVisibility(reinterpret_cast(cull->portalGraphEntry)); + light->ReturnShadowmaps(); + } + + // Activates a light as a normal (non-shadow) light by inserting it into + // the scene's active-light list without allocating a shadow slot. + // + // Two paths: "already-converted re-enable" (just GameEnableLight) and + // "first conversion this session" (ReturnShadowmaps + portal-clear + + // track in s_normalConvert + GameEnableLight). Tracy sub-zones split + // the cost so the next capture distinguishes the steady-state cost + // (re-enable only) from the cost of a fresh conversion. + static void ConvertLight(RE::BSShadowLight* light, RE::ShadowSceneNode* ssn, bool isNS) + { + // Already converted: just re-enable so geometry picks it up this frame. + for (auto& c : s_normalConvert) { + if (c.light == light) { + ZoneNamedN(zReEnable, "SCM::Engine::ConvertLight::ReEnable", true); + GameEnableLight(ssn, light); + return; + } + } + + // First conversion this session: release shadow resources, register. + ZoneNamedN(zFirstConv, "SCM::Engine::ConvertLight::FirstConvert", true); + auto* cull = GetLightCullingProcess(light); + if (cull && cull->portalGraphEntry) + GameClearPortalVisibility(reinterpret_cast(cull->portalGraphEntry)); + light->ReturnShadowmaps(); + + s_normalConvert.push_back({ light, isNS }); + GameEnableLight(ssn, light); + } + + // Activates a non-sun shadow light into slot `slotIndex`. + static void EnableLight(RE::BSShadowLight* light, RE::NiCamera* camera, + RE::ShadowSceneNode* ssn, int slotIndex) + { + // Remove from conversion list if it was previously converted to normal. + EraseFromConvertList(light); + + // Focus shadow handling. Gated on s_focusShadowSlots so we only run + // the engine's focus accumulate when ScheduleShadowCasters has + // reserved [kFocusShadowBaseSlotIndex .. +s_focusShadowSlots) this + // frame -- without that reservation the engine would write focus + // depth into texture slices currently held by point lights. With + // it, extended mode (ShadowLightCount > 4) is safe; the previous + // blanket `<= 4` gate is replaced by the reservation contract. + if (s_focusShadowSlots > 0) { + bool drawFocus = ShadowField(light, drawFocusShadows); + if (drawFocus || (!*GetFocusShadowSelected() && light->GetIsFrustumOrDirectionalLight())) { + GameSetupDirectionalLight(light, camera); + GameAccumulate(light); + if (REL::Module::IsVR()) { + for (auto& desc : light->GetVRRuntimeData().focusShadowmapDescriptors) { + desc.vrRenderTarget[0] = RE::RENDER_TARGET_DEPTHSTENCIL::kNONE; + desc.vrRenderTarget[1] = RE::RENDER_TARGET_DEPTHSTENCIL::kNONE; + } + } + ShadowField(light, drawFocusShadows) = true; + *GetFocusShadowSelected() = true; + *GetSunPtr() = reinterpret_cast(light); + } + } + + GameEnableLight(ssn, light); + GameSetShadowCasterSlot(ssn, light, *GetAccumLightSlot(), 1); + + { + uint32_t mi = *GetMaskIndex(); + ShadowField(light, maskIndex) = mi; + *GetMaskIndex() = mi + 1; + } + + // Projected bounding box for shadow map region. + auto* nilight = light->light.get(); + if (nilight) { + auto lpos = nilight->world.translate; + auto cpos = camera->world.translate; + auto delta = lpos - cpos; + float dx = delta.x, dy = delta.y, dz = delta.z; + float dist = lpos.GetDistance(cpos); + float radius = nilight->GetLightRuntimeData().radius.x; + + float left, right, top, bottom; + + if (dist >= radius + camera->GetNearPlane()) { + float inv = 1.0f / dist; + float coord[4] = { + lpos.x - dx * radius * inv, + lpos.y - dy * radius * inv, + lpos.z - dz * radius * inv, + radius + }; + float r1[2], r2[2]; + GameFrustumOverlap(camera, coord, r1, r2, 0.00001f); + + float vw = (float)*globals::game::viewWidth; + float vh = (float)*globals::game::viewHeight; + if (REL::Module::IsVR()) { + vw *= GetVRDRSWidthRatio(); + vh *= GetVRDRSHeightRatio(); + } + + left = (r1[0] + 1.0f) * 0.5f * vw; + right = (r2[0] + 1.0f) * 0.5f * vw; + top = (1.0f - (r1[1] + 1.0f) * 0.5f) * vh; + bottom = (1.0f - (r2[1] + 1.0f) * 0.5f) * vh; + } else { + // Light contains the camera: use full screen. + *GetShadowMask() |= 1u << *GetAccumLightSlot(); + left = right = top = bottom = -1.0f; + } + + ShadowField(light, projectedBoundingBox) = + RE::NiRect((uint32_t)left, (uint32_t)right, (uint32_t)top, (uint32_t)bottom); + } + + // Accumulate into shadow slot. + { + uint32_t idx = static_cast(slotIndex); + light->Accumulate(idx, idx, nullptr); + *GetAccumLightSlot() += light->shadowMapCount; + } + + // Extended mode: pre-set kNONE renderTarget so RenderCascade re-runs + // its slot-allocation block (where Hook_OverwriteShadowMapIndex + // overrides the global counter with our slot index). Without this, + // RenderCascade keeps the slot from a prior frame and lights not + // redrawn this frame would corrupt another light's shadow map. + // Pool index maps 1:1 to texture slot; slice 0 stays unused. + if (s_settings.ShadowLightCount > 4) { + int32_t idx = s_lights.FindLight(light, s_settings.ShadowLightCount); + if (idx < 0) + idx = 0; + if (REL::Module::IsVR()) { + for (auto& desc : light->GetVRRuntimeData().shadowmapDescriptors) { + desc.renderTarget = RE::RENDER_TARGET_DEPTHSTENCIL::kNONE; + desc.shadowmapIndex = static_cast(idx); + } + } else { + for (auto& desc : light->GetRuntimeData().shadowmapDescriptors) { + desc.renderTarget = RE::RENDER_TARGET_DEPTHSTENCIL::kNONE; + desc.shadowmapIndex = static_cast(idx); + } + } + } + + // Only apply lens flare when lensFlareData is non-null; calling it on parabolic lights + // (null lensFlareData) registers them into the lens flare system, causing a crash + // in the lens flare pass when it tries to dereference the null sprite data. + if (light->lensFlareData) + GameApplyLensFlare(light); + } + + // ========================================================================= + // Main shadow caster manager + // + // Replaces the game's CalculateActiveShadowCasterLights entirely. + // Runs via stl::detour_thunk; obtains all inputs from game globals. + // ========================================================================= + + // Lightweight per-frame candidate entry used during scheduling. + // + // After the validation pass, exactly one of {chosen, excess, invalid} + // is true (or none if it's the sun, which is processed separately). + struct CandidateLight + { + RE::BSShadowLight* light{ nullptr }; + double score{ 0.0 }; + bool sun{ false }; + bool chosen{ false }; // valid + within ShadowLightCount budget + bool excess{ false }; // valid but over budget (convert or disable) + bool invalid{ false }; // shorthand: invalidCamera || invalidPortal + bool invalidCamera{ false }; // UpdateCamera returned false -- shorthand for + // branches that don't care which sub-reason + bool invalidPortal{ false }; // portal cull: light's cell not visible from + // camera's cell. Must DisableLight; converting + // routes through cluster lighting which has no + // portal awareness and would bleed through walls. + + // Sub-reasons for invalidCamera, recovered from engine side-band flags: + // frustrumCull == 0xff -> off-screen, ConvertLight wasted -> drop + // lodDimmer == 0.0f -> past LOD fade end, still visible -> ConvertLight + // (resets lodDimmer so cluster lighting picks it up) + // Both can fire together; frustum-out wins (contribution is zero either way). + bool invalidFrustum{ false }; // BSMultiBoundSphere::WithinFrustum / cone-frustum cull + bool invalidLod{ false }; // engine's LOD-fade zeroed lodDimmer + }; + + static void ScheduleShadowCasters() + { + ZoneScopedN("SCM::ScheduleShadowCasters"); + // Per-frame diagnostic counters; emitted via TracyPlot at function exit. + s_schedDiag = SchedDiagCounters{}; + // VR calls CalculateAndDrawShadowCasterLights twice per frame (once per + // eye). Block the second call: s_lights isn't reentrancy-safe. + static std::atomic s_inSchedule{ false }; + if (s_inSchedule.exchange(true, std::memory_order_acquire)) + return; + struct Guard + { + ~Guard() { s_inSchedule.store(false, std::memory_order_release); } + } guard; + + // VR display guard: skip scheduling when the HMD display is not active. + if (REL::Module::IsVR() && !GetVRDrawShadows()) + return; + + auto* ssn = GetShadowSceneNode(); + auto* camera = GetWorldCamera(); + if (!ssn || !camera) + return; + + // Read the engine's per-frame focus-shadow actor count and reserve + // matching pool slots. Eject any point lights that occupy a slot the + // engine now claims for focus rendering -- the displaced lights are + // reassigned to a free slot or fall through to the existing excess + // path. When the count drops, the slots naturally rejoin the pool's + // FindFreeIndex range on the next allocation. + s_focusShadowSlots = std::clamp(GetFocusShadowActorCount(), 0, kFocusShadowMaxSlots); + for (int i = kFocusShadowBaseSlotIndex; i < kFocusShadowBaseSlotIndex + s_focusShadowSlots && i < s_lights.Size; ++i) { + if (s_lights.Lights[i].Light) + s_lights.Lights[i].Clear(); + } + + // Do NOT clear shadowLightsAccum or reset the slot counter here. The + // outer CalculateAndDrawShadowCasterLights calls ResetCalculatedShadow- + // CasterLights before our hook fires, and that function clears the + // array, resets the counter, AND installs the sun at slot 0. Re- + // clearing here wipes the sun (sun->Accumulate is the focus vfunc, + // not a slot allocator) and the engine then skips the directional + // cascade pass entirely. + + s_budget.Begin(0); + + int doneLightCount = 0; + RE::BSShadowLight* sunLight = nullptr; + + // ---- Sun / directional light ---- + if (!GetSunBool2()) { + auto* sun = ssn->GetRuntimeData().sunShadowDirLight; + if (sun) { + static REL::Relocation vrUpdateFlag{ REL::Offset(0x1ed62f8) }; + uint8_t vrFlag = REL::Module::IsVR() ? static_cast(*vrUpdateFlag) + 1 : 0; + sun->Accumulate(*GetAccumLightSlot(), 0, nullptr, vrFlag); + + if (sun->lensFlareData && !REL::Module::IsVR()) + GameApplyLensFlare(sun); + + if (REL::Module::IsVR() && !GetVRAccumFirst()) { + GameVRPrepareShadowMaps(sun); + GameVRAccumulateShadowMaps(sun); + } + + sunLight = sun; + } + } + + // Extended mode: scrub drawFocusShadows on every active light and the + // sun. A stale flag on a parabolic (point/spot) light occupying a + // kSHADOWMAPS slot in [4..7] sends BSShadowParabolicLight::Render + // into its focus-shadow loop on a non-directional light and CTDs. + // Mirrors Intellightent's mitigation (see main.cpp:1411-1420); the + // byte patches at SetupResources are belt-and-braces for the engine's + // global gate, this is belt-and-braces for the per-light flag. + if (s_settings.ShadowLightCount > 4) { + for (auto& sp : ssn->GetRuntimeData().activeShadowLights) { + if (auto* l = sp.get()) + ShadowField(l, drawFocusShadows) = false; + } + if (auto* sun2 = ssn->GetRuntimeData().sunShadowDirLight) + ShadowField(sun2, drawFocusShadows) = false; + } + + *GetSunPtr() = 0; + + // ---- Score all candidate lights ---- + // Reuse a static vector so we don't allocate per frame -- the + // scheduler runs every frame and the candidate list is the same + // shape size each call (a few hundred lights at most). + static std::vector candidates; + + { + ZoneScopedN("SCM::ScoreCandidates"); + SetupSceneFormula(camera); + + candidates.clear(); + candidates.reserve(ssn->GetRuntimeData().activeShadowLights.size()); + + int32_t tmpIndex = 0; + for (auto& sp : ssn->GetRuntimeData().activeShadowLights) { + auto* l = sp.get(); + if (!l || l == sunLight) + continue; + auto& c = candidates.emplace_back(); + c.light = l; + c.sun = false; + c.score = CalculateLightScore(l, camera, tmpIndex++); + } +#ifdef TRACY_ENABLE + char buf[32]; + const int n = snprintf(buf, sizeof(buf), "candidates=%zu", candidates.size()); + if (n > 0) + ZoneText(buf, static_cast(n)); +#endif + } + + // Validation, redraw-interval scoring, and RedrawFrame marking all + // happen before the atomic loop. Tracy capture analysis showed this + // block dominates SCM::ScheduleShadowCasters (98%+ of the function's + // runtime), so a dedicated zone scopes that cost separately from + // ScoreCandidates and ScheduleLoop. Named variant because the + // enclosing function already declares a ZoneScopedN. + ZoneNamedN(zoneValBudget, "SCM::ValidateAndScheduleBudget", true); + + // Apply debug pins: bias scoring so pinned-shadow lights sort to the + // top (forced into the chosen pool up to ShadowLightCount) and + // pinned-convert lights sort to the bottom (forced into the excess pool + // where ConvertLight runs unconditionally — see c.excess branch below). + // Pin sets are mutually exclusive (SetPinned* enforces that), but if a + // stale entry slips through, pin-shadow wins because the bias is checked + // first. + for (auto& c : candidates) { + auto key = reinterpret_cast(c.light); + if (s_pinShadow.count(key)) + c.score += 1e15; + else if (s_pinConvert.count(key)) + c.score -= 1e15; + } + + // Sort descending by score (highest priority first); sun always first. + std::sort(candidates.begin(), candidates.end(), + [](const CandidateLight& a, const CandidateLight& b) { + if (a.sun != b.sun) + return a.sun; + return a.score > b.score; + }); + + // ---- Validation pass (no game mutations) ---- + // + // Mirrors Intellightent's per-iteration validation gates. Splitting + // validation from mutation lets us defer all game-state changes + // (DisableLight / ConvertLight / EnableLight) to a single atomic loop + // later, eliminating the dangling-pointer crash window where mutations + // in an earlier phase invalidated raw pointers held in s_lights[]. + // + // Slot 0 is reserved for the sun; point lights fill slots 1..ShadowLightCount. + // Do not count the sun against ShadowLightCount -- it uses focus cascade DSV slots, + // not parabolic point-light slots. + auto* globalCull = *reinterpret_cast( + *reinterpret_cast( + REL::RelocationID(528077, 415022).address())); + + int wantCount = 0; + + // Per-candidate UpdateCamera vfunc + portal-graph visibility walk + // + chosen/excess tagging. Captured separately so memoization or + // caching of UpdateCamera/portal verdicts can be measured. + { + ZoneNamedN(zoneCandVal, "SCM::CandidateValidation", true); + for (auto& c : candidates) { + auto* l = c.light; + // UpdateCamera (vfunc 16, +0x80) is the engine's type-aware visibility + // test. Verified via Ghidra (BSShadowParabolicLight_UpdateCamera at + // 0x14151b620 in 1.6.1170, 0x14132ddf0 in 1.6.640, 0x141370c80 in VR): + // + // - BSShadowParabolicLight: TWO cull conditions, both setting + // frustrumCull=0xff: + // (1) BSMultiBoundSphere::WithinFrustum (BSMultiBoundShape + // vfunc 0x29) -- sphere(niLight.pos, niLight.Radius.x) + // vs camera frustum. Geometrically correct; + // failure means no visible pixel can be lit because the + // light's bounding sphere doesn't touch the camera frustum. + // The radius source matches what the cluster builder reads + // (LightLimitFix.cpp's `runtimeData.radius.x`). + // (2) Shadow-distance LOD -- if (lodFade flag set on + // BSShadowLight) AND + // ((camDist^2 - radius^2) * camera.LodAdjust) > + // ShadowDistanceSquared_Current => cull. + // ShadowDistanceSquared_Current = fShadowDistance^2 + // (8000^2 outdoors, 3000^2 indoors by default). + // This is NOT a visibility test -- it's "skip per-light + // shadow rendering at this distance". A light past + // shadow distance can still be IN the camera frustum and + // illuminating visible pixels via cluster lighting. + // + // - BSShadowFrustumLight: cone-vs-frustum test (cone-aware so an + // off-screen spot pointing INTO the frustum is correctly kept). + // + // - BSShadowDirectionalLight: cascades, separate code path. + // + // Implication for SCM: a `frustrumCull != 0` verdict does NOT mean + // "geometrically off-screen". The convertOrDisable path below treats + // all c.invalid cases uniformly (omnis convert, spots disable, portal + // disable) so distant lights past shadow distance still reach the + // cluster pipeline. The cluster builder's own + // `(color * fade) > 1e-4 && radius > 1e-4` filter discards lights + // that genuinely don't contribute. + if (!l->UpdateCamera(camera)) { + c.invalidCamera = true; + c.invalid = true; + // Recover the sub-reason from the engine's side-band flags. + // Both can be true (a light off-screen AND LOD-faded); + // recorded as independent bits for analysis. Action loop + // below treats frustum-out as terminal (drop) and + // LOD-faded-in-frustum as convert. + c.invalidFrustum = (l->frustrumCull != 0); + c.invalidLod = (l->lodDimmer == 0.0f); + continue; + } + // Portal culling only applies in interior cells where a portal graph exists. + // Lights with no culling process (e.g. WSU spotlights outside cell bounds) + // or no portal are unconditionally visible; skip the check for them. + auto* cull = GetLightCullingProcess(l); + if (cull) { + auto* portal = reinterpret_cast(cull->portalGraphEntry); + if (portal) { + auto* gPortal = globalCull ? reinterpret_cast(globalCull->portalGraphEntry) : nullptr; + if (gPortal && !GamePortalHasSharedVisibility(gPortal, portal)) { + c.invalidPortal = true; + c.invalid = true; + continue; + } + } + } + + // Effective point-light capacity excludes the engine-claimed + // focus shadow slots; excess candidates fall through to the + // existing convert/disable path. + if (wantCount < s_settings.ShadowLightCount - s_focusShadowSlots) { + c.chosen = true; + wantCount++; + } else { + c.excess = true; + } + } + + // Tracy candidate breakdown: emits per-frame so a capture can be + // queried alongside the per-action counters to verify the math + // (chosen + excess + invalid_camera + invalid_portal == total). + for (auto& c : candidates) { + s_schedDiag.candidates_total++; + if (c.chosen) + s_schedDiag.candidates_chosen++; + if (c.excess) + s_schedDiag.candidates_excess++; + if (c.invalidCamera) + s_schedDiag.candidates_invalid_camera++; + if (c.invalidPortal) + s_schedDiag.candidates_invalid_portal++; + // Sub-reason breakdown of invalidCamera. A single light may + // be both frustum-out AND LOD-faded -- both bits are counted + // so the sum can exceed candidates_invalid_camera. The + // "other" bucket catches UpdateCamera failures where the + // engine cleared frustrumCull and left lodDimmer > 0 (rare + // edge cases like internal state changes). + if (c.invalidCamera) { + if (c.invalidFrustum) + s_schedDiag.candidates_invalid_frustum++; + if (c.invalidLod) + s_schedDiag.candidates_invalid_lod++; + if (!c.invalidFrustum && !c.invalidLod) + s_schedDiag.candidates_invalid_other++; + } + } + } // end SCM::CandidateValidation + + // Pool membership update: drop expired pointers, drop unchosen, + // add newly chosen, sync sun slot. + { + ZoneNamedN(zonePoolMem, "SCM::UpdatePoolMembership", true); + // ---- Sync s_lights (our active pool) ---- + // + // First drop entries whose pointers are no longer in the scene's + // activeShadowLights (game-side may have freed them since last frame). + // This protects subsequent slot-stability lookups from dereferencing + // dangling pointers. + std::unordered_set aliveSet; + { + auto& alive = ssn->GetRuntimeData().activeShadowLights; + aliveSet.reserve(alive.size() + 1); + if (sunLight) + aliveSet.insert(sunLight); + for (auto& sp : alive) + if (auto* l = sp.get()) + aliveSet.insert(l); + } + for (int i = 0; i < s_lights.Size; i++) { + if (!s_lights.Lights[i].Light) + continue; + if (aliveSet.find(s_lights.Lights[i].Light) == aliveSet.end()) { + s_schedDiag.reconciliation_clears++; + s_lights.Lights[i].Clear(); + } + } + + // ---- Sync s_normalConvert (converted-to-non-shadow set) ---- + // + // Two-tier filter: + // + // Tier 1: drop entries the engine has removed from BOTH active + // lists. Hook_ConvertLights_Remove fires on individual RemoveLight + // calls but the engine's bulk cell-teardown path bypasses it, so + // this is our safety net for dangling pointers. + // + // Tier 2: drop entries that are functionally dead -- still in + // activeShadowLights / activeLights (because GameEnableLight from + // ConvertLight activates an entry that the engine never + // auto-deactivates), but with fade=0 / lodDimmer=0 / null NiLight + // so addLight in LightLimitFix would skip them anyway. + // + // Without tier 2 the set grows unbounded across a session: every + // converted light stays pinned in s_normalConvert until the engine + // triggers a removal we can hook. Heavy modlists hit 400+ entries, + // keeping freed-then-recycled BSLight memory referenced by + // downstream pass captures longer than necessary. The criteria + // mirror addLight's discard filter -- entries failing it + // contribute nothing to the cluster or engine lighting paths and + // have no business staying in our set. + if (!s_normalConvert.empty()) { + std::unordered_set normalAlive; + normalAlive.reserve(aliveSet.size() + ssn->GetRuntimeData().activeLights.size()); + for (auto* p : aliveSet) + normalAlive.insert(static_cast(p)); + for (auto& sp : ssn->GetRuntimeData().activeLights) + if (auto* l = sp.get()) + normalAlive.insert(l); + + const std::size_t before = s_normalConvert.size(); + std::erase_if(s_normalConvert, [&](const ConvertedLight& c) { + // Tier 1: dangling / engine-removed. + if (!c.light || normalAlive.find(static_cast(c.light)) == normalAlive.end()) + return true; + // Tier 2: functionally dead. Cheap derefs only -- no + // virtual calls or extra hash lookups. + auto* niLight = c.light->light.get(); + if (!niLight) + return true; + const auto& rt = niLight->GetLightRuntimeData(); + const float colorSum = rt.diffuse.red + rt.diffuse.green + rt.diffuse.blue; + if (colorSum * rt.fade <= 1e-4f) + return true; + if (rt.radius.x <= 1e-4f) + return true; + return false; + }); + const std::size_t after = s_normalConvert.size(); + if (before != after) { + static int loggedShrink = 0; + if (loggedShrink++ < 20 || (before - after) > 32) { + logger::debug("[SCM] s_normalConvert reconcile: {} -> {} ({} dropped)", + before, after, before - after); + } + } + } + + // Drop entries no longer chosen. Rank-drift suppression now lives + // in CalculateLightScore via the lightframessincerender decay term + // in the default ScoreFormula; the slot pool itself is a dumb + // container that follows the chosen set without policy of its own. + // The atomic loop's c.excess / c.invalid branches handle the + // engine-side ConvertLight / DisableLight call for the dropped + // occupants on the same frame. + for (int i = 0; i < s_lights.Size; i++) { + if (!s_lights.Lights[i].Light) + continue; + bool stillChosen = (i == 0 && s_lights.Sun); // sun slot + if (!stillChosen) { + for (auto& c : candidates) { + if (c.light == s_lights.Lights[i].Light && c.chosen) { + stillChosen = true; + break; + } + } + } + if (!stillChosen) + s_lights.Lights[i].Clear(); + } + + // Add newly chosen lights (assigned to first free slot; keeps existing chosen lights in place). + for (auto& c : candidates) { + if (!c.chosen) + continue; + bool alreadyIn = false; + for (int i = 0; i < s_lights.Size && !alreadyIn; i++) + if (s_lights.Lights[i].Light == c.light) + alreadyIn = true; + if (alreadyIn) + continue; + + int idx = s_lights.FindFreeIndex(true, s_settings.ShadowLightCount, s_settings.ConvertedShadowSlots); + if (idx < 0) + continue; + // Eviction nulls Light* but leaves the rest of LightEntry intact + // so it can serve as a cache key. Clear at acquire so the new + // occupant doesn't inherit LastDrawnFrame / lastGeomHash from the + // previous owner (which would skip its first render and let the + // cluster pipeline sample stale kSHADOWMAPS[idx] content). + s_lights.Lights[idx].Clear(); + s_lights.Lights[idx].Light = c.light; + } + + // Update sun slot (slot 0). + if (sunLight) { + if (s_lights.Lights[0].Light != sunLight) { + s_lights.Lights[0].Clear(); + s_lights.Lights[0].Light = sunLight; + } + s_lights.Sun = true; + } else { + // Sun is gone. If slot 0 was tracking the sun, clear the stale + // pointer. If Sun was already false coming in, slot 0 holds a + // regular point light (sun-aware FindFreeIndex allocates point + // lights to slot 0 when Sun=false) -- do NOT wipe it. This + // matches Intellightent's reference behaviour (no unconditional + // slot-0 clear in the no-sun branch). + if (s_lights.Sun) + s_lights.Lights[0].Clear(); + s_lights.Sun = false; + } + } // end SCM::UpdatePoolMembership + + // ---- Temporal budget: decide which lights redraw this frame ---- + double budget = s_settings.RedrawBudgetMs; + { + // Frame-time EMA + budget formula evaluation. Scoped separately + // from ScheduleLoop so the once-per-frame budget cost is visible + // distinct from the per-light scheduling cost. + { + ZoneNamedN(zoneCompBud, "SCM::ComputeBudget", true); + // Update frame-time EMA and ring buffer (always, for formula params and UI). + const float dtMs = *globals::game::deltaTime * 1000.0f; + s_ftRing[s_ftHead] = dtMs; + s_ftHead = (s_ftHead + 1) % kFrameWindow; + if (s_ftCount < kFrameWindow) + ++s_ftCount; + s_ftEMA = (s_ftCount == 1) ? dtMs : 0.1f * dtMs + 0.9f * s_ftEMA; + + const float target_ms = ComputeFrameTimePercentile90(); + if (s_ftEMA < target_ms) + s_stableFrames = std::min(s_stableFrames + 1, 45); + else + s_stableFrames = 0; + + FormulaHelper::SetParam(kFormulaParam_FrameTime, static_cast(s_ftEMA)); + FormulaHelper::SetParam(kFormulaParam_FrameTarget, static_cast(target_ms)); + FormulaHelper::SetParam(kFormulaParam_StableFrames, static_cast(s_stableFrames)); + + // Evaluate the budget for the whole frame. + // Manual: fixed slider value (RedrawBudgetMs). + // Formula: user-editable exprtk expression. + if (s_settings.BudgetMode == BudgetModeEnum::Formula && s_formulaRedrawBudget) { + budget = s_formulaRedrawBudget->Calculate(); + } + s_autoBudgetMs = static_cast(budget); + } // end SCM::ComputeBudget + + s_redrawnLightsThisFrame = 0; + s_totalShadowLightsThisFrame = s_settings.ShadowLightCount; + + ZoneScopedN("SCM::ScheduleLoop"); + int maxRedraw = std::min(s_settings.MaxRedrawPerFrame, s_lights.Size); + int32_t budgetRemain = static_cast(budget * 1000.0); + bool isFirst = true; + int32_t now = *globals::game::frameCounter; + + // Clear RedrawFrame on slots OUTSIDE the point-light range (converted / + // otherwise-allocated). Note PointLightEnd accounts for the sun + // bookkeeping slot when Sun=true, so a converted-slot light at + // pool[ShadowLightCount + 1] correctly gets cleared. + for (int i = s_lights.PointLightEnd(s_settings.ShadowLightCount); i < s_lights.Size; i++) + s_lights.Lights[i].RedrawFrame = false; + + // First pass: sun only. Point-light slots fall through to the + // importance-scored pending loop below so new lights compete + // fairly with existing redraws (sorted by importance, not pool + // order). AllowDrawNewLight is honoured by the pending loop's + // filter. + for (int i = 0; i < s_lights.Size; i++) { + auto& e = s_lights.Lights[i]; + if (!e.Light) { + e.RedrawFrame = false; + continue; + } + e.RedrawFrame = (i == 0 && s_lights.Sun); + if (e.RedrawFrame) { + e.LastDrawnFrame = now; + isFirst = false; + maxRedraw--; + // Sun's budget cost is bookkept at 0 (different texture + // pipeline -- it has its own cascade buffer), so no + // budgetRemain decrement. + } + } + + if (maxRedraw > 0 && budgetRemain > 0) { + std::vector pending; + for (int i = 0; i < s_lights.Size; i++) { + auto& e = s_lights.Lights[i]; + if (!e.Light || e.RedrawFrame) + continue; + // Honour AllowDrawNewLight: when disabled, brand-new + // entries (LastDrawnFrame < 0) wait until the next frame + // rather than competing for this frame's budget. Existing + // lights re-entering view still schedule normally. + if (!s_settings.AllowDrawNewLight && e.LastDrawnFrame < 0) + continue; + pending.push_back(&e); + } + + for (auto* e : pending) { + double interval = 0.0; + if (s_formulaRedrawInterval) { + SetupLightFormula(e->Light, camera, 0); + // e->Index is the pool index. Beyond PointLightEnd are converted slots. + if (e->Index >= s_lights.PointLightEnd(s_settings.ShadowLightCount)) + FormulaHelper::SetParam(kFormulaParam_LightConverted, 1.0); + + // Compute how far the light has moved since its last shadow map render. + // Exposed as `lightdisplacement` so the formula can prioritise fast-moving + // lights (e.g. player torches) without relying on distance-to-camera alone. + if (auto* nilight = e->Light->light.get()) { + auto& curr = nilight->world.translate; + float dx = curr.x - e->lastRenderedPos.x; + float dy = curr.y - e->lastRenderedPos.y; + float dz = curr.z - e->lastRenderedPos.z; + FormulaHelper::SetParam(kFormulaParam_LightDisplacement, + static_cast(sqrtf(dx * dx + dy * dy + dz * dz))); + } + + interval = s_formulaRedrawInterval->Calculate(); + } + interval += 1.0; + + // Contribution-weighted redraw interval: + // importance = luminance(diffuse × fade) × max(att_cam, att_plr) + // att(pos) = max(1 - (dist/radius)^2, 0)^2 (Skyrim falloff) + // interval *= 2.0 * (0.025/2.0)^importance + // importance=0 -> x2.0 (deprioritise), 0.5 -> ~x0.32, 1.0 -> ~x0.05. + // Refs: Wimmer & Scherzer 2006 "Instant Shadow Maps" sec. 3; + // Valient 2014 "Practical Shadow Maps". + + float importance = 0.0f; + + if (auto* ni = e->Light->light.get()) { + auto& rtd = ni->GetLightRuntimeData(); + float lightRadius = rtd.radius.x; + auto lp = ni->world.translate; + + // Perceptual luminance (Rec.709) × engine fade factor. + float lum = 0.2126f * rtd.diffuse.red + + 0.7152f * rtd.diffuse.green + + 0.0722f * rtd.diffuse.blue; + float effectiveLum = lum * rtd.fade; + + // Primary: screen-space projected solid angle. + // "How much of the view does this light's influence + // sphere occupy?" Industry standard for many-light + // shadow prioritisation -- see Olsson & Assarsson 2012, + // "Clustered Deferred and Forward Shading"; Wronski + // 2014, "Sample Distribution Shadow Maps"; CryEngine + // shadow LOD docs. Approximates angular radius from + // camera as radius/viewZ; solid angle ~ angularRadius^2. + // Constants (screenH / 2*tan(fovY/2))^2 drop out -- they're + // the same across all lights and don't affect ranking. + // + // Edge cases: + // viewZ < -radius : light fully behind camera, coverage=0 + // |viewZ| < radius : light intersects camera plane; + // clamp effectiveZ to avoid blow-up. + float coverage = 0.0f; + if (camera) { + auto cp = camera->world.translate; + RE::NiPoint3 fwd = camera->world.rotate.GetVectorY(); + float rx = lp.x - cp.x, ry = lp.y - cp.y, rz = lp.z - cp.z; + float viewZ = fwd.x * rx + fwd.y * ry + fwd.z * rz; + if (viewZ > -lightRadius) { + float effectiveZ = std::max(viewZ, lightRadius * 0.5f); + float angularRadius = lightRadius / effectiveZ; + coverage = angularRadius * angularRadius; + } + } + + // Fallback: Skyrim-style quadratic distance falloff + // from camera/player. Covers two cases where coverage + // alone returns 0 but the user still sees shadows: + // 1. Light just outside the frustum (around a corner) + // illuminating a visible wall. + // 2. Player-held torch behind the camera lighting + // geometry ahead. + // Weighted at 0.3 -- coverage dominates when the light + // is in view, but out-of-view lights still get a floor + // proportional to their illumination at the viewer. + auto computeAtt = [&](const RE::NiPoint3& pos) -> float { + float dx = pos.x - lp.x, dy = pos.y - lp.y, dz = pos.z - lp.z; + float dist2 = dx * dx + dy * dy + dz * dz; + float r2 = lightRadius * lightRadius; + if (dist2 >= r2) + return 0.0f; + float t = dist2 / r2; + float a = 1.0f - t; + return a * a; // matches Skyrim (1-(d/r)^2)^2 falloff + }; + auto* plr = RE::PlayerCharacter::GetSingleton(); + float attCam = camera ? computeAtt(camera->world.translate) : 0.0f; + float attPlr = plr ? computeAtt(plr->GetPosition()) : attCam; + float distanceFallback = std::max(attCam, attPlr) * 0.3f; + + importance = effectiveLum * std::max(coverage, distanceFallback); + } + + // Exponential interval scaling: maxScale*(minScale/maxScale)^clamp(importance,0,1) + float kMaxMult = s_settings.ImportanceMaxScale; + float kMinMult = std::min(s_settings.ImportanceMinScale, kMaxMult); + float clampedImp = std::min(importance, 1.0f); + interval *= static_cast(kMaxMult * powf(kMinMult / kMaxMult, clampedImp)); + + FormulaHelper::SetParam(kFormulaParam_LightImportance, static_cast(importance)); + e->RedrawScore = e->LastDrawnFrame + interval; + e->lastImportance = importance; + + // Cached shadow maps: if the geometry hash matches what we + // rendered last time, the shadow map currently in the slot + // is byte-identical to what a fresh re-render would produce. + // No need to redraw -- push the score sky-high so this entry + // loses every budget contest unless literally nothing else + // needs redrawing (defensive: still allow eventual refresh + // against any hashing bugs). + // + // Industry-standard pattern: UE5 "Cached Shadow Maps", + // Frostbite movable-light caching. The hash captures + // (1) light's own pose + radius and (2) each caster's + // worldBound + identity -- both rigid motion and engine- + // updated bounds (BSDynamicTriShape vertex changes update + // worldBound). + e->pendingGeomHash = ComputeShadowGeomHash(e->Light); + if (e->LastDrawnFrame >= 0 && e->lastGeomHash != 0 && + e->pendingGeomHash == e->lastGeomHash) { + e->RedrawScore += 1e15; + } + } + + // Count lights meaningfully illuminating the viewer area. + s_highImportanceLightCount = static_cast( + std::count_if(pending.begin(), pending.end(), + [](const LightEntry* e) { return e->lastImportance > 0.1f; })); + + std::sort(pending.begin(), pending.end(), + [](const LightEntry* a, const LightEntry* b) { return a->RedrawScore < b->RedrawScore; }); + + for (auto* e : pending) { + if (maxRedraw <= 0) + break; + if (budgetRemain <= 0) + break; + int32_t budgetEstimate = s_budget.GetCost(e->Light); + if (isFirst) { + if (!s_lights.Sun || e->Index > 0) + budgetRemain -= budgetEstimate; + maxRedraw--; + e->RedrawFrame = true; + e->LastDrawnFrame = now; + e->lastGeomHash = e->pendingGeomHash; + isFirst = false; + continue; + } + if (budgetEstimate <= budgetRemain) { + budgetRemain -= budgetEstimate; + maxRedraw--; + e->RedrawFrame = true; + e->LastDrawnFrame = now; + e->lastGeomHash = e->pendingGeomHash; + continue; + } + } + } + } + + // Count how many shadow lights are scheduled to redraw this frame. + // Iterate the point-light range (sun-aware: skips pool[0] when Sun=true). + s_redrawnLightsThisFrame = 0; + for (int j = s_lights.PointLightFirst(); j < s_lights.PointLightEnd(s_settings.ShadowLightCount); j++) { + if (s_lights.Lights[j].RedrawFrame) + ++s_redrawnLightsThisFrame; + } + + // EWMA so the UI counter doesn't flicker frame-to-frame. + s_redrawnLightsSmoothed = 0.8f * s_redrawnLightsSmoothed + 0.2f * s_redrawnLightsThisFrame; + + // Atomic per-candidate loop: process each score-sorted candidate to + // completion before moving on. Branch dispatch: + // chosen + RedrawFrame + slot in budget: EnableLight + render + // chosen otherwise: DisableLight (re-added below + // via GameSetShadowCasterSlot) + // excess + ConvertExcessToNormal: ConvertLight + // excess otherwise / invalid: DisableLight + // + // Ordering matters: chosen (rank < ShadowLightCount) runs before any + // excess. ConvertLight's ReturnShadowmaps can mutate activeShadowLights + // and free other BSShadowLights, but by then chosen entries have + // already completed EnableLight + budget pairing in-iteration -- no + // later phase walks those pointers. + // + // isUsableLight() per-iteration guard catches dangling pointers if an + // earlier EnableLight invalidated a later candidate via scene mutation. + + auto* shadowSceneNodeRT = &ssn->GetRuntimeData(); + + // Two-stage validity check used before any virtual dispatch on a + // BSShadowLight from s_lights[] or candidates[]: + // (1) Is the pointer still in the scene's activeShadowLights? + // (catches "removed since last frame") + // (2) Is the vtable non-zero? + // (catches "freed and zeroed by tbbmalloc / EngineFixes via a path + // that bypassed BSSmartPointer ref-counting" — the pointer is + // still in activeShadowLights but the object is dead) + // Either failure → caller must skip the light. + auto isAliveNow = [shadowSceneNodeRT, sunLight](RE::BSShadowLight* l) -> bool { + if (!l) + return false; + if (l == sunLight) + return true; + for (auto& sp : shadowSceneNodeRT->activeShadowLights) + if (sp.get() == l) + return true; + return false; + }; + auto isVtableValid = [](RE::BSShadowLight* l) -> bool { + return l && *reinterpret_cast(l) != 0; + }; + auto isUsableLight = [&](RE::BSShadowLight* l) -> bool { + return isAliveNow(l) && isVtableValid(l); + }; + + auto findSlotForLight = [](RE::BSShadowLight* l) -> int { + for (int i = 0; i < s_lights.Size; i++) + if (s_lights.Lights[i].Light == l) + return i; + return -1; + }; + + // Single decision point for "this light won't shadow this frame -- + // Convert (keeps diffuse via cluster pipeline) or Disable (light + // vanishes)?". Used by both the c.invalid and c.excess branches. + // + // Spots always Disable: the engine has no NiSpotLight equivalent, so + // ConvertLight on a BSShadowFrustumLight would make the cone-shaped + // illumination spherical and bleed through walls behind the cone. + // Omnis/hemis Convert when ConvertExcessToNormal is on or a debug + // pin-convert is set on this light. The pin override applies even + // when the user disabled ConvertExcessToNormal globally. + // + // allowConvert is a callsite veto -- the c.invalid path passes it + // false for invalidPortal (cluster has no portal-graph awareness, + // converting would leak light across cells) so portal-occluded + // lights always Disable. + // + // Returns true on Convert, false on Disable, so callers can apply + // path-specific follow-ups (e.g. lodDimmer=1 reset on the invalidLod + // path so the converted light still contributes to clusters). + auto convertOrDisable = [&](RE::BSShadowLight* light, bool allowConvert) -> bool { + const bool isSpot = light->GetIsFrustumLight(); + const bool forceConvert = s_pinConvert.count(reinterpret_cast(light)) > 0; + if (allowConvert && (s_settings.ConvertExcessToNormal || forceConvert) && !isSpot) { + ConvertLight(light, ssn, false); + return true; + } + DisableLight(light); + return false; + }; + + // Sun slot (slot 0) is processed inline below — sun setup happened at the + // top of the function; we only need to mark its mask index here. + if (s_lights.Sun && s_lights.Lights[0].Light && s_lights.Lights[0].RedrawFrame) { + ShadowField(s_lights.Lights[0].Light, maskIndex) = 0; + doneLightCount++; + } + + // Per-candidate Begin/EnableLight/End mutation loop. EnableLight may + // trigger synchronous shadow render dispatches in the engine, so this + // zone captures both our scheduler work and any engine-side rendering + // it pulls in for chosen lights. + { + ZoneNamedN(zoneAtomic, "SCM::AtomicMutationLoop", true); + for (auto& c : candidates) { + if (c.invalid) { + // isUsableLight (membership + vtable) is the same gate the + // excess branch uses. Both ConvertLight and DisableLight + // fan into virtually-dispatched callees (ReturnShadowmaps), + // so a freed-but-canonical pointer must be skipped for + // either path. + if (!isUsableLight(c.light)) + continue; + + // All c.invalid cases route through convertOrDisable. Per the + // Ghidra-verified UpdateCamera analysis above, frustrumCull + // is set both by the genuine sphere-vs-frustum cull AND by + // the shadow-distance LOD cull; treating them uniformly lets + // distant lights past shadow distance still reach the + // cluster pipeline. allowConvert=c.invalidCamera so portal- + // occluded omnis fall to Disable (cluster lighting has no + // portal-graph awareness and would leak across cells). + ZoneNamedN(zCvt, "SCM::Engine::convertOrDisable(invalid)", true); + if (convertOrDisable(c.light, /*allowConvert=*/c.invalidCamera)) { + s_schedDiag.converted_invalid++; + // UpdateCamera zeros lodDimmer alongside frustrumCull + // when its shadow-distance LOD cull fires. The + // cluster lighting builder multiplies light.fade by + // lodDimmer and drops the light if the product falls + // below 1e-4. Restore only when fully zeroed -- any + // smooth fade value the engine set is preserved so + // the cluster contribution fades gradually rather + // than snapping to full intensity. Matches the + // per-frame restore in LightLimitFix::UpdateLights + // for already-converted lights. + if (c.light->lodDimmer == 0.0f) + c.light->lodDimmer = 1.0f; + } else { + s_schedDiag.disabled_invalid++; + } + continue; + } + + if (c.chosen) { + int slot = findSlotForLight(c.light); + if (slot < 0) + continue; // matches old behaviour: chosen-but-no-slot is a no-op + if (slot == 0 && s_lights.Sun) + continue; // sun handled above + + auto& e = s_lights.Lights[slot]; + + // Render-this-frame path is reserved for chosen point-light slots + // (excludes converted slots which start at PointLightEnd). Use + // the sun-aware bound so pool[ShadowLightCount] (the highest + // point-light slot when Sun=true) is included. + if (e.RedrawFrame && slot < s_lights.PointLightEnd(s_settings.ShadowLightCount)) { + // Render-this-frame path. A previous iteration's EnableLight + // may have transitively freed this light via game-side scene + // mutations (membership change OR tbbmalloc-zeroed memory), + // so re-validate before any virtual dispatch. + if (!isUsableLight(e.Light)) { + e.Light = nullptr; + continue; + } + + auto* lightSnapshot = e.Light; // value snapshot for budget pairing + + e.Light->UpdateCamera(camera); + s_budget.BeginLight(lightSnapshot, 0); + { + ZoneNamedN(zEnable, "SCM::Engine::EnableLight", true); + EnableLight(e.Light, camera, ssn, slot); + } + + // EnableLight callbacks can null e.Light (re-entrant scheduling + // / scene mutation), AND the engine can free the BSShadowLight + // during the call without nulling our pointer -- a third-party + // VR crash report (CommunityShaders.dll v1.5.1, file path + // D:\a\skyrim-community-shaders\... at EnableLight's + // `*GetAccumLightSlot() += light->shadowMapCount`) showed the + // engine reading shadowMapCount from a freed BSShadowLight, + // corrupting the global accumLightSlot counter, then a + // downstream `[base + corrupted*8]` AV. Bare null check passes + // for the freed-but-non-null case; isUsableLight rejects it + // via the activeShadowLights-membership and vtable checks. + if (!e.Light || !isUsableLight(e.Light)) + continue; + s_budget.EndLight(lightSnapshot, 0); + + if (auto* nilight = e.Light->light.get()) + e.lastRenderedPos = nilight->world.translate; + + ShadowField(e.Light, maskIndex) = static_cast(slot); + doneLightCount++; + } + // Cached-shadow path (chosen + !RedrawFrame, or i >= ShadowLightCount): + // do nothing here. The non-redrawn light keeps its stale shadow map and + // is re-inserted by the GameSetShadowCasterSlot loop below at endIdx. + // Calling DisableLight here would invoke ReturnShadowmaps, releasing the + // cached shadow data for one frame and producing visible flicker that + // worsens as the budget gets more constrained. + continue; + } + + if (c.excess) { + if (!isUsableLight(c.light)) + continue; + + // Atomic ordering: by the time we reach excess (rank + // >= ShadowLightCount), all chosen lights have completed + // their Begin/EnableLight/End sequence. ConvertLight's + // ReturnShadowmaps side effect can only invalidate + // pointers we are no longer walking. LightLimitFix:: + // UpdateLights then iterates activeShadowLights to pick + // up converted lights for the cluster pipeline. + // + // Rank-drift suppression (a torch's importance score + // bobbing across the chosen/excess boundary frame-to- + // frame) lives in the score formula via the + // lightframessincerender decay term, not here. + ZoneNamedN(zCvt, "SCM::Engine::convertOrDisable(excess)", true); + if (convertOrDisable(c.light, /*allowConvert=*/true)) + s_schedDiag.converted_excess++; + else + s_schedDiag.disabled_excess++; + continue; + } + } + } // end SCM::AtomicMutationLoop + + // Non-redrawn chosen lights: insert at end of shadow caster array without rendering. + // GetAccumLightSlot() already advanced past all EnableLight()-rendered slots. + // + // Re-rebuild the alive set: the atomic loop above may have invalidated + // pointers (e.g. ConvertLight on excess removes from activeShadowLights). + // Skip s_lights entries whose pointer is no longer in the scene to avoid + // dereferencing freed BSShadowLight memory below. + { + ZoneNamedN(zonePostAtomic, "SCM::PostAtomicRevalidate", true); + std::unordered_set aliveAfterAtomic; + { + auto& alive = ssn->GetRuntimeData().activeShadowLights; + aliveAfterAtomic.reserve(alive.size() + 1); + if (sunLight) + aliveAfterAtomic.insert(sunLight); + for (auto& sp : alive) + if (auto* l = sp.get()) + aliveAfterAtomic.insert(l); + } + + int endIdx = (int)*GetAccumLightSlot(); + + for (int i = 0; i < s_lights.Size; i++) { + auto& e = s_lights.Lights[i]; + // Re-insert (without rendering) every chosen+!RedrawFrame light + // AND every converted-slot light (i >= PointLightEnd). The + // PointLightEnd bound is sun-aware so converted slots correctly + // start one slot later when Sun=true. + if (e.Light && (!e.RedrawFrame || i >= s_lights.PointLightEnd(s_settings.ShadowLightCount))) { + // Membership check uses the snapshot built above (a + // game-mutation in the atomic loop may have invalidated + // pointers; aliveAfterAtomic captures the current scene + // state in O(N) for O(1) membership queries here). + if (aliveAfterAtomic.find(e.Light) == aliveAfterAtomic.end()) { + s_schedDiag.reconciliation_clears++; + e.Clear(); + continue; + } + // First-render gate: a chosen light whose slot has never + // been rendered for IT (LastDrawnFrame < 0) has no valid + // shadow content in its kSHADOWMAPS slice -- the depth + // content is either cleared or carries the evicted + // previous occupant's shadow. Inserting the light as a + // shadow caster would make the cluster shader sample stale + // depth and project a wrong shadow shape through the new + // light. Skip insertion this frame; the light still + // illuminates via the cluster pipeline as a non-shadow + // light, with no false shadow. Once it wins a redraw turn + // LastDrawnFrame goes >= 0 and it joins the shadow set + // normally. + // + // Converted-slot range (i >= PointLightEnd) is unaffected: + // converted lights don't sample kSHADOWMAPS via this slot + // path; they participate via the s_normalConvert non-shadow + // pipeline. + if (i < s_lights.PointLightEnd(s_settings.ShadowLightCount) && + e.LastDrawnFrame < 0 && + !(s_lights.Sun && i == 0)) { + s_schedDiag.first_render_skips++; + continue; + } + + // Cached-shadow reuse (the UE5 / CryEngine / Frostbite + // pattern). We unconditionally sample the cached + // kSHADOWMAPS slice even when the geometry hash mismatches + // (light or caster moved since the cached render). For + // small motion the staleness is sub-pixel and invisible; + // for large motion the shadow visibly lags the light by + // 1-2 frames, which is much less objectionable than the + // full-frame on/off flicker that hash-gated suppression + // produces on every animated torch. The hash-mismatch + // priority hint above keeps stale entries at the front of + // the redraw queue, so the lag self-corrects within budget + // cycles. + // + // The first_render_skips gate above is the only safety + // gate that DOES suppress insertion: a slot with no + // rendered content for its current owner (LastDrawnFrame + // < 0) has no valid cached shadow to fall back on; the + // GPU slice is either cleared or contains an evicted + // previous occupant. Hash mismatch on an existing slice + // is at worst a small visual lag. + // GameSetShadowCasterSlot calls Accumulate virtually; reuse + // isUsableLight's vtable guard to catch tbbmalloc-zeroed + // objects that are still in activeShadowLights but freed. + if (!isVtableValid(e.Light)) { + e.Light = nullptr; + continue; + } + GameSetShadowCasterSlot(ssn, e.Light, endIdx, 1); + // Same hazard as the post-EnableLight site: the engine can + // free the light during this call. Use isUsableLight, not + // just null check. + if (!e.Light || !isUsableLight(e.Light)) + continue; + endIdx += e.Light->shadowMapCount; + ShadowField(e.Light, maskIndex) = static_cast(i); + + // GameSetShadowCasterSlot (via Accumulate) overwrites shadowmapIndex + // with the sequential endIdx counter, diverging from the stable + // container-slot index that CopyShadowLightData and Prepass expect. + // All shadow-slot light types are affected: + // Spot (!IsParabolicLight): 1 descriptor, 1 atlas slice. + // Hemi (IsParabolicLight && !IsOmniLight): 1 descriptor, 1 atlas slice. + // Omni (IsParabolicLight && IsOmniLight): both paraboloids packed into + // a single atlas slice via UV splitting in GetOmnidirectionalShadow, + // so all descriptors should also point to i. + // Restore shadowmapIndex = i for every non-redrawn shadow-slot light. + // Only restore shadowmapIndex for point-light slots (skip converted). + // PointLightEnd accounts for sun bookkeeping so the highest point-light + // slot (Sun=true: pool[ShadowLightCount]) is included. + if (s_settings.ShadowLightCount > 4 && i < s_lights.PointLightEnd(s_settings.ShadowLightCount)) { + // Restore descriptor.shadowmapIndex for cached (non-redrawn) + // chosen lights so RenderCascade samples their preserved + // depth slice. Sun (pool[0] when Sun=true) is skipped — + // it renders via the directional cascade path, not + // kSHADOWMAPS, so its descriptor.shadowmapIndex is unused. + if (s_lights.Sun && i == 0) + continue; + if (REL::Module::IsVR()) { + for (auto& desc : e.Light->GetVRRuntimeData().shadowmapDescriptors) + desc.shadowmapIndex = static_cast(i); + } else { + for (auto& desc : e.Light->GetRuntimeData().shadowmapDescriptors) + desc.shadowmapIndex = static_cast(i); + } + } + } + } + } + // Update rolling redraw and budget statistics. + { + int redrawing = 0; + int32_t consumed = 0; + for (int i = 0; i < s_lights.Size; i++) { + auto& e = s_lights.Lights[i]; + if (e.Light && e.RedrawFrame) { + if (i != 0 || !s_lights.Sun) + consumed += s_budget.GetCost(e.Light); + redrawing++; + } + } + s_redrawSum -= s_redrawHistory[s_redrawHistoryPos]; + s_redrawHistory[s_redrawHistoryPos] = redrawing; + s_redrawSum += redrawing; + s_redrawHistoryPos = (s_redrawHistoryPos + 1) % kRedrawHistorySize; + + s_budgetSum -= s_budgetHistory[s_budgetHistoryPos]; + s_budgetHistory[s_budgetHistoryPos] = consumed; + s_budgetSum += consumed; + s_budgetHistoryPos = (s_budgetHistoryPos + 1) % kRedrawHistorySize; + } + + ssn->GetRuntimeData().firstPersonShadowMask = *GetShadowMask(); + *GetFrameLightCount() = static_cast(doneLightCount); + + // ===================================================================== + // Tracy per-frame plots: scheduler diagnostic counters + live config. + // Emitting both in the same frame lets a capture be queried for A/B + // behaviour without re-running the game: the cfg_* plots are the + // independent variables, the scm.* plots are the dependent outcomes. + // ===================================================================== + { + // Sample slot occupancy at frame end (post-reconciliation). + for (int i = 0; i < s_lights.Size; i++) + if (s_lights.Lights[i].Light) + s_schedDiag.slots_in_use++; + + TracyPlot("scm.candidates.total", (int64_t)s_schedDiag.candidates_total); + TracyPlot("scm.candidates.chosen", (int64_t)s_schedDiag.candidates_chosen); + TracyPlot("scm.candidates.excess", (int64_t)s_schedDiag.candidates_excess); + TracyPlot("scm.candidates.invalid_camera", (int64_t)s_schedDiag.candidates_invalid_camera); + TracyPlot("scm.candidates.invalid_portal", (int64_t)s_schedDiag.candidates_invalid_portal); + TracyPlot("scm.candidates.invalid_frustum", (int64_t)s_schedDiag.candidates_invalid_frustum); + TracyPlot("scm.candidates.invalid_lod", (int64_t)s_schedDiag.candidates_invalid_lod); + TracyPlot("scm.candidates.invalid_other", (int64_t)s_schedDiag.candidates_invalid_other); + TracyPlot("scm.converted.invalid", (int64_t)s_schedDiag.converted_invalid); + TracyPlot("scm.converted.excess", (int64_t)s_schedDiag.converted_excess); + TracyPlot("scm.disabled.invalid", (int64_t)s_schedDiag.disabled_invalid); + TracyPlot("scm.disabled.excess", (int64_t)s_schedDiag.disabled_excess); + TracyPlot("scm.reconciliation.clears", (int64_t)s_schedDiag.reconciliation_clears); + TracyPlot("scm.slots.in_use", (int64_t)s_schedDiag.slots_in_use); + TracyPlot("scm.first_render_skips", (int64_t)s_schedDiag.first_render_skips); + + // Live config plots — record the *current* settings on each frame so + // a single capture spanning a settings change captures both sides. + TracyPlot("cfg.ShadowLightCount", (int64_t)s_settings.ShadowLightCount); + TracyPlot("cfg.MaxRedrawPerFrame", (int64_t)s_settings.MaxRedrawPerFrame); + TracyPlot("cfg.ConvertExcessToNormal", (int64_t)(s_settings.ConvertExcessToNormal ? 1 : 0)); + TracyPlot("cfg.Enabled", (int64_t)(s_settings.Enabled ? 1 : 0)); + TracyPlot("cfg.RedrawBudgetMs", (double)s_settings.RedrawBudgetMs); + } + } + + // ========================================================================= + // Render hook: replaces RenderActiveShadowCasterLights + // Iterates s_lights and calls Render() on lights flagged RedrawFrame. + // Uses install_context_hook at a specific call site in the render loop (see Install()). + // ========================================================================= + + static void RenderScheduledShadowLights() + { + // VR: RenderActiveShadowCasterLights normally saves+clears g_drawStereo before + // iterating shadow casters, then restores it. Without this, each hemisphere + // render is doubled for both eyes -> 4-quadrant shadow map texture. + bool savedStereo = false; + if (REL::Module::IsVR()) { + savedStereo = *globals::game::drawStereo; + *globals::game::drawStereo = false; + } + + ZoneScopedN("SCM::RenderScheduledShadowLights"); + auto* state = globals::state; + state->BeginPerfEvent("SCM::RenderScheduledShadowLights"); +#ifdef TRACY_ENABLE + TracyD3D11Zone(state->tracyCtx, "SCM::RenderScheduledShadowLights"); +#endif + + s_budget.Begin(1); + + uint32_t tmp = 0; + // Sun first: BSShadowDirectionalLight::Render emits the "Directional + // Light Shadowmaps" marker and writes the cascade depth maps to + // kSHADOWMAPS_ESRAM. The engine's vanilla RenderActiveShadowCasterLights + // dispatches this via the same vtable walk it uses for point lights; + // we replaced that walk with this loop, so we need to call sun.Render + // explicitly. Without this, the directional cascade pass is skipped + // and exterior scenes render with no sun shadow. + if (s_lights.Sun && s_lights.Lights[0].Light) { + ZoneNamedN(zSun, "SCM::Render::Sun", true); +#ifdef TRACY_ENABLE + TracyD3D11Zone(state->tracyCtx, "SCM::Render::Sun"); +#endif + s_budget.BeginLight(s_lights.Lights[0].Light, 1); + s_lights.Lights[0].Light->Render(tmp); + s_budget.EndLight(s_lights.Lights[0].Light, 1); + } + + // Point lights from PointLightFirst onwards. PointLightFirst skips + // slot 0 (handled above when Sun=true). PointLightEnd includes the + // highest point-light slot when Sun=true. + { + ZoneNamedN(zPoint, "SCM::Render::PointLights", true); +#ifdef TRACY_ENABLE + TracyD3D11Zone(state->tracyCtx, "SCM::Render::PointLights"); +#endif + for (int i = s_lights.PointLightFirst(); i < s_lights.PointLightEnd(s_settings.ShadowLightCount); i++) { + auto& e = s_lights.Lights[i]; + if (!e.Light || !e.RedrawFrame) + continue; + s_budget.BeginLight(e.Light, 1); + e.Light->Render(tmp); + s_budget.EndLight(e.Light, 1); + } + } + + state->EndPerfEvent(); + + if (REL::Module::IsVR()) + *globals::game::drawStereo = savedStereo; + } + + // Replaces the call to RenderActiveShadowCasterLights. + // install_context_hook (RtlRestoreContext) is required so all volatile registers (r8, etc.) + // are restored before the game continues past the patched call site. + // + // Non-VR (SE/AE): set ctx.Rax = 0 so the conditional between 107133+0x192 and + // +0x1AE skips "call [r8+0x50]" -- r8 is loaded from rax there; if rax != 0, + // r8 gets a stale pointer whose [+0x50] slot is null -> crash at execute 0x0. + static void Hook_RenderShadowLights(CONTEXT& ctx) + { + if (!REL::Module::IsVR()) + ctx.Rax = 0; + RenderScheduledShadowLights(); + }; + + // Hook struct for stl::detour_thunk. + // + // `s_settings.Enabled` is now a BOOT-TIME flag only -- toggling at + // runtime has no effect on this thunk, the same way ShadowLightCount + // and atlas texture sizes are restart-gated. See Init() at the + // settings.Enabled early-return for the boot-time gate. + // + // Rationale (Ghidra-verified by crash 2026-05-17 20:31:12): the AV + // at BSBatchRenderer::sub_SE100843_AE107633 +0x54 + // (`mov rax, [r14+0x48]`, r14=1 = vfunc bool returned as pointer) + // is reached via: + // NiCamera::CalculateAndDrawShadowCasterLights + // -> CalculateActiveShadowCasterLights (the engine's vanilla + // scheduler -- what we'd + // route to on disable) + // -> BSShadowDirectionalLight::sub_SE100818_AE107602 (sun + // shadow) + // -> FUN_1414bf320 (BSCullingProcess inner) + // -> BSCullingProcess::sub + // -> FUN_1414f50d0 + // -> BSBatchRenderer::sub_SE100843_AE107633 (AV) + // + // The crash is in the vanilla scheduler itself. SCM's boot-time + // modifications (kSHADOWMAPS texture sized to ShadowLightCount, + // depth-buffer creation loop redirected via Hook_CreateNormalDepthBuffer + // and Hook_CreateReadOnlyDepthBuffer, screen-space mask pass wrapped + // by Hook_RenderShadowLightsWithUtilityShader) make the engine state + // incompatible with + // the vanilla traversal even when our runtime tracking is left + // untouched (soft-disable still crashed). The deep engine hooking + // is not safely reversible at runtime; restart is the only safe + // way to revert to vanilla. + struct Hook_CalculateActiveShadowCasters + { + static void thunk() + { + ScheduleShadowCasters(); + } + static inline REL::Relocation func; + }; + + // ========================================================================= + // Surface lights hook + // Replaces CalculateActiveNonShadowCasterLights (ID 100997/107784). + // Uses install_context_hook because the function has 10 args (11 in VR) + // with VR-specific stack layout -- CONTEXT is the simplest cross-runtime approach. + // ========================================================================= + + static void Hook_CalculateActiveLightsForSurface(CONTEXT& ctx) + { + // Args from registers/stack (x64 fastcall, shadow space at RSP+0x00..0x20): + auto* lightData = reinterpret_cast(ctx.Rcx); // a1 + auto** lights = reinterpret_cast(ctx.Rdx); // a2 + int maxCount = static_cast(ctx.R8); // a3 + int* shadowCount = reinterpret_cast(ctx.R9); // a4 + auto* ssn = *reinterpret_cast(ctx.Rsp + 0x28); // a5 + auto* shaderProp = *reinterpret_cast(ctx.Rsp + 0x30); // a6 + bool addShadow = *reinterpret_cast(ctx.Rsp + 0x38); // a7 + bool* useShadowSun = *reinterpret_cast(ctx.Rsp + 0x40); // a8 + bool firstPerson = *reinterpret_cast(ctx.Rsp + 0x48); // a9 + uint32_t fpMask = *reinterpret_cast(ctx.Rsp + 0x50); // a10 + + // VR passes an 11th arg: if non-zero, skip accumulation (vanilla early-out). + if (REL::Module::IsVR() && *reinterpret_cast(ctx.Rsp + 0x58) != 0) { + ctx.Rax = 1; // addedLightCount = sun only + return; + } + + // Determine the sun light for this surface. + RE::BSLight* sunLight; + if (*useShadowSun) + sunLight = ssn->GetRuntimeData().sunShadowDirLight; + else + sunLight = ssn->GetRuntimeData().sunLight; + if (shaderProp->flags.any(RE::BSShaderProperty::EShaderPropertyFlag::kCloudLOD)) + sunLight = ssn->GetRuntimeData().cloudLight; + + lights[0] = sunLight; + *shadowCount = 0; + int added = 1; + + if (addShadow) { + auto& casters = ssn->GetRuntimeData().shadowLightsAccum; + + // Step 1: vanilla shadow lights gated by activeLightMask / first-person mask. + for (uint32_t slot = 0; slot < casters.size() && added < maxCount; slot++) { + uint32_t bit = 1u << slot; + if (!((firstPerson && (fpMask & bit)) || (lightData->activeLightMask & bit))) + continue; + auto* sl = reinterpret_cast(casters[slot]); + if (!sl || sl == sunLight) + continue; + if (GameIsLightAffectingSurface(shaderProp, sl)) { + lights[added++] = sl; + (*shadowCount)++; + } + } + + // Step 2: extended pool lights not covered by the vanilla mask. + // Only inject lights that are present in this scene's caster array + // (prevents world lights leaking into menu / special scenes). + // Iterate the point-light range (sun-aware via PointLightFirst / + // PointLightEnd; pre-helper loops missed pool[ShadowLightCount] + // when Sun=true, dropping one shadow caster from per-surface lists). + for (int i = s_lights.PointLightFirst(); i < s_lights.PointLightEnd(s_settings.ShadowLightCount) && added < maxCount; i++) { + auto& e = s_lights.Lights[i]; + if (!e.Light || reinterpret_cast(e.Light) == sunLight) + continue; + + bool inScene = false; + for (uint32_t s = 0; s < casters.size() && !inScene; s++) + if (reinterpret_cast(casters[s]) == reinterpret_cast(e.Light)) + inScene = true; + if (!inScene) + continue; + + bool alreadyAdded = false; + for (int j = 1; j < added && !alreadyAdded; j++) + if (lights[j] == reinterpret_cast(e.Light)) + alreadyAdded = true; + if (alreadyAdded) + continue; + + if (GameIsLightAffectingSurface(shaderProp, reinterpret_cast(e.Light))) { + lights[added++] = reinterpret_cast(e.Light); + (*shadowCount)++; + } + } + } + + // Step 3: non-shadow lights from the per-surface accumulation list. + // Skip parabolic shadow-casters (frustrumCull == 0xFF) and hidden NiLights. + for (uint32_t i = 0; i < lightData->lights.size() && added < maxCount; i++) { + auto* l = lightData->lights[i]; + if (!l || l == sunLight) + continue; + auto* ni = l->light.get(); + if (ni && (l->frustrumCull == 0xFFu || ni->GetFlags().any(RE::NiAVObject::Flag::kHidden))) + continue; + lights[added++] = l; + } + + // Step 4: Inject converted shadow lights (s_normalConvert, issue #2121 #3) + // into the per-surface lights array. These lights have frustrumCull == 0xFF + // (parabolic shadow-caster marker) and are skipped by Step 3, while Steps + // 1/2 don't include them either (ReturnShadowmaps cleared shadowLightsAccum). + // + // The cluster pipeline picks them up separately via LightLimitFix::UpdateLights' + // activeShadowLights iteration; this Step 4 ensures the engine's vanilla + // strict-light loop (which consumes lights[] passed to this function) also + // sees them so non-LLF code paths and shaders without LIGHT_LIMIT_FIX still + // receive the diffuse contribution. + for (auto& c : s_normalConvert) { + if (added >= maxCount) + break; + auto* l = reinterpret_cast(c.light); + if (!l || l == sunLight) + continue; + auto* ni = l->light.get(); + if (!ni || ni->GetFlags().any(RE::NiAVObject::Flag::kHidden)) + continue; + + // Skip if already added in any prior step. + bool alreadyAdded = false; + for (int j = 1; j < added && !alreadyAdded; j++) + if (lights[j] == l) + alreadyAdded = true; + if (alreadyAdded) + continue; + + if (GameIsLightAffectingSurface(shaderProp, l)) + lights[added++] = l; + // Note: do NOT increment *shadowCount; this is a non-shadow contribution. + } + + ctx.Rax = static_cast(added); + } + + // ========================================================================= + // Light conversion hooks + // + // BSShadowLight::IsShadowLight (VFT slot 3): returns false for lights in + // s_normalConvert so the engine treats them as normal (non-shadow) lights + // during the geometry-shader/stencil shadow-masking pass. + // + // RemoveLight / AddLight / SetLight hooks maintain s_normalConvert and + // s_shadowConvert so the lists stay consistent with scene changes. + // ========================================================================= + + static bool Hook_IsShadowLight(RE::BSShadowLight* light) + { + for (auto& c : s_normalConvert) + if (c.light == light) + return false; + return true; + } + + // Fires at start of ShadowSceneNode::RemoveLight (ID 99697/106331). + static void Hook_ConvertLights_Remove(CONTEXT& ctx) + { + auto* ssn = reinterpret_cast(ctx.Rcx); + auto* light = reinterpret_cast(ctx.Rdx); + if (ssn != GetShadowSceneNode()) + return; + for (auto it = s_normalConvert.begin(); it != s_normalConvert.end(); ++it) { + auto* nl = it->light->light.get(); + if (nl && nl == light) { + GameClearGeometryList(it->light); + s_normalConvert.erase(it); + break; + } + } + if (light) + s_shadowConvert.erase(light); + } + + // Fires at start of ShadowSceneNode::AddLight (ID 99692/106326). + // Optionally promotes normal light to shadow light; always forces portal-strict. + static void Hook_ConvertLights_Add(CONTEXT& ctx) + { + auto* ssn = reinterpret_cast(ctx.Rcx); + auto* light = reinterpret_cast(ctx.Rdx); + auto* p = reinterpret_cast(ctx.R8); + if (ssn != GetShadowSceneNode() || !light || !p) + return; + + if (s_settings.PromoteNormalToShadow && !p->shadowLight) { + p->shadowLight = true; + p->fov = 6.2831855f; + p->dynamic = true; + p->restrictedNode = nullptr; + p->falloff = 1.0f; + p->depthBias = 1.0f; + p->nearDistance = (light->GetLightRuntimeData().radius.x / 512.0f) * 219.6356f; + s_shadowConvert.insert(light); + } + // Portal-strict policy by shadow type. The engine picks the concrete + // shadow class (BSShadowParabolicLight / BSShadowHemisphereLight / + // BSShadowFrustumLight) based on the FOV in LIGHT_CREATE_PARAMS: + // fov >= ~2pi -> dual-paraboloid omni + // fov >= ~pi -> hemisphere + // fov < ~pi -> perspective spot/frustum + // Tightening portal-strict on omnis/hemis usefully exercises the + // portal-graph visibility test; doing it on spots drops culled-but- + // visible spots entirely (the cone test rejects spots whose origin + // sits behind a portal even when the cone sweeps into a visible + // room). Honour the per-type toggle so users can A/B easily. + constexpr float kFovHemiThreshold = 3.0f; // ~pi + constexpr float kFovOmniThreshold = 6.0f; // ~2pi + bool enforce = false; + if (p->fov >= kFovOmniThreshold) + enforce = s_settings.ForceEnablePortalStrictOmni; + else if (p->fov >= kFovHemiThreshold) + enforce = s_settings.ForceEnablePortalStrictHemi; + else + enforce = s_settings.ForceEnablePortalStrictSpot; + if (enforce) + p->portalStrict = true; + } + + // Fires at start of BSLight::SetLight (ID 101302/108289). + // Tracks NiLight pointer reassignments in s_shadowConvert. + static void Hook_ConvertLights_SetLight(CONTEXT& ctx) + { + auto* bslight = reinterpret_cast(ctx.Rcx); + auto* nilight = reinterpret_cast(ctx.Rdx); + if (!bslight) + return; + auto* oldlight = bslight->light.get(); + if (oldlight && oldlight != nilight) { + bool did = s_shadowConvert.erase(oldlight) != 0; + if (nilight && did) + s_shadowConvert.insert(nilight); + } + } + + // ========================================================================= + // Stealth detection fix + // + // GetLightLevel (AIProcess::CalculateLightValue, ID 38900/39946) uses the + // engine shadow-light iteration internally. When we replace shadow caster + // selection, the vanilla per-light affect-player loop no longer sees our + // chosen lights correctly. We replace it with our own pass that iterates + // activeShadowLights and calls IsLightAffectingActor() directly. + // ========================================================================= + + // Temporary set of lights that affect the player -- populated each frame + // in Hook_UpdateLightLevelPlayer, consumed in Hook_CheckLightLevelPlayer. + static std::set s_stealthDetectionTmp; + + static void* GetUnkDetectionGlobal() + { + // SE: 142F6DB98 -- a ~80-byte detection struct; GetSingleton equivalent + static REL::RelocationID uid(518074, 404596); + return *reinterpret_cast(uid.address()); + } + + static bool IsLightAffectingActor(RE::BSShadowLight* light, RE::Actor* actor, RE::NiPoint3* pos) + { + // SE: 14071A380 (ID 41661) + using F = bool (*)(void*, RE::BSShadowLight*, RE::Actor*, RE::NiPoint3*); + static REL::Relocation func{ REL::RelocationID(41661, 42744) }; + return func(GetUnkDetectionGlobal(), light, actor, pos); + } + + // Replaces the vanilla shadow-light-affect-player loop. + // RBP-33 holds the player's position (NiPoint3*). + static void Hook_UpdateLightLevelPlayer(CONTEXT& ctx) + { + auto* pos = reinterpret_cast(ctx.Rbp - 33); + auto* player = RE::PlayerCharacter::GetSingleton(); + + s_stealthDetectionTmp.clear(); + auto* ssn = GetShadowSceneNode(); + if (!ssn) + return; + + for (auto& sp : ssn->GetRuntimeData().activeShadowLights) { + auto* l = sp.get(); + if (!l) + continue; + auto* ni = l->light.get(); + if (!ni || ni->GetFlags().any(RE::NiAVObject::Flag::kHidden)) + continue; + if (IsLightAffectingActor(l, player, pos)) + s_stealthDetectionTmp.insert(reinterpret_cast(l)); + } + } + + // Per-light check inside the vanilla affect-player path. + // If the light is not in our set, skip the branch (ctx.Rip += 0x16). + // Note: Execute() sets ctx.Rip = resumeAddr BEFORE calling this, so + // ctx.Rip += 0x16 skips 0x16 bytes past the hook site -- correct. + static void Hook_CheckLightLevelPlayer(CONTEXT& ctx) + { + auto* light = reinterpret_cast(ctx.Rcx); + if (s_stealthDetectionTmp.find(reinterpret_cast(light)) == s_stealthDetectionTmp.end()) + ctx.Rip += 0x16; + } + + // ========================================================================= + // Public API + // ========================================================================= + + void Init(const Settings& settings) + { + s_settings = settings; + + // Check for external shadow management plugins that conflict with our hooks. + if (GetModuleHandleW(L"intellightent-ng.dll")) { + s_externalConflict = true; + s_conflictMessage = + "Disabled: intellightent-ng.dll detected. Both mods manage shadow caster " + "selection and cannot run simultaneously. Remove one to use the other."; + logger::warn("[SCM] {}", s_conflictMessage); + return; + } + + int total = LightContainerSize(settings); + s_lights.Size = total; + s_lights.Sun = false; + s_lights.Lights = new LightEntry[total](); + for (int i = 0; i < total; i++) + s_lights.Lights[i].Index = i; + + // Seed auto-budget ring buffer to 60 fps so the first few frames have sane values. + std::fill(std::begin(s_ftRing), std::end(s_ftRing), 16.67f); + s_ftEMA = 16.67f; + + // Parse formula strings + if (!settings.ScoreFormula.empty()) { + s_formulaScore = std::make_unique(); + if (!s_formulaScore->Parse(settings.ScoreFormula)) + logger::error("[SCM] Failed to parse ScoreFormula"); + } + if (!settings.RedrawIntervalFormula.empty()) { + s_formulaRedrawInterval = std::make_unique(); + if (!s_formulaRedrawInterval->Parse(settings.RedrawIntervalFormula)) + logger::error("[SCM] Failed to parse RedrawIntervalFormula"); + } + if (!settings.RedrawBudgetFormula.empty()) { + s_formulaRedrawBudget = std::make_unique(); + if (!s_formulaRedrawBudget->Parse(settings.RedrawBudgetFormula)) + logger::error("[SCM] Failed to parse RedrawBudgetFormula"); + } + } + + // Set by the resolution combo when the user picks a new tier. Gates the + // SaveINISettings write so we only touch SkyrimPrefs.ini when there's an + // actual change to persist -- without this, every Save Settings click + // would rewrite the user's prefs file even if shadow res wasn't edited. + static bool s_shadowResolutionDirty = false; + + void LoadINISettings() + { + // No-op: the engine already loaded SkyrimPrefs.ini at startup, so the + // live RE::Setting reflects the user's saved value. Future overrides + // that need to land before SCM::Install hook here. + } + + void SaveINISettings() + { + if (!s_shadowResolutionDirty) + return; + auto* prefColl = RE::INIPrefSettingCollection::GetSingleton(); + if (!prefColl) + return; + auto* setting = prefColl->GetSetting("iShadowMapResolution:Display"); + if (!setting) + return; + + // The engine's INIPrefSettingCollection::WriteSetting requires + // OpenHandle to have been called first (it writes via the cached + // `handle` member, which is null between RefreshINI calls). Calling + // it directly returns true but silently no-ops -- verified by the + // fact that the live RE::Setting updates but SkyrimPrefs.ini's + // timestamp doesn't change after Save Settings. + // + // Sidestep the engine path entirely with WritePrivateProfileStringA. + // CommonLib stores the full path of SkyrimPrefs.ini in subKey at + // startup (see InitializeSkyrimINIPrefSettingCollection caller at + // SE 1406489e6 / AE 140648990 / VR equivalent -- it concatenates the + // Documents path with "SkyrimPrefs.ini"). The setting name encodes + // ":
" -- "iShadowMapResolution:Display" means + // [Display]\niShadowMapResolution=N. + const char* fullName = setting->GetName(); + const char* colon = std::strchr(fullName, ':'); + if (!colon) { + logger::warn("[SCM] Setting name '{}' has no section -- cannot write to INI", fullName); + s_shadowResolutionDirty = false; + return; + } + const std::string key(fullName, colon - fullName); + const std::string section(colon + 1); + const std::string value = std::to_string(setting->GetInteger()); + + // subKey holds the full path to SkyrimPrefs.ini. + const char* iniPath = prefColl->subKey; + if (!iniPath || !iniPath[0]) { + logger::warn("[SCM] INIPrefSettingCollection subKey is empty -- cannot write to INI"); + s_shadowResolutionDirty = false; + return; + } + + if (::WritePrivateProfileStringA(section.c_str(), key.c_str(), value.c_str(), iniPath)) { + // Windows caches INI writes in-process; the file on disk doesn't + // update until the cache is flushed. Calling WritePrivateProfile + // with three NULL parameters forces the flush. Without this the + // write succeeds (returns non-zero, no error) but the file's + // timestamp and contents stay stale until the process exits. + // See KB Q104112 / MSDN remarks for WritePrivateProfileString. + ::WritePrivateProfileStringA(nullptr, nullptr, nullptr, iniPath); + logger::info("[SCM] Persisted [{}]{}={} to {}", section, key, value, iniPath); + } else { + const DWORD err = ::GetLastError(); + logger::warn("[SCM] WritePrivateProfileStringA failed (err={}) writing [{}]{}={} to {}", + err, section, key, value, iniPath); + } + s_shadowResolutionDirty = false; + } + + // Boot-time value of settings.Enabled, captured once in Install() and + // never modified afterwards. The ImGui "Restart required" label + // compares the user's current setting against this rather than the + // (mutable) s_settings.Enabled, so the label persists across Save + // Settings clicks until the user actually restarts. Without this the + // label vanished the moment they saved -- s_settings would catch up + // to the new value and the !=-against-staged condition cleared. + static bool s_bootEnabled = false; + static bool s_bootEnabledCaptured = false; + + void Install(const Settings& settings) + { + s_settings = settings; + s_installedShadowLightCount = settings.ShadowLightCount; + // kSHADOWMAPS is point/spot only -- the sun renders to a separate + // kSHADOWMAPS_ESRAM texture (cascade descriptors live there, not + // here). So the engine allocates exactly ShadowLightCount slices + // in kSHADOWMAPS; no +1 for the sun. + s_requestedSlotCount = static_cast(settings.ShadowLightCount); + + // One-shot capture of the boot Enabled value. Install() is called + // once at startup, but guard anyway in case it's ever re-invoked. + if (!s_bootEnabledCaptured) { + s_bootEnabled = settings.Enabled; + s_bootEnabledCaptured = true; + } + + if (s_externalConflict) + return; + + if (!settings.Enabled) { + logger::info("[SCM] Shadow caster manager disabled -- skipping hook installation."); + return; + } + + bool extended = settings.ShadowLightCount > 4; + bool needExtraBuffers = settings.ShadowLightCount > 8; + + // ---- Extended depth buffer infrastructure ------------------------- + + if (needExtraBuffers) { + globals::features::llf::normalDepthBuffer = new void*[settings.ShadowLightCount + 1](); + globals::features::llf::readOnlyDepthBuffer = new void*[settings.ShadowLightCount + 1](); + + // Patch the creation-loop count from 8 to ShadowLightCount. + // SE/VR: pattern "C7 44 24 68 08 00 00 00" (+4 = the imm32 0x00000008) + // AE: same pattern at different offset + // + // The instruction encodes a 32-bit immediate; we overwrite all four + // bytes so values >255 don't silently truncate (a single-byte write + // to the low byte would leave higher bytes stale, capping us at 255 + // while making the cap silent). + { + static REL::RelocationID uid(100458, 107175); + uintptr_t addr = uid.address() + REL::Relocate(0xD326 - 0xC940, 0xBF6 - 0x210, 0xc91); + int immOff = REL::Relocate(4, 4, 3); + uint32_t newCount = static_cast(settings.ShadowLightCount); + REL::safe_write(addr + immOff, &newCount, sizeof(newCount)); + } + + // Redirect depth-buffer pointer storage in the creation loop. + { + // Normal DSV creation: SE 140D6AB52 / VR 140DBCA00 + static REL::RelocationID uid(75469, 77255); + uintptr_t base = uid.address(); + uintptr_t off = REL::Relocate(0xB52 - 0x9E0, 0x2EB - 0x180, 0x1a0); + int sz = REL::Relocate(7, 7, 8); + if (!SKSE::stl::install_context_hook(base + off, sz, Hook_CreateNormalDepthBuffer, sz)) + logger::error("[SCM] Failed to install Hook_CreateNormalDepthBuffer"); + } + { + // ReadOnly DSV creation: SE 140D6AB71 / VR 140DBCA24 + static REL::RelocationID uid(75469, 77255); + uintptr_t base = uid.address(); + uintptr_t off = REL::Relocate(0xB71 - 0x9E0, 0x2FC - 0x180, 0x1c4); + int sz = REL::Relocate(8, 7, 7); + if (!SKSE::stl::install_context_hook(base + off, sz, Hook_CreateReadOnlyDepthBuffer, sz)) + logger::error("[SCM] Failed to install Hook_CreateReadOnlyDepthBuffer"); + } + + // Sync the first 8 slots into the game's own DepthStencilData array. + { + // SE 140D6AC00 / VR 140DBCAB0 + static REL::RelocationID uid(75469, 77255); + uintptr_t base = uid.address(); + uintptr_t off = REL::Relocate(0xC00 - 0x9E0, 0x384 - 0x180, 0x250); + if (!SKSE::stl::install_context_hook(base + off, 8, Hook_SetupGameArray, 8)) + logger::error("[SCM] Failed to install Hook_SetupGameArray"); + } + + // Depth-buffer selection at draw time. + { + // SE 140D70444 + static REL::RelocationID uid(75580, 77386); + uintptr_t base = uid.address(); + uintptr_t off = REL::Relocate(0x444 - 0x2F0, 0x704 - 0x5B0, 0x1c3); + if (!SKSE::stl::install_context_hook(base + off, 21, Hook_SelectDepthBuffer1)) + logger::error("[SCM] Failed to install Hook_SelectDepthBuffer1"); + } + { + // SE 140D6A1A5 / VR 140DBBFFC + static REL::RelocationID uid(75462, 77247); + uintptr_t base = uid.address(); + uintptr_t off = REL::Relocate(0x1A5 - 0x070, 0x985 - 0x850, 0x19c); + int sz = REL::Relocate(10, 10, 0x2e); + if (!SKSE::stl::install_context_hook(base + off, sz, Hook_SelectDepthBuffer2)) + logger::error("[SCM] Failed to install Hook_SelectDepthBuffer2"); + } + + // Release extended buffers at renderer shutdown. + // SE: ZeroDepthStencilData; AE/VR: Renderer::Shutdown and related dtor paths. + if (REL::Module::GetRuntime() != REL::Module::Runtime::AE) { + // SE + VR share the same pattern. + static REL::RelocationID uid(75628, 0 /*AE unused*/); + uintptr_t addr = uid.address() + (0xE27 - 0xDD0); + if (!SKSE::stl::install_context_hook(addr, 9, Hook_DeleteDepthBuffers_SE, -9)) + logger::error("[SCM] Failed to install Hook_DeleteDepthBuffers_SE"); + } else { + // AE has three separate shutdown paths. + static REL::RelocationID uid1(0, 77228); + if (!SKSE::stl::install_context_hook(uid1.address() + (0x3195 - 0x2E10), 7, Hook_DeleteDepthBuffers_AE, 7)) + logger::error("[SCM] Failed to install Hook_DeleteDepthBuffers_AE (path 1)"); + + static REL::RelocationID uid2(0, 77237); + if (!SKSE::stl::install_context_hook(uid2.address() + (0x3B8C - 0x34A0), 7, Hook_DeleteDepthBuffers_AE, 7)) + logger::error("[SCM] Failed to install Hook_DeleteDepthBuffers_AE (path 2)"); + + static REL::RelocationID uid3(0, 77238); + if (!SKSE::stl::install_context_hook(uid3.address() + (0x3E79 - 0x3BC0), 6, Hook_DeleteDepthBuffers_AE, -6)) + logger::error("[SCM] Failed to install Hook_DeleteDepthBuffers_AE (path 3)"); + } + } + + // Expanded accumulated-lights array (needed when ShadowLightCount > 4). + if (extended) { + // SE: BSShadowFrustumLight accumulation setup + static REL::RelocationID uid(99686, 106320); + uintptr_t base = uid.address(); + uintptr_t off = REL::Relocate(0xFCA4 - 0xF950, 0xF05 - 0xBB0, 0x387); + if (!SKSE::stl::install_context_hook(base + off, 5, Hook_AccumulatedLightsArray, 5)) + logger::error("[SCM] Failed to install Hook_AccumulatedLightsArray"); + } + + // Force per-light shadow map slot assignment. + // Required whenever our temporal scheduler is active (ShadowLightCount >= 4): + // RenderCascade recalculates the slot from a global counter each call; without + // this hook, a light not redrawn this frame gets a different slot than last + // frame and corrupts another light's shadow map. + { + // SE: RenderCascade+0xBE; VR: RenderCascade+0xE0 + static REL::RelocationID uid(100820, 107604); + uintptr_t base = uid.address(); + uintptr_t off = REL::Relocate(0xA9E - 0x9E0, 0xDB0 - 0xCF0, 0xe0); + if (!SKSE::stl::install_context_hook(base + off, 0x25, Hook_OverwriteShadowMapIndex)) + logger::error("[SCM] Failed to install Hook_OverwriteShadowMapIndex"); + } + + // Suppress the engine's focus shadow path in extended mode (matches + // Intellightent's mitigation). In extended mode parabolic lights + // occupy kSHADOWMAPS slots [4..7] -- the same range g_focusShadow- + // BaseSlotIndex (=4) reserves for focus rendering. If the engine + // enters BSShadowParabolicLight::Render's focus loop on a parabolic + // light in those slots it CTDs without a crashlog. Two layers of + // defense: these byte patches zero the engine's global gate, and + // ScheduleShadowCasters scrubs drawFocusShadows on every light + // per-frame to clear stale flags. The per-frame scrub alone would + // suffice; the patches make the suppression robust against any + // engine path that bypasses the per-light flag. + if (extended) { + const uint8_t xorRax[6] = { 0x48, 0x31, 0xC0, 0x90, 0x90, 0x90 }; + + static REL::RelocationID uid1(10209, 10247); + REL::safe_write(uid1.address(), xorRax, 6); + + static REL::RelocationID uid2(10207, 10245); + REL::safe_write(uid2.address(), xorRax, 6); + + static REL::RelocationID uid3(513201, 390932); + const uint8_t zero = 0; + REL::safe_write(uid3.address(), &zero, 1); + } + + // ---- Screen-space shadow-mask pass: clamp to vanilla 4 slices --------- + // Suppress the engine's screen-space shadow-mask inner loop. With + // extended slot counts SLF can produce maskIndex >= 4, which makes + // the loop OOB-read the 4-entry per-slot blend-mode table + // (DAT_141861380); the mask's R channel (sun cascades) is the only + // channel LIGHT_LIMIT_FIX consumes anyway -- extended shadow casters + // are served by LLF's cluster pipeline reading kSHADOWMAPS directly. + // See Hook_RenderShadowLightsWithUtilityShader above for the full + // rationale, including the previous Hook_DisableColorMask's misread + // (it patched out the inner call, not a color-mask call -- verified + // via Ghidra). + if (globals::game::isVR) { + // VR's Main::RenderShadowmasks (100422) inlines the inner loop + // instead of calling the standalone 100423, so the detour below + // would never fire. NOP the near-CALL at +0x9E directly. Only + // needed when extended slot counts can produce maskIndex >= 4; + // vanilla 4-slice VR doesn't trip the OOB. + if (settings.ShadowLightCount > 4) { + // Site verification: the call at +0x9E must be `E8 rel32` + // targeting the inlined helper at +0xC0 (rel32 = 0xC0 - + // next-instruction-addr = 0xC0 - 0xA3 = 0x1D). If either the + // opcode or the target drifts, fail closed -- clamp the + // scheduler back to vanilla 4 slots so a drifted binary + // degrades to "no extended shadows" rather than the original + // CTD path this patch protects. + static REL::RelocationID renderShadowmasks(100422, 107140); + constexpr std::uint8_t kCallOffset = 0x9E; + constexpr std::uint8_t kCallOpcode = 0xE8; // near CALL rel32 + constexpr std::int32_t kExpectedRel32 = 0x1D; + const auto site = renderShadowmasks.address() + kCallOffset; + const auto opcode = *reinterpret_cast(site); + const auto rel32 = *reinterpret_cast(site + 1); + if (opcode != kCallOpcode || rel32 != kExpectedRel32) { + logger::warn( + "[SCM] VR shadow-mask site drift: expected E8 rel32=0x{:X} at RenderShadowmasks+0x{:X}, " + "found 0x{:02X} rel32=0x{:X}. Clamping ShadowLightCount to 4 (vanilla) to avoid OOB CTD.", + kExpectedRel32, kCallOffset, opcode, rel32); + s_installedShadowLightCount = 4; + } else { + REL::safe_fill(site, REL::NOP, 5); + logger::info("[SCM] VR: NOPed inner shadow-mask call at RenderShadowmasks+0x{:X}", kCallOffset); + } + } + } else { + // Flat (SE/AE): RenderShadowmasks calls the standalone 100423, + // so detour that function to a no-op. + stl::detour_thunk( + REL::RelocationID(100423, 107141)); + } + + // ---- Shadow caster selection ----------------------------------------- + + // Replace CalculateActiveShadowCasterLights entirely (ID 100419/107137). + // VR confirmed: 0x1413226e0 + stl::detour_thunk(REL::RelocationID(100419, 107137)); + + // Replace the CALL to RenderActiveShadowCasterLights inside the render loop. + // ID 100415/107133; VR confirmed: 0x141322130 + // Offsets: SE = 0xF76-0xE30 (0x146), AE = 0xC17D-0xBFF0 (0x18D), VR = 0x1CA + // Must use install_context_hook (not write_thunk_call) so RtlRestoreContext restores + // volatile registers (r8, etc.) before the game continues past the call site. + { + static REL::RelocationID uid(100415, 107133); + uintptr_t addr = uid.address() + REL::Relocate(0xF76 - 0xE30, 0xC17D - 0xBFF0, 0x1CA); + if (!SKSE::stl::install_context_hook(addr, 5, Hook_RenderShadowLights)) + logger::error("[SCM] Failed to install Hook_RenderShadowLights"); + } + + // Replace CalculateActiveNonShadowCasterLights (surface light injection). + // ID 100997/107784; VR confirmed: 0x141354d20 + // Uses install_context_hook because the function has 10 args (11 in VR) with + // platform-specific stack layout. We write a RET at func+5 so + // RtlRestoreContext lands on ret and the function returns cleanly. + { + static REL::RelocationID uid(100997, 107784); + if (!SKSE::stl::install_context_hook(uid.address(), 5, Hook_CalculateActiveLightsForSurface)) + logger::error("[SCM] Failed to install Hook_CalculateActiveLightsForSurface"); + const uint8_t ret = 0xC3; + REL::safe_write(uid.address() + 5, &ret, 1); + } + + // ---- Stealth detection fix ------------------------------------------- + // GetLightLevel (ID 38900/39946) iterates shadow lights to check which + // affect the player. We replace that iteration with our own. + // VR: 38900 confirmed (0x1406892e0); offsets assumed same as SE for VR. + { + static REL::RelocationID uid(38900, 39946); + + // Hook at the start of the affect-player loop. + // Original bytes: "41 83 CE FF 33 C0" (6 bytes) -- keep them running first. + uintptr_t off1 = REL::Relocate(0x185 - 0x050, 0x847 - 0x710, 0x185 - 0x050); + if (!SKSE::stl::install_context_hook(uid.address() + off1, 6, Hook_UpdateLightLevelPlayer, 6)) + logger::error("[SCM] Failed to install Hook_UpdateLightLevelPlayer"); + + // Byte patch: change JA (0x73) to JMP (0xEB) to skip the vanilla iteration. + uintptr_t off2 = REL::Relocate(0x194 - 0x050, 0x856 - 0x710, 0x194 - 0x050); + const uint8_t jmp = 0xEB; + REL::safe_write(uid.address() + off2, &jmp, 1); + } + // Per-light check (ID 99725/106362): not yet confirmed in VR address library, + // so guard VR until addresses are found. + if (!REL::Module::IsVR()) { + static REL::RelocationID uid(99725, 106362); + uintptr_t off = REL::Relocate(0x648 - 0x560, 0xB49 - 0xA60, 0x648 - 0x560); + if (!SKSE::stl::install_context_hook(uid.address() + off, 5, Hook_CheckLightLevelPlayer)) + logger::error("[SCM] Failed to install Hook_CheckLightLevelPlayer"); + } + + // ---- Light conversion ------------------------------------------------ + // All conversion hooks install unconditionally; runtime behaviour is + // gated by s_settings.ConvertExcessToNormal / PromoteNormalToShadow + // and container membership. When both flags are false the hooks fire + // but are no-ops -- required so toggling either flag on at runtime + // takes effect without a restart. + + { + // BSShadowLight vtable slot 3 = IsShadowLight; replace on all 4 shadow light types. + // Reads s_normalConvert membership -- empty when ConvertExcessToNormal + // off, so the hook returns vanilla truth for every light. + REL::Relocation vtbl1{ RE::BSShadowLight::VTABLE[0] }; + vtbl1.write_vfunc(3, Hook_IsShadowLight); + REL::Relocation vtbl2{ RE::BSShadowDirectionalLight::VTABLE[0] }; + vtbl2.write_vfunc(3, Hook_IsShadowLight); + REL::Relocation vtbl3{ RE::BSShadowFrustumLight::VTABLE[0] }; + vtbl3.write_vfunc(3, Hook_IsShadowLight); + REL::Relocation vtbl4{ RE::BSShadowParabolicLight::VTABLE[0] }; + vtbl4.write_vfunc(3, Hook_IsShadowLight); + } + + { + // ShadowSceneNode::RemoveLight -- fires at +0x9 (SE: 6 bytes, AE: 5 bytes). + // Drains s_normalConvert / s_shadowConvert entries for the removed light. + // No-op when both containers are empty. + static REL::RelocationID uid(99697, 106331); + int sz = REL::Relocate(6, 5, 6); + if (!SKSE::stl::install_context_hook(uid.address() + REL::Relocate(0x9, 0x9, 0x9), sz, Hook_ConvertLights_Remove, sz)) + logger::error("[SCM] Failed to install Hook_ConvertLights_Remove"); + } + + { + // ShadowSceneNode::AddLight -- at function start (5 bytes). + // Applies portal-strict per type (always) and PromoteNormalToShadow + // flag mutation (when enabled). + static REL::RelocationID uid(99692, 106326); + if (!SKSE::stl::install_context_hook(uid.address(), 5, Hook_ConvertLights_Add, 5)) + logger::error("[SCM] Failed to install Hook_ConvertLights_Add"); + } + + { + // BSLight::SetLight -- at function start (5 bytes). + // Tracks NiLight* reassignments for s_shadowConvert. No-op when + // PromoteNormalToShadow is off (s_shadowConvert is empty). + static REL::RelocationID uid(101302, 108289); + if (!SKSE::stl::install_context_hook(uid.address(), 5, Hook_ConvertLights_SetLight, 5)) + logger::error("[SCM] Failed to install Hook_ConvertLights_SetLight"); + } + + logger::info("[SCM] Hooks installed (ShadowLightCount={})", settings.ShadowLightCount); + + // Wholesale reset on LoadingMenu open so transient session state + // (s_normalConvert, s_lights pool, debug pins) drops the previous + // cell's pointers before the engine tears them down. Mirrors the + // pattern in DynamicCubemaps for similar reset-on-scene-transition + // behaviour. + RegisterSceneTransitionEvents(); + + // DXGI budget snapshot at install. Per-slice geometry follows once + // Update() sees a non-null kSHADOWMAPS SRV. + if (auto* menu = Menu::GetSingleton()) { + if (auto adapter3 = menu->GetDXGIAdapter3()) { + DXGI_QUERY_VIDEO_MEMORY_INFO vmem{}; + if (SUCCEEDED(adapter3->QueryVideoMemoryInfo(0, DXGI_MEMORY_SEGMENT_GROUP_LOCAL, &vmem)) && vmem.Budget > 0) { + const float budgetMB = static_cast(vmem.Budget) / (1024.f * 1024.f); + const float usageMB = static_cast(vmem.CurrentUsage) / (1024.f * 1024.f); + logger::info("[SCM] VRAM at install: {:.1f}/{:.1f} MB used", usageMB, budgetMB); + } + } + } + } + + void Update(const Settings& settings, RE::ShadowSceneNode* /*shadowSceneNode*/, + RE::NiCamera* /*worldCamera*/) + { + ZoneScopedN("SCM::Update"); + if (s_externalConflict) + return; + + // Lazy verification of the kSHADOWMAPS allocation. Self-healing: + // retries until kSHADOWMAPS exists, then early-returns. Cheap. + // This must run BEFORE the clamp below so a VRAM-exhaustion + // fallback gets reflected in the same frame the verification + // succeeds. + RefreshInstalledSlotCount(); + + Settings capped = settings; + if (s_installedShadowLightCount > 0) + capped.ShadowLightCount = std::min(settings.ShadowLightCount, s_installedShadowLightCount); + + int newTotal = LightContainerSize(capped); + if (newTotal != s_lights.Size) { + auto* newLights = new LightEntry[newTotal](); + int copyCount = std::min(s_lights.Size, newTotal); + for (int i = 0; i < copyCount; i++) + newLights[i] = s_lights.Lights[i]; + for (int i = copyCount; i < newTotal; i++) + newLights[i].Index = i; + delete[] s_lights.Lights; + s_lights.Lights = newLights; + s_lights.Size = newTotal; + } + + // Apply settings as a pure flag flip. Conversion-related state + // (s_normalConvert, s_shadowConvert, s_lights pool) is NOT + // drained on toggles -- it ages out at the next LoadingMenu + // via the natural ResetSession() in SceneTransitionEventHandler. + // See Hook_CalculateActiveShadowCasters::thunk for the rationale: + // wholesale clearing mid-session caused engine accumulate-shadow + // crashes (2026-05-17 crash logs) because the engine still had + // our converted/promoted lights in activeShadowLights with + // half-populated shadowmapDescriptors, and tearing our tracking left + // the engine walking that half-state. + // + // Each setting's gating still takes effect immediately via the + // runtime checks in the relevant hook / scheduler branches: + // - Enabled: per-frame thunk routes to vanilla; OverwriteShadowMapIndex no-ops. + // - ConvertExcessToNormal: convertOrDisable routes excess omnis to DisableLight. + // - PromoteNormalToShadow: Hook_ConvertLights_Add stops promoting. + // "Off = stop converting" is the documented semantic; existing + // converted/promoted lights persist in their current form until + // the engine itself drops them at cell change. + s_settings = capped; + } + + void ResetSession() + { + // Wholesale drop of pointers the engine is about to free during + // a scene transition. Called by RegisterSceneTransitionEvents + // when the LoadingMenu opens. The per-frame reconciliation in + // ScheduleShadowCasters keeps these caches honest during normal + // play; this is the explicit "scene is gone" signal so the UI + // counter, debug pins, and tracking sets read empty during the + // loading screen rather than displaying stale entries from the + // previous cell. + // + // Stale BSRenderPass.sceneLights[] captures that would otherwise AV + // in BSEffectShader::SetupGeometry are handled by the defensive + // guard there (clamps numLights past the first stale entry), not by + // trying to drive engine-side cleanup from here. An earlier version + // tried calling ShadowSceneNode::RemoveLight to undo our + // ConvertLight -> GameEnableLight pinning, but the engine function + // takes NiLight* (not BSLight* as the wrapper assumed); the call + // was a silent no-op on every runtime and accomplished nothing. + s_normalConvert.clear(); + s_shadowConvert.clear(); + s_pinShadow.clear(); + s_pinConvert.clear(); + s_soloLight = 0; + s_suppressedLights.clear(); + // Clear pool entries but keep the array allocation; size is set by + // Install/Update based on the configured ShadowLightCount. + if (s_lights.Lights) { + for (int i = 0; i < s_lights.Size; ++i) + s_lights.Lights[i].Clear(); + s_lights.Sun = false; + } + } + + class SceneTransitionEventHandler : public RE::BSTEventSink + { + public: + RE::BSEventNotifyControl ProcessEvent(const RE::MenuOpenCloseEvent* a_event, + RE::BSTEventSource*) override + { + if (a_event && a_event->menuName == RE::LoadingMenu::MENU_NAME && a_event->opening) + ResetSession(); + return RE::BSEventNotifyControl::kContinue; + } + static SceneTransitionEventHandler* GetSingleton() + { + static SceneTransitionEventHandler singleton; + return &singleton; + } + }; + + void RegisterSceneTransitionEvents() + { + auto* ui = globals::game::ui; + if (!ui) { + logger::error("[SCM] No UI singleton; cannot register LoadingMenu handler"); + return; + } + ui->AddEventSink(SceneTransitionEventHandler::GetSingleton()); + logger::info("[SCM] LoadingMenu event handler registered"); + } + + const LightContainer& GetLights() + { + return s_lights; + } + + int32_t GetShadowSlot(RE::BSShadowLight* light) + { + // Returns the kSHADOWMAPS texture-array slot for `light`, or -1 if the + // light has no kSHADOWMAPS slice. Pool index == texture slot for point + // lights (1:1). Sun's pool slot returns -1 since the sun renders to + // kSHADOWMAPS_ESRAM (a separate texture) — callers in ShadowRenderer + // upload and LightLimitFix cluster builder must skip it. + const int32_t poolIdx = s_lights.FindLight(light, s_settings.ShadowLightCount); + if (poolIdx < 0) + return -1; + if (s_lights.Sun && poolIdx == 0) + return -1; // sun + return poolIdx; + } + + void ForEachConvertedLight(const std::function& visitor) + { + for (auto& c : s_normalConvert) { + if (!c.light) + continue; + // Defensive vtable check: catches lights freed and zeroed by + // tbbmalloc / EngineFixes between our per-frame reconciliation + // in ScheduleShadowCasters and the cluster builder running. The + // reconciliation prunes stale pointers up-front, but a bulk + // engine teardown could still happen mid-frame. + if (*reinterpret_cast(c.light) == 0) + continue; + visitor(c.light); + } + } + + // ========================================================================= + // Per-slot visualization state (owned by ShadowCasterManager) + // ========================================================================= + + static constexpr const char* kShadowTypeNames[] = { "Spot", "Hemisphere", "Omni" }; + + static std::vector s_shadowSlotInfos; + static uint32_t s_shadowSlotUsage = 0; + // Persists last-seen ShadowSlotInfo for every light ever recorded this session, + // so suppressed lights that leave the active slots still have metadata for the settings table. + static std::unordered_map s_knownLights; + + /// Computes the golden-ratio hue color for a shadow-map slot (matches mode-8 shader). + ImVec4 ShadowSlotHueColor(uint32_t slotIdx) + { + auto chan = [](float h, float shift) { + float v = fmodf(h + shift, 1.0f); + if (v < 0.0f) + v += 1.0f; + return std::clamp(fabsf(v * 6.0f - 3.0f) - 1.0f, 0.0f, 1.0f); + }; + float hue = fmodf(float(slotIdx) * 0.618033988f, 1.0f); + return ImVec4(chan(hue, 0.0f), chan(hue, 2.0f / 3.0f), chan(hue, 1.0f / 3.0f), 1.0f); + } + + // ========================================================================= + // Slot frame API implementations + // ========================================================================= + + void BeginSlotFrame(uint32_t slotCount) + { + s_shadowSlotInfos.assign(slotCount, ShadowSlotInfo{}); + s_shadowSlotUsage = 0; + } + + void RecordSlot(uint32_t depthSlot, const ShadowSlotInfo& info) + { + if (depthSlot < static_cast(s_shadowSlotInfos.size())) + s_shadowSlotInfos[depthSlot] = info; + // Omni lights (type 2) occupy 2 depth-texture slices; all others use 1. + s_shadowSlotUsage += (info.type == 2) ? 2 : 1; + s_knownLights[info.lightKey] = info; + } + + bool IsSuppressed(uintptr_t lightKey) + { + if (s_suppressedLights.count(lightKey)) + return true; + // Solo: every key except the soloed one is implicitly suppressed. + if (s_soloLight != 0 && s_soloLight != lightKey) + return true; + return false; + } + + bool IsPinnedShadow(uintptr_t lightKey) { return s_pinShadow.count(lightKey) > 0; } + bool IsPinnedConvert(uintptr_t lightKey) { return s_pinConvert.count(lightKey) > 0; } + + void SetPinnedShadow(uintptr_t lightKey, bool pinned) + { + if (pinned) { + s_pinShadow.insert(lightKey); + s_pinConvert.erase(lightKey); // mutually exclusive + s_suppressedLights.erase(lightKey); + } else { + s_pinShadow.erase(lightKey); + } + } + + void SetPinnedConvert(uintptr_t lightKey, bool pinned) + { + if (pinned) { + s_pinConvert.insert(lightKey); + s_pinShadow.erase(lightKey); + s_suppressedLights.erase(lightKey); + } else { + s_pinConvert.erase(lightKey); + } + } + + uintptr_t GetSoloLight() { return s_soloLight; } + void SetSoloLight(uintptr_t lightKey) { s_soloLight = lightKey; } + + uintptr_t GetHoveredLight() { return s_hoverLightKey; } + void SetHoveredLight(uintptr_t lightKey) { s_hoverLightKey = lightKey; } + + void ClearAllOverrides() + { + s_suppressedLights.clear(); + s_pinShadow.clear(); + s_pinConvert.clear(); + s_soloLight = 0; + // Hover key is transient (per-draw); not part of "overrides". + } + + bool HasAnyOverrides() + { + return !s_suppressedLights.empty() || !s_pinShadow.empty() || + !s_pinConvert.empty() || s_soloLight != 0; + } + + bool HasSuppressedLights() + { + return !s_suppressedLights.empty(); + } + + uint32_t GetSlotUsage() + { + return s_shadowSlotUsage; + } + + uint32_t GetHighImportanceCount() + { + return s_highImportanceLightCount; + } + + const std::vector& GetSlotInfos() + { + return s_shadowSlotInfos; + } + + const char* GetShadowTypeName(uint32_t type) + { + return kShadowTypeNames[std::min(type, 2u)]; + } + + // ========================================================================= + // DrawShadowLightTable + // ========================================================================= + // Interactive shadow caster table: suppress/re-enable per light or by type, + // filter by type name/range/address, sort by any column. + // Rows are keyed by lightKey (light object pointer) so suppression persists + // across slot reassignments as the player moves around. + // + // compact=true -> auto-sizes height (up to 15 rows visible) + // compact=false -> fills available window height (resizable overlay window) + // showColor -> adds a golden-ratio hue swatch column (visualization mode 8) + // ========================================================================= + + void DrawShadowLightTable(bool compact, bool showColor, bool sceneOnly, bool readOnly) + { + // Hover key is set per-row here and consumed (cleared) once per frame + // by UpdateLights. Do NOT clear it at function entry -- if both the + // settings-menu table and the overlay table render in the same frame, + // a top-of-function clear would let the second table clobber the + // hover set by the first. Only one row hovers at a time, so the two + // callsites can't fight. + + struct SlotRow + { + uint32_t idx; // shadow slot index; only meaningful when inScene=true + bool inScene; // currently occupies a shadow slot this frame + bool converted; // demoted to non-shadow rendering via ConvertExcessToNormal + bool isFocus{ false }; // engine-owned focus shadow slot (read-only row) + ShadowSlotInfo info; + float importance{ 0.0f }; // contribution-weighted importance (luminance × fade × attenuation²) + bool highImp{ false }; // importance > 0.1 — light meaningfully illuminates the viewer area + }; + + // Build index of lights currently in scene (slot -> info). + // Static containers avoid per-frame heap allocation. + static std::unordered_map sceneSlot; + sceneSlot.clear(); + for (uint32_t i = 0; i < static_cast(s_shadowSlotInfos.size()); ++i) + if (s_shadowSlotInfos[i].valid) + sceneSlot[s_shadowSlotInfos[i].lightKey] = i; + + // Build lightKey -> LightEntry* lookup for debug columns. + static std::unordered_map lightEntryByKey; + lightEntryByKey.clear(); + for (int li = 0; li < s_lights.Size; ++li) { + const auto& e = s_lights.Lights[li]; + if (e.Light) + lightEntryByKey[reinterpret_cast(e.Light)] = &e; + } + + auto applyEntryDebug = [&](SlotRow& row) { + auto it = lightEntryByKey.find(row.info.lightKey); + if (it != lightEntryByKey.end()) { + row.importance = it->second->lastImportance; + row.highImp = row.importance > 0.1f; + } + }; + + // Build set of converted-light keys (shadow lights demoted to non-shadow + // rendering via ConvertExcessToNormal). These don't occupy a shadow slot + // this frame but are still active in the scene as normal lights — we want + // them visible in the table with a "Conv" indicator and the same suppress + // toggle so users can hide them like any other shadow caster. + static std::unordered_set convertedKeys; + convertedKeys.clear(); + ForEachConvertedLight([&](RE::BSShadowLight* light) { + convertedKeys.insert(reinterpret_cast(light)); + }); + + // Build row list. + static std::vector rows; + rows.clear(); + auto addConvertedRows = [&]() { + for (uintptr_t key : convertedKeys) { + if (sceneSlot.count(key)) + continue; // simultaneously a shadow caster this frame + SlotRow r{ 0, false, true, false, {} }; + auto it = s_knownLights.find(key); + if (it != s_knownLights.end()) { + r.info = it->second; + r.info.valid = false; // no shadow slot this frame + } else { + // First-frame convert: no cached metadata yet. Surface a minimal + // row so the user can still toggle suppression by address. + r.info.lightKey = key; + } + applyEntryDebug(r); + rows.push_back(r); + } + }; + if (sceneOnly) { + rows.reserve(sceneSlot.size() + convertedKeys.size()); + for (auto& [key, idx] : sceneSlot) { + SlotRow r{ idx, true, false, false, s_shadowSlotInfos[idx] }; + applyEntryDebug(r); + rows.push_back(r); + } + addConvertedRows(); + } else { + // All scene lights first, then converted lights, then suppressed lights + // not currently in scene at all. + rows.reserve(sceneSlot.size() + convertedKeys.size() + s_suppressedLights.size()); + for (auto& [key, idx] : sceneSlot) { + SlotRow r{ idx, true, false, false, s_shadowSlotInfos[idx] }; + applyEntryDebug(r); + rows.push_back(r); + } + addConvertedRows(); + for (uintptr_t key : s_suppressedLights) { + if (sceneSlot.count(key) || convertedKeys.count(key)) + continue; + auto it = s_knownLights.find(key); + if (it != s_knownLights.end()) { + SlotRow r{ 0, false, false, false, it->second }; + applyEntryDebug(r); + rows.push_back(r); + } + } + } + + // Engine-owned focus shadow rows. One per active focus actor at the + // matching kSHADOWMAPS slot. Synthetic lightKey encodes the slot index + // so each row is unique without colliding with real BSShadowLight + // pointers (top-half-set is impossible for user-mode allocations). + for (int32_t i = 0; i < s_focusShadowSlots; ++i) { + SlotRow r{}; + r.idx = static_cast(kFocusShadowBaseSlotIndex + i); + r.inScene = true; + r.isFocus = true; + r.info.valid = true; + r.info.lightKey = 0xFEFE'0000ULL | static_cast(r.idx); + r.info.type = 0; // surfaced as "Focus" in the type column override below + rows.push_back(r); + } + + if (rows.empty()) { + ImGui::TextDisabled("No shadow slots this frame."); + return; + } + + // -- Header: active count + suppression badge ---------------------- + ImGui::Text("Shadow slots: %u active", s_shadowSlotUsage); + if (!s_suppressedLights.empty()) { + ImGui::SameLine(); + ImGui::TextColored(ImVec4(1, 0.6f, 0.2f, 1), " %zu suppressed", s_suppressedLights.size()); + } + + // -- Group toggle buttons ------------------------------------------ + // green = at least one unsuppressed; grey = all suppressed; click flips. + // Predicate-based so we can mix type filters (Spot/Hemi/Omni) with state + // filters (Conv = converted-to-normal lights). + { + using RowPred = std::function; + auto allSuppressedMatching = [&](const RowPred& pred) { + bool sawAny = false; + for (auto& r : rows) { + if (!pred(r)) + continue; + sawAny = true; + if (!s_suppressedLights.count(r.info.lightKey)) + return false; + } + // If nothing matches, treat as "all suppressed" so the button shows + // grey/disabled (clicking a no-op button does nothing). + return sawAny; + }; + auto toggleMatching = [&](const RowPred& pred) { + if (allSuppressedMatching(pred)) { + for (auto& r : rows) + if (pred(r)) + s_suppressedLights.erase(r.info.lightKey); + } else { + for (auto& r : rows) + if (pred(r)) + s_suppressedLights.insert(r.info.lightKey); + } + }; + auto groupButton = [&](const char* label, const RowPred& pred, const char* tooltip) { + bool allOff = allSuppressedMatching(pred); + ImGui::PushStyleColor(ImGuiCol_Button, + allOff ? ImVec4(0.35f, 0.35f, 0.35f, 1) : ImVec4(0.15f, 0.5f, 0.15f, 1)); + ImGui::PushStyleColor(ImGuiCol_ButtonHovered, + allOff ? ImVec4(0.5f, 0.5f, 0.5f, 1) : ImVec4(0.2f, 0.7f, 0.2f, 1)); + if (ImGui::SmallButton(label)) + toggleMatching(pred); + ImGui::PopStyleColor(2); + if (tooltip && ImGui::IsItemHovered()) + ImGui::SetTooltip("%s", tooltip); + }; + auto typePred = [](uint32_t type) { + return [type](const SlotRow& r) { return r.info.type == type; }; + }; + groupButton( + "All", [](const SlotRow&) { return true; }, nullptr); + ImGui::SameLine(); + groupButton("Spot", typePred(0), "Toggle all spot/frustum shadow lights"); + ImGui::SameLine(); + groupButton("Hemi", typePred(1), "Toggle all hemisphere shadow lights"); + ImGui::SameLine(); + groupButton("Omni", typePred(2), "Toggle all omni (paraboloid) shadow lights"); + ImGui::SameLine(); + groupButton( + "Conv", [](const SlotRow& r) { return r.converted; }, + "Toggle all lights currently demoted from shadow to normal\n" + "(ConvertExcessToNormal). Hides their cluster-light contribution."); + + // "Clear All": resets every debug override (suppress / pin shadow / + // pin convert / solo) so the table returns to scheduler-auto. Only + // shown when overrides are active so it doesn't take up space when + // there's nothing to reset. + if (HasAnyOverrides()) { + ImGui::SameLine(); + ImGui::PushStyleColor(ImGuiCol_Button, ImVec4(0.55f, 0.25f, 0.25f, 1)); + ImGui::PushStyleColor(ImGuiCol_ButtonHovered, ImVec4(0.75f, 0.35f, 0.35f, 1)); + if (ImGui::SmallButton("Clear All")) + ClearAllOverrides(); + ImGui::PopStyleColor(2); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Reset every debug override:\n" + " - clear suppression\n" + " - clear shadow / convert pins\n" + " - clear solo\n" + "Returns the table to scheduler-auto behaviour."); + } + + // Help marker: explains the per-row debug controls so users aren't + // surprised by states / pulses they didn't know they could trigger. + ImGui::SameLine(); + Util::HelpMarker( + "Per-row controls:\n" + " * Cycle button (col 1): click to rotate this light through\n" + " Auto -> Shadow pin (S) -> Convert pin (C) -> Suppress (X) -> Auto.\n" + " * Solo button (col 2): isolate this light against a black scene.\n" + " Click again to clear; only one light may be soloed at a time.\n" + " * Hold Shift while hovering a row to highlight that light in the\n" + " world with a pulsing magenta tint. Release Shift or move the\n" + " cursor away to stop. Useful when you can't tell which entry\n" + " corresponds to which physical light. Does not affect rendering\n" + " when Shift is not held.\n\n" + "Group buttons toggle suppression for every matching row at once.\n" + "Clear All appears when any override is active and resets everything."); + } + + // -- Filter input -------------------------------------------------- + static std::string s_filterText; + { + char buf[128] = {}; + strncpy_s(buf, s_filterText.c_str(), sizeof(buf) - 1); + ImGui::SetNextItemWidth(120.0f); + if (ImGui::InputText("##slotfilter", buf, sizeof(buf))) + s_filterText = buf; + ImGui::SameLine(); + ImGui::TextDisabled(sceneOnly ? "filter (yes/conv/type/range/addr)" : "filter (yes/conv/no/type/range/addr)"); + } + + // Apply filter. + static std::vector filteredRows; + filteredRows.clear(); + if (s_filterText.empty()) { + filteredRows = rows; + } else { + std::string lower = s_filterText; + std::transform(lower.begin(), lower.end(), lower.begin(), ::tolower); + char addrBuf[16]; + for (auto& r : rows) { + std::string typeName = kShadowTypeNames[std::min(r.info.type, 2u)]; + std::transform(typeName.begin(), typeName.end(), typeName.begin(), ::tolower); + // Range filter matches both raw units and rounded meters. + char rangeBuf[32]; + snprintf(rangeBuf, sizeof(rangeBuf), "%.0f %.0f", + r.info.range, Util::Units::GameUnitsToMeters(r.info.range)); + snprintf(addrBuf, sizeof(addrBuf), "%08x", static_cast(r.info.lightKey & 0xFFFFFFFF)); + const char* statusStr = r.inScene ? "yes" : (r.converted ? "conv" : "no"); + if (typeName.find(lower) != std::string::npos || + std::string(rangeBuf).find(lower) != std::string::npos || + std::string(addrBuf).find(lower) != std::string::npos || + lower == statusStr) + filteredRows.push_back(r); + } + } + + // -- Column layout ------------------------------------------------- + // Interactive (settings menu, or overlay with menu open): + // [Mode] [Solo] [Status] [Address] [Color?] [Type] [Range] [Imp] + // Read-only (overlay with menu closed -- buttons would be dead pixels): + // [Status] [Address] [Color?] [Type] [Range] [Imp] + // + // Status merges the old "In Scene" + "Slot" columns into one cell + // showing one of: "Slot N" / "Conv" / "Out" / "Suppr". The old "Hi" + // boolean column is gone -- highImp now tints the row instead, which + // is what the column was being used for visually. + const bool showButtons = !readOnly; + const int modeColIdx = showButtons ? 0 : -1; + const int soloColIdx = showButtons ? 1 : -1; + const int statusColIdx = showButtons ? 2 : 0; + const int addrColIdx = statusColIdx + 1; + const int typeColIdx = addrColIdx + (showColor ? 2 : 1); + const int radColIdx = typeColIdx + 1; + const int centrColIdx = radColIdx + 1; + + std::vector headers; + if (showButtons) { + headers.push_back("Mode"); // cycle: Auto / Pin-S / Pin-C / Suppress + headers.push_back("Solo"); + } + headers.push_back("Status"); + headers.push_back("Address"); + if (showColor) + headers.push_back("Color"); + headers.push_back("Type"); + headers.push_back("Range"); + headers.push_back("Imp"); + + using SortFn = std::function; + std::vector sorts(headers.size(), nullptr); + // Status sort: in-scene shadow casters → converted → out-of-scene. + // Suppressed lights sort to the end (treated as worst rank). + sorts[statusColIdx] = [](const SlotRow& a, const SlotRow& b, bool asc) { + auto rank = [](const SlotRow& r) -> int { + bool sup = s_suppressedLights.count(r.info.lightKey) > 0; + if (sup) + return 3; + return r.inScene ? 0 : (r.converted ? 1 : 2); + }; + int ra = rank(a), rb = rank(b); + if (ra != rb) + return asc ? ra < rb : ra > rb; + return asc ? a.idx < b.idx : a.idx > b.idx; + }; + sorts[addrColIdx] = [](const SlotRow& a, const SlotRow& b, bool asc) { + return asc ? a.info.lightKey < b.info.lightKey : a.info.lightKey > b.info.lightKey; + }; + sorts[typeColIdx] = [](const SlotRow& a, const SlotRow& b, bool asc) { + return asc ? a.info.type < b.info.type : a.info.type > b.info.type; + }; + sorts[radColIdx] = [](const SlotRow& a, const SlotRow& b, bool asc) { + return asc ? a.info.range < b.info.range : a.info.range > b.info.range; + }; + sorts[centrColIdx] = [](const SlotRow& a, const SlotRow& b, bool asc) { + return asc ? a.importance < b.importance : a.importance > b.importance; + }; + + // outerSize logic: + // * compact auto-size up to 15 rows (handled by + // ShowSortedStringTableCustom when y==0). Used in + // the menu's Active Casters block where the table + // is one of several elements in a long settings + // list and shouldn't grab unbounded vertical space. + // * non-compact fill remaining vertical space. The table itself + // scrolls internally (ScrollY flag in the shared + // helper) so summary stats above stay visible + // regardless of how many lights exist or how the + // user has sized the host window. + ImVec2 outerSize = compact ? ImVec2(0, 0) : ImVec2(0, ImGui::GetContentRegionAvail().y); + + Util::ShowSortedStringTableCustom( + "##ShadowLightTbl", + headers, + filteredRows, + static_cast(statusColIdx), // default sort: Status + true, // ascending + sorts, + [&](int /*rowIdx*/, int col, const SlotRow& row) { + const uintptr_t key = row.info.lightKey; + const bool suppressed = s_suppressedLights.count(key) > 0; + const bool pinShadow = s_pinShadow.count(key) > 0; + const bool pinConvert = s_pinConvert.count(key) > 0; + const bool isSolo = (s_soloLight == key && key != 0); + + // Helper: shift-gated debug pulse. Setting s_hoverLightKey makes + // the cluster light builder replace this light's colour with a + // 1Hz magenta pulse — useful for finding which light a row + // corresponds to in 3D, but visually startling if it triggered + // every time the cursor crossed a cell. Requiring Shift+hover + // means a user clicking through the cycle/solo buttons doesn't + // see lights randomly turn purple, while debugging is one + // modifier away. + auto noteHover = [&]() { + if (ImGui::IsItemHovered() && ImGui::GetIO().KeyShift) + s_hoverLightKey = key; + }; + + // Row tint: highImp lights get a subtle yellow background so + // the eye can pick out the lights actually contributing to the + // frame at a glance. Replaces the dropped "Hi" column. Set on + // col 0 so it applies to the whole row. + if (col == 0 && row.highImp) { + ImGui::TableSetBgColor(ImGuiTableBgTarget_RowBg0, + ImGui::GetColorU32(ImVec4(0.30f, 0.30f, 0.10f, 0.35f))); + } + + // === Mode column: state cycle button ======================= + // Cycle: Auto (·) -> PinShadow (S) -> PinConvert (C) -> Suppress (X) -> Auto + // Mutually exclusive (SetPinned* / suppressed.erase enforce that). + // Hidden in readOnly mode (overlay with menu closed). + // Focus rows skip Mode/Solo entirely -- engine owns the slot. + if (row.isFocus && col == modeColIdx) { + ImGui::TextDisabled("eng"); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip("Engine-controlled focus shadow; not pinnable/suppressible."); + return; + } + if (row.isFocus && col == soloColIdx) { + ImGui::TextDisabled("--"); + return; + } + if (showButtons && col == modeColIdx) { + ImGui::PushID(static_cast(key & 0xFFFFFFFF)); + const char* label = "·"; + ImVec4 col4 = ImVec4(0.15f, 0.6f, 0.15f, 1); // green = auto/active + ImVec4 colH = ImVec4(0.2f, 0.75f, 0.2f, 1); + const char* tip = "Auto (scheduler decides)\nClick: pin as shadow caster"; + if (pinShadow) { + label = "S"; + col4 = ImVec4(0.20f, 0.40f, 0.85f, 1); // blue + colH = ImVec4(0.30f, 0.55f, 1.0f, 1); + tip = "Pinned: forced shadow caster\nClick: pin as converted (non-shadow)"; + } else if (pinConvert) { + label = "C"; + col4 = ImVec4(0.85f, 0.55f, 0.15f, 1); // amber + colH = ImVec4(1.0f, 0.7f, 0.25f, 1); + tip = "Pinned: forced converted (non-shadow)\nClick: suppress entirely"; + } else if (suppressed) { + label = "X"; + col4 = ImVec4(0.45f, 0.25f, 0.25f, 1); // dim red + colH = ImVec4(0.6f, 0.35f, 0.35f, 1); + tip = "Suppressed (hidden)\nClick: return to auto"; + } + ImGui::PushStyleColor(ImGuiCol_Button, col4); + ImGui::PushStyleColor(ImGuiCol_ButtonHovered, colH); + if (ImGui::SmallButton(label)) { + // Cycle to next state. + if (pinShadow) { + SetPinnedShadow(key, false); + SetPinnedConvert(key, true); + } else if (pinConvert) { + SetPinnedConvert(key, false); + s_suppressedLights.insert(key); + } else if (suppressed) { + s_suppressedLights.erase(key); + } else { + SetPinnedShadow(key, true); + } + } + ImGui::PopStyleColor(2); + noteHover(); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip("%s", tip); + ImGui::PopID(); + return; + } + + // === Solo column ========================================== + // Hidden in readOnly mode. + if (showButtons && col == soloColIdx) { + ImGui::PushID(static_cast((key & 0xFFFFFFFF) ^ 0xA1)); + ImVec4 col4 = isSolo ? + ImVec4(0.85f, 0.7f, 0.15f, 1) : // bright yellow when active + ImVec4(0.30f, 0.30f, 0.30f, 1); + ImVec4 colH = isSolo ? ImVec4(1.0f, 0.85f, 0.25f, 1) : ImVec4(0.45f, 0.45f, 0.45f, 1); + ImGui::PushStyleColor(ImGuiCol_Button, col4); + ImGui::PushStyleColor(ImGuiCol_ButtonHovered, colH); + if (ImGui::SmallButton(isSolo ? "!" : "·")) + SetSoloLight(isSolo ? 0 : key); + ImGui::PopStyleColor(2); + noteHover(); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip("%s", + isSolo ? + "Solo: this light is shown alone\nClick: clear solo" : + "Solo this light\n(suppresses every other light\nuntil cleared)"); + ImGui::PopID(); + return; + } + + if (suppressed || (s_soloLight != 0 && !isSolo)) + ImGui::BeginDisabled(); + bool dimmed = suppressed || (s_soloLight != 0 && !isSolo); + if (col == statusColIdx) { + // Merged "In Scene" + "Slot" column. Four mutually-exclusive + // states; suppressed wins because the user explicitly hid it. + if (suppressed) { + ImGui::TextColored(ImVec4(0.85f, 0.35f, 0.35f, 1), "Suppr"); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip("Suppressed by debug override.\nClick the Mode button to clear."); + } else if (row.inScene) { + ImGui::Text("Slot %u", row.idx); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip("Casting shadows this frame in slot %u.", row.idx); + } else if (row.converted) { + ImGui::TextColored(ImVec4(0.95f, 0.75f, 0.25f, 1), "Conv"); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Demoted to a normal (non-shadow) light this frame.\n" + "Cluster lighting still illuminates it; no shadow-map cost."); + } else { + ImGui::TextDisabled("Out"); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip("Out of range / not active in the current frame."); + } + } else if (col == addrColIdx) { + if (row.isFocus) { + ImGui::TextDisabled("focus[%u]", row.idx - static_cast(kFocusShadowBaseSlotIndex)); + } else { + char addrFull[20]; + snprintf(addrFull, sizeof(addrFull), "0x%016llX", static_cast(row.info.lightKey)); + ImGui::Selectable(addrFull + 10, false, ImGuiSelectableFlags_None); + if (ImGui::IsItemClicked()) + ImGui::SetClipboardText(addrFull); + noteHover(); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip("Click to copy: %s", addrFull); + } + } else if (showColor && col == addrColIdx + 1) { + ImVec4 c = ShadowSlotHueColor(row.idx); + auto ri = static_cast(c.x * 255.0f); + auto gi = static_cast(c.y * 255.0f); + auto bi = static_cast(c.z * 255.0f); + ImGui::ColorButton("##col", c, + ImGuiColorEditFlags_NoTooltip | ImGuiColorEditFlags_NoBorder, ImVec2(22, 16)); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip("#%02X%02X%02X", ri, gi, bi); + } else if (col == typeColIdx) { + if (row.isFocus) { + ImGui::TextColored(ImVec4(0.55f, 0.75f, 1.0f, 1.0f), "Focus"); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Engine-owned focus shadow slot.\n" + "FocusShadowActors[%u] = high-res shadow for a tracked\n" + "actor (player + dialog/combat NPCs). SCM reserves\n" + "this slot so the engine's focus render isn't trampled\n" + "by point/spot lights.", + row.idx - static_cast(kFocusShadowBaseSlotIndex)); + } else { + ImGui::TextUnformatted(kShadowTypeNames[std::min(row.info.type, 2u)]); + noteHover(); + } + } else if (col == radColIdx) { + if (row.isFocus) { + ImGui::TextDisabled("--"); + } else { + ImGui::Text("%.0f u", row.info.range); + noteHover(); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip("%s", Util::Units::FormatDistance(row.info.range).c_str()); + } + } else if (col == centrColIdx) { + // Importance score: luminance × fade × attenuation² at viewer. + // White (0) → bright green (1+) as contribution increases. + float imp = row.importance; + float t = std::min(imp, 1.0f); + ImVec4 colour = ImVec4(1.0f - t * 0.7f, 1.0f, 1.0f - t * 0.7f, 1.0f); // white → green + ImGui::TextColored(colour, "%.2f", imp); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Contribution importance score:\n" + " luminance(diffuse * fade)\n" + " * max(att_camera, att_player)\n" + " where att = (1 - (dist/radius)^2)^2\n\n" + "Higher = light strongly illuminates the viewer area.\n" + "Drives interval multiplier (configurable in Advanced settings).\n" + "Default: 0 => x2.0, 0.5 => x0.32, 1 => x0.05\n\n" + "Rows tinted yellow are high-importance (>0.1)\n" + "-- they deliver meaningful illumination near the camera\n" + "or player and receive accelerated shadow redraw scheduling."); + } + // Hi column dropped -- highImp now tints the row background + // (see TableSetBgColor at the top of this lambda) so the visual + // signal is preserved without consuming a column. + if (dimmed) + ImGui::EndDisabled(); + }, + {}, + outerSize); + } + + void DrawShadowSummary(uint32_t clusterCount, uint32_t clusterMax, uint32_t shadowUnshadowedLightCount) + { + // Canonical "where are we vs the limits" panel. Used by both the menu's + // Active Casters block and the overlay header so testers see the same + // numbers in the same format regardless of which view they're in. + const uint32_t slotUsage = s_shadowSlotUsage; + const uint32_t slots = GetInstalledSlotCount(); + // "Wanted" = total shadow-eligible demand this frame (active + dropped). + // We don't track demand separately, but slotUsage + dropped is the + // observable proxy that matches the user-visible "X dropped" signal. + const uint32_t requested = slotUsage + shadowUnshadowedLightCount; + + if (clusterCount >= clusterMax) + ImGui::TextColored(ImVec4(1, 0.3f, 0.3f, 1), "Cluster lights : %u / %u (overflow)", clusterCount, clusterMax); + else + ImGui::Text("Cluster lights : %u / %u", clusterCount, clusterMax); + + // "lights" rather than "slots" matches the Shadow Light Count + // setting name -- users think in lights, the engine thinks in + // texture slots, so we use the user's word. + if (shadowUnshadowedLightCount > 0) + ImGui::TextColored(ImVec4(1, 0.4f, 0.4f, 1), + "Shadow lights : %u / %u (%u wanted, %u dropped, %zu converted)", + slotUsage, slots, requested, shadowUnshadowedLightCount, s_normalConvert.size()); + else + ImGui::Text("Shadow lights : %u / %u (%u wanted, 0 dropped, %zu converted)", + slotUsage, slots, requested, s_normalConvert.size()); + + if (s_highImportanceLightCount > 0 && ImGui::IsItemHovered()) + ImGui::SetTooltip("%u high-importance (near camera/player).", + s_highImportanceLightCount); + } + + void DrawShadowSchedulerStats() + { + // Avg redraws/frame: rolling average of how many shadow casters per frame + // the scheduler decided to (re)render. Bounded by MaxRedrawPerFrame. + float avgRedraws = static_cast(s_redrawSum) / static_cast(kRedrawHistorySize); + ImGui::Text("Avg redraws/frame : %.1f (cap: %d)", avgRedraws, s_settings.MaxRedrawPerFrame); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip("Rolling average over the last %d frames.", kRedrawHistorySize); + + // Avg per-light cost: budget tracker's measured GPU cost per shadow caster. + // Used by the formula budget mode to decide how many casters fit in the + // per-frame time budget. + int32_t avgCost = s_budget.GetAverageCostUs(); + if (avgCost > 0) + ImGui::Text("Avg light cost : %.2f ms", avgCost / 1000.0f); + + // ---- Budget verdict --------------------------------------------- + // Cross-checks measured shadow cost against the user-chosen budget + // to surface "is your setup actually working?" without making the + // user math it out themselves. We compare measured shadow time to + // the user's chosen shadow budget -- not to total frame time -- so + // this is "are we honouring your settings?" not "are your settings + // right for your hardware?". The latter genuinely needs data we + // don't own (frame target, GPU headroom, async overlap). + const float budgetMs = s_autoBudgetMs; // active budget (Manual = slider, Formula = computed) + const float costMs = avgCost / 1000.0f; + const float usedMs = avgRedraws * costMs; + const int32_t cap = s_settings.MaxRedrawPerFrame; + const bool capLimited = avgCost > 0 && avgRedraws >= static_cast(cap) * 0.95f; + const bool slotLimited = (s_shadowSlotUsage + 0u) >= GetInstalledSlotCount(); + const bool overBudget = avgCost > 0 && budgetMs > 0.0f && usedMs > budgetMs * 1.0f; + const bool headroom = avgCost > 0 && budgetMs > 0.0f && usedMs < budgetMs * 0.5f && !capLimited; + + if (avgCost <= 0 || budgetMs <= 0.0f) { + ImGui::TextDisabled("Budget usage : (warming up)"); + return; + } + + // Verdicts named after the user-visible settings, not internal + // engineering terms. Tooltips kept to one short line each so the + // hover doesn't grow into a wall of text. + ImVec4 col; + const char* verdict; + const char* tip; + if (overBudget) { + col = ImVec4(0.95f, 0.35f, 0.35f, 1); + verdict = "OVER BUDGET"; + tip = "Shadow time exceeds Redraw Budget. Lower Max Redraws or raise Redraw Budget."; + } else if (capLimited && slotLimited) { + col = ImVec4(0.95f, 0.65f, 0.25f, 1); + verdict = "AT LIMITS"; + tip = "Both Max Redraws and Shadow Light Count are full. Enable Convert to Normal or raise Shadow Light Count."; + } else if (slotLimited) { + col = ImVec4(0.95f, 0.65f, 0.25f, 1); + verdict = "LIGHT LIMITED"; + tip = "Shadow Light Count is full. Enable Convert to Normal or raise Shadow Light Count."; + } else if (capLimited) { + col = ImVec4(0.95f, 0.85f, 0.25f, 1); + verdict = "REDRAW LIMITED"; + tip = "Hitting Max Redraws Per Frame. Raise it to spend the unused Redraw Budget."; + } else if (headroom) { + col = ImVec4(0.55f, 0.85f, 0.55f, 1); + verdict = "HEADROOM"; + tip = "Under half the Redraw Budget is being used. Raise Max Redraws or accept the slack."; + } else { + col = ImVec4(0.55f, 0.85f, 0.55f, 1); + verdict = "OK"; + tip = "Within Redraw Budget; no limits hit."; + } + // Budget gauge: progress bar tinted by the verdict colour so the + // state is readable at a glance, with the numeric reading and + // verdict label inside the bar. One widget replaces the old + // separate progress bar (in SCM settings) + verdict text line. + const float fraction = std::min(usedMs / budgetMs, 1.0f); + char overlay[80]; + snprintf(overlay, sizeof(overlay), "%.2f / %.2f ms - %s", usedMs, budgetMs, verdict); + ImGui::PushStyleColor(ImGuiCol_PlotHistogram, col); + ImGui::Text("Budget usage :"); + ImGui::SameLine(); + ImGui::ProgressBar(fraction, ImVec2(-1.0f, 0.0f), overlay); + ImGui::PopStyleColor(); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip("%s", tip); + + // ---- Shadow VRAM progress bar ---- + // Bar fills `currentUsage / budget` (process headroom); overlay text + // shows the kSHADOWMAPS array's share of that. Same DXGI data source + // as PerformanceOverlay. + auto vinfo = GetVRAMInfo(); + if (vinfo.valid && vinfo.budgetBytes > 0) { + const std::uint64_t freeBytes = vinfo.budgetBytes > vinfo.currentUsageBytes ? vinfo.budgetBytes - vinfo.currentUsageBytes : 0; + const float arrayMB = static_cast(vinfo.shadowArrayBytes) / (1024.f * 1024.f); + const float freeMB = static_cast(freeBytes) / (1024.f * 1024.f); + const float usageMB = static_cast(vinfo.currentUsageBytes) / (1024.f * 1024.f); + const float budgetMBf = static_cast(vinfo.budgetBytes) / (1024.f * 1024.f); + const float perSliceMB = static_cast(vinfo.bytesPerSlice) / (1024.f * 1024.f); + // Disambiguated from the budget-verdict string above. + const VRAMVerdict vramVerdict = EvaluateVRAMVerdict(vinfo.shadowArrayBytes, freeBytes, vinfo.budgetBytes); + const float fillFraction = std::min(1.0f, + static_cast(vinfo.currentUsageBytes) / static_cast(vinfo.budgetBytes)); + char overlayText[96]; + snprintf(overlayText, sizeof(overlayText), + "%.0f / %.0f MB - shadows %.0f MB (%u slices)", + usageMB, budgetMBf, arrayMB, vinfo.shadowSlices); + ImGui::PushStyleColor(ImGuiCol_PlotHistogram, vramVerdict.colour); + ImGui::Text("Shadow VRAM :"); + ImGui::SameLine(); + ImGui::ProgressBar(fillFraction, ImVec2(-1.0f, 0.0f), overlayText); + ImGui::PopStyleColor(); + if (ImGui::IsItemHovered()) { + ImGui::SetTooltip( + "Bar fill = process VRAM usage / DXGI budget (same data the\n" + "performance overlay reports). Overlay text shows the shadow\n" + "array's contribution to that usage.\n" + "\n" + "Slices : %u (sun lives in its own kSHADOWMAPS_ESRAM texture)\n" + "Per slice : %.2f MB (%u x %u @ %u B/pixel)\n" + "Shadow array : %.1f MB\n" + "Free in budget : %.1f MB\n" + "\n" + "Green when free VRAM and shadow share are comfortable.\n" + "Yellow when free < 512 MB or shadow array > 25%% of budget.\n" + "Red when free < 128 MB or shadow array > 50%% of budget --\n" + "lower Shadow Light Count or iShadowMapResolution.", + vinfo.shadowSlices, perSliceMB, + vinfo.shadowWidth, vinfo.shadowHeight, + vinfo.shadowWidth && vinfo.shadowHeight ? vinfo.bytesPerSlice / (vinfo.shadowWidth * vinfo.shadowHeight) : 0u, + arrayMB, freeMB); + } + } + } + + void DrawOverlayShadowModeInfo(uint32_t mode, uint32_t /*shadowUnshadowedLightCount*/, uint32_t /*totalLightCount*/) + { + // Cluster light count, slot usage, requested/dropped/converted are all + // covered by DrawShadowSummary above this in the overlay header. This + // function now carries only mode-specific information that wouldn't be + // meaningful elsewhere -- channel meanings, heatmap legends, etc. + if (mode == 3) { + ImGui::Text("R channel = directional soft shadow"); + ImGui::Text("G channel = directional detailed shadow"); + ImGui::TextDisabled("(B = unused)"); + } else if (mode == 4) { + ImGui::TextDisabled("Pixel heatmap: 0=blue 8+=red"); + } else if (mode == 5) { + ImGui::TextDisabled("White = fully lit, black = fully in shadow"); + } else if (mode == 6) { + ImGui::TextDisabled("Pixel heatmap: 0=blue 8+=red (lights without shadow maps)"); + } else if (mode == 7) { + ImGui::TextDisabled("Cool Turbo[0.0-0.3] = 1-4 shadows"); + ImGui::TextDisabled("Warm Turbo[0.3-0.8] = 5-%u shadows", GetInstalledSlotCount()); + ImGui::TextDisabled("Red = overflow"); + } else if (mode == 9) { + uint32_t spotC = 0, hemiC = 0, omniC = 0; + for (const auto& info : GetSlotInfos()) { + if (!info.valid) + continue; + if (info.type == 0) + spotC++; + else if (info.type == 1) + hemiC++; + else + omniC++; + } + ImGui::Text("R Spot (frustum) : %u", spotC); + ImGui::Text("G Hemisphere : %u", hemiC); + ImGui::Text("B Omni (paraboloid): %u", omniC); + } + } + + void DrawVisualisationTooltipShadowModes() + { + ImGui::Text( + "\n" + "Shadow Mask: R=directional soft shadow, G=directional detailed shadow.\n" + "\n" + "Shadow Light Count: Heatmap of shadow-casting point/spot lights per pixel (blue=0, red=8+).\n" + "Use to gauge shadow density; high counts indicate expensive shadow sampling.\n" + "\n" + "Point Light Shadow Factor: Brightness shows the darkest shadow value from any point/spot\n" + "light. White=fully lit, black=fully shadowed. Shows where PCF/PCSS filtering is active.\n" + "\n" + "Unshadowed Point Lights: Heatmap of point/spot lights without shadow maps (blue=0, red=8+).\n" + "High values where lights are bright indicate where the shadow slot limit is costing quality.\n" + "\n" + "Shadow Caster Density: Custom Turbo ranges show how heavily shadow slots are used.\n" + " Cool (Turbo 0.0-0.3): 1-4 shadow lights per pixel.\n" + " Warm (Turbo 0.3-0.8): 5 to ShadowMapSlots lights (dynamic range).\n" + " Bright red: overflow - a light wanted a shadow slot but none was available.\n" + "\n" + "Shadow Slot Index Color: Assigns each shadow-map slot a unique high-contrast hue\n" + "(golden-ratio sequence) so you can identify which slot is casting the primary shadow.\n" + "First valid shadow light index per pixel is shown. Bright red = slot overflow.\n" + "\n" + "Light Type Visualization: RGB channels encode shadow light types per pixel.\n" + " R = spot/frustum lights (ShadowParam.x == 0).\n" + " G = hemisphere/paraboloid lights (ShadowParam.x == 1).\n" + " B = omnidirectional/full-paraboloid lights (ShadowParam.x == 2).\n" + " Dark grey = unshadowed lights only (no shadow maps assigned).\n" + " Bright red = overflow (slot capacity exceeded).\n" + "Intensity scales with count (up to 4); channels blend for mixed-type pixels."); + } + + void DrawSettings(Settings& settings) + { + ImGui::SeparatorText("Shadow Limit Fix"); + + // ---- External conflict banner -------------------------------------- + if (s_externalConflict) { + const auto& theme = Menu::GetSingleton()->GetTheme(); + ImGui::TextColored(theme.StatusPalette.Error, "%s", s_conflictMessage.c_str()); + ImGui::BeginDisabled(); + } + + // ---- Enable toggle (requires restart) ------------------------------ + ImGui::Checkbox("Enable Shadow Limit Fix", &settings.Enabled); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Extends Skyrim's hard limit of 4 simultaneous shadow-casting lights.\n" + "Intelligently selects which lights cast shadows each frame based on\n" + "distance, intensity, and a configurable priority formula.\n\n" + "Based on Intellightent by meh321.\n" + "https://www.nexusmods.com/skyrimspecialedition/mods/172423\n\n" + "Restart required to take effect in either direction. The boot-time\n" + "patches (extended atlas slices, depth buffer creation loop, color-mask\n" + "pass replacement) cannot be safely reversed at runtime -- vanilla\n" + "shadow scheduling crashes when run on top of them. Toggle and restart."); + // Either direction requires restart -- the boot-time patches modify + // the engine's shadow texture array, depth buffer creation, and + // color-mask pass. Vanilla scheduling cannot run on top of those + // (verified by AV in BSShadowDirectionalLight processing during a + // runtime-disable test, 2026-05-17 crash logs). + // + // Compare the user's current value against the BOOT value, not + // against s_settings -- s_settings updates when the user saves, + // so a stale comparison against s_settings would hide the label + // the instant the user clicked Save Settings, leaving them with + // no indication that their change won't apply until restart. + if (s_bootEnabledCaptured && settings.Enabled != s_bootEnabled) { + const auto& theme = Menu::GetSingleton()->GetTheme(); + ImGui::TextColored(theme.StatusPalette.RestartNeeded, + "Restart required -- this session is %s.", s_bootEnabled ? "enabled" : "disabled"); + } + + if (!settings.Enabled) + ImGui::BeginDisabled(); + + // ---- Shadow Light Count (requires restart) ------------------------- + // Upper bound of 127: the engine refuses to render any shadow caster + // when ShadowLightCount >= 128 even though kSHADOWMAPS allocates + // successfully -- some internal limit (likely an 8-bit shadow index + // somewhere we haven't patched) silently disables shadow rendering. + // 127 is the highest value that actually works. + ImGui::SliderInt("Shadow Light Count", &settings.ShadowLightCount, 0, 127); + // Compute projected VRAM for the slider's current value so the user + // can see the cost of a higher count *before* committing the restart. + // kSHADOWMAPS holds exactly ShadowLightCount slices -- the sun lives + // in its own kSHADOWMAPS_ESRAM texture, so there's no +1. + auto sliderVram = GetVRAMInfo(); + std::uint64_t projectedBytes = 0; + std::uint64_t projectedFreeBytes = 0; + bool projectionValid = sliderVram.valid; + if (projectionValid) { + projectedBytes = ProjectShadowArrayBytes(static_cast(settings.ShadowLightCount)); + std::int64_t projectedUsage = static_cast(sliderVram.currentUsageBytes) - + static_cast(sliderVram.shadowArrayBytes) + + static_cast(projectedBytes); + if (projectedUsage < 0) + projectedUsage = 0; + projectedFreeBytes = (static_cast(sliderVram.budgetBytes) > projectedUsage) ? static_cast(sliderVram.budgetBytes - projectedUsage) : 0; + } + if (ImGui::IsItemHovered()) { + constexpr const char* kSliderBase = + "Maximum simultaneous shadow-casting point/spot lights (directional sun not counted).\n" + " 0 = scheduler runs but selects no point lights (sun/directional unaffected).\n" + " 4 = vanilla point light count with intelligent selection.\n" + " >4 = extended mode; depth buffer expanded when >8. Max 127\n" + " (VRAM is the practical limit -- watch the projected-VRAM bar).\n" + "Requires a game restart to take effect."; + if (projectionValid) { + ImGui::SetTooltip( + "%s\n" + "\n" + "Projected kSHADOWMAPS array at %d slots: %.1f MB\n" + "Per-slice cost: %.2f MB (%u x %u, %u B/pixel)\n" + "Projected free VRAM after restart: %.1f MB", + kSliderBase, + settings.ShadowLightCount, + static_cast(projectedBytes) / (1024.f * 1024.f), + static_cast(sliderVram.bytesPerSlice) / (1024.f * 1024.f), + sliderVram.shadowWidth, sliderVram.shadowHeight, + sliderVram.shadowWidth && sliderVram.shadowHeight ? + sliderVram.bytesPerSlice / (sliderVram.shadowWidth * sliderVram.shadowHeight) : + 0u, + static_cast(projectedFreeBytes) / (1024.f * 1024.f)); + } else { + ImGui::SetTooltip("%s", kSliderBase); + } + } + // Custom-drawn stacked bar against DXGI budget showing non-shadow / + // current-shadow / projected-shadow segments. ImGui::ProgressBar + // can't multi-segment. + if (projectionValid && sliderVram.budgetBytes > 0) { + const VRAMVerdict verdict = EvaluateVRAMVerdict(projectedBytes, projectedFreeBytes, sliderVram.budgetBytes); + const float budgetMBf = static_cast(sliderVram.budgetBytes) / (1024.f * 1024.f); + const float nonShadowMB = std::max(0.0f, + (static_cast(sliderVram.currentUsageBytes) - static_cast(sliderVram.shadowArrayBytes)) / (1024.f * 1024.f)); + const float currentShadowMB = static_cast(sliderVram.shadowArrayBytes) / (1024.f * 1024.f); + const float projectedShadowMB = static_cast(projectedBytes) / (1024.f * 1024.f); + + ImGui::Text("Projected shadow VRAM :"); + ImGui::SameLine(); + const ImVec2 cursor = ImGui::GetCursorScreenPos(); + const float fullWidth = ImGui::GetContentRegionAvail().x; + const float barHeight = ImGui::GetFrameHeight(); + const float scale = fullWidth / budgetMBf; + auto* draw = ImGui::GetWindowDrawList(); + // Background frame, then non-shadow / current / projected segments. + draw->AddRectFilled(cursor, ImVec2(cursor.x + fullWidth, cursor.y + barHeight), + ImGui::GetColorU32(ImGuiCol_FrameBg)); + const float nonShadowEndX = cursor.x + nonShadowMB * scale; + draw->AddRectFilled(cursor, ImVec2(nonShadowEndX, cursor.y + barHeight), + IM_COL32(120, 120, 120, 200)); + const float currentEndX = std::min(cursor.x + fullWidth, nonShadowEndX + currentShadowMB * scale); + draw->AddRectFilled(ImVec2(nonShadowEndX, cursor.y), + ImVec2(currentEndX, cursor.y + barHeight), + IM_COL32(80, 130, 200, 220)); + // Projection outline anchored at the same start as current, so + // the visual delta IS the difference. Solid fill for grow, dark + // stripe for shrink. + const float projectedEndX = std::min(cursor.x + fullWidth, nonShadowEndX + projectedShadowMB * scale); + const ImU32 verdictColU32 = ImGui::GetColorU32(verdict.colour); + draw->AddRect(ImVec2(nonShadowEndX, cursor.y), ImVec2(projectedEndX, cursor.y + barHeight), + verdictColU32, 0.0f, 0, 2.0f); + if (projectedShadowMB > currentShadowMB) { + draw->AddRectFilled(ImVec2(currentEndX, cursor.y), ImVec2(projectedEndX, cursor.y + barHeight), + (verdictColU32 & 0x00FFFFFFu) | 0xA0000000u); + } else if (projectedShadowMB < currentShadowMB) { + draw->AddRectFilled(ImVec2(projectedEndX, cursor.y), ImVec2(currentEndX, cursor.y + barHeight), + IM_COL32(80, 80, 80, 120)); + } + + char overlay[128]; + snprintf(overlay, sizeof(overlay), + "shadows %.0f -> %.0f MB (%d slots, %.0f MB free after restart)", + currentShadowMB, projectedShadowMB, + settings.ShadowLightCount, + static_cast(projectedFreeBytes) / (1024.f * 1024.f)); + const ImVec2 textSize = ImGui::CalcTextSize(overlay); + const ImVec2 textPos(cursor.x + (fullWidth - textSize.x) * 0.5f, + cursor.y + (barHeight - textSize.y) * 0.5f); + draw->AddText(textPos, IM_COL32(240, 240, 240, 255), overlay); + ImGui::Dummy(ImVec2(fullWidth, barHeight)); // reserve layout space + if (ImGui::IsItemHovered()) { + ImGui::SetTooltip( + "Stacked VRAM bar against DXGI budget.\n" + " Grey block : process VRAM not counted as shadow array\n" + " Blue block : current kSHADOWMAPS allocation this session\n" + " Outlined block: what the slider's value would allocate\n" + " after restart (colour reflects verdict)\n" + "\n" + "Solid colour past the blue: shadow array would GROW by that\n" + "amount. Dark stripe inside the blue: shadow array would\n" + "SHRINK by that amount.\n" + "\n" + "Slots requested : %d (sun lives in kSHADOWMAPS_ESRAM)\n" + "Per-slice cost : %.2f MB (%u x %u @ %u B/pixel)\n" + "Current array : %.1f MB\n" + "Projected array : %.1f MB\n" + "Free after restart : %.1f MB / %.0f MB budget\n" + "%s", + settings.ShadowLightCount, + static_cast(sliderVram.bytesPerSlice) / (1024.f * 1024.f), + sliderVram.shadowWidth, sliderVram.shadowHeight, + sliderVram.shadowWidth && sliderVram.shadowHeight ? + sliderVram.bytesPerSlice / (sliderVram.shadowWidth * sliderVram.shadowHeight) : + 0u, + currentShadowMB, + projectedShadowMB, + static_cast(projectedFreeBytes) / (1024.f * 1024.f), + budgetMBf, + verdict.over ? + "\nRED: this projection won't fit in the current VRAM budget.\n" + "The driver will page or refuse the allocation, leaving the\n" + "shadow array smaller than requested -- shadows will silently\n" + "break. Lower the slot count or reduce iShadowMapResolution." : + verdict.tight ? + "\nYELLOW: tight headroom. A driver or OS spike could push\n" + "shadow allocation into paging. Safe for testing, risky for\n" + "long sessions or heavily-modded scenes." : + ""); + } + } + + // ---- Allocation mismatch banner ---- + // Surface kSHADOWMAPS truncation visibly so users hit by a silent + // "shadows don't work at high slot counts" failure can see why. + // Reads the verified count directly (not the GetInstalledSlotCount + // accessor, which falls back to the requested value). + { + uint32_t installed = s_installedSlotCount; + uint32_t requested = s_requestedSlotCount; + if (installed > 0 && requested > 0 && installed < requested) { + ImGui::TextColored(ImVec4(0.95f, 0.35f, 0.35f, 1), + "VRAM exhausted: requested %u slots, GPU allocated %u.", + requested, installed); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "The engine tried to create kSHADOWMAPS with %u slices but\n" + "the GPU / driver returned a smaller array (likely out of\n" + "VRAM at the configured iShadowMapResolution). The scheduler\n" + "has clamped itself to the actual count so the existing %u\n" + "slices work correctly, but to reach the requested %u you'll\n" + "need to free VRAM (lower resolution, other features, etc).", + requested, installed, requested); + } else if (installed == 0 && s_settings.Enabled && !s_externalConflict) { + ImGui::TextColored(ImVec4(0.95f, 0.85f, 0.25f, 1), + "Shadow array not yet verified -- load a save to confirm allocation."); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "kSHADOWMAPS isn't readable yet (main menu / loading screen).\n" + "Once you reach gameplay the scheduler verifies the actual\n" + "slice count against your requested value. If they disagree\n" + "this banner turns red."); + } + } + + if (settings.ShadowLightCount != s_installedShadowLightCount) { + const auto& theme = Menu::GetSingleton()->GetTheme(); + ImGui::TextColored(theme.StatusPalette.RestartNeeded, + "Restart required -- current session uses %d lights.", s_installedShadowLightCount); + } + + // ---- Shadow Map Resolution (requires restart) --------------------- + // Mirrors the launcher's resolution tiers (the four power-of-two values + // Skyrim itself offers). Mutates the live iShadowMapResolution:Display + // RE::Setting immediately; persistence to SkyrimPrefs.ini happens in + // SCM::SaveINISettings (called from LightLimitFix::SaveSettings). + if (auto* prefColl = RE::INIPrefSettingCollection::GetSingleton()) { + if (auto* setting = prefColl->GetSetting("iShadowMapResolution:Display")) { + static constexpr struct + { + const char* label; + std::int32_t value; + } kResTiers[] = { + { "Low (1024)", 1024 }, + { "Medium (2048)", 2048 }, + { "High (4096)", 4096 }, + { "Ultra (8192)", 8192 }, + }; + constexpr int kTierCount = static_cast(sizeof(kResTiers) / sizeof(kResTiers[0])); + + const std::int32_t currentRes = setting->GetInteger(); + int tierIdx = -1; + for (int i = 0; i < kTierCount; ++i) { + if (kResTiers[i].value == currentRes) { + tierIdx = i; + break; + } + } + // Non-tier values (manual INI edits / third-party tools) + // surface as "Custom (N)" so the user sees what the engine is + // actually using, but we don't offer it as a selectable tier. + char previewBuf[32]; + const char* preview; + if (tierIdx >= 0) { + preview = kResTiers[tierIdx].label; + } else { + snprintf(previewBuf, sizeof(previewBuf), "Custom (%d)", currentRes); + preview = previewBuf; + } + + if (ImGui::BeginCombo("Shadow Map Resolution", preview)) { + for (int i = 0; i < kTierCount; ++i) { + const bool selected = (i == tierIdx); + if (ImGui::Selectable(kResTiers[i].label, selected) && + kResTiers[i].value != currentRes) { + setting->SetInteger(kResTiers[i].value); + s_shadowResolutionDirty = true; + } + if (selected) + ImGui::SetItemDefaultFocus(); + } + ImGui::EndCombo(); + } + if (ImGui::IsItemHovered()) { + ImGui::SetTooltip( + "Drives iShadowMapResolution:Display in SkyrimPrefs.ini.\n" + "Affects both omni/spot shadow slices and the sun cascade\n" + "texture; per-slice VRAM scales as resolution^2 * 4 bytes\n" + "(4 / 16 / 64 / 256 MB at 1024 / 2048 / 4096 / 8192).\n" + "Requires a game restart to take effect."); + } + + if (s_initialShadowMapResolution > 0 && currentRes != s_initialShadowMapResolution) { + const auto& theme = Menu::GetSingleton()->GetTheme(); + ImGui::TextColored(theme.StatusPalette.RestartNeeded, + "Restart required -- current session uses %d px shadow maps.", + s_initialShadowMapResolution); + } + } + } + + // ---- Temporal budget (dynamic) ------------------------------------ + + // Migrate legacy Auto saves silently. Manual is now the default and the + // closest match in spirit to what most users actually wanted from Auto: + // a predictable budget that doesn't ping-pong. Power users can switch + // back to Formula manually if they want the adaptive default expression. + if (settings.BudgetMode == BudgetModeEnum::Auto) + settings.BudgetMode = BudgetModeEnum::Manual; + + // Budget mode selector — Manual or Formula. Auto was removed: it was an + // opaque DRS controller that confused users when the budget moved without + // a visible cause. The default Formula expresses the same behaviour + // transparently and stays editable. + static const char* budgetModeNames[] = { "Manual", "Formula" }; + int budgetModeIdx = (settings.BudgetMode == BudgetModeEnum::Manual) ? 0 : 1; + if (ImGui::Combo("Budget Mode", &budgetModeIdx, budgetModeNames, 2)) + settings.BudgetMode = (budgetModeIdx == 0) ? BudgetModeEnum::Manual : BudgetModeEnum::Formula; + if (ImGui::IsItemHovered()) { + if (budgetModeIdx == 0) + ImGui::SetTooltip( + "Manual (default): fixed per-frame GPU time budget for shadow re-renders.\n" + "Predictable; doesn't oscillate. Adjust the slider to trade FPS for shadow quality."); + else + ImGui::SetTooltip( + "Formula: user-editable exprtk expression for per-frame budget.\n" + "Default expression matches Intellightent's original behaviour\n" + "(1 ms outdoors, 2 ms indoors). Edit the expression in the\n" + "Advanced section below.\n" + "\n" + "Caveat: adaptive expressions referencing `frametime` tend to\n" + "ping-pong because rendering shadows raises frametime, removing\n" + "the headroom that allowed the budget. Stick to static or\n" + "slowly-varying inputs (`isinterior`, `frametarget`)."); + } + + // Per-mode controls. + if (budgetModeIdx == 0) { + ImGui::SliderFloat("Redraw Budget (ms)", &settings.RedrawBudgetMs, 0.1f, 32.0f, "%.2f ms"); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Per-frame GPU time budget for shadow re-renders (milliseconds).\n" + "Lights whose estimated render cost exceeds the remaining budget are deferred.\n" + "The first eligible light always renders regardless of budget (starvation prevention).\n" + "\n" + "Reference points:\n" + " 1-2 ms: Intellightent's original (1 outdoors, 2 indoors)\n" + " 5 ms : default — comfortable for typical scenes (~5-8 lights at ~1 ms each)\n" + " 16 ms: full 60 fps frame; shadows can saturate the frame here\n" + " 32 ms: extreme — only useful for very high light counts on fast GPUs\n" + "\n" + "Higher = more shadow lights redraw per frame, fewer stale shadow maps,\n" + "at the cost of frametime. The Budget verdict in the Active Casters\n" + "section shows whether the current setting has headroom to spare."); + } else { + ImGui::Text("Budget from formula: %.2f ms", s_autoBudgetMs); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip("Edit the Redraw Budget formula in the Advanced section below."); + } + + // Budget consumption visualisation lives in the Active Casters block + // (DrawShadowSchedulerStats) alongside the verdict, so the bar, the + // numeric reading and the actionable state appear in one place + // instead of being split between two sections. + + // ---- Frame-target diagnostic (Formula mode only) ------------------ + // `frametarget` is an exprtk variable available to the Redraw Budget + // formula -- in Formula mode the user needs to see what it evaluates + // to in order to write/debug expressions that reference it. In Manual + // mode the user's chosen RedrawBudgetMs has nothing to do with frame + // timing, so this block would just be noise -- the new Budget verdict + // (in the Active Casters block) covers the "headroom / saturated" + // signal more actionably for both modes, and DrawShadowSummary covers + // the rendered/dropped lights count without duplication. + if (settings.BudgetMode == BudgetModeEnum::Formula) { + const float currentFrameMs = *globals::game::deltaTime * 1000.0f; + const float currentFPS = 1000.0f / std::max(currentFrameMs, 1.0f); + const float targetMs = ComputeFrameTimePercentile90(); + const float targetFPS = targetMs > 0.0f ? 1000.0f / targetMs : 0.0f; + const float rawHeadroom = targetMs - s_ftEMA; + const float headroomMs = rawHeadroom - kFrameHeadroomSafetyMs; + + const char* state = "steady"; + if (rawHeadroom > kFrameHeadroomSafetyMs + kFrameHeadroomDeadZoneMs) + state = "growing"; + else if (rawHeadroom < -kFrameHeadroomDeadZoneMs) + state = "throttling"; + + ImGui::Text("Frame: %.1f FPS (%.1f ms) | frametarget: %.0f FPS (%.1f ms) | headroom: %+.1f ms | %s", + currentFPS, currentFrameMs, targetFPS, targetMs, headroomMs, state); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Live values of the exprtk variables exposed to the Redraw\n" + "Budget formula. `frametarget` is the rolling 90th-percentile\n" + "frame time, used as a self-measured ceiling -- not a vsync\n" + "target. State indicator:\n" + " steady -- within +/-%.1f ms of target\n" + " growing -- frametime well below target; headroom available\n" + " throttling -- frametime over target; expressions returning\n" + " nonzero values here will keep frametime high", + kFrameHeadroomDeadZoneMs); + } + { + // Use ShadowLightCount as the slider upper bound when the scheduler hasn't + // run yet (s_totalShadowLightsThisFrame == 0 on the first menu open). + // Never clamp the stored setting here — the scheduling code already applies + // the live cap. Clamping here caused MaxRedrawPerFrame to be permanently + // written to 1 on the first DrawSettings call before the hook fired. + // Track active shadow lights this frame, falling back to the + // configured ShadowLightCount when the scheduler hasn't run yet. + // No artificial 64 cap -- if the user dialled in 128 lights, the + // redraw cap should be allowed to follow. + int maxRedraws = s_totalShadowLightsThisFrame > 0 ? s_totalShadowLightsThisFrame : settings.ShadowLightCount; + maxRedraws = std::max(maxRedraws, Settings::kMinMaxRedrawPerFrame); + ImGui::SliderInt("Max Redraws Per Frame", &settings.MaxRedrawPerFrame, + Settings::kMinMaxRedrawPerFrame, maxRedraws); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Hard cap on how many shadow lights may re-render their shadow maps in one frame.\n" + "Acts as a safety valve regardless of budget -- the budget controls time spent,\n" + "this controls count. The sun directional light always counts as one redraw.\n" + "Minimum is %d (lower values cause shadow flicker as redraw rotation outpaces TAA).\n" + "Upper bound tracks the number of active shadow lights this frame (%d).", + Settings::kMinMaxRedrawPerFrame, maxRedraws); + } + + // ---- Light conversion (requires restart for hooks) ----------------- + if (ImGui::TreeNode("Light Conversion##LightConv")) { + ImGui::Checkbox("Convert Excess Lights to Normal", &settings.ConvertExcessToNormal); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Shadow lights that exceed the active shadow caster limit are demoted to\n" + "normal (unshadowed) lights so they still contribute diffuse and specular\n" + "lighting at no shadow-map cost. Lights that fail culling are dropped entirely.\n" + "Requires a game restart to change."); + + // No texture-array cost -- converted lights flow through the cluster + // pipeline as ordinary non-shadow lights. Match the ShadowLightCount + // max so users can pair a large shadow pool with a matching converted + // pool without the slider lying about the upper bound. + ImGui::SliderInt("Converted Shadow Slots", &settings.ConvertedShadowSlots, 0, 127); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Extra pool slots for lights converted to normal (unshadowed) mode.\n" + "Increase if Convert Excess Lights drops lights you expect to see."); + + ImGui::Checkbox("Promote Normal Lights to Shadow Casters", &settings.PromoteNormalToShadow); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Experimental: elevate high-scoring unshadowed lights to shadow casters\n" + "when shadow slots are available.\n" + "Requires a game restart to change."); + + ImGui::SeparatorText("Portal-Strict Enforcement"); + // Three-way toggle plus master row. SCM forces the engine's + // portal-strict flag on shadow casters at creation time, gated + // per shadow type (FOV-derived). Defaults enforce on omni and + // hemisphere, leave spotlights alone -- portal-strict on spots + // drops culled-but-visible spots entirely (cone test rejects + // spots whose origin is behind a portal even when the beam + // sweeps into a visible room). + { + const bool allOn = settings.ForceEnablePortalStrictOmni && + settings.ForceEnablePortalStrictHemi && + settings.ForceEnablePortalStrictSpot; + const bool allOff = !settings.ForceEnablePortalStrictOmni && + !settings.ForceEnablePortalStrictHemi && + !settings.ForceEnablePortalStrictSpot; + bool master = allOn; + bool indeterminate = !allOn && !allOff; + if (indeterminate) { + // Render the master checkbox as visually mixed via a + // muted alpha so the row still functions as a "set all" + // control without misrepresenting state. + ImGui::PushStyleVar(ImGuiStyleVar_Alpha, ImGui::GetStyle().Alpha * 0.6f); + } + if (ImGui::Checkbox("Force Enable Portal Strict (All)", &master)) { + settings.ForceEnablePortalStrictOmni = master; + settings.ForceEnablePortalStrictHemi = master; + settings.ForceEnablePortalStrictSpot = master; + } + if (indeterminate) + ImGui::PopStyleVar(); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Master toggle for the three per-type rows below.\n" + "Checked when all three are enforced, unchecked when none are,\n" + "and rendered translucent when mixed.\n" + "Requires a game restart to change."); + } + + ImGui::Indent(); + ImGui::Checkbox("Force Portal Strict on Omni Lights", &settings.ForceEnablePortalStrictOmni); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Force-enable portal-strict on dual-paraboloid (omnidirectional)\n" + "shadow casters. Recommended on -- tightens portal-graph visibility\n" + "culling for full-sphere shadow lights without side effects.\n" + "Requires a game restart to change."); + ImGui::Checkbox("Force Portal Strict on Hemisphere Lights", &settings.ForceEnablePortalStrictHemi); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Force-enable portal-strict on single-paraboloid (hemisphere)\n" + "shadow casters. Recommended on -- behaves like the omni case\n" + "under portal culling.\n" + "Requires a game restart to change."); + ImGui::Checkbox("Force Portal Strict on Spot Lights", &settings.ForceEnablePortalStrictSpot); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Force-enable portal-strict on perspective (frustum/spot) shadow\n" + "casters. Off by default: the cone test rejects spots whose\n" + "origin sits behind a portal even when their beam sweeps into a\n" + "visible room, which drops culled-but-visible spots entirely.\n" + "Enable only for debugging.\n" + "Requires a game restart to change."); + ImGui::Unindent(); + + ImGui::TreePop(); + } + + // ---- Advanced (dynamic) ------------------------------------------- + if (ImGui::TreeNode("Advanced##ShadowScheduling")) { + ImGui::Checkbox("Allow Immediate Draw for New Lights", &settings.AllowDrawNewLight); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Allow a light just added to the active pool to render its shadow map this frame.\n" + "Prevents a one-frame shadow-map gap when new lights enter view."); + + // ---- Importance scheduling curve ------------------------------ + ImGui::SeparatorText("Importance Scheduling"); + ImGui::SliderFloat("Max Interval Scale", &settings.ImportanceMaxScale, 0.5f, 5.0f, "%.2f"); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Interval multiplier applied to unimportant lights (importance = 0).\n" + "Higher values defer dim or distant lights more aggressively.\n" + "Default: 2.0"); + settings.ImportanceMaxScale = std::max(settings.ImportanceMaxScale, settings.ImportanceMinScale); + + ImGui::SliderFloat("Min Interval Scale", &settings.ImportanceMinScale, 0.01f, 1.0f, "%.3f"); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip( + "Interval multiplier applied to high-importance lights (importance >= 1).\n" + "Lower values make bright/close lights update shadows more frequently.\n" + "The ratio Max/Min defines the scheduling dynamic range.\n" + "Default: 0.05 (40x range at default Max=2.0)"); + settings.ImportanceMinScale = std::min(settings.ImportanceMinScale, settings.ImportanceMaxScale); + + { + float ratio = settings.ImportanceMaxScale / std::max(settings.ImportanceMinScale, 0.001f); + ImGui::Text("Dynamic range: %.0fx (unimportant lights wait %.0fx longer)", ratio, ratio); + } + + if (ImGui::Button("Reset Importance Defaults")) { + settings.ImportanceMinScale = 0.05f; + settings.ImportanceMaxScale = 2.0f; + } + + // ---- Formula editor ------------------------------------------ + if (ImGui::TreeNode("Formula Editor##Formulas")) { + // Build variable reference from the DRY table. + if (ImGui::TreeNode("Available Variables##FormulaVars")) { + if (ImGui::BeginTable("##FormulaVarTable", 2, + ImGuiTableFlags_Borders | ImGuiTableFlags_RowBg | + ImGuiTableFlags_SizingFixedFit | ImGuiTableFlags_ScrollY, + ImVec2(0, std::min(static_cast(IM_ARRAYSIZE(kFormulaVars)) * 20.0f + 28.0f, 320.0f)))) { + ImGui::TableSetupColumn("Variable"); + ImGui::TableSetupColumn("Description"); + ImGui::TableHeadersRow(); + for (const auto& v : kFormulaVars) { + ImGui::TableNextRow(); + ImGui::TableSetColumnIndex(0); + ImGui::TextUnformatted(v.name); + ImGui::TableSetColumnIndex(1); + ImGui::TextUnformatted(v.description); + } + ImGui::EndTable(); + } + ImGui::TreePop(); + } + + static char scoreBuf[512]; + static char scoreErr[256] = {}; + static char redrawIntervalBuf[512]; + static char redrawIntervalErr[256] = {}; + static char redrawBudgetBuf[512]; + static char redrawBudgetErr[256] = {}; + static bool formulaBufsInited = false; + if (!formulaBufsInited) { + snprintf(scoreBuf, sizeof(scoreBuf), "%s", settings.ScoreFormula.c_str()); + snprintf(redrawIntervalBuf, sizeof(redrawIntervalBuf), "%s", settings.RedrawIntervalFormula.c_str()); + snprintf(redrawBudgetBuf, sizeof(redrawBudgetBuf), "%s", settings.RedrawBudgetFormula.c_str()); + formulaBufsInited = true; + } + + // Helper lambda: validate, apply live, revert buffer on error. + auto applyFormula = [](const char* label, char* buf, size_t bufSize, + std::string& settingStr, char* errBuf, size_t errBufSize, + std::unique_ptr& helper) { + ImGui::InputText(label, buf, bufSize); + if (ImGui::IsItemDeactivatedAfterEdit()) { + std::string err; + if (FormulaHelper::Validate(buf, err)) { + settingStr = buf; + errBuf[0] = '\0'; + if (helper) + helper->Reparse(settingStr); + else { + helper = std::make_unique(); + helper->Parse(settingStr); + } + } else { + snprintf(errBuf, errBufSize, "Parse error: %s", err.c_str()); + snprintf(buf, bufSize, "%s", settingStr.c_str()); + } + } + if (errBuf[0]) + ImGui::TextColored(ImVec4(1.0f, 0.3f, 0.3f, 1.0f), "%s", errBuf); + }; + + applyFormula("Score", scoreBuf, sizeof(scoreBuf), + settings.ScoreFormula, scoreErr, sizeof(scoreErr), s_formulaScore); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip("Light priority scoring formula. Higher score = more likely to get a shadow slot."); + + applyFormula("Redraw Interval", redrawIntervalBuf, sizeof(redrawIntervalBuf), + settings.RedrawIntervalFormula, redrawIntervalErr, sizeof(redrawIntervalErr), s_formulaRedrawInterval); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip("Per-light redraw interval formula. Higher = less frequent shadow map updates."); + applyFormula("Redraw Budget", redrawBudgetBuf, sizeof(redrawBudgetBuf), + settings.RedrawBudgetFormula, redrawBudgetErr, sizeof(redrawBudgetErr), s_formulaRedrawBudget); + if (ImGui::IsItemHovered()) + ImGui::SetTooltip("Per-frame redraw budget formula (ms). Empty = use the Redraw Budget (ms) slider value."); + + ImGui::TreePop(); + } + + ImGui::TreePop(); + } + + // Active casters table + scheduler stats are rendered by LightLimitFix + // alongside its own quick-stats line, so the table area has full + // testing context (cluster light count, shadow slot usage, etc.) in + // one place. See LightLimitFix::DrawSettings. + + if (!settings.Enabled) + ImGui::EndDisabled(); + + if (s_externalConflict) + ImGui::EndDisabled(); + } +} diff --git a/src/Features/LightLimitFix/ShadowCasterManager.h b/src/Features/LightLimitFix/ShadowCasterManager.h new file mode 100644 index 0000000000..6230b1c870 --- /dev/null +++ b/src/Features/LightLimitFix/ShadowCasterManager.h @@ -0,0 +1,666 @@ +// ShadowCasterManager.h +// Shadow caster scheduling for LightLimitFix. +// +// Based on Intellightent by meh321 +// https://www.nexusmods.com/skyrimspecialedition/mods/172423 +// +// Ported and adapted for Community Shaders by the Community Shaders team with permission. +// +// The original plugin managed shadow caster selection, temporal shadow +// update scheduling, and depth buffer extension entirely outside Community +// Shaders. This file houses the CPU-side shadow scheduling subsystem so it +// can live alongside (and share settings with) LightLimitFix's GPU-side +// clustered light culling without coupling the two concerns inside a single +// translation unit. + +#pragma once + +#include +#include + +#include "RE/B/BSShadowLight.h" +#include "RE/S/ShadowSceneNode.h" + +struct ImVec4; + +namespace ShadowCasterManager +{ + // Type-based shadow-caster check that bypasses the IsShadowLight vtable + // hook (which flips to false for ConvertExcessToNormal-demoted lights). + // Use this when callers need the intrinsic type, not the current shadow + // state -- e.g. InverseSquareLighting's cutoff selection, where the + // radius shouldn't oscillate as a light flips in and out of conversion. + // Cost: one vtable-pointer compare. + inline bool IsShadowLightType(RE::BSLight* bsLight) + { + return skyrim_cast(bsLight) != nullptr; + } + + // shadowLightsAccum iterator. shadowLightsAccum is a flat slot array + // where a dual-paraboloid light occupies shadowMapCount==2 consecutive + // physical slots (second is null). ForEachShadowLight advances by + // shadowMapCount so each logical light is visited once. + // + // WARNING: _size is never updated -- do not push_back or use range-for / + // BSTArray iterators on this directly. + /// Conservative upper bound on shadowLightsAccum iteration index, derived + /// from the active scheduler settings (ShadowLightCount + sun cascades). + /// Used by ForEachShadowLight as a setting-aware safety cap so corrupt + /// / non-null-terminated arrays can't loop forever, while still allowing + /// iteration past BSTArray's static _capacity. + std::uint32_t MaxShadowAccumIterationBound(); + + /// kSHADOWMAPS texture-array slot count the engine actually allocated. + /// 0 until the SRV becomes readable; the read is lazy and self-healing + /// across frames so callers can use it without timing constraints. + /// Consumers in the cluster pipeline, scheduler, and UI should call + /// this rather than reaching into Deferred / the renderer directly. + std::uint32_t GetInstalledSlotCount(); + + /// Live VRAM telemetry used for shadow-array sizing decisions and stats. + /// All values in bytes; populated from IDXGIAdapter3::QueryVideoMemoryInfo + /// + the kSHADOWMAPS texture's actual geometry. valid=false when the + /// adapter/texture aren't ready yet (e.g. before SetupResources). + struct VRAMInfo + { + std::uint64_t currentUsageBytes = 0; ///< VRAM currently allocated to this process (local heap) + std::uint64_t budgetBytes = 0; ///< Driver-suggested budget for this process + std::uint64_t shadowArrayBytes = 0; ///< Bytes currently used by the kSHADOWMAPS texture array + std::uint32_t shadowWidth = 0; ///< Per-slice width + std::uint32_t shadowHeight = 0; ///< Per-slice height + std::uint32_t shadowSlices = 0; ///< Current kSHADOWMAPS ArraySize + std::uint32_t bytesPerSlice = 0; ///< Per-slice byte cost (width*height*format size) + bool valid = false; + }; + VRAMInfo GetVRAMInfo(); + + /// Predict the kSHADOWMAPS texture-array byte size for a given slice count + /// using the current per-slice geometry. Returns 0 if VRAMInfo isn't valid yet. + std::uint64_t ProjectShadowArrayBytes(std::uint32_t sliceCount); + + template + inline void ForEachShadowLight(const RE::BSTArray& accum, Fn&& fn) + { + // Engine writes via SetShadowCasterLightArrayEntry which bypasses + // BSTArray::push_back, so _capacity stays at the initial preallocation + // -- using capacity as the bound silently caps SLF at vanilla shadow + // counts. Use the null sentinel instead, with a setting-derived + // safety cap (ShadowLightCount + sun cascades, with a small margin). + // Per-pointer plausibility (alignment + user-mode range) handles + // non-null garbage between our prepass and this read. + const std::uint32_t maxIdx = MaxShadowAccumIterationBound(); + std::uint32_t idx = 0; + while (idx < maxIdx) { + RE::BSShadowLight* light = accum[idx]; + if (!light) + break; + const auto raw = reinterpret_cast(light); + if (raw >= 0x0000800000000000ull || (raw & 0x7) != 0) + break; + fn(light); + const std::uint32_t step = light->shadowMapCount; + if (step == 0) + break; + const std::uint64_t next = static_cast(idx) + step; + if (next >= maxIdx) + break; + idx = static_cast(next); + } + } + + // ------------------------------------------------------------------------- + // Formula parameter indices + // ------------------------------------------------------------------------- + enum FormulaParams + { + kFormulaParam_LightIndex, + kFormulaParam_LightIntensity, + kFormulaParam_LightDistance, + kFormulaParam_LightRadius, + kFormulaParam_LightX, + kFormulaParam_LightY, + kFormulaParam_LightZ, + kFormulaParam_LightR, + kFormulaParam_LightG, + kFormulaParam_LightB, + kFormulaParam_LightAmbientR, + kFormulaParam_LightAmbientG, + kFormulaParam_LightAmbientB, + kFormulaParam_LightChosenLastFrame, + kFormulaParam_LightFramesSinceRender, ///< frames since this light's slot was last rendered; large sentinel if never rendered or no slot + kFormulaParam_LightNeverFades, + kFormulaParam_LightPortalStrict, + kFormulaParam_LightNS, + kFormulaParam_LightConverted, + kFormulaParam_LightDisplacement, ///< distance moved since last shadow map render (game units) + kFormulaParam_PlayerLightDistance, ///< distance from the player character to the light (game units) + kFormulaParam_LightImportance, ///< contribution importance: lum(diffuse*fade) * max(att_cam, att_plr); set in interval loop only + kFormulaParam_LightIsSpot, ///< 1 if light is a spot (BSShadowFrustumLight), 0 otherwise + kFormulaParam_LightSpotVisible, ///< 1 if a spot's cone is plausibly visible to the camera (cone-aimed-at-frustum). Always 1 for non-spots so omni-only formulas aren't affected. + + kFormulaParam_CameraX, + kFormulaParam_CameraY, + kFormulaParam_CameraZ, + kFormulaParam_IsInterior, + kFormulaParam_TimeOfDay, + + kFormulaParam_FrameTime, ///< EMA-smoothed frame time (ms) + kFormulaParam_FrameTarget, ///< 90th-percentile frame time (ms) — target budget ceiling + kFormulaParam_StableFrames, ///< consecutive frames the EMA has been below FrameTarget + + kFormulaParam_Max + }; + + // ------------------------------------------------------------------------- + // Expression-based formula evaluator (wraps exprtk) + // ------------------------------------------------------------------------- + struct FormulaHelper + { + FormulaHelper(); + ~FormulaHelper(); + + FormulaHelper(const FormulaHelper&) = delete; + FormulaHelper& operator=(const FormulaHelper&) = delete; + FormulaHelper(FormulaHelper&&) = delete; + FormulaHelper& operator=(FormulaHelper&&) = delete; + + bool Parse(const std::string& input); + double Calculate(); + + /// Re-parse with a new expression, replacing any previously compiled formula. + /// Returns true on success. On failure the old formula remains active. + bool Reparse(const std::string& input); + + /// Compile `input` into a temporary expression and return true if it succeeds. + /// On failure, `errorOut` receives the first parser error message. + /// Does NOT affect the active formula. + static bool Validate(const std::string& input, std::string& errorOut); + + static void SetParam(int32_t index, double value); + static double GetParam(int32_t index); + + private: + void* _ptr; + }; + // ------------------------------------------------------------------------- + // ------------------------------------------------------------------------- + // Budget mode enum + // ------------------------------------------------------------------------- + enum class BudgetModeEnum : int32_t + { + Auto = 0, ///< DEPRECATED: kept only for save-file backward compat. Migrated to Formula at load. + Manual = 1, ///< Fixed slider value + Formula = 2, ///< User-editable exprtk expression (default) + }; + + // ------------------------------------------------------------------------- + // Settings + // All shadow-scheduling knobs. Held inside LightLimitFix::Settings and + // serialised as part of that JSON blob. Pass a const-ref to Init(). + // ------------------------------------------------------------------------- + struct Settings + { + /// Enable the shadow caster scheduler entirely. Requires a game restart to take effect. + bool Enabled = true; + + /// Number of simultaneous shadow-casting point/spot lights (NOT counting the directional sun). + /// 0 = scheduler active but selects no point lights (sun/directional unaffected). + /// 4 = vanilla point light count with intelligent selection replacing the game's default. + /// 5-127 = extended mode; depth buffer array is expanded beyond game's 8-slot limit + /// when this exceeds 8. The practical ceiling is VRAM (per-slice cost + /// of the kSHADOWMAPS texture array) -- the in-game settings panel + /// shows a live projection so users can see when a value won't fit. + /// Higher values allow more lights to hold stale shadow maps between redraws at + /// the cost of startup memory. The redraw budget and interval formula control + /// per-frame GPU cost independently. + int32_t ShadowLightCount = 16; + + /// Number of additional converted-light slots (lights treated as normal lights + /// for geometry but tracked alongside shadow casters when ConvertExcessToNormal is enabled). + int32_t ConvertedShadowSlots = 32; + + /// Allow a newly-chosen light to draw even if it was not chosen last frame. + bool AllowDrawNewLight = true; + + /// Hard cap on how many lights may re-render their shadow maps in one + /// frame. Floored at kMinMaxRedrawPerFrame. + int32_t MaxRedrawPerFrame = 16; + + /// Lower bound for MaxRedrawPerFrame. Below this the per-slot redraw + /// rotation is slow enough that camera-relative jitter on the cluster + /// shadow lookup crosses occluder silhouettes between TAA frames, + /// producing visible shadow flicker on the nearest light's + /// contribution. Enforced in the ImGui slider and on JSON load. + static constexpr int32_t kMinMaxRedrawPerFrame = 4; + + /// How the per-frame shadow redraw budget is determined. + /// Manual is the default — predictable, doesn't ping-pong, and matches + /// the spirit of Intellightent's original behaviour. Formula is available + /// for power users who want adaptive logic, with the caveat that any + /// formula referencing `frametime` will tend to oscillate (rendering + /// shadows raises frametime, which removes the headroom that allowed + /// the budget — classic feedback loop without hysteresis). + BudgetModeEnum BudgetMode = BudgetModeEnum::Manual; + + /// Per-frame time budget for shadow re-renders (milliseconds). + /// Used in Manual mode. Lights whose estimated GPU cost would exceed this + /// are deferred to a later frame. + float RedrawBudgetMs = 5.0f; + + /// Demote shadow lights that exceed the active caster limit to normal (non-shadow) lights + /// so they still contribute diffuse lighting without a shadow-map cost. + bool ConvertExcessToNormal = true; + + /// Promote normal (non-shadow) lights to shadow casters when there is budget. + /// Disabled by default; experimental. + bool PromoteNormalToShadow = false; + + /// Force-enable portal-strict on shadow casters as they're added by + /// the engine. Per-type because portal-strict on spotlights drops + /// culled-but-visible spots entirely, while on omnis/hemispheres it + /// usefully tightens the engine's portal-graph visibility test. + /// + /// Defaults: omni + hemi enforced, spotlights left to their + /// engine-authored portal-strict flag. + bool ForceEnablePortalStrictOmni = true; + bool ForceEnablePortalStrictHemi = true; + bool ForceEnablePortalStrictSpot = false; + + // --- Formula strings (exprtk expressions) --- + + /// Light priority scoring formula (exprtk). Variables: + /// lightindex, lightintensity, lightdistance, playerlightdistance, + /// lightradius, lightx/y/z, lightr/g/b, lightambientr/g/b, + /// lightchosenlastframe, lightframessincerender, lightneverfades, + /// lightportalstrict, lightns, lightconverted, camerax/y/z, + /// isinterior, timeofday, lightisspot, lightspotvisible + /// Default-formula notes: + /// (1 + lightisspot * lightspotvisible) gives visible spots 2x, + /// omnis 1x. lightspotvisible=0 for spots pointing away from camera. + /// max(0, 1 - lightframessincerender / 8) * 0.4 is smooth temporal + /// stickiness; recently-rendered lights resist demotion across small + /// score perturbations, decaying to 0 over 8 frames since last redraw. + std::string ScoreFormula = "lightradius * lightintensity / (1 + ((1 - lightneverfades) * lightdistance) / 1000) * (1 + max(0, 1 - lightframessincerender / 8) * 0.4) * (1 + lightisspot * lightspotvisible)"; + + /// Redraw interval formula (per light). Higher = less frequent redraws. + /// Uses min(lightdistance, playerlightdistance) so that a light near the player + /// character is always treated as close even in third-person (camera is further away). + /// `lightdisplacement` further reduces the interval for lights that have moved. + std::string RedrawIntervalFormula = "min(10, (max(0, min(lightdistance, playerlightdistance) - lightradius * 0.5) / 500) / max(0.5, lightintensity)) * (lightconverted * 5 + 1) - min(lightdisplacement / 5, 10)"; + + /// Redraw budget formula (per frame, in ms). Used in Formula mode. + /// Default mirrors Intellightent's original behaviour: a flat 1 ms outdoors + /// (`isinterior` = 0) and 2 ms indoors (`isinterior` = 1). Predictable and + /// doesn't oscillate. + /// + /// Available variables: frametime (smoothed ms), frametarget (90th-pct ms), + /// stableframes, isinterior, plus the per-light variables (used by ScoreFormula + /// and RedrawIntervalFormula but evaluated to last-light values here). + /// + /// Avoid adaptive formulas that subtract shadow GPU cost from frametime + /// headroom -- they oscillate (rendering shadows raises frametime, + /// which zeroes the budget, which drops frametime, restoring the + /// budget). exprtk has no hysteresis state. Use static expressions. + std::string RedrawBudgetFormula = "1 + isinterior"; + + // --- Importance scheduling curve --- + + /// Interval multiplier applied to high-importance lights (importance >= 1). + /// Lower values make frequently-contributing lights update shadows more aggressively. + /// Default: 0.05 (updates 40x more frequently than unimportant lights). + float ImportanceMinScale = 0.05f; + + /// Interval multiplier applied to unimportant lights (importance == 0). + /// Higher values defer dim or distant lights more aggressively. + /// Default: 2.0. + float ImportanceMaxScale = 2.0f; + }; + + NLOHMANN_JSON_SERIALIZE_ENUM(BudgetModeEnum, + { { BudgetModeEnum::Auto, 0 }, { BudgetModeEnum::Manual, 1 }, { BudgetModeEnum::Formula, 2 } }) + + NLOHMANN_DEFINE_TYPE_NON_INTRUSIVE_WITH_DEFAULT( + Settings, + Enabled, + ShadowLightCount, + ConvertedShadowSlots, + AllowDrawNewLight, + MaxRedrawPerFrame, + BudgetMode, + RedrawBudgetMs, + ConvertExcessToNormal, + PromoteNormalToShadow, + ForceEnablePortalStrictOmni, + ForceEnablePortalStrictHemi, + ForceEnablePortalStrictSpot, + ScoreFormula, + RedrawIntervalFormula, + RedrawBudgetFormula, + ImportanceMinScale, + ImportanceMaxScale) + + // ------------------------------------------------------------------------- + // Per-light schedule entry + // ------------------------------------------------------------------------- + struct LightEntry + { + RE::BSShadowLight* Light{ nullptr }; + + /// Sort key: LastDrawnFrame + computed interval. Lower = higher priority. + double RedrawScore{ 0.0 }; + + /// Frame number this light last rendered its shadow map. + int32_t LastDrawnFrame{ -1 }; + + /// Set each frame by the scheduler; consumed by the render hook. + bool RedrawFrame{ false }; + + /// Slot index in the LightContainer array. + int32_t Index{ -1 }; + + /// World position of the light at its last rendered shadow map frame. + /// Used to prioritise redraws for lights that have moved significantly. + RE::NiPoint3 lastRenderedPos{ 0.0f, 0.0f, 0.0f }; + + /// Contribution-weighted importance score from the last scheduling frame. + /// importance = luminance(diffuse × fade) × attenuation²(viewer, radius) + /// where attenuation = max(1 − (dist/radius)², 0) (Skyrim's quadratic falloff). + /// Typically in [0, 1]; can exceed 1 for very bright lights at close range. + /// Higher = light strongly illuminates the area around the viewer. + float lastImportance{ 0.0f }; + + /// Hash of the shadow scene at the most recent successful redraw: + /// light pose + radius + each caster's worldBound + identity. Compared + /// against the current frame's hash to detect when the cached shadow + /// map is still pixel-correct (no geometric change since last render). + /// Industry term: "cached shadow maps" (UE5, CryEngine, Frostbite). + /// 0 sentinel = never-rendered; treat as "needs redraw" on first frame. + std::uint64_t lastGeomHash{ 0 }; + + /// Hash computed for the current frame's scoring pass. Promoted into + /// lastGeomHash only when this entry actually redraws (RedrawFrame=true), + /// so the cache key reflects what's *in the slot*, not what we observed. + std::uint64_t pendingGeomHash{ 0 }; + + void Clear() + { + Light = nullptr; + LastDrawnFrame = -1; + RedrawFrame = false; + lastRenderedPos = { 0.0f, 0.0f, 0.0f }; + lastImportance = 0.0f; + lastGeomHash = 0; + pendingGeomHash = 0; + } + }; + + // ------------------------------------------------------------------------- + // Container for the active light pool + // ------------------------------------------------------------------------- + struct LightContainer + { + LightEntry* Lights{ nullptr }; + + /// true when index 0 is the directional sun (always active, never rescheduled). + bool Sun{ false }; + + /// Total allocated slots (ShadowLightCount + ConvertedShadowSlots). + int32_t Size{ 0 }; + + /// Returns the first free shadow-caster slot index, or -1 if full. + int32_t FindFreeIndex(bool shadowSlot, int32_t shadowCount, int32_t convertCount) const; + + /// Returns the index of a light pointer in the shadow-caster range, or -1. + int32_t FindLight(RE::BSShadowLight* light, int32_t shadowCount) const; + + /// First pool index of the point-light range. Equals 1 when Sun=true + /// (slot 0 reserved for sun bookkeeping), 0 when Sun=false. + int32_t PointLightFirst() const { return Sun ? 1 : 0; } + + /// One-past-last pool index of the point-light range, given the + /// configured ShadowLightCount. Use as the exclusive upper bound for + /// `for (i = PointLightFirst(); i < PointLightEnd(N); ++i)` iteration + /// over chosen+candidate point lights (excludes converted slots which + /// follow at [PointLightEnd..PointLightEnd + ConvertedShadowSlots)). + /// + /// Off-by-one history: pre-this-helper, code iterated [0, shadowCount), + /// which missed pool[shadowCount] when Sun=true. The highest point-light + /// slot was then unfindable / unrendered / un-redrawn — silent loss of + /// one shadow caster slot when a sun is present. + int32_t PointLightEnd(int32_t shadowCount) const { return PointLightFirst() + shadowCount; } + }; + + // ------------------------------------------------------------------------- + // Per-light GPU timing tracker (sliding-window average over 8 frames) + // ------------------------------------------------------------------------- + static constexpr int kBudgetWindowSize = 8; + + struct BudgetEntry + { + uint64_t Key{ 0 }; + uint32_t Tracked[kBudgetWindowSize]{}; ///< Ring buffer of per-frame µs costs. + int32_t TrackedCount{ 0 }; + int32_t LastTrackedHelper{ -1 }; + uint32_t Progress{ 0 }; ///< Accumulated step-0 cost awaiting step-1. + int32_t Current{ 0 }; ///< Rolling sum of Tracked[]. + + void BeginStep(int32_t step); + void EndStep(int32_t step, int32_t helperCounter); + + /// Returns true when the entry hasn't been updated in ~600 scheduler ticks. + bool IsExpired(int32_t helperCounter) const; + + private: + int64_t _startTime{ 0 }; + }; + + struct BudgetTracker + { + void Begin(int32_t step); + void BeginLight(RE::BSShadowLight* light, int32_t step); + void EndLight(RE::BSShadowLight* light, int32_t step); + + /// Returns estimated render cost (µs) for a light. + /// Falls back to the mean of all tracked lights for unseen lights. + int32_t GetCost(RE::BSShadowLight* light) const; + + /// Returns the mean GPU cost (µs) averaged over all currently tracked lights. + int32_t GetAverageCostUs() const; + + private: + int32_t _counter{ 0 }; + std::unordered_map> _map; + + void CleanupExpired(); + }; + + // ------------------------------------------------------------------------- + // Per-slot visualization metadata (filled by LLF::CopyShadowLightData) + // ------------------------------------------------------------------------- + struct ShadowSlotInfo + { + uint32_t type = 0; ///< Shadow type: 0=spot/frustum, 1=hemisphere, 2=omnidirectional + float range = 0.0f; ///< Light range (world units) -- radius for point lights, cone distance for spots + bool valid = false; ///< true when this slot was written this frame + uintptr_t lightKey = 0; ///< Light object pointer (stable key for suppression) + }; + + /// Resets slot metadata for a new frame. Call at the start of CopyShadowLightData. + void BeginSlotFrame(uint32_t slotCount); + + /// Records metadata for one filled shadow slot. + void RecordSlot(uint32_t depthSlot, const ShadowSlotInfo& info); + + /// Returns true if the light with this pointer key has been suppressed by the user. + /// Includes implicit suppression from solo mode (every key except the soloed one). + bool IsSuppressed(uintptr_t lightKey); + + /// Returns true if any lights are currently suppressed (explicit or via solo). + bool HasSuppressedLights(); + + /// Returns true if any debug override is active (suppress / pin shadow / + /// pin convert / solo). Used by the LLF overlay's visibility gate so the + /// overlay stays available while users have any override in effect, even + /// without the visualisation modes or the explicit ShowShadowOverlay toggle. + bool HasAnyOverrides(); + + // ------------------------------------------------------------------------- + // Debugging override API + // + // Per-light state pins (Shadow / Convert) override the scheduler's automatic + // chosen/excess decision. Useful for isolating a single light's behaviour + // when chasing scheduler / cluster pipeline regressions: + // - Pin Shadow: bias scoring so the light is forced into the chosen pool + // (gets a real shadow slot up to ShadowLightCount). + // - Pin Convert: bias scoring to the bottom and force ConvertLight in the + // excess branch regardless of the ConvertExcessToNormal user setting + // (still honours the spot-gate -- spots that can't safely convert are + // disabled instead). + // - Suppress: existing behaviour (ShadowParam.y = -1 for casters; cluster + // filter for converted / non-shadow lights via solo). + // - Solo: when set, every key OTHER than the soloed one is reported as + // suppressed via IsSuppressed(). Lets you isolate one light's + // contribution against a black scene. + // ------------------------------------------------------------------------- + bool IsPinnedShadow(uintptr_t lightKey); + bool IsPinnedConvert(uintptr_t lightKey); + + void SetPinnedShadow(uintptr_t lightKey, bool pinned); + void SetPinnedConvert(uintptr_t lightKey, bool pinned); + + uintptr_t GetSoloLight(); + void SetSoloLight(uintptr_t lightKey); // 0 clears solo + + /// Mouse-hover key for the per-frame debug pulse. Set per row by the table + /// when the row is hovered; reset to 0 when the table redraws or the cursor + /// leaves the table. The cluster light builder (LightLimitFix::UpdateLights) + /// reads this to apply a magenta pulse to the matching light, making it + /// visible in 3D against the rest of the scene. + uintptr_t GetHoveredLight(); + void SetHoveredLight(uintptr_t lightKey); + + /// Drops every override (suppress / pin shadow / pin convert / solo). + /// Useful when a debugging session has accumulated state and lights are + /// mysteriously hidden — one click resets to the scheduler's auto behaviour. + void ClearAllOverrides(); + + /// Returns the number of shadow slots consumed this frame. + uint32_t GetSlotUsage(); + + /// Returns the number of active shadow-casting lights whose importance score + /// exceeds 0.1 (lights meaningfully illuminating the camera or player area). + uint32_t GetHighImportanceCount(); + + /// Read-only view of the per-slot metadata for the current frame. + const std::vector& GetSlotInfos(); + + /// Returns the display name for a shadow type index (0=Spot, 1=Hemi, 2=Omni). + const char* GetShadowTypeName(uint32_t type); + + /// Returns the golden-ratio hue colour for shadow-map slot slotIdx as an ImVec4. + /// Matches the mode-8 shader visualisation colour. + ImVec4 ShadowSlotHueColor(uint32_t slotIdx); + + /// Draw the interactive shadow caster table (suppress/filter/sort). + /// compact=true caps height; showColor adds a hue swatch column (viz mode 8). + /// sceneOnly=true shows only lights currently in the scene (overlay); false shows all known lights including disabled ones (settings). + /// readOnly hides the per-row Mode/Solo buttons (overlay when the menu is + /// closed isn't interactive anyway, so the buttons just take up space). + void DrawShadowLightTable(bool compact, bool showColor, bool sceneOnly = false, bool readOnly = false); + + /// Canonical one-place "where are we vs the limits" summary. Used by both + /// the menu's Active Casters block and the overlay header so the same + /// numbers appear identically in both views. clusterCount/clusterMax come + /// from LightLimitFix; the rest is read from SCM internal state. + void DrawShadowSummary(uint32_t clusterCount, uint32_t clusterMax, uint32_t shadowUnshadowedLightCount); + + // ------------------------------------------------------------------------- + // Public API + // ------------------------------------------------------------------------- + + /// Call once from LightLimitFix::PostPostLoad() before Install(). + /// Allocates the light container and initialises state from settings. + void Init(const Settings& settings); + + /// Install all game hooks. Call from LightLimitFix::PostPostLoad(). + void Install(const Settings& settings); + + /// Per-frame update: refreshes installed slot count from the texture-array + /// capacity and applies settings changes (pool resize, etc.). The actual + /// scheduling -- choosing which lights cast shadows this frame -- happens + /// in the hooked `CalculateActiveShadowCasters` path via ScheduleShadowCasters. + /// Call from LightLimitFix::Prepass(). + void Update(const Settings& settings, RE::ShadowSceneNode* shadowSceneNode, + RE::NiCamera* worldCamera); + + /// Clear all transient session state (pool entries, converted-light + /// tracking, debug overrides). Used by the LoadingMenu hook to drop + /// pointers to lights the engine is about to free during fast-travel + /// / cell change. The per-frame reconciliation in ScheduleShadowCasters + /// covers the same ground for incremental changes; this is the wholesale + /// reset for known scene boundaries so the UI counter and table read 0 + /// during the loading screen instead of carrying stale entries forward. + void ResetSession(); + + /// Register the LoadingMenu open/close handler so ResetSession fires + /// when the user starts a fast-travel or cell transition. Call once + /// from SCM::Install after the rest of the hooks are in place. + void RegisterSceneTransitionEvents(); + + /// Returns a read-only view of the active light pool for UI/visualization. + const LightContainer& GetLights(); + + /// Returns the kSHADOWMAPS texture-array slot for an active point/spot + /// shadow light as a raw slice index 0..GetInstalledSlotCount()-1, or -1 + /// when the light is either not active in the SCM pool OR is the sun. + /// Consumers (ShadowRenderer upload, LightLimitFix cluster builder, + /// strict-light shadow-flag setup) treat the -1 sentinel as "skip" -- + /// the sun renders to kSHADOWMAPS_ESRAM (a separate texture) and has no + /// kSHADOWMAPS slice; inactive lights have no slot at all. + /// Uses the internal s_lights pool -- does not read the descriptor's + /// shadowmapIndex field, which may be corrupted by ReturnShadowmaps(). + int32_t GetShadowSlot(RE::BSShadowLight* light); + + /// Visit every shadow light currently demoted to non-shadow rendering via + /// ConvertExcessToNormal. These lights live in the engine's activeShadowLights + /// list (0x148) but are reported as non-shadow by Hook_IsShadowLight. The + /// cluster pipeline (LightLimitFix::UpdateLights) needs to inject them into + /// lightsData[] without the Shadow flag so they still contribute diffuse light. + /// + /// Visitor signature: void(RE::BSShadowLight* light). Pointers are stable for + /// the duration of the call (no concurrent scheduler mutation). + void ForEachConvertedLight(const std::function& visitor); + + /// Draw scheduler stats (avg redraws/frame and avg per-light cost). + /// Reads internal SCM state so the caller doesn't need accessors. Intended + /// to render directly under DrawShadowLightTable for testing context. + void DrawShadowSchedulerStats(); + + /// Draw per-mode overlay info for shadow-related visualisation modes (3-9). + /// Call from LightLimitFix::DrawOverlay() inside the vizOn block for modes >= 3. + /// totalLightCount is the current clustered light count owned by LightLimitFix. + void DrawOverlayShadowModeInfo(uint32_t mode, uint32_t shadowUnshadowedLightCount, uint32_t totalLightCount); + + /// Appends tooltip text for visualisation modes 3-9 (all shadow-specific). + /// Call from LightLimitFix::DrawSettings() inside the LightsVisualisationMode hover tooltip, + /// immediately after the LLF-owned entries for modes 0-2. + void DrawVisualisationTooltipShadowModes(); + + /// Draw the ImGui settings panel for the shadow caster scheduler. + /// Call from LightLimitFix::DrawSettings(). + void DrawSettings(Settings& settings); + + /// Apply any Skyrim-side INI overrides SCM owns (currently just + /// iShadowMapResolution:Display) at LoadSettings time. The engine has + /// already read SkyrimPrefs.ini at startup so this is a no-op for now, + /// but the seam exists for future overrides that need a pre-Install + /// hook. Call from LightLimitFix::LoadSettings. + void LoadINISettings(); + + /// Persist any Skyrim-side INI settings SCM owns directly to the user's + /// SkyrimPrefs.ini in their Documents folder. Only writes when the user + /// actually edited a value this session. Call from + /// LightLimitFix::SaveSettings. + void SaveINISettings(); + +} diff --git a/src/Features/LightLimitFix/ShadowRenderer.cpp b/src/Features/LightLimitFix/ShadowRenderer.cpp new file mode 100644 index 0000000000..9b44116d6b --- /dev/null +++ b/src/Features/LightLimitFix/ShadowRenderer.cpp @@ -0,0 +1,353 @@ +// Shadow rendering operations for LightLimitFix. +// Contains: resource setup, per-frame data copy, and shadow-specific UI. + +#include "../LightLimitFix.h" +#include "Deferred.h" +#include "Menu/ThemeManager.h" +#include "State.h" +#include "Util.h" + +// Fills a ShadowLightData entry from a light's shadowmap descriptor transform. +// Returns true on success, false when the light has no usable descriptors -- +// the caller must treat false as "do not advertise a valid shadow for this +// slot", because ShadowProj remains at its default zero matrix and the +// shader's depth-comparison sampling against that matrix collapses to +// "fully shadowed" (the worst possible visual outcome -- e.g. grass goes +// pitch black under any shadow-flagged point light). Pair this with a +// ShadowParam.y = 0 fallback in the caller so the shader's safe sentinel +// (`if (ShadowLightParam.y == 0) return 1.0;`) keeps the slot fully lit +// instead of fully dark. +template +static bool SetShadowParameters(T& lightData, Deferred::ShadowLightData& sd) +{ + if (lightData.shadowmapDescriptors.empty()) + return false; + + auto& desc = lightData.shadowmapDescriptors[0]; + DirectX::XMMATRIX proj = DirectX::XMLoadFloat4x4(reinterpret_cast(&desc.lightTransform)); + DirectX::XMStoreFloat4x4(&sd.ShadowProj, proj); + + DirectX::XMMATRIX invProj = DirectX::XMMatrixInverse(nullptr, proj); + DirectX::XMStoreFloat4x4(&sd.InvShadowProj, invProj); + + sd.ShadowParam.z = lightData.shadowBiasScale * 0.00025f; + return true; +} + +// ─── Per-frame shadow data copy ─────────────────────────────────────────────── + +void LightLimitFix::EarlyPrepass() +{ + auto state = globals::state; + state->BeginPerfEvent("LLF CopyShadowLightData"); + CopyShadowLightData(); + state->EndPerfEvent(); +} + +void LightLimitFix::CopyShadowLightData() +{ + ZoneScoped; +#ifdef TRACY_ENABLE + TracyD3D11Zone(globals::state->tracyCtx, "LLF CopyShadowLightData"); +#endif + + uint32_t slots = ShadowCasterManager::GetInstalledSlotCount(); + if (slots == 0) { + // Clean degradation when SCM hasn't published a usable slot count yet + // (e.g. before SetupResources finishes, or after a transient + // reallocation failure). Without this clear, the previous frame's + // slot metadata, counters, and PS bindings at t102/t103 stay live -- + // the overlay shows stale shadow rows and shaders keep sampling + // stale shadow records instead of degrading cleanly to unshadowed. + ShadowCasterManager::BeginSlotFrame(0); + shadowLightCount = 0; + shadowUnshadowedLightCount = 0; + ID3D11ShaderResourceView* nullSRVs[2]{ nullptr, nullptr }; + globals::d3d::context->PSSetShaderResources(102, ARRAYSIZE(nullSRVs), nullSRVs); + return; + } + + auto* shadowSceneNode = globals::game::smState->shadowSceneNode[0]; + if (!shadowSceneNode) { + // Same cleanup contract as the slots==0 path above: clear slot + // metadata + counters and unbind t102/t103 so the overlay doesn't + // show stale rows and shaders degrade to unshadowed instead of + // sampling a previous frame's records. + ShadowCasterManager::BeginSlotFrame(0); + shadowLightCount = 0; + shadowUnshadowedLightCount = 0; + ID3D11ShaderResourceView* nullSRVs[2]{ nullptr, nullptr }; + globals::d3d::context->PSSetShaderResources(102, ARRAYSIZE(nullSRVs), nullSRVs); + return; + } + + // Lazy (re)allocation when slot count changes (e.g. on resolution change). + if (!shadowLights || shadowLightsCapacity != slots) { + delete shadowLights; + shadowLights = nullptr; + + D3D11_BUFFER_DESC sbDesc{}; + sbDesc.Usage = D3D11_USAGE_DYNAMIC; + sbDesc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE; + sbDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE; + sbDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED; + sbDesc.StructureByteStride = sizeof(Deferred::ShadowLightData); + sbDesc.ByteWidth = slots * sizeof(Deferred::ShadowLightData); + + D3D11_SHADER_RESOURCE_VIEW_DESC srvDesc{}; + srvDesc.Format = DXGI_FORMAT_UNKNOWN; + srvDesc.ViewDimension = D3D11_SRV_DIMENSION_BUFFER; + srvDesc.Buffer.FirstElement = 0; + srvDesc.Buffer.NumElements = slots; + + shadowLights = new Buffer(sbDesc, nullptr, "LLF::ShadowLights"); + shadowLights->CreateSRV(srvDesc); + shadowLightsCapacity = slots; + } + + // Static reusable buffer for per-frame shadow light data. The previous + // `std::vector(slots)` ctor heap-allocated every frame in this render + // hot path -- avoidable churn given the slot count only changes on + // resolution / settings reconfigures (matched by the shadowLights + // buffer reallocation block above). assign(slots, {}) reuses the + // backing storage when slot count is stable and zero-fills entries. + static std::vector sd; + sd.assign(slots, {}); + uint32_t prevSlotUsage = ShadowCasterManager::GetSlotUsage(); + ShadowCasterManager::BeginSlotFrame(slots); + auto context = globals::d3d::context; + + ID3D11ShaderResourceView* shadowMapsSRV = + globals::game::renderer->GetDepthStencilData().depthStencils[RE::RENDER_TARGET_DEPTHSTENCIL::kSHADOWMAPS].depthSRV; + + uint32_t plCount = 0; + uint32_t unshadowedLights = 0; + ShadowCasterManager::ForEachShadowLight(shadowSceneNode->GetRuntimeData().shadowLightsAccum, + [&](RE::BSShadowLight* light) { + // Use the stable container-slot index from s_lights rather than + // reading shadowmapDescriptors[0].shadowmapIndex, which can drift + // relative to our scheduler-assigned slot when ReturnShadowmaps + // fires between ScheduleShadowCasters and this function. + int32_t stableSlot = ShadowCasterManager::GetShadowSlot(light); + if (stableSlot < 0) { + // Sun (BSShadowDirectionalLight) — no kSHADOWMAPS slice. Its + // shadow lives in kSHADOWMAPS_ESRAM and is sampled through a + // separate path (DirectionalShadowCascades at t99). Skip + // silently so we don't count it as an "unshadowed point + // light" or scribble garbage into sd[0]. + return; + } + if (static_cast(stableSlot) >= slots) { + unshadowedLights++; + plCount++; + return; + } + uint32_t depthSlot = static_cast(stableSlot); + + { + float shadowTypeF = light->GetIsParabolicLight() ? float(light->shadowMapCount == 2 ? 2 : 1) : 0.f; + sd[depthSlot].ShadowParam.x = shadowTypeF; + + const bool projValid = globals::game::isVR ? + SetShadowParameters(light->GetVRRuntimeData(), sd[depthSlot]) : + SetShadowParameters(light->GetRuntimeData(), sd[depthSlot]); + + float range = light->light->GetLightRuntimeData().radius.x; + // ShadowParam.y semantics in the shader: + // > 0 → valid radius; sample kSHADOWMAPS via ShadowProj at the slot. + // == 0 → safe sentinel; shader returns 1.0 (fully lit, no shadow). + // < 0 → suppression sentinel; shader returns 0.0 (fully dark). + // If SetShadowParameters skipped (empty descriptors -> ShadowProj + // stays default zero matrix), we MUST leave ShadowParam.y at 0 so + // the safe sentinel fires. Otherwise the shader samples a zero + // projection -> depth comparison says fully shadowed -> any + // shadow-flagged light with stale descriptors makes grass go + // pitch black under that light. + uintptr_t lightKey = reinterpret_cast(light); + const bool suppressed = ShadowCasterManager::IsSuppressed(lightKey); + sd[depthSlot].ShadowParam.y = suppressed ? -1.0f : (projValid ? range : 0.0f); + ShadowCasterManager::RecordSlot(depthSlot, + { static_cast(shadowTypeF), range, true, lightKey }); + } + + plCount++; + }); + + if (plCount != shadowLightCount || ShadowCasterManager::GetSlotUsage() != prevSlotUsage || unshadowedLights != shadowUnshadowedLightCount) { + shadowLightCount = plCount; + shadowUnshadowedLightCount = unshadowedLights; + + // Throttle the count-change log: this fires every time plCount or + // slot usage moves by even 1, which in busy outdoor scenes is + // effectively every frame. Earlier logs averaged ~10 entries/sec + // (25k lines over a 39-minute session, dwarfing every other + // signal). Two filters: + // - Significance: only log when the count moves by >= 4 from + // the last logged value, OR when the unshadowed-lights count + // changes at all (rarer, more interesting). + // - Rate: floor at 1 line/sec via QueryPerformanceCounter + // (project convention; State.h uses QPC, std::chrono is disfavored). + static int s_lastLoggedShadowCount = -1; + static uint32_t s_lastLoggedUnshadowed = 0; + static LARGE_INTEGER s_lastLogQpc = { .QuadPart = 0 }; + static LARGE_INTEGER s_qpcFrequency = []() { + LARGE_INTEGER f{}; + QueryPerformanceFrequency(&f); + return f; + }(); + LARGE_INTEGER now; + QueryPerformanceCounter(&now); + const bool unshadowedChanged = unshadowedLights != s_lastLoggedUnshadowed; + const bool countSignificant = s_lastLoggedShadowCount < 0 || + std::abs(static_cast(plCount) - s_lastLoggedShadowCount) >= 4; + const bool rateOk = (now.QuadPart - s_lastLogQpc.QuadPart) >= s_qpcFrequency.QuadPart; + if ((countSignificant || unshadowedChanged) && rateOk) { + s_lastLoggedShadowCount = static_cast(plCount); + s_lastLoggedUnshadowed = unshadowedLights; + s_lastLogQpc = now; + if (unshadowedLights > 0) + logger::debug("[LLF] {} shadow lights, {} / {} slots used; {} lights dropped (no shadow)", + plCount, ShadowCasterManager::GetSlotUsage(), slots, unshadowedLights); + else + logger::debug("[LLF] {} shadow lights, {} / {} slots used", plCount, ShadowCasterManager::GetSlotUsage(), slots); + } + } + + { + D3D11_MAPPED_SUBRESOURCE mapped{}; + DX::ThrowIfFailed(context->Map(shadowLights->resource.get(), 0, D3D11_MAP_WRITE_DISCARD, 0, &mapped)); + memcpy(mapped.pData, sd.data(), slots * sizeof(Deferred::ShadowLightData)); + context->Unmap(shadowLights->resource.get(), 0); + ID3D11ShaderResourceView* srv = shadowLights->srv.get(); + context->PSSetShaderResources(102, 1, &srv); + } + + context->PSSetShaderResources(103, 1, &shadowMapsSRV); +} + +// ─── Debug helpers ──────────────────────────────────────────────────────────── + +std::string LightLimitFix::BuildShadowSlotColorLegend() const +{ + const auto& shadowSlotInfos = ShadowCasterManager::GetSlotInfos(); + if (shadowSlotInfos.empty()) + return {}; + + std::string out = "Shadow Slot Color Map (Mode 8):\n"; + for (uint32_t i = 0; i < static_cast(shadowSlotInfos.size()); ++i) { + const auto& info = shadowSlotInfos[i]; + if (!info.valid) + continue; + + float hue = fmodf(float(i) * 0.618033988f, 1.0f); + ImVec4 c = ShadowCasterManager::ShadowSlotHueColor(i); + auto ri = static_cast(c.x * 255.0f); + auto gi = static_cast(c.y * 255.0f); + auto bi = static_cast(c.z * 255.0f); + + out += std::format(" Slot {:2d} | hue {:5.3f} | #{:02X}{:02X}{:02X} | {:11s} | r={:.0f}\n", + i, hue, ri, gi, bi, ShadowCasterManager::GetShadowTypeName(info.type), info.range); + } + return out; +} + +// ─── Overlay ───────────────────────────────────────────────────────────────── + +void LightLimitFix::DrawOverlay() +{ + // Overlay shows when: + // - visualisation modes are active (debug heatmaps), OR + // - any light is suppressed (so the suppression list stays accessible), OR + // - any debug override is in effect (pin shadow / pin convert / solo) so + // users can find what they pinned without remembering to toggle anything, OR + // - the user explicitly opted in via Show Shadow Overlay (lets the table's + // debug controls — cycle button, solo, hover-pulse — be reachable in + // the default state without first triggering a side-effect). + bool vizOn = EnableLightsVisualisation; + bool hasSuppressed = ShadowCasterManager::HasSuppressedLights(); + bool hasOverrides = ShadowCasterManager::HasAnyOverrides(); + bool showOverlay = settings.ShowShadowOverlay; + if (!vizOn && !hasSuppressed && !hasOverrides && !showOverlay) + return; + + // When the CS menu is open, show a draggable/resizable window so the user can + // move it out of the way and expand the table. When the menu is closed, keep + // it as a compact pinned overlay (no title bar, no chrome). + bool menuOpen = globals::menu->IsEnabled; + const float pos = ThemeManager::Constants::OVERLAY_WINDOW_POSITION * Util::GetUIScale(); + + // Single unified window: same ImGui ID across menu open/closed so the + // user's resize persists. Title bar and Move are toggled via flags -- + // hidden when the menu is closed (pinned debug overlay) and shown when + // the menu is open (so the user can drag/resize via the title bar). + // We deliberately don't pass NoSavedSettings so ImGui retains the size + // the user picked across sessions. + ImGuiWindowFlags flags = ImGuiWindowFlags_None; + if (!menuOpen) + flags |= ImGuiWindowFlags_NoTitleBar | ImGuiWindowFlags_NoMove; + + ImGui::SetNextWindowPos(ImVec2(pos, pos), menuOpen ? ImGuiCond_Appearing : ImGuiCond_Always); + ImGui::SetNextWindowSize(ImVec2(340, 480), ImGuiCond_FirstUseEver); + ImGui::SetNextWindowSizeConstraints(ImVec2(280, 200), ImVec2(800, 1200)); + ImGui::Begin("LLF Shadow Slots", nullptr, flags); + + if (vizOn) { + static const char* kVizNames[] = { + "Light Limit", "Strict Lights Count", "Clustered Lights Count", + "Shadow Mask", "Shadow Light Count", "Point Light Shadow Factor", + "Unshadowed Point Lights", "Shadow Caster Density", + "Shadow Slot Index Color", "Light Type Visualization" + }; + uint32_t m = LightsVisualisationMode; + const char* vizName = (m < IM_ARRAYSIZE(kVizNames)) ? kVizNames[m] : "Unknown"; + ImGui::TextColored(ImVec4(1.0f, 0.3f, 0.3f, 1.0f), "LLF DEBUG - %s", vizName); + } else + ImGui::TextColored(ImVec4(1.0f, 0.6f, 0.2f, 1.0f), "LLF - Shadow Suppression"); + ImGui::Separator(); + + uint32_t mode = vizOn ? LightsVisualisationMode : UINT32_MAX; + + // ── All stats grouped above the table (same order as menu) ───────── + // Summary always visible. Scheduler stats only when not in a viz mode + // that has its own legend competing for the same space. + ShadowCasterManager::DrawShadowSummary(lightCount, MAX_LIGHTS, shadowUnshadowedLightCount); + if (!vizOn) + ShadowCasterManager::DrawShadowSchedulerStats(); + + // ── Per-mode informational panels (visualization-mode-specific only) ── + if (vizOn) { + if (mode == 2) { + uint32_t cx = clusterSize[0], cy = clusterSize[1], cz = clusterSize[2]; + ImGui::Text("Cluster grid : %ux%ux%u (%u total)", cx, cy, cz, cx * cy * cz); + ImGui::Text("Max lights/cluster : %u", CLUSTER_MAX_LIGHTS); + } else if (mode >= 3) { + ShadowCasterManager::DrawOverlayShadowModeInfo(mode, shadowUnshadowedLightCount, lightCount); + } + } + + // ── Shadow slot toggle table ───────────────────────────────────── + // Show when in a shadow-related viz mode, or when lights are suppressed. + // readOnly=true when the menu is closed -- the overlay isn't interactive + // then, so the per-row Mode/Solo buttons would be dead pixels. readOnly + // also bounds the table height so the stats above stay visible even + // when many lights are present (the table scrolls internally instead + // of pushing the window past its max-height constraint). + bool shadowRelatedMode = !vizOn || (mode >= 4); + // Also show the table when the user explicitly opened the overlay + // (Show Shadow Overlay toggle) or has any per-light overrides -- the + // tooltip promises the table's debug controls are reachable any time + // once the overlay is open, but viz modes 0-3 leave shadowRelatedMode + // false so without these extra terms the user gets an empty window. + if (showOverlay || hasOverrides || hasSuppressed || shadowRelatedMode) { + ImGui::Separator(); + // compact=false in the overlay: the table fills the remaining + // content region of the user-sized window and scrolls internally + // (ScrollY in Util::ShowSortedStringTableCustom). Stats above stay + // visible regardless of how many lights exist or how the user has + // resized the window. readOnly is still true when the menu is + // closed -- buttons would be dead pixels. + ShadowCasterManager::DrawShadowLightTable(false, vizOn && (mode == 8), true, !menuOpen); + } + + ImGui::End(); +} diff --git a/src/Features/PerformanceOverlay.cpp b/src/Features/PerformanceOverlay.cpp index 957e591d02..0c863f93bf 100644 --- a/src/Features/PerformanceOverlay.cpp +++ b/src/Features/PerformanceOverlay.cpp @@ -1,6 +1,6 @@ /** * @file PerformanceOverlay.cpp - * @brief Real-time performance monitoring system for Skyrim Community Shaders + * @brief Real-time performance monitoring system for Skyrim Open Shaders * * This module provides comprehensive performance monitoring capabilities including: * - Real-time FPS and frame time tracking with configurable update intervals diff --git a/src/Features/RemoteControl.cpp b/src/Features/RemoteControl.cpp new file mode 100644 index 0000000000..186c926345 --- /dev/null +++ b/src/Features/RemoteControl.cpp @@ -0,0 +1,1060 @@ +// Remote Control feature: hosts an in-process Model Context Protocol (MCP) +// server inside CommunityShaders.dll, letting AI assistants query and mutate +// runtime state for A/B testing. Off by default and loopback-only. +// +// Transport: HTTP+SSE (Streamable HTTP, MCP 2025-03-26). +// Endpoint: http://:/mcp (modern, single endpoint) +// http://:/sse (legacy SSE, also exposed by cpp-mcp) + +#include "Features/RemoteControl.h" + +#include "Features/PerformanceOverlay/ABTesting/ABTesting.h" +#include "Features/RenderDoc.h" +#include "Features/ScreenshotFeature.h" +#include "Globals.h" +#include "State.h" + +#include +#include + +#include +#include +#include +#include +#include +#include + +// cpp-mcp headers. Kept inside the .cpp only so the vendored httplib/json +// in extern/cpp-mcp/common don't leak into other translation units. +#include "mcp_server.h" +#include "mcp_tool.h" + +namespace +{ + // The control endpoint is intentionally loopback-only — exposing it off-host + // would let any networked client toggle features and dispatch captures. + // Only accept literal loopback IPs: on Windows the hosts file (or a + // hijacked resolver) can map "localhost" to a routable address, which would + // silently break the loopback-only contract. + bool IsLoopbackAddress(const std::string& host) + { + return host == "127.0.0.1" || host == "::1"; + } + + void NormalizeBindAddress(std::string& host) + { + if (!IsLoopbackAddress(host)) { + logger::warn("Remote Control: non-loopback bindAddress '{}' rejected; forcing 127.0.0.1", host); + host = "127.0.0.1"; + } + } + + int ClampPort(int port) + { + return std::clamp(port, 1024, 65535); + } +} + +RemoteControl* RemoteControl::GetSingleton() +{ + return &globals::features::remoteControl; +} + +RemoteControl::RemoteControl() = default; + +RemoteControl::~RemoteControl() +{ + StopServer(); +} + +void RemoteControl::Load() +{ + // Settings have already been read in by the time Load() fires. + if (settings.enabled) { + StartServer(); + } +} + +void RemoteControl::Reset() +{ + // No per-frame state to reset. +} + +void RemoteControl::LoadSettings(json& o_json) +{ + settings.enabled = o_json.value("enabled", false); + settings.port = ClampPort(o_json.value("port", 8910)); + settings.bindAddress = o_json.value("bindAddress", std::string("127.0.0.1")); + NormalizeBindAddress(settings.bindAddress); +} + +void RemoteControl::SaveSettings(json& o_json) +{ + o_json["enabled"] = settings.enabled; + o_json["port"] = settings.port; + o_json["bindAddress"] = settings.bindAddress; +} + +void RemoteControl::RestoreDefaultSettings() +{ + settings = Settings{}; +} + +void RemoteControl::DrawSettings() +{ + ImGui::TextWrapped( + "Exposes Community Shaders over Model Context Protocol (MCP) so AI " + "assistants such as Claude Code can drive A/B testing, toggle " + "features, and trigger captures. Off by default. The endpoint is " + "loopback-only — any non-loopback bind address is rejected at load " + "and bind time."); + ImGui::Spacing(); + + const bool wasEnabled = settings.enabled; + if (ImGui::Checkbox("Enable MCP server", &settings.enabled)) { + if (settings.enabled && !wasEnabled) { + StartServer(); + } else if (!settings.enabled && wasEnabled) { + StopServer(); + } + } + + // Port + bind address can only be edited while the server is stopped. + ImGui::BeginDisabled(IsRunning()); + ImGui::InputInt("Port", &settings.port); + settings.port = std::clamp(settings.port, 1024, 65535); + ImGui::InputText("Bind address", &settings.bindAddress); + ImGui::EndDisabled(); + if (IsRunning()) { + ImGui::SameLine(); + ImGui::TextDisabled("(stop the server to edit)"); + } + + if (!lastError.empty()) { + ImGui::TextColored(ImVec4(1.0f, 0.5f, 0.4f, 1.0f), + "Server error: %s", lastError.c_str()); + } + + if (IsRunning()) { + ImGui::TextColored(ImVec4(0.4f, 0.9f, 0.5f, 1.0f), + "Listening on %s:%d", settings.bindAddress.c_str(), activePort); + } + + ImGui::Separator(); + ImGui::Text("Connect from an MCP client (Claude Code, Cursor, etc.):"); + + if (ImGui::Button("Copy MCP client config to clipboard")) { + ImGui::SetClipboardText(BuildClientConfig().c_str()); + } + ImGui::SameLine(); + ImGui::TextDisabled("(?)"); + if (ImGui::IsItemHovered()) { + ImGui::SetTooltip( + "Paste the JSON into your Claude Code settings under " + "\"mcpServers\". Other MCP hosts (Cursor, Continue) accept the " + "same shape."); + } + + if (ImGui::CollapsingHeader("Config preview")) { + const auto preview = BuildClientConfig(); + ImGui::PushTextWrapPos(); + ImGui::TextUnformatted(preview.c_str()); + ImGui::PopTextWrapPos(); + } + + ImGui::Separator(); + DrawClientsTable(); +} + +void RemoteControl::DrawClientsTable() +{ + // Snapshot under the lock to keep the listener-thread updates from + // racing the draw. The snapshot is small (a handful of sessions at most). + std::vector rows; + { + std::lock_guard lock(sessionMutex); + rows.reserve(sessions.size()); + for (const auto& [_, info] : sessions) { + rows.push_back(info); + } + } + + const std::string headerLabel = std::format("Connected clients ({})##rc-clients", rows.size()); + if (!ImGui::CollapsingHeader(headerLabel.c_str(), ImGuiTreeNodeFlags_DefaultOpen)) { + return; + } + + if (!IsRunning()) { + ImGui::TextDisabled("Server not running."); + return; + } + if (rows.empty()) { + ImGui::TextDisabled( + "No clients connected. Paste the config above into " + "your MCP host and run a tool to populate this table."); + return; + } + + constexpr ImGuiTableFlags flags = ImGuiTableFlags_Borders | ImGuiTableFlags_Resizable | + ImGuiTableFlags_RowBg | ImGuiTableFlags_Sortable | + ImGuiTableFlags_SortMulti | ImGuiTableFlags_ScrollY; + + enum ColumnId : ImGuiID + { + ColSession = 0, + ColConnected, + ColIdle, + ColRequests, + ColLastTool, + }; + + if (ImGui::BeginTable("##rc-clients-table", 5, flags, ImVec2(0.0f, 120.0f))) { + ImGui::TableSetupColumn("Session", ImGuiTableColumnFlags_DefaultSort, 0.0f, ColSession); + ImGui::TableSetupColumn("Connected", 0, 0.0f, ColConnected); + ImGui::TableSetupColumn("Idle for", 0, 0.0f, ColIdle); + ImGui::TableSetupColumn("Requests", 0, 0.0f, ColRequests); + ImGui::TableSetupColumn("Last tool", 0, 0.0f, ColLastTool); + ImGui::TableSetupScrollFreeze(0, 1); + ImGui::TableHeadersRow(); + + if (auto* sortSpecs = ImGui::TableGetSortSpecs(); sortSpecs && sortSpecs->SpecsCount > 0) { + std::sort(rows.begin(), rows.end(), + [&](const SessionInfo& a, const SessionInfo& b) { + for (int i = 0; i < sortSpecs->SpecsCount; ++i) { + const auto& spec = sortSpecs->Specs[i]; + const bool desc = spec.SortDirection == ImGuiSortDirection_Descending; + int cmp = 0; + switch (static_cast(spec.ColumnUserID)) { + case ColSession: + cmp = a.id.compare(b.id); + break; + case ColConnected: + cmp = a.connected < b.connected ? -1 : (a.connected > b.connected ? 1 : 0); + break; + case ColIdle: + cmp = a.lastSeen < b.lastSeen ? -1 : (a.lastSeen > b.lastSeen ? 1 : 0); + break; + case ColRequests: + cmp = a.requestCount < b.requestCount ? -1 : (a.requestCount > b.requestCount ? 1 : 0); + break; + case ColLastTool: + cmp = a.lastTool.compare(b.lastTool); + break; + } + if (cmp != 0) { + return desc ? cmp > 0 : cmp < 0; + } + } + return false; + }); + } + + const auto now = std::chrono::system_clock::now(); + const auto formatRelative = [](std::chrono::seconds sec) -> std::string { + const auto s = sec.count(); + if (s < 60) { + return std::format("{}s ago", s); + } + if (s < 3600) { + return std::format("{}m {}s ago", s / 60, s % 60); + } + return std::format("{}h {}m ago", s / 3600, (s % 3600) / 60); + }; + + for (const auto& info : rows) { + ImGui::TableNextRow(); + const auto connectedSec = std::chrono::duration_cast(now - info.connected); + const auto idleSec = std::chrono::duration_cast(now - info.lastSeen); + + ImGui::TableSetColumnIndex(0); + ImGui::TextUnformatted(info.id.c_str()); + ImGui::TableSetColumnIndex(1); + ImGui::TextUnformatted(formatRelative(connectedSec).c_str()); + ImGui::TableSetColumnIndex(2); + ImGui::TextUnformatted(formatRelative(idleSec).c_str()); + ImGui::TableSetColumnIndex(3); + ImGui::Text("%llu", static_cast(info.requestCount)); + ImGui::TableSetColumnIndex(4); + ImGui::TextUnformatted(info.lastTool.empty() ? "(none)" : info.lastTool.c_str()); + } + + ImGui::EndTable(); + } + + ImGui::TextDisabled( + "To force-disconnect all clients, toggle 'Enable MCP server' off and back on. " + "Per-session kick is not exposed by cpp-mcp's public API."); +} + +std::string RemoteControl::BuildClientConfig() const +{ + // Streamable HTTP transport per the MCP 2025-03-26 spec. Same shape works + // for Claude Code, Cursor, Continue, and other MCP hosts. + // IPv6 literals must be bracketed in a URL authority (RFC 3986 §3.2.2), + // so the IPv6 loopback "::1" becomes "[::1]". IPv4 / hostnames pass + // through verbatim. + const std::string hostInUrl = (settings.bindAddress.find(':') != std::string::npos) ? "[" + settings.bindAddress + "]" : settings.bindAddress; + const json cfg = { + { "mcpServers", + { { "community-shaders", + { + { "type", "http" }, + { "url", std::format("http://{}:{}/mcp", + hostInUrl, settings.port) }, + } } } } + }; + return cfg.dump(4); +} + +void RemoteControl::StartServer() +{ + if (server) { + return; + } + lastError.clear(); + + try { + // Re-validate at bind time — settings may have been touched via the UI + // or hot-reload since LoadSettings ran. + NormalizeBindAddress(settings.bindAddress); + settings.port = ClampPort(settings.port); + + mcp::server::configuration cfg; + cfg.host = settings.bindAddress; + cfg.port = settings.port; + cfg.name = "Community Shaders"; + cfg.version = "0.1.0"; + + server = std::make_unique(cfg); + server->set_server_info(cfg.name, cfg.version); + server->set_capabilities({ { "tools", mcp::json::object() } }); + server->set_instructions( + "This server exposes the Skyrim Community Shaders plugin. " + "Use the tools to inspect engine state for performance " + "investigation and A/B testing of graphics features."); + + RegisterTools(); + + // Drop a session from the bookkeeping map on disconnect. cpp-mcp + // dispatches this from its listener thread when the SSE/HTTP + // connection tears down. + server->register_session_cleanup("remote-control-session-tracker", + [this](const std::string& sessionId) { + DropSession(sessionId); + }); + + if (!server->start(false)) { // false = non-blocking + throw std::runtime_error("server.start() returned false"); + } + activePort = settings.port; + logger::info("Remote Control: MCP server listening on {}:{}", + settings.bindAddress, activePort); + } catch (const std::exception& e) { + lastError = e.what(); + logger::error("Remote Control: failed to start MCP server: {}", + e.what()); + server.reset(); + activePort = 0; + } +} + +void RemoteControl::StopServer() +{ + if (!server) { + return; + } + try { + server->stop(); + } catch (...) { + // best-effort on shutdown + } + server.reset(); + activePort = 0; + { + std::lock_guard lock(sessionMutex); + sessions.clear(); + } + logger::info("Remote Control: MCP server stopped"); +} + +void RemoteControl::RecordToolCall(const std::string& sessionId, const std::string& toolName) +{ + const auto now = std::chrono::system_clock::now(); + std::lock_guard lock(sessionMutex); + auto& info = sessions[sessionId]; + if (info.requestCount == 0) { + info.id = sessionId; + info.connected = now; + } + info.lastSeen = now; + info.requestCount += 1; + info.lastTool = toolName; +} + +void RemoteControl::DropSession(const std::string& sessionId) +{ + std::lock_guard lock(sessionMutex); + sessions.erase(sessionId); +} + +// Helper: wrap a payload string in the MCP tool-result content envelope +// (an array of typed content items). Tools return application data as the +// "text" field of a single content item; consumers typically parse it as +// JSON. +static mcp::json TextResult(std::string text) +{ + return mcp::json::array({ mcp::json{ + { "type", "text" }, + { "text", std::move(text) } } }); +} + +// Helper: emit an error result. Convention: a single text content item +// containing a JSON object with "error" + optional context fields, so +// callers always get parseable JSON whether the call succeeded or not. +static mcp::json ErrorResult(std::string_view message, mcp::json context = {}) +{ + mcp::json obj = { { "error", message } }; + if (!context.is_null()) { + obj.update(context); + } + return mcp::json::array({ mcp::json{ + { "type", "text" }, + { "text", obj.dump() } } }); +} + +void RemoteControl::RegisterTools() +{ + // Five tools, each semantically rich. Reads vs writes vs lifecycles are + // separated by tool; within each tool, kind/action discriminates the + // specific operation. See agentic-renderdoc's "Why this design" notes — + // fewer rich tools outperform expansive suites because the agent reads + // fewer descriptions and each description carries the operational + // expertise (timing, gotchas, verification routes). + RegisterInspectTool(); // reads (non-feature engine state) + RegisterFeatureTool(); // all feature ops (list/get/set/reset/toggle) + RegisterConsoleTool(); // Skyrim console passthrough + RegisterCaptureTool(); // frame capture (renderdoc/screenshot) + RegisterAbtestTool(); // A/B test lifecycle +} + +// Helper used by both inspect(kind="state") and (potentially) future tools. +static mcp::json EngineStateBlob() +{ + const uint frames = globals::state ? globals::state->frameCountAtomic.load(std::memory_order_relaxed) : 0u; + const bool vr = REL::Module::IsVR(); + return mcp::json({ + { "plugin", "CommunityShaders" }, + { "frame_count", frames }, + { "vr", vr }, + }); +} + +// Helper used by feature(action="list") to build one entry per feature. +static mcp::json FeatureEntry(Feature* f) +{ + mcp::json entry({ + { "name", f->GetName() }, + { "shortName", f->GetShortName() }, + { "loaded", f->loaded }, + { "version", f->version }, + { "category", std::string(f->GetCategory()) }, + { "isCore", f->IsCore() }, + { "supportsVR", f->SupportsVR() }, + { "inMenu", f->IsInMenu() }, + }); + + // Inline restart-gated metadata so `list` is the single tool that answers + // "what features exist", "which fields need a restart to apply", and + // "is anything currently pending". Each entry's `pending` is true when + // the live setting differs from the boot-latched value. + const auto fields = f->GetRestartRequiredFields(); + if (!fields.empty()) { + mcp::json restartFields = mcp::json::array(); + const auto* liveBase = reinterpret_cast(f->GetSettingsBlob()); + const size_t liveSize = f->GetSettingsBlobSize(); + for (const auto& field : fields) { + bool pending = false; + if (liveBase && field.jsonKey && field.size != 0 && + field.offset + field.size <= liveSize) { + const void* boot = f->GetBootValue(field.jsonKey); + if (boot && + std::memcmp(boot, liveBase + field.offset, field.size) != 0) { + pending = true; + } + } + restartFields.push_back(mcp::json({ + { "key", field.jsonKey ? field.jsonKey : "" }, + { "label", field.label ? field.label : "" }, + { "pending", pending }, + })); + } + entry["restartFields"] = restartFields; + } + + return entry; +} + +void RemoteControl::RegisterInspectTool() +{ + // Single read endpoint for non-feature engine state. Kind-discriminated + // so future engine reads (weather, cell, player, render targets) extend + // the same tool rather than spawning new top-level reads. For feature + // reads (list, get settings), use the `feature` tool with the + // corresponding action. + const auto tool = mcp::tool_builder("inspect") + .with_description( + "Read non-feature engine state. Kind-dispatched; the " + "response is always a JSON object delivered as the " + "text of a single content item.\n\n" + "Kinds:\n" + " state — { plugin, frame_count, vr }. Frame counter " + "monotonically increases each render tick; use as a " + "ground truth for verifying that deferred operations " + "(see `console`) have had time to run.\n\n" + "For feature reads (enumerate / settings), use the " + "`feature` tool with action='list' or 'get'.") + .with_string_param("kind", + "Currently 'state'. New kinds will be added here " + "rather than as new tools.") + .build(); + server->register_tool(tool, + [this](const mcp::json& params, const std::string& session_id) -> mcp::json { + RecordToolCall(session_id, "inspect"); + const std::string kind = params.value("kind", std::string{}); + if (kind.empty()) { + return ErrorResult("missing required parameter 'kind'"); + } + if (kind == "state") { + return TextResult(EngineStateBlob().dump()); + } + return ErrorResult("unknown kind", + { { "kind", kind }, + { "supported", mcp::json::array({ "state" }) } }); + }); +} + +void RemoteControl::RegisterFeatureTool() +{ + // One tool for all graphics-feature operations. Action-dispatched so the + // agent has a single description that documents the full feature + // vocabulary plus the gotchas across all five operations (silent no-op + // for missing overrides, listener-thread caveats, etc). + const auto tool = mcp::tool_builder("feature") + .with_description( + "All graphics-feature operations — enumerate, " + "inspect settings, mutate settings, restore defaults, " + "toggle on/off. Action-dispatched; each action takes " + "the parameters listed below.\n\n" + "Actions:\n" + " list — no other params. Returns a JSON array; " + "each entry has { name, shortName, loaded, version, " + "category, isCore, supportsVR, inMenu }. Features " + "with restart-gated settings also include " + "`restartFields: [{ key, label, pending }]` — " + "`pending=true` means the user has staged a change " + "that won't take effect until the next launch.\n" + " get — params: shortName. Returns the " + "Feature::SaveSettings(json) blob. May return null " + "if the feature has no SaveSettings/LoadSettings " + "override (e.g. LightLimitFix); set/reset will " + "silently no-op for these.\n" + " set — params: shortName, settings (object). " + "Calls Feature::LoadSettings on the listener thread. " + "Safe for value-assigning LoadSettings (the common " + "case) and for features that flip a recompileFlag " + "(ScreenSpaceGI, DynamicCubemaps) — the render loop " + "picks them up on the next frame. Settings that " + "synchronously rebuild GPU resources would race; " + "none in-tree currently do.\n" + " reset — params: shortName. Calls " + "Feature::RestoreDefaultSettings(). Distinct from " + "set({}) because RestoreDefaultSettings is " + "feature-specific reset logic (may release/recreate " + "state).\n" + " toggle — params: shortName, enabled (boolean). " + "Flips Feature::loaded. Disabled features are " + "skipped by ForEachLoadedFeature so their per-frame " + "rendering work doesn't run. GPU resources allocated " + "in SetupResources are NOT freed — A/B perf/quality, " + "not memory reclaim.\n\n" + "A/B testing pattern:\n" + " 1. feature(action='get', shortName='Skylighting') → snapshot\n" + " 2. feature(action='reset', shortName='Skylighting') → defaults\n" + " 3. capture + tracy capture → measure\n" + " 4. feature(action='set', shortName='Skylighting', settings=) → restore\n\n" + "Gotchas:\n" + " • Some features have no SaveSettings/LoadSettings " + "override. `get` returns null; `set` and `reset` " + "claim success but don't change anything. Confirmed " + "case: LightLimitFix.\n" + " • toggle keeps GPU resources alive. If a feature " + "still affects rendering after `enabled=false`, it " + "has a hook that isn't gated on `loaded` — file an " + "issue with the shortName.") + .with_string_param("action", + "One of: 'list', 'get', 'set', 'reset', 'toggle'.") + .with_string_param("shortName", + "Required for all actions except 'list'. From the " + "list response.", + /*required=*/false) + .with_object_param("settings", + "Required for action='set'. Shape that matches what " + "action='get' returned for the same feature.", + mcp::json::object(), + /*required=*/false) + .with_boolean_param("enabled", + "Required for action='toggle'.", + /*required=*/false) + .build(); + server->register_tool(tool, + [this](const mcp::json& params, const std::string& session_id) -> mcp::json { + RecordToolCall(session_id, "feature"); + const std::string action = params.value("action", std::string{}); + if (action.empty()) { + return ErrorResult("missing required parameter 'action'"); + } + + if (action == "list") { + mcp::json features = mcp::json::array(); + for (auto* f : Feature::GetFeatureList()) { + features.push_back(FeatureEntry(f)); + } + return TextResult(features.dump()); + } + + const std::string shortName = params.value("shortName", std::string{}); + + if (shortName.empty()) { + return ErrorResult("missing required parameter 'shortName'", + { { "action", action } }); + } + + if (action == "toggle") { + if (!params.contains("enabled") || !params["enabled"].is_boolean()) { + return ErrorResult("missing required boolean parameter 'enabled'"); + } + const bool desired = params["enabled"].get(); + // FindFeatureByShortName filters on loaded==true so it can't + // help re-enable; walk the full list ourselves. + Feature* target = nullptr; + for (auto* f : Feature::GetFeatureList()) { + if (f->GetShortName() == shortName) { + target = f; + break; + } + } + if (!target) { + return ErrorResult("feature not found", + { { "shortName", shortName } }); + } + // Marshal the write onto the main/render thread. Feature::loaded + // is read every frame by Feature::ForEachLoadedFeature without + // synchronization, so writing it directly from the MCP listener + // thread is a data race. AddTask runs the closure on the next + // tick. + auto* task = SKSE::GetTaskInterface(); + if (!task) { + return ErrorResult("SKSE TaskInterface unavailable"); + } + const bool previous = target->loaded; + const uint enqueuedFrame = globals::state ? globals::state->frameCountAtomic.load(std::memory_order_relaxed) : 0u; + task->AddTask([target, desired, shortName]() { + target->loaded = desired; + logger::info("Remote Control: feature(toggle, {}, {}) applied", + shortName, desired); + }); + return TextResult(mcp::json({ + { "action", "toggle" }, + { "shortName", shortName }, + { "previous", previous }, + { "requested", desired }, + { "queued", true }, + { "enqueued_at_frame", enqueuedFrame }, + }) + .dump()); + } + + auto* feature = Feature::FindFeatureByShortName(shortName); + if (!feature) { + return ErrorResult("feature not found or not loaded", + { { "shortName", shortName } }); + } + + if (action == "get") { + // SaveSettings uses nlohmann::json (unordered). Keep the + // intermediate value as plain json and dump as a string so + // we don't have to round-trip through mcp::json's ordered map. + ::json blob; + feature->SaveSettings(blob); + return TextResult(blob.dump()); + } + if (action == "set") { + if (!params.contains("settings") || !params["settings"].is_object()) { + return ErrorResult("missing required object parameter 'settings'"); + } + ::json blob; + try { + blob = ::json::parse(params["settings"].dump()); + } catch (const std::exception& e) { + return ErrorResult("settings is not valid JSON", + { { "detail", e.what() } }); + } + // Marshal LoadSettings onto the main thread. Many features + // mutate UI/render-thread-visible state inside LoadSettings + // (palettes, cached textures, settings JSON read elsewhere), + // so calling it from the MCP listener thread is racy. + auto* task = SKSE::GetTaskInterface(); + if (!task) { + return ErrorResult("SKSE TaskInterface unavailable"); + } + const uint enqueuedFrame = globals::state ? globals::state->frameCountAtomic.load(std::memory_order_relaxed) : 0u; + task->AddTask([feature, blob, shortName]() mutable { + try { + feature->LoadSettings(blob); + logger::info("Remote Control: feature(set, {}) applied", shortName); + } catch (const std::exception& e) { + logger::error("Remote Control: feature(set, {}) LoadSettings threw: {}", + shortName, e.what()); + } + }); + return TextResult(mcp::json({ + { "action", "set" }, + { "shortName", shortName }, + { "queued", true }, + { "enqueued_at_frame", enqueuedFrame }, + }) + .dump()); + } + if (action == "reset") { + // Same marshaling rationale as feature(set): RestoreDefaultSettings + // touches state that the render/UI threads read concurrently. + auto* task = SKSE::GetTaskInterface(); + if (!task) { + return ErrorResult("SKSE TaskInterface unavailable"); + } + const uint enqueuedFrame = globals::state ? globals::state->frameCountAtomic.load(std::memory_order_relaxed) : 0u; + task->AddTask([feature, shortName]() { + try { + feature->RestoreDefaultSettings(); + logger::info("Remote Control: feature(reset, {}) applied", shortName); + } catch (const std::exception& e) { + logger::error("Remote Control: feature(reset, {}) RestoreDefaultSettings threw: {}", + shortName, e.what()); + } + }); + return TextResult(mcp::json({ + { "action", "reset" }, + { "shortName", shortName }, + { "queued", true }, + { "enqueued_at_frame", enqueuedFrame }, + }) + .dump()); + } + + return ErrorResult("unknown action", + { { "action", action }, + { "supported", mcp::json::array({ "list", "get", "set", "reset", "toggle" }) } }); + }); +} + +void RemoteControl::RegisterAbtestTool() +{ + // Single tool for the entire A/B testing lifecycle. Action-dispatched + // rather than spawning start_abtest / stop_abtest / get_abtest_results + // / clear_abtest_snapshots / set_abtest_interval — fewer richer tools. + const auto tool = mcp::tool_builder("abtest") + .with_description( + "Drive the built-in A/B testing harness " + "(features/Performance Overlay/ABTesting). The " + "harness rotates between a USER configuration " + "(your current settings) and a TEST configuration " + "(typically a preset under test) on a fixed " + "interval, snapshots both in memory to avoid disk " + "I/O during swaps, and aggregates per-variant " + "frame timing so you can compare quality and perf.\n\n" + "Actions:\n" + " status — return enabled, usingTestConfig, " + "interval, hasCachedSnapshots.\n" + " start — Enable() the manager (begin rotating). " + "Optional `interval` parameter (seconds) is applied " + "first if provided.\n" + " stop — Disable() the manager. Snapshots are " + "retained.\n" + " clear — ClearCachedSnapshots(). Use to reset " + "before a fresh comparison.\n" + " diff — return the per-key diff list " + "(GetConfigDiffEntries) so callers know which " + "settings the rotation is actually toggling.\n\n" + "Setup of the TEST config itself lives in the " + "Performance Overlay UI — this tool only drives " + "the lifecycle, not the test-config authoring.") + .with_string_param("action", + "'status', 'start', 'stop', 'clear', or 'diff'.") + .with_number_param("interval", + "Seconds per variant when action='start'. " + "Default 0 (no change).", + /*required=*/false) + .build(); + server->register_tool(tool, + [this](const mcp::json& params, const std::string& session_id) -> mcp::json { + RecordToolCall(session_id, "abtest"); + const std::string action = params.value("action", std::string{}); + if (action.empty()) { + return ErrorResult("missing required parameter 'action'"); + } + auto* mgr = ABTestingManager::GetSingleton(); + if (!mgr) { + return ErrorResult("ABTestingManager singleton unavailable"); + } + + const auto statusBlob = [&]() { + return mcp::json({ + { "enabled", mgr->IsEnabled() }, + { "usingTestConfig", mgr->IsUsingTestConfig() }, + { "interval", mgr->GetTestInterval() }, + { "hasCachedSnapshots", mgr->HasCachedSnapshots() }, + }); + }; + + if (action == "status") { + // Read-only — safe from the listener thread; the only state we + // touch is the manager's atomic-ish status getters. + return TextResult(statusBlob().dump()); + } + + // Lifecycle actions (start/stop/clear) marshal onto the main thread: + // Enable/Disable swap configs via State::Load → JSON, and Menu::Load + // touches settings the menu/render thread also reads. Doing that + // from the listener thread is a race against the next frame's UI. + auto* task = SKSE::GetTaskInterface(); + if (!task) { + return ErrorResult("SKSE TaskInterface unavailable"); + } + const uint enqueuedFrame = globals::state ? globals::state->frameCountAtomic.load(std::memory_order_relaxed) : 0u; + const auto queuedResult = [&](const std::string& act) { + auto blob = statusBlob(); + blob["action"] = act; + blob["queued"] = true; + blob["enqueued_at_frame"] = enqueuedFrame; + return TextResult(blob.dump()); + }; + + if (action == "start") { + std::optional interval; + if (params.contains("interval") && params["interval"].is_number()) { + const auto secs = params["interval"].get(); + if (secs > 0) { + interval = static_cast(secs); + } + } + task->AddTask([mgr, interval]() { + if (interval) { + mgr->SetTestInterval(*interval); + } + mgr->Enable(); + logger::info("Remote Control: abtest(start) applied"); + }); + return queuedResult("start"); + } + if (action == "stop") { + task->AddTask([mgr]() { + mgr->Disable(); + logger::info("Remote Control: abtest(stop) applied"); + }); + return queuedResult("stop"); + } + if (action == "clear") { + task->AddTask([mgr]() { + mgr->ClearCachedSnapshots(); + logger::info("Remote Control: abtest(clear) applied"); + }); + return queuedResult("clear"); + } + if (action == "diff") { + mcp::json entries = mcp::json::array(); + for (const auto& entry : mgr->GetConfigDiffEntries()) { + // SettingsDiffEntry uses generic a/b labels (see + // Utils/FileSystem.h). For A/B testing semantics here, + // `a` is USER and `b` is TEST. + entries.push_back({ + { "path", entry.path }, + { "userValue", entry.aValue }, + { "testValue", entry.bValue }, + }); + } + return TextResult(mcp::json({ + { "hasCachedSnapshots", mgr->HasCachedSnapshots() }, + { "entries", std::move(entries) }, + }) + .dump()); + } + return ErrorResult("unknown action", + { { "action", action }, + { "supported", mcp::json::array({ "status", "start", "stop", "clear", "diff" }) } }); + }); +} + +void RemoteControl::RegisterCaptureTool() +{ + // One tool for all frame-capture kinds, kind-dispatched. Adding new + // capture types later (e.g. tracy snapshot, video clip) extends this + // tool's `kind` enum rather than spawning new top-level tools. + const auto tool = mcp::tool_builder("capture") + .with_description( + "Trigger a frame capture on the next render. Kind-" + "dispatched so all capture flavors live behind one " + "tool — see the agentic-renderdoc design notes.\n\n" + "Supported kinds:\n" + " renderdoc — RenderDoc multi-frame capture via " + "the in-application API. Honors the `frames` " + "parameter (default 1, max 120). RenderDoc must " + "be attached or the in-app DLL loaded; check " + "feature(action='list') for RenderDoc loaded=true. Output " + "lands in RenderDoc's configured captures dir.\n" + " screenshot — Lossless screenshot via the " + "Screenshot feature's non-blocking capture path. " + "The `frames` parameter is ignored. Output lands " + "in the game's Screenshots/ folder.\n\n" + "Fire-and-forget: the trigger flag is set " + "immediately and the render loop consumes it on " + "the next frame. No artifact path is returned " + "synchronously — for renderdoc, inspect the " + "captures directory; for screenshots, watch the " + "Screenshots folder.") + .with_string_param("kind", + "'renderdoc' or 'screenshot'.") + .with_number_param("frames", + "RenderDoc only: number of consecutive frames to " + "capture (1-120). Default 1. Ignored for " + "screenshot.", + /*required=*/false) + .build(); + server->register_tool(tool, + [this](const mcp::json& params, const std::string& session_id) -> mcp::json { + RecordToolCall(session_id, "capture"); + const std::string kind = params.value("kind", std::string{}); + if (kind.empty()) { + return ErrorResult("missing required parameter 'kind'"); + } + const uint enqueuedFrame = globals::state ? globals::state->frameCountAtomic.load(std::memory_order_relaxed) : 0u; + + if (kind == "renderdoc") { + auto* renderDoc = &globals::features::renderDoc; + if (!renderDoc->loaded) { + return ErrorResult("RenderDoc feature is not loaded", + { { "hint", "feature(action='list') shows RenderDoc.loaded" } }); + } + if (!renderDoc->IsAvailable()) { + return ErrorResult( + "RenderDoc API not available — attach RenderDoc or " + "load the in-app DLL"); + } + uint32_t frameCount = 1; + if (params.contains("frames") && params["frames"].is_number()) { + const auto raw = params["frames"].get(); + frameCount = static_cast(std::clamp(raw, 1, 120)); + } + if (frameCount == 1) { + renderDoc->TriggerCapture(); + } else { + renderDoc->TriggerMultiFrameCapture(frameCount); + } + logger::info("Remote Control: capture(renderdoc, {}) at frame {}", + frameCount, enqueuedFrame); + return TextResult(mcp::json({ + { "queued", true }, + { "kind", "renderdoc" }, + { "frames", frameCount }, + { "enqueued_at_frame", enqueuedFrame }, + }) + .dump()); + } + + if (kind == "screenshot") { + auto* shot = &globals::features::screenshotFeature; + if (!shot->loaded) { + return ErrorResult("Screenshot feature is not loaded"); + } + shot->captureRequested.store(true, std::memory_order_release); + logger::info("Remote Control: capture(screenshot) at frame {}", + enqueuedFrame); + return TextResult(mcp::json({ + { "queued", true }, + { "kind", "screenshot" }, + { "enqueued_at_frame", enqueuedFrame }, + }) + .dump()); + } + + return ErrorResult("unknown kind", + { { "kind", kind }, + { "supported", mcp::json::array({ "renderdoc", "screenshot" }) } }); + }); +} + +void RemoteControl::RegisterConsoleTool() +{ + // Singular tool for the entire console concern. Future console-related + // capabilities (history readout, command lookup, etc.) get added as + // optional parameters / additional response fields here rather than as + // separate tools — per the "fewer, semantically rich tools" philosophy. + const auto tool = mcp::tool_builder("console") + .with_description( + "Execute a Skyrim console command. Fire-and-forget: " + "the command is queued onto the main game thread via " + "SKSE's TaskInterface and runs on the next tick. " + "Returns immediately with the frame counter at the " + "moment of enqueue.\n\n" + "RE::Console::ExecuteCommand is `void` — there is " + "no per-command return value. RE::ConsoleLog is a " + "shared sink (engine + every SKSE plugin) with no " + "command-to-output correlation, and many useful " + "commands are silent (tcl, tfc, tg, tm, tlb…), so " + "scraping console output is unreliable and " + "intentionally NOT exposed.\n\n" + "To verify a state change, poll inspect(kind='state') " + "until frame_count > enqueued_at_frame (at least one tick " + "elapsed), then observe via side channels: tracy " + "captures for perf-affecting changes, " + "capture(kind='renderdoc'|'screenshot') for visual " + "confirmation, or future feature-specific get_* " + "tools that read RE:: state directly.\n\n" + "Common A/B-relevant commands:\n" + " tcl — toggle player collision\n" + " tfc [1] — free camera (1 = pause game)\n" + " tg — toggle grass\n" + " tm — toggle menus / HUD\n" + " tll <0..15> — toggle land LOD level\n" + " setweather — force weather (persistent)\n" + " fw — force weather (temporary)\n" + " coc — teleport to cell\n" + " set timescale to N — game-time multiplier\n") + .with_string_param("command", + "The console command, exactly as typed after the ~ key.") + .build(); + server->register_tool(tool, + [this](const mcp::json& params, const std::string& session_id) -> mcp::json { + RecordToolCall(session_id, "console"); + std::string command = params.value("command", std::string{}); + if (command.empty()) { + return ErrorResult("missing required parameter 'command'"); + } + auto* task = SKSE::GetTaskInterface(); + if (!task) { + return ErrorResult("SKSE TaskInterface unavailable"); + } + const uint enqueuedFrame = globals::state ? globals::state->frameCountAtomic.load(std::memory_order_relaxed) : 0u; + // Capture by value so the string outlives this lambda's scope. + task->AddTask([command]() { + RE::Console::ExecuteCommand(command.c_str()); + }); + logger::info("Remote Control: console({}) queued at frame {}", + command, enqueuedFrame); + return TextResult(mcp::json({ + { "queued", true }, + { "command", std::move(command) }, + { "enqueued_at_frame", enqueuedFrame }, + }) + .dump()); + }); +} diff --git a/src/Features/RemoteControl.h b/src/Features/RemoteControl.h new file mode 100644 index 0000000000..0c88ad8107 --- /dev/null +++ b/src/Features/RemoteControl.h @@ -0,0 +1,118 @@ +#pragma once + +#include "Feature.h" + +#include +#include +#include +#include +#include +#include +#include + +using json = nlohmann::json; + +// Forward declare cpp-mcp types so we don't leak its vendored +// httplib / json headers into consumers of this header. +namespace mcp +{ + class server; + struct tool; + // cpp-mcp's tool_handler is std::function + // where `json` is an alias for ordered_json — that can't be forward-declared + // cleanly without dragging the full vendored nlohmann/json header into this + // public header. Tool registration therefore stays in the .cpp where the + // real signature is in scope; only opaque pointers are exposed here. +} + +class RemoteControl : public Feature +{ +public: + static RemoteControl* GetSingleton(); + + // Feature overrides — see Feature.h for contracts. + std::string GetName() override { return "Remote Control"; } + std::string GetShortName() override { return "RemoteControl"; } + std::string_view GetCategory() const override { return FeatureCategories::kUtility; } + bool IsCore() const override { return true; } + bool IsInMenu() const override { return true; } + bool SupportsVR() override { return true; } + std::string_view GetShaderDefineName() override { return ""; } + bool HasShaderDefine(RE::BSShader::Type) override { return false; } + + std::pair> GetFeatureSummary() override + { + return { + "Expose Community Shaders to AI assistants over Model Context Protocol (MCP).", + { + "Loopback-only JSON-RPC server, off by default", + "Pair with Claude Code / Cursor / Continue for A/B testing", + "One-click clipboard copy of MCP client config", + } + }; + } + + // Lifecycle + void Load() override; + void Reset() override; + + // Settings persistence + void DrawSettings() override; + void RestoreDefaultSettings() override; + void LoadSettings(json& o_json) override; + void SaveSettings(json& o_json) override; + + struct Settings + { + bool enabled = false; // opt-in + int port = 8910; // arbitrary high port + std::string bindAddress = "127.0.0.1"; // loopback by default + } settings; + + RemoteControl(); + ~RemoteControl(); + + RemoteControl(const RemoteControl&) = delete; + RemoteControl& operator=(const RemoteControl&) = delete; + RemoteControl(RemoteControl&&) = delete; + RemoteControl& operator=(RemoteControl&&) = delete; + + // Session bookkeeping for the ImGui "Connected clients" table. + // Updated on every tool invocation (listener thread) and on session + // cleanup (cpp-mcp callback). Read from the main thread when drawing. + struct SessionInfo + { + std::string id; + std::chrono::system_clock::time_point connected; + std::chrono::system_clock::time_point lastSeen; + uint64_t requestCount = 0; + std::string lastTool; + }; + +private: + void StartServer(); + void StopServer(); + bool IsRunning() const noexcept { return server != nullptr; } + std::string BuildClientConfig() const; + void RegisterTools(); + void RegisterInspectTool(); + void RegisterFeatureTool(); + void RegisterConsoleTool(); + void RegisterCaptureTool(); + void RegisterAbtestTool(); + + // Records a tool invocation against the per-session table. + // Safe to call from the cpp-mcp listener thread. + void RecordToolCall(const std::string& sessionId, const std::string& toolName); + // Drops a session from the table on disconnect. + void DropSession(const std::string& sessionId); + // Draws the connected-clients ImGui table. + void DrawClientsTable(); + + std::unique_ptr server; + int activePort = 0; + std::string lastError; + + mutable std::mutex sessionMutex; + std::unordered_map sessions; +}; diff --git a/src/Features/RenderDoc.cpp b/src/Features/RenderDoc.cpp index 08661b33b2..9baa3c1568 100644 --- a/src/Features/RenderDoc.cpp +++ b/src/Features/RenderDoc.cpp @@ -31,8 +31,13 @@ RenderDoc* RenderDoc::GetSingleton() void RenderDoc::Load() { + // Latch the boot-time value of restart-gated fields so the menu can + // surface pending diffs even though the renderdoc.dll injection itself + // only runs once per launch. + bootSnapshot.LatchIfNeeded(settings); + // Only load RenderDoc if the user has enabled capture - if (!enableRenderDocCapture) { + if (!settings.enableCapture) { logger::debug("[RenderDoc] RenderDoc capture disabled, skipping initialization"); return; } @@ -132,35 +137,35 @@ void RenderDoc::DrawSettings() bool isSectionVisible = false; // Include enable toggle and annotation forcing logic here - bool prevRenderDocCapture = enableRenderDocCapture; - if (ImGui::Checkbox("Enable RenderDoc Capture", &enableRenderDocCapture)) { - if (enableRenderDocCapture && !prevRenderDocCapture) { + bool prevRenderDocCapture = settings.enableCapture; + if (ImGui::Checkbox("Enable RenderDoc Capture", &settings.enableCapture)) { + if (settings.enableCapture && !prevRenderDocCapture) { globals::state->useFrameAnnotations = globals::state->frameAnnotations; globals::state->frameAnnotations = true; } - if (!enableRenderDocCapture && prevRenderDocCapture) { + if (!settings.enableCapture && prevRenderDocCapture) { globals::state->frameAnnotations = globals::state->useFrameAnnotations; } } + Util::UI::DrawSettingDiff(bootSnapshot, settings, &Settings::enableCapture); if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text("Enable RenderDoc frame capture for providing debug captures to the Community Shaders team."); + ImGui::Text("Enable RenderDoc frame capture for providing debug captures to the Open Shaders team (or upstream Community Shaders for upstream-relevant issues)."); ImGui::Text("Enabling capture will force-enable frame annotations for easier debugging and will restore the previous setting when disabled."); } // The rest of the UI renders only when capture is active - bool renderDocCaptureEnabled = enableRenderDocCapture; + bool renderDocCaptureEnabled = settings.enableCapture; bool renderDocActive = IsAvailable(); const auto& themeSettings = Menu::GetSingleton()->GetTheme(); if (renderDocCaptureEnabled && !renderDocActive) { - ImGui::TextColored(themeSettings.StatusPalette.RestartNeeded, "Requires restart to enable RenderDoc capture."); return; } if (!renderDocCaptureEnabled && renderDocActive) { - ImGui::TextColored(themeSettings.StatusPalette.Warning, "Requires restart to disable RenderDoc capture, performance will be severely impacted."); + ImGui::TextColored(themeSettings.StatusPalette.Warning, "Performance will be severely impacted until the game is restarted."); return; } @@ -539,36 +544,34 @@ void RenderDoc::SetupResources() void RenderDoc::SaveSettings(json& o_json) { - o_json["Enable RenderDoc Capture"] = enableRenderDocCapture; + o_json["Enable RenderDoc Capture"] = settings.enableCapture; o_json["Capture Frame Count"] = GetCaptureFrameCount(); } void RenderDoc::LoadSettings(json& o_json) { if (o_json.contains("Enable RenderDoc Capture") && o_json["Enable RenderDoc Capture"].is_boolean()) { - enableRenderDocCapture = o_json["Enable RenderDoc Capture"]; - } - if (!o_json.contains("Capture Frame Count")) { - return; - } - - const auto& frameCountJson = o_json["Capture Frame Count"]; - if (frameCountJson.is_number_unsigned()) { - const auto frameCount = std::min(frameCountJson.get(), static_cast(kMaxCaptureFrameCount)); - SetCaptureFrameCount(static_cast(frameCount)); - } else if (frameCountJson.is_number_integer()) { - const auto frameCount = std::clamp( - frameCountJson.get(), - static_cast(kMinCaptureFrameCount), - static_cast(kMaxCaptureFrameCount)); - SetCaptureFrameCount(static_cast(frameCount)); + settings.enableCapture = o_json["Enable RenderDoc Capture"]; + } + if (o_json.contains("Capture Frame Count")) { + const auto& frameCountJson = o_json["Capture Frame Count"]; + if (frameCountJson.is_number_unsigned()) { + const auto frameCount = std::min(frameCountJson.get(), static_cast(kMaxCaptureFrameCount)); + SetCaptureFrameCount(static_cast(frameCount)); + } else if (frameCountJson.is_number_integer()) { + const auto frameCount = std::clamp( + frameCountJson.get(), + static_cast(kMinCaptureFrameCount), + static_cast(kMaxCaptureFrameCount)); + SetCaptureFrameCount(static_cast(frameCount)); + } } + bootSnapshot.LatchIfNeeded(settings); } void RenderDoc::RestoreDefaultSettings() { - enableRenderDocCapture = false; - SetCaptureFrameCount(1); + settings = {}; } void RenderDoc::ClearShaderCache() @@ -726,12 +729,12 @@ bool RenderDoc::HandleCaptureHotkey(uint32_t a_vkKey) uint32_t RenderDoc::GetCaptureFrameCount() const { - return std::clamp(captureFrameCount, kMinCaptureFrameCount, kMaxCaptureFrameCount); + return std::clamp(settings.captureFrameCount, kMinCaptureFrameCount, kMaxCaptureFrameCount); } void RenderDoc::SetCaptureFrameCount(uint32_t a_frameCount) { - captureFrameCount = std::clamp(a_frameCount, kMinCaptureFrameCount, kMaxCaptureFrameCount); + settings.captureFrameCount = std::clamp(a_frameCount, kMinCaptureFrameCount, kMaxCaptureFrameCount); } uint64_t RenderDoc::GetRequiredCaptureSpaceBytes() const @@ -780,7 +783,7 @@ bool RenderDoc::IsCapturing() const return false; // RenderDoc API doesn't have a direct IsCapturing method, but we can check if captures are enabled - return enableRenderDocCapture && renderDocApi != nullptr; + return settings.enableCapture && renderDocApi != nullptr; } std::string RenderDoc::GetCapturePath(uint32_t a_index) @@ -873,7 +876,7 @@ std::string RenderDoc::BuildAutomaticCaptureComments(const std::string& userComm // Plugin version auto pluginVersion = Util::GetFormattedVersion(Plugin::VERSION); - comments += std::format("Community Shaders {}\n", pluginVersion); + comments += std::format("Open Shaders {}\n", pluginVersion); // Enabled features const auto& features = Feature::GetFeatureList(); diff --git a/src/Features/RenderDoc.h b/src/Features/RenderDoc.h index ed3c133392..e5f8da05fb 100644 --- a/src/Features/RenderDoc.h +++ b/src/Features/RenderDoc.h @@ -1,6 +1,7 @@ #pragma once #include "Feature.h" +#include "Utils/BootSnapshot.h" #include #include #include @@ -118,9 +119,28 @@ class RenderDoc : public Feature std::string pendingCaptureComments; mutable std::mutex pendingCommentsMutex; - // RenderDoc capture enable setting - bool enableRenderDocCapture = false; - uint32_t captureFrameCount = 1; + struct Settings + { + bool enableCapture = false; + uint32_t captureFrameCount = 1; + }; + Settings settings; + + // `enableCapture` is restart-gated: the renderdoc.dll only gets injected + // at Load(); toggling the checkbox mid-session stages the change for next + // launch but doesn't install/uninstall the API. + inline static constexpr Util::Settings::RestartTable kRestartFields{ { + UTIL_RESTART_FIELD(Settings, enableCapture, "RenderDoc Capture"), + } }; + Util::Settings::BootSnapshot bootSnapshot{ kRestartFields }; + + std::span GetRestartRequiredFields() const override + { + return { kRestartFields.data(), kRestartFields.size() }; + } + const void* GetBootValue(std::string_view jsonKey) const override { return bootSnapshot.RawBoot(jsonKey); } + const void* GetSettingsBlob() const override { return &settings; } + size_t GetSettingsBlobSize() const override { return sizeof(settings); } // Track the last capture count we've processed for automatic comments uint32_t lastCaptureCount = 0; diff --git a/src/Features/ScreenshotFeature.cpp b/src/Features/ScreenshotFeature.cpp index 8f6d445293..68ea6502c3 100644 --- a/src/Features/ScreenshotFeature.cpp +++ b/src/Features/ScreenshotFeature.cpp @@ -8,6 +8,7 @@ #include "Globals.h" #include "Menu.h" #include "Utils/FileSystem.h" +#include "Utils/Subrect.h" #include #include #include @@ -273,35 +274,6 @@ namespace combo[0].GetKey() == VK_SNAPSHOT; } - // Blend state used around the preview's ImGui::Image draw. Two regression - // risks if this is changed: - // 1. BlendEnable must stay FALSE - the source texture carries non-1 alpha - // where Skyrim composited UI plates; default SRC_ALPHA blend lets the - // host window background show through (visible on the desktop mirror). - // 2. WriteMask must exclude alpha (RGB only). In VR, Skyrim's menu UI - // shader recomposites our menu plate over the SBS framebuffer with - // alpha blending; writing texture alpha into the menu plate RT - // produces a cutout visible only through the HMD. RGB-only writes - // leave the plate's pre-cleared alpha=1 in place. - // Paired with ImDrawCallback_ResetRenderState queued by Subrect::DrawEditor - // immediately after the image draw. - void OpaquePreviewBlendCallback(const ImDrawList*, const ImDrawCmd*) - { - static winrt::com_ptr opaqueBlend; - if (!opaqueBlend) { - D3D11_BLEND_DESC desc{}; - desc.RenderTarget[0].BlendEnable = FALSE; - desc.RenderTarget[0].RenderTargetWriteMask = - D3D11_COLOR_WRITE_ENABLE_RED | - D3D11_COLOR_WRITE_ENABLE_GREEN | - D3D11_COLOR_WRITE_ENABLE_BLUE; - globals::d3d::device->CreateBlendState(&desc, opaqueBlend.put()); - } - if (opaqueBlend) { - globals::d3d::context->OMSetBlendState(opaqueBlend.get(), nullptr, 0xFFFFFFFF); - } - } - std::filesystem::path BuildScreenshotPath(const std::string& screenshotPath) { SYSTEMTIME st; @@ -429,7 +401,7 @@ void ScreenshotFeature::DrawSettings() } } - subrect.DrawEditor(previewView, src.texture, 1.0f, 0.0f, OpaquePreviewBlendCallback); + subrect.DrawEditor(previewView, src.texture, 1.0f, 0.0f, Util::Subrect::OpaquePreviewBlendCallback); } void ScreenshotFeature::EnsurePreviewCache(ID3D11Texture2D* sourceTexture) diff --git a/src/Features/Upscaling.cpp b/src/Features/Upscaling.cpp index b3a0190b6c..c793eabc1b 100644 --- a/src/Features/Upscaling.cpp +++ b/src/Features/Upscaling.cpp @@ -6,6 +6,12 @@ #include "State.h" #include "Upscaling/DX12SwapChain.h" #include "Upscaling/FidelityFX.h" +#include "Upscaling/FoveatedRender.h" +#include "Upscaling/FoveatedRender/Bridge.h" +#include "Upscaling/FoveatedRender/Core.h" +#include "Upscaling/FoveatedRender/Postprocess.h" +#include "Upscaling/FoveatedRender/Preprocess.h" +#include "Upscaling/PerfMode.h" #include "Upscaling/Streamline.h" #include "Utils/UI.h" #include @@ -32,7 +38,8 @@ NLOHMANN_DEFINE_TYPE_NON_INTRUSIVE_WITH_DEFAULT( reflexLowLatencyBoost, reflexUseMarkersToOptimize, reflexUseFPSLimit, - reflexFPSLimit); + reflexFPSLimit, + renderAtUpscaleRes); decltype(&D3D11CreateDeviceAndSwapChain) ptrD3D11CreateDeviceAndSwapChainUpscaling; @@ -212,6 +219,29 @@ void Upscaling::DrawSettings() // Check the current upscale method auto upscaleMethod = GetUpscaleMethod(); + // PerfMode: BSOpenVR size hook + RT::Create run once at world load, so + // runtime reads of method/qualityMode route through the boot snapshot. + // The always-present explanation is plain text — only the staged-change + // diff uses the RestartNeeded color so users learn the cue means "you + // changed something that won't apply yet." + if (perfMode.IsHookActive()) { + ImGui::TextWrapped( + "Render-at-upscaled-resolution is active: Method and Upscale Preset changes only take effect after a game restart. " + "Sharpness / model preset / Reflex remain live."); + + // Method pending-diff. Only fires when the user is editing the DLSS- + // path mode slot (upscaleMethod, not upscaleMethodNoDLSS), since + // that's the one the boot snapshot locked. + if (currentUpscaleMode == &settings.upscaleMethod && + bootSnapshot.HasPendingChange(settings, &Settings::upscaleMethod)) { + const uint live = std::clamp(settings.upscaleMethod, 0u, availableModes); + const uint boot = std::clamp(bootSnapshot.Boot(&Settings::upscaleMethod), 0u, availableModes); + Util::Text::RestartNeeded( + "Pending restart: currently active method = %s (selected = %s).", + upscaleModes[boot].c_str(), upscaleModes[live].c_str()); + } + } + // Display warning for DLSS resolution limits (non-VR only; VR handles this automatically) if (!globals::game::isVR && upscaleMethod == UpscaleMethod::kDLSS) { auto screenSize = globals::state->screenSize; @@ -244,10 +274,25 @@ void Upscaling::DrawSettings() } if (baseLabel) { - // Format the label with preset name and resolution scale - std::string labelWithScale = std::format("{} ( {:.2f}x )", baseLabel, (resolutionScale.x + resolutionScale.y) * 0.5f); + // Derive scale from live `settings.qualityMode` — `resolution- + // Scale` is locked to the PerfMode boot snapshot, so reusing it + // here would mismatch the slider position the user sees. + const float displayScale = 1.0f / GetQualityModeRatio(settings.qualityMode); + std::string labelWithScale = std::format("{} ( {:.2f}x )", baseLabel, displayScale); ImGui::SliderInt("Upscale Preset", (int*)&settings.qualityMode, 0, 4, labelWithScale.c_str()); + + // Pending-diff vs the boot snapshot the runtime upscaler is + // actually using. Without this the slider change looks like a + // no-op. + if (perfMode.IsHookActive() && + bootSnapshot.HasPendingChange(settings, &Settings::qualityMode)) { + const uint bm = std::clamp(bootSnapshot.Boot(&Settings::qualityMode), 0u, 4u); + const char* bootLabel = (upscaleMethod == UpscaleMethod::kDLSS) ? upscalePresetsDLSS[std::clamp(4 - (int)bm, 0, 4)] : upscalePresets[std::clamp(4 - (int)bm, 0, 4)]; + Util::Text::RestartNeeded( + "Pending restart: currently active = %s ( %.2fx ). Change applies after game restart.", + bootLabel, 1.0f / GetQualityModeRatio(bm)); + } } if (upscaleMethod == UpscaleMethod::kFSR) { @@ -261,9 +306,45 @@ void Upscaling::DrawSettings() ImGui::Text("Choose which DLSS AI model preset to use."); ImGui::Text("Each model offers different visual quality, performance, and motion stability."); ImGui::Text("Set to 'Default' for automatic selection based on your Upscale Preset and hardware."); - ImGui::Text("Changing this setting requires a restart to take effect."); } } + + // VR PerfMode: opt-in performance feature. Lives in the main + // upscaler section (not Backend Diagnostics) so users discover it + // alongside the rest of the upscaler controls. Restart-gated — + // the BSOpenVR size hook reads this at world load and sizes every + // engine RT off the boot value. + // + // The setting persists across method switches (we don't auto-flip + // it when the user picks TAA/NONE), but the checkbox itself is + // disabled outside upscalers that can target a separate displayRes + // output (DLSS, FSR). Keep visible-but-greyed so users see the + // option exists and understand why it isn't live. + if (globals::game::isVR) { + const bool methodSupportsPerf = + upscaleMethod == UpscaleMethod::kDLSS || + upscaleMethod == UpscaleMethod::kFSR; + if (!methodSupportsPerf) + ImGui::BeginDisabled(); + ImGui::Checkbox("Render engine at upscaled resolution", &settings.renderAtUpscaleRes); + if (!methodSupportsPerf) + ImGui::EndDisabled(); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text( + "When enabled, the engine pipeline allocates render targets at the upscaled-render\n" + "resolution instead of the HMD display resolution. The upscaler (DLSS or FSR) writes\n" + "its output to a private DisplayRes texture. Substantial VRAM and bandwidth savings,\n" + "especially at high HMD resolutions.\n" + "\n" + "Requires DLSS or FSR. Restart required to enable/disable. Method and Upscale\n" + "Preset changes also require a restart while this is active; sharpness / model preset\n" + "/ Reflex remain live."); + } + if (!methodSupportsPerf && settings.renderAtUpscaleRes) + Util::Text::Disabled("Render-at-upscaled-resolution requires DLSS or FSR — switch upscaler Method to activate."); + if (methodSupportsPerf) + Util::UI::DrawSettingDiff(bootSnapshot, settings, &Settings::renderAtUpscaleRes); + } } const bool frameGenerationDx12PathActive = IsFrameGenerationDx12PathActive(); @@ -275,37 +356,23 @@ void Upscaling::DrawSettings() if (HasFrameGenModule()) ImGui::Text("AMD FSR Frame Generation is available."); ImGui::Text("Requires a D3D11 to D3D12 proxy which can create compatibility issues"); - ImGui::Text("Toggling this setting requires a restart to work correctly"); - - bool onlyRequiresRestart = true; if (!isWindowed) { Util::Text::Warning("Warning: Requires windowed mode"); - - onlyRequiresRestart = false; } if (lowRefreshRate && !settings.frameGenerationForceEnable) { Util::Text::Warning("Warning: Requires a high refresh rate monitor or Force Enable Frame Generation"); - - onlyRequiresRestart = false; } if (fidelityFXMissing) { Util::Text::Warning("Warning: FidelityFX DLLs are not loaded"); - - onlyRequiresRestart = false; } - if (onlyRequiresRestart && settings.frameGenerationMode && !frameGenerationDx12PathActive) - Util::Text::Warning("Warning: Requires restart"); - - if (!settings.frameGenerationMode && frameGenerationDx12PathActive) - Util::Text::Warning("Warning: Requires restart"); - bool fgEnabled = settings.frameGenerationMode != 0; if (ImGui::Checkbox("Frame Generation", &fgEnabled)) settings.frameGenerationMode = fgEnabled ? 1 : 0; + Util::UI::DrawSettingDiff(bootSnapshot, settings, &Settings::frameGenerationMode); if (!frameGenerationDx12PathActive) ImGui::BeginDisabled(); @@ -321,6 +388,7 @@ void Upscaling::DrawSettings() bool fgForce = settings.frameGenerationForceEnable != 0; if (ImGui::Checkbox("Force Enable Frame Generation", &fgForce)) settings.frameGenerationForceEnable = fgForce ? 1 : 0; + Util::UI::DrawSettingDiff(bootSnapshot, settings, &Settings::frameGenerationForceEnable); ImGui::Checkbox("Frame Generation in Menus", &settings.frameGenerationAllowInMenus); if (auto _tt = Util::HoverTooltipWrapper()) { @@ -409,6 +477,25 @@ void Upscaling::DrawSettings() ImGui::TreePop(); } + // FoveatedRender: foveated subrect DLSS — VR-only, opt-in mode of this + // feature. Like DLSSperf, lives here rather than as a peer Feature so + // all DLSS surfaces share one settings panel. Enable lives at the top + // level for discoverability; the body knobs are collapsed by default and + // greyed out until the user opts in. + if (globals::game::isVR) { + ImGui::Separator(); + foveatedRender.DrawEnable(); + const bool enabled = foveatedRender.settings.enabled != 0; + if (!enabled) + ImGui::BeginDisabled(); + if (ImGui::TreeNodeEx("Foveated DLSS — Tuning")) { + foveatedRender.DrawSettings(); + ImGui::TreePop(); + } + if (!enabled) + ImGui::EndDisabled(); + } + if (ImGui::TreeNodeEx("Backend Diagnostics")) { // Streamline log level selection const char* logLevels[] = { "Off", "Default", "Verbose" }; @@ -416,7 +503,7 @@ void Upscaling::DrawSettings() if (ImGui::Combo("Streamline Logging", &logLevelIdx, logLevels, IM_ARRAYSIZE(logLevels))) { settings.streamlineLogLevel = static_cast(logLevelIdx); } - ImGui::TextUnformatted("Changing this requires a restart to take effect."); + Util::UI::DrawSettingDiff(bootSnapshot, settings, &Settings::streamlineLogLevel); if (auto _tt = Util::HoverTooltipWrapper()) { ImGui::Text("Streamline logging controls the verbosity of NVIDIA Streamline backend logs. Useful for debugging issues with DLSS/DLSS-G."); } @@ -493,6 +580,11 @@ void Upscaling::DrawSettings() void Upscaling::SaveSettings(json& o_json) { o_json = settings; + // Nest FoveatedRender's settings under a sub-key so they round-trip alongside + // Upscaling's own. Subrect controller persistence is owned by FoveatedRender. + json foveatedRenderJson; + foveatedRender.SaveSettings(foveatedRenderJson); + o_json["foveatedRender"] = foveatedRenderJson; auto iniSettingCollection = globals::game::iniPrefSettingCollection; if (iniSettingCollection) { auto setting = iniSettingCollection->GetSetting("bUseTAA:Display"); @@ -504,6 +596,15 @@ void Upscaling::SaveSettings(json& o_json) void Upscaling::LoadSettings(json& o_json) { + // Pull FoveatedRender's nested block first so its absence doesn't fail the + // outer settings deserialize. FoveatedRender::ClampSettings touches sibling + // presetDLSS (cross-feature compat), so re-run it after `settings = o_json` + // below — otherwise the JSON re-assign overwrites the clamp and an + // incompatible preset slips through. (Copilot on PR #44.) + if (o_json.contains("foveatedRender")) { + foveatedRender.LoadSettings(o_json["foveatedRender"]); + o_json.erase("foveatedRender"); + } settings = o_json; // Sanitize loaded settings to ensure enum indices are valid @@ -516,10 +617,20 @@ void Upscaling::LoadSettings(json& o_json) logger::warn("[Upscaling] Loaded upscaleMethodNoDLSS {} out of range, clamping to {}", settings.upscaleMethodNoDLSS, enumCount ? enumCount - 1 : 0); settings.upscaleMethodNoDLSS = enumCount ? enumCount - 1 : 0; } + if (settings.qualityMode > 4) { + logger::warn("[Upscaling] Loaded qualityMode {} out of range, clamping to 4", settings.qualityMode); + settings.qualityMode = 4; + } if (settings.presetDLSS > 4) { logger::warn("[Upscaling] Loaded presetDLSS {} out of range, resetting to 0 (Default)", settings.presetDLSS); settings.presetDLSS = 0; } + // Re-apply FoveatedRender's cross-feature clamp now that the JSON + // re-assign above has overwritten anything it set during its own + // LoadSettings (which fired before this block ran). Idempotent — no-op + // if FoveatedRender is inactive or the preset is already compatible. + // (Copilot on PR #44.) + foveatedRender.ClampSettings(); const float originalReflexFPSLimit = settings.reflexFPSLimit; if (!std::isfinite(settings.reflexFPSLimit)) { settings.reflexFPSLimit = 60.0f; @@ -548,6 +659,7 @@ void Upscaling::LoadSettings(json& o_json) void Upscaling::RestoreDefaultSettings() { settings = {}; + foveatedRender.RestoreDefaultSettings(); } void Upscaling::DataLoaded() @@ -605,6 +717,10 @@ struct BSImageSpace_Init_FXAA }; void Upscaling::PostPostLoad() { + // Subrect controller defaults + stereo flag (FoveatedRender is no longer a + // Feature subclass so we drive its lifecycle from here). + foveatedRender.PostPostLoad(); + bool isGOG = !GetModuleHandle(L"steam_api64.dll"); stl::detour_thunk(REL::RelocationID(79947, 82084)); @@ -634,8 +750,26 @@ void Upscaling::PostPostLoad() logger::info("[Upscaling] Installed hooks"); } +float Upscaling::GetQualityModeRatio(uint qualityMode) +{ + // Lower bound is 0, not 1: qualityMode=0 is DLAA / NATIVEAA (1.0x — + // render at display resolution). The FfxFsr3QualityMode enum header + // doesn't *declare* a 0 value, but the implementation delegates to + // FfxFsr3UpscalerQualityMode which has NATIVEAA=0 → 1.0f. Clamping to + // 1 would force DLAA into Quality (1.5x) and shrink the rendered + // region of kMAIN to 67%. + const float ratio = ffxFsr3GetUpscaleRatioFromQualityMode( + static_cast(std::clamp(qualityMode, 0u, 4u))); + return std::isfinite(ratio) && ratio > 0.0f ? ratio : 3.0f; +} + Upscaling::UpscaleMethod Upscaling::GetUpscaleMethod() const { + // Lock runtime to the boot upscaler under PerfMode — engine RTs are + // sized for it, and routing a different method through testTexture/ + // renderRes paths breaks the HMD. + if (globals::features::upscaling.perfMode.IsHookActive()) + return static_cast(bootSnapshot.Boot(&Settings::upscaleMethod)); if (streamline.featureDLSS) return (UpscaleMethod)settings.upscaleMethod; return (UpscaleMethod)settings.upscaleMethodNoDLSS; @@ -1015,8 +1149,15 @@ void Upscaling::EnsureVRIntermediateTextures() auto screenSize = globals::state->screenSize; auto renderSize = Util::ConvertToDynamic(screenSize); - uint32_t eyeWidthOut = (uint32_t)(screenSize.x / 2); - uint32_t eyeHeightOut = (uint32_t)screenSize.y; + // PerfMode: state->screenSize is polluted to RenderRes (the BSOpenVR size + // hook spoofs HMD-recommended size). DLSS output needs to land at real + // DisplayRes, so size the OUTPUT intermediates from perfMode's snapshot + // of the true HMD resolution. Input intermediates stay at renderSize. + const bool dlssperfActive = perfMode.IsHookActive() && perfMode.GetTestTexture(); + const float2 outputSize = dlssperfActive ? perfMode.GetDisplayScreenSize() : screenSize; + + uint32_t eyeWidthOut = (uint32_t)(outputSize.x / 2); + uint32_t eyeHeightOut = (uint32_t)outputSize.y; uint32_t eyeWidthIn = (uint32_t)(renderSize.x / 2); uint32_t eyeHeightIn = (uint32_t)renderSize.y; @@ -1087,10 +1228,18 @@ void Upscaling::FinalizePerEyeOutputs(ID3D11Resource* colorDst) state->BeginPerfEvent("VR Upscaling Finalize"); auto context = globals::d3d::context; - auto screenSize = state->screenSize; - uint32_t eyeWidthOut = (uint32_t)(screenSize.x / 2); - uint32_t eyeHeightOut = (uint32_t)screenSize.y; + // Drive output dims from the per-eye intermediate desc, not state->screenSize. + // Under PerfMode the state value is polluted to renderRes while the intermediates + // were allocated at displayRes via EnsureVRIntermediateTextures' size bridge. + if (!vrIntermediateColorOut[0]) { + if (state->frameAnnotations) + state->EndPerfEvent(); + return; + } + + uint32_t eyeWidthOut = vrIntermediateColorOut[0]->desc.Width; + uint32_t eyeHeightOut = vrIntermediateColorOut[0]->desc.Height; // Write upscaled outputs back for (uint32_t i = 0; i < 2; ++i) { @@ -1218,24 +1367,62 @@ void Upscaling::ConfigureUpscaling(RE::BSGraphics::State* a_viewport) auto screenHeight = static_cast(screenSize.y); if (upscaleMethod != UpscaleMethod::kNONE && upscaleMethod != UpscaleMethod::kTAA) { - float resolutionScaleBase = 1.0f / ffxFsr3GetUpscaleRatioFromQualityMode((FfxFsr3QualityMode)settings.qualityMode); + // PerfMode: when the BSOpenVR size hook is live, every engine RT was + // already allocated at RenderRes — so the DRS-style scale is identity. + // Jitter is still computed at the real DisplayRes phase ratio so the + // upscaler has enough sub-pixel diversity for reconstruction. + // + // The upscaleMethod here comes from GetUpscaleMethod(), which under + // PerfMode+hookActive is locked to the boot snapshot — so this gate + // reads the value the user had selected at game start, not what they + // later moved the slider to. Engine RTs were sized off that boot + // choice (irreversible — the size hook can't un-allocate them); the + // boot-snapshot lock keeps the runtime evaluate consistent with those + // allocations. UI staged-change banners explain the restart + // requirement for method/quality edits. Branch fires for both DLSS + // and FSR since both consume the renderRes engine RTs and write to + // perfMode.testTexture. + const bool dlssperfRenderResPath = + perfMode.IsHookActive() && + (upscaleMethod == UpscaleMethod::kDLSS || upscaleMethod == UpscaleMethod::kFSR); + if (dlssperfRenderResPath) { + resolutionScale = { 1.0f, 1.0f }; + + auto renderWidth = static_cast(perfMode.GetRenderEyeWidth()); + auto displayWidth = static_cast(perfMode.GetDisplayEyeWidth()); + + auto phaseCount = GetJitterPhaseCount(renderWidth, displayWidth); + GetJitterOffset(&jitter.x, &jitter.y, state->frameCount, phaseCount); + + if (globals::game::isVR) + a_viewport->projectionPosScaleX = -jitter.x / renderWidth; + else + a_viewport->projectionPosScaleX = -2.0f * jitter.x / renderWidth; - auto renderWidth = static_cast(screenWidth * resolutionScaleBase); - auto renderHeight = static_cast(screenHeight * resolutionScaleBase); + a_viewport->projectionPosScaleY = 2.0f * jitter.y / static_cast(perfMode.GetRenderEyeHeight()); + } else { + // Boot qualityMode under PerfMode so projection stays coherent + // with the engine RTs sized at install. + const uint32_t qm = globals::features::upscaling.perfMode.IsHookActive() ? bootSnapshot.Boot(&Settings::qualityMode) : settings.qualityMode; + float resolutionScaleBase = 1.0f / GetQualityModeRatio(qm); - resolutionScale.x = static_cast(renderWidth) / static_cast(screenWidth); - resolutionScale.y = static_cast(renderHeight) / static_cast(screenHeight); + auto renderWidth = static_cast(screenWidth * resolutionScaleBase); + auto renderHeight = static_cast(screenHeight * resolutionScaleBase); - auto phaseCount = GetJitterPhaseCount(renderWidth, screenWidth); + resolutionScale.x = static_cast(renderWidth) / static_cast(screenWidth); + resolutionScale.y = static_cast(renderHeight) / static_cast(screenHeight); - GetJitterOffset(&jitter.x, &jitter.y, state->frameCount, phaseCount); + auto phaseCount = GetJitterPhaseCount(renderWidth, screenWidth); - if (globals::game::isVR) - a_viewport->projectionPosScaleX = -jitter.x / renderWidth; - else - a_viewport->projectionPosScaleX = -2.0f * jitter.x / renderWidth; + GetJitterOffset(&jitter.x, &jitter.y, state->frameCount, phaseCount); + + if (globals::game::isVR) + a_viewport->projectionPosScaleX = -jitter.x / renderWidth; + else + a_viewport->projectionPosScaleX = -2.0f * jitter.x / renderWidth; - a_viewport->projectionPosScaleY = 2.0f * jitter.y / renderHeight; + a_viewport->projectionPosScaleY = 2.0f * jitter.y / renderHeight; + } } else { resolutionScale = { 1.0f, 1.0f }; @@ -1355,6 +1542,7 @@ void Upscaling::SetupResources() void Upscaling::ClearShaderCache() { + foveatedRender.ClearShaderCache(); for (int i = 0; i < 5; ++i) { encodeTexturesCS[i] = nullptr; // com_ptr automatically releases } @@ -1784,9 +1972,51 @@ void Upscaling::Upscale() logger::debug("[Upscaling] LoadingMenu close detected — rebuilding DLSS feature"); streamline.DestroyDLSSResources(); } - streamline.Upscale(main.texture, reactiveMaskTexture->resource.get(), transparencyCompositionMaskTexture->resource.get(), motionVectorCopyTexture->resource.get()); + + // PR-3 MVP-B: opt-in FoveatedRender route. When active, runs the + // per-eye DLSS dispatch with optional foveal subrect through + // FoveatedRenderImpl::Core; falls through to dev's standard path on + // any failure so users always see DLSS output (graceful + // degradation — no black frames if the enhancer preflights bad). + // + // Menu-skip: in menus the world stops producing fresh motion + // vectors and depth, but kMAIN keeps changing (UI plate composites). + // The route's subrect DLSS evaluate then accumulates temporal + // history against stale neighbourhood data and the subrect region + // renders as visible reconstruction garbage. Standard full-eye DLSS + // (the fall-through below) is robust to this because it reconstructs + // across the whole image — the foveated crop is what makes the + // stale-history bleed visible. Same menu-open predicate dev uses + // at Upscaling.cpp:1748 for ShouldUseFrameGenerationThisFrame. + auto* ui = globals::game::ui; + auto* st = globals::state; + const bool menuOpen = (ui && ui->GameIsPaused()) || (st && st->IsMainOrLoadingMenuOpen(ui)); + bool routeHandled = false; + if (FoveatedRenderImpl::Bridge::IsRouteActive() && globals::game::isVR && !menuOpen) { + if (FoveatedRenderImpl::Preprocess::EncodeUpscalingTextures(*this)) { + routeHandled = FoveatedRenderImpl::Core::ExecuteVRDlssCore(streamline, + main.texture, + globals::game::renderer->GetDepthStencilData().depthStencils[RE::RENDER_TARGETS_DEPTHSTENCIL::kMAIN].texture, + reactiveMaskTexture->resource.get(), + transparencyCompositionMaskTexture->resource.get(), + motionVectorCopyTexture->resource.get()); + if (!routeHandled) { + logger::warn("[FOVEATED] route preflight failed — falling through to standard DLSS path"); + } + } + } + if (!routeHandled) { + streamline.Upscale(main.texture, reactiveMaskTexture->resource.get(), transparencyCompositionMaskTexture->resource.get(), motionVectorCopyTexture->resource.get()); + } } else if (upscaleMethod == UpscaleMethod::kFSR) { - fidelityFX.Upscale(main.texture, reactiveMaskTexture->resource.get(), transparencyCompositionMaskTexture->resource.get(), motionVector.texture, settings.sharpnessFSR); + // PerfMode bridge: when the engine RTs are shrunk to renderRes, FSR's displayRes + // output must land in perfMode.testTexture (the private displayRes target used for + // OpenVR submit), not back in the now-small kMAIN. Mirrors Streamline's colorOut + // routing for DLSS. + ID3D11Resource* fsrColorOut = (perfMode.IsHookActive() && perfMode.GetTestTexture()) ? + static_cast(perfMode.GetTestTexture()) : + nullptr; + fidelityFX.Upscale(main.texture, reactiveMaskTexture->resource.get(), transparencyCompositionMaskTexture->resource.get(), motionVector.texture, settings.sharpnessFSR, fsrColorOut); } state->EndPerfEvent(); @@ -1819,7 +2049,20 @@ void Upscaling::UpscaleDepth() // 3) Resource copies are skipped for aliased src/dst to reduce copy churn. // (1) Early validation exits - if (!IsUpscalingActive()) { + const bool depthUpscaleActive = IsUpscalingActive(); + const auto upscaleMethod = GetUpscaleMethod(); + const bool isVR = globals::game::isVR; + const bool vendorUpscaler = upscaleMethod == UpscaleMethod::kDLSS || upscaleMethod == UpscaleMethod::kFSR; + const bool fullResolutionMaskPath = + upscaleMethod == UpscaleMethod::kNONE || + upscaleMethod == UpscaleMethod::kTAA || + (vendorUpscaler && settings.qualityMode == 0); + const bool repairVRFullResolutionMask = + isVR && + fullResolutionMaskPath && + !depthUpscaleActive; + + if (!depthUpscaleActive && !repairVRFullResolutionMask) { return; } @@ -1827,7 +2070,8 @@ void Upscaling::UpscaleDepth() auto renderer = globals::game::renderer; auto context = globals::d3d::context; auto deferred = globals::deferred; - if (!state || !renderer || !context || !deferred || !deferred->linearSampler || !jitterCB || !upscaleRasterizerState || !upscaleBlendState || !upscaleDepthStencilState) { + if (!state || !renderer || !context || !deferred || !deferred->linearSampler || !jitterCB || !upscaleRasterizerState || !upscaleBlendState || + (depthUpscaleActive && !upscaleDepthStencilState)) { return; } @@ -1842,24 +2086,39 @@ void Upscaling::UpscaleDepth() auto& saoCameraZ = renderer->GetRuntimeData().renderTargets[RE::RENDER_TARGET::kSAO_CAMERAZ]; auto& underwaterMask = renderer->GetRuntimeData().renderTargets[RE::RENDER_TARGET::kUNDERWATER_MASK]; - if (!depth.texture || !depth.views[0] || !depthCopy.texture || !depthCopy.depthSRV || - !refractionNormals.texture || !refractionNormals.textureCopy || !refractionNormals.SRVCopy || !refractionNormals.RTV || !saoCameraZ.RTV || + if (!depth.texture || !depthCopy.texture || !depthCopy.depthSRV || !underwaterMask.texture || !underwaterMask.textureCopy || !underwaterMask.SRVCopy || !underwaterMask.RTV) { return; } - if (globals::game::isVR && (!depthCopy.views[0] || !depthCopy.stencilSRV)) { + if (depthUpscaleActive && + (!depth.views[0] || !refractionNormals.texture || !refractionNormals.textureCopy || !refractionNormals.SRVCopy || !refractionNormals.RTV || !saoCameraZ.RTV)) { + return; + } + // stencilSRV + views[0] are both upscale-path-only: the depth-upscale + // draw binds depthCopy as a stencil SRV input and depth.views[0] as DSV. + // The full-resolution mask repair never touches either, so don't disable + // the VR fix on setups where stencil SRV creation is unavailable. + if (depthUpscaleActive && isVR && (!depthCopy.stencilSRV || !depthCopy.views[0])) { return; } auto* fullscreenVS = GetUpscaleVS(); - auto* depthUpscalePS = GetDepthRefractionUpscalePS(); + auto* depthUpscalePS = depthUpscaleActive ? GetDepthRefractionUpscalePS() : nullptr; auto* underwaterMaskPS = GetUnderwaterMaskUpscalePS(); - if (!fullscreenVS || !depthUpscalePS || !underwaterMaskPS) { + if (!fullscreenVS || !underwaterMaskPS || (depthUpscaleActive && !depthUpscalePS)) { return; } state->BeginPerfEvent("Render Target Upscaling"); + // Unbind any prior render targets before issuing CopyResource on depth/ + // depthCopy. Upscale() does this for the standard upscale path, but + // UpscaleDepth() can now be invoked standalone from Main_PostProcessing + // (kNONE/kTAA VR path) without going through Upscale() first — match the + // same precondition here to avoid a debug-layer hazard when depth happens + // to still be bound as a DSV from a prior pass. + context->OMSetRenderTargets(0, nullptr, nullptr); + // Set up Input Assembler for fullscreen triangle (no vertex/index buffers needed) context->IASetInputLayout(nullptr); context->IASetVertexBuffers(0, 0, nullptr, nullptr, nullptr); @@ -1887,10 +2146,10 @@ void Upscaling::UpscaleDepth() context->PSSetSamplers(0, ARRAYSIZE(samplers), samplers); // Set up jitter/depth-kernel constant buffer for upscaling - JitterCB jitterData; + JitterCB jitterData{}; jitterData.jitter = jitter; // (2) Wide-kernel hysteresis - { + if (depthUpscaleActive) { constexpr float kEnterWideKernelRatio = 1.55f; constexpr float kExitWideKernelRatio = 1.45f; const float minScale = std::max(std::min(resolutionScale.x, resolutionScale.y), FLT_EPSILON); @@ -1907,7 +2166,6 @@ void Upscaling::UpscaleDepth() } jitterData.useWideKernel = depthUpscaleUseWideKernel ? 1.0f : 0.0f; - jitterData.pad0 = 0.0f; } jitterCB->Update(jitterData); @@ -1921,7 +2179,7 @@ void Upscaling::UpscaleDepth() } }; - { + if (depthUpscaleActive) { TracyD3D11Zone(globals::state->tracyCtx, "Upscaling - Depth Upscale"); // Sometimes this is not already copied e.g. map menu. @@ -1929,7 +2187,7 @@ void Upscaling::UpscaleDepth() copyIfNonAliased(depthCopy.texture, depth.texture); // Clear stencil to be 0xFF - if (globals::game::isVR) { + if (isVR) { context->ClearDepthStencilView(depthCopy.views[0], D3D11_CLEAR_STENCIL, 1.0f, 0xFF); } @@ -1944,11 +2202,16 @@ void Upscaling::UpscaleDepth() // kSAO_CAMERAZ is at quarter-stereo resolution in VR; the full-stereo viewport would // corrupt only the top-left quarter. The engine's ISSAOCameraZ pass populates it correctly. ID3D11RenderTargetView* rtvs[] = { refractionNormals.RTV, - globals::game::isVR ? nullptr : saoCameraZ.RTV }; + isVR ? nullptr : saoCameraZ.RTV }; context->OMSetRenderTargets(2, rtvs, depth.views[0]); context->PSSetShader(depthUpscalePS, nullptr, 0); context->Draw(3, 0); + } else { + TracyD3D11Zone(globals::state->tracyCtx, "Upscaling - Full Resolution Underwater Mask Depth Copy"); + + // Full-resolution paths only need to refresh the underwater mask depth source. + copyIfNonAliased(depthCopy.texture, depth.texture); } { @@ -1974,8 +2237,10 @@ void Upscaling::UpscaleDepth() context->Draw(3, 0); } - // Now propagate the upscaled depth to kMAIN_COPY so downstream VR passes see it. - if (globals::game::isVR) { + // Propagate the upscaled depth to kMAIN_COPY so downstream VR passes see + // it. Skipped on the full-resolution path because the else branch above + // already refreshed depthCopy from depth and nothing has touched it since. + if (isVR && depthUpscaleActive) { TracyD3D11Zone(globals::state->tracyCtx, "Upscaling - Depth VR Propagate"); copyIfNonAliased(depthCopy.texture, depth.texture); } @@ -1986,6 +2251,101 @@ void Upscaling::UpscaleDepth() state->EndPerfEvent(); } +void Upscaling::RunUnderwaterMaskRepair() +{ + ZoneScoped; + TracyD3D11Zone(globals::state->tracyCtx, "Upscaling - Underwater Mask (Standalone)"); + + if (!globals::game::isVR) + return; + + auto state = globals::state; + auto renderer = globals::game::renderer; + auto context = globals::d3d::context; + auto deferred = globals::deferred; + if (!state || !renderer || !context || !deferred || !deferred->linearSampler || !jitterCB) { + return; + } + + auto screenSize = state->screenSize; + if (screenSize.x <= 0.0f || screenSize.y <= 0.0f) { + return; + } + + auto& depth = renderer->GetDepthStencilData().depthStencils[RE::RENDER_TARGETS_DEPTHSTENCIL::kMAIN]; + auto& depthCopy = renderer->GetDepthStencilData().depthStencils[RE::RENDER_TARGETS_DEPTHSTENCIL::kMAIN_COPY]; + auto& underwaterMask = renderer->GetRuntimeData().renderTargets[RE::RENDER_TARGET::kUNDERWATER_MASK]; + if (!depth.texture || !depthCopy.texture || !depthCopy.depthSRV || + !underwaterMask.texture || !underwaterMask.textureCopy || !underwaterMask.SRVCopy || !underwaterMask.RTV) { + return; + } + + auto* fullscreenVS = GetUpscaleVS(); + auto* underwaterMaskPS = GetUnderwaterMaskUpscalePS(); + if (!fullscreenVS || !underwaterMaskPS) { + return; + } + + state->BeginPerfEvent("Underwater Mask Repair (Standalone)"); + + // Unbind RTs/DSV before the CopyResource calls below — if the caller + // still has depth bound as a DSV the copy is a debug-layer hazard. + // Mirrors UpscaleDepth's entry-time precondition. The caller's save/ + // restore (FullscreenPassScope) restores the original binding on exit. + context->OMSetRenderTargets(0, nullptr, nullptr); + + // Fullscreen triangle setup — pipeline state is the caller's + // responsibility to save/restore; we do not touch the existing OM + // bindings beyond the explicit binds below. + context->IASetInputLayout(nullptr); + context->IASetVertexBuffers(0, 0, nullptr, nullptr, nullptr); + context->IASetIndexBuffer(nullptr, DXGI_FORMAT_UNKNOWN, 0); + context->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST); + context->VSSetShader(fullscreenVS, nullptr, 0); + context->GSSetShader(nullptr, nullptr, 0); + context->HSSetShader(nullptr, nullptr, 0); + context->DSSetShader(nullptr, nullptr, 0); + + context->RSSetState(nullptr); + context->OMSetBlendState(nullptr, nullptr, 0xffffffff); + context->OMSetDepthStencilState(nullptr, 0x00); + + ID3D11SamplerState* samplers[] = { deferred->linearSampler }; + context->PSSetSamplers(0, ARRAYSIZE(samplers), samplers); + + // jitterCB is shared with the depth-upscale path; the mask shader only + // reads .jitter (de-jitter sampling). useWideKernel is depth-only. + JitterCB jitterData{}; + jitterData.jitter = jitter; + jitterCB->Update(jitterData); + auto bufferArray = jitterCB->CB(); + context->PSSetConstantBuffers(0, 1, &bufferArray); + + // Refresh depthCopy + underwater mask copy before sampling. + if (depthCopy.texture != depth.texture) + context->CopyResource(depthCopy.texture, depth.texture); + if (underwaterMask.textureCopy != underwaterMask.texture) + context->CopyResource(underwaterMask.textureCopy, underwaterMask.texture); + + D3D11_VIEWPORT viewport = {}; + viewport.Width = screenSize.x * 0.5f; + viewport.Height = screenSize.y * 0.5f; + viewport.MaxDepth = 1.0f; + context->RSSetViewports(1, &viewport); + + ID3D11ShaderResourceView* srvs[] = { underwaterMask.SRVCopy, depthCopy.depthSRV }; + context->PSSetShaderResources(0, ARRAYSIZE(srvs), srvs); + ID3D11RenderTargetView* rtvs[] = { underwaterMask.RTV }; + context->OMSetRenderTargets(ARRAYSIZE(rtvs), rtvs, nullptr); + context->PSSetShader(underwaterMaskPS, nullptr, 0); + context->Draw(3, 0); + + ID3D11ShaderResourceView* nullPSResources[2] = { nullptr, nullptr }; + context->PSSetShaderResources(0, ARRAYSIZE(nullPSResources), nullPSResources); + + state->EndPerfEvent(); +} + void Upscaling::ApplySharpening() { ZoneScoped; @@ -2045,11 +2405,25 @@ void Upscaling::Main_PostProcessing::thunk(RE::ImageSpaceManager* a_this, uint32 if (upscaling.ShouldUseFrameGenerationThisFrame()) upscaling.CopySharedD3D12Resources(); - if (upscaleMethod != UpscaleMethod::kNONE && upscaleMethod != UpscaleMethod::kTAA) + if (upscaleMethod != UpscaleMethod::kNONE && upscaleMethod != UpscaleMethod::kTAA) { upscaling.PerformUpscaling(); - - if (upscaleMethod == UpscaleMethod::kDLSS) - upscaling.ApplySharpening(); + } else if (globals::game::isVR) { + upscaling.UpscaleDepth(); + } + + if (upscaleMethod == UpscaleMethod::kDLSS) { + // FoveatedRender's DLSS output doesn't land in sharpenerTexture the + // way dev's path does (the route writes to its own per-eye intermediates + // and copies back to kMAIN/testTexture), so dev's zero-copy + // ApplySharpening can't read sharpenerTexture. Route through + // Postprocess::ApplyDlssSharpening which does the kMAIN → sharpener → + // kMAIN round-trip. Both paths honor sharpnessDLSS=0 to disable RCAS. + if (FoveatedRenderImpl::Bridge::IsRouteActive()) { + FoveatedRenderImpl::Postprocess::ApplyDlssSharpening(upscaling); + } else { + upscaling.ApplySharpening(); + } + } auto imageSpaceManager = RE::ImageSpaceManager::GetSingleton(); GET_INSTANCE_MEMBER(BSImagespaceShaderISTemporalAA, imageSpaceManager); @@ -2062,7 +2436,28 @@ void Upscaling::Main_PostProcessing::thunk(RE::ImageSpaceManager* a_this, uint32 if (hdrLoaded) globals::features::hdrDisplay.RedirectFramebuffer(); - func(a_this, a3, a_target, a_4, a_5); + // PerfMode: hybrid Post — HandlePostProcessing performs a two-layer + // struct swap around the engine's func() so tonemap + refraction read + // the DisplayRes testTexture instead of the small kMAIN. The supplied + // lambda is the engine call we'd normally make directly. + // + // Upscaler gate: testTexture is populated by whichever upscaler ran + // (Streamline routes DLSS colorOut there, FidelityFX routes FSR + // colorOut there). Under PerfMode+hookActive, GetUpscaleMethod() returns + // the boot snapshot so this check evaluates against the install-time + // choice — staged UI method changes don't reach here until restart. + // ShouldHandlePost() covers the partial-init case (post resources + // missing). + const bool upscalerWritesTestTexture = + upscaleMethod == UpscaleMethod::kDLSS || + upscaleMethod == UpscaleMethod::kFSR; + if (upscalerWritesTestTexture && globals::features::upscaling.perfMode.ShouldHandlePost()) { + globals::features::upscaling.perfMode.HandlePostProcessing([&]() { + func(a_this, a3, a_target, a_4, a_5); + }); + } else { + func(a_this, a3, a_target, a_4, a_5); + } // Restore kFRAMEBUFFER after ISHDR — hdrTexture now has the HDR scene if (hdrLoaded) diff --git a/src/Features/Upscaling.h b/src/Features/Upscaling.h index 2cc4b1297f..d84b422854 100644 --- a/src/Features/Upscaling.h +++ b/src/Features/Upscaling.h @@ -3,8 +3,11 @@ #include "Feature.h" #include "Upscaling/DX12SwapChain.h" #include "Upscaling/FidelityFX.h" +#include "Upscaling/FoveatedRender.h" +#include "Upscaling/PerfMode.h" #include "Upscaling/RCAS/RCAS.h" #include "Upscaling/Streamline.h" +#include "Utils/BootSnapshot.h" #include #include #include @@ -68,10 +71,35 @@ struct Upscaling : Feature bool reflexUseMarkersToOptimize = false; bool reflexUseFPSLimit = false; float reflexFPSLimit = 60.0f; + + // VR PerfMode: opt-in. When set, BSShaderRenderTargets::Create installs + // the BSOpenVR render-target-size hook at engine init so the entire + // engine pipeline allocates render targets at upscaled-render resolution + // instead of display resolution. Saves VRAM/bandwidth proportional to + // the quality-mode scale ratio. Requires a game restart to take effect. + bool renderAtUpscaleRes = false; }; Settings settings; + // Single source of truth for restart-gated fields. Order is not load-bearing + // — the call-site `DrawSettingDiff` invocations in DrawSettings() handle any + // per-field conditional gating (e.g., qualityMode/upscaleMethod banners only + // render while PerfMode's render-target hook is active). MCP discovery + // reports the full set; clients can check feature state themselves. + // presetDLSS is deliberately NOT here: Streamline::SetDLSSOptions reads + // settings.presetDLSS per-frame and applies it via slDLSSSetOptions, so + // it's already runtime-effective. + inline static constexpr Util::Settings::RestartTable kRestartFields{ { + UTIL_RESTART_FIELD(Settings, frameGenerationMode, "Frame Generation"), + UTIL_RESTART_FIELD(Settings, frameGenerationForceEnable, "Force Enable Frame Generation"), + UTIL_RESTART_FIELD(Settings, renderAtUpscaleRes, "Render at Upscaled Resolution"), + UTIL_RESTART_FIELD(Settings, streamlineLogLevel, "Streamline Logging"), + UTIL_RESTART_FIELD(Settings, upscaleMethod, "Upscaling Method"), + UTIL_RESTART_FIELD(Settings, qualityMode, "Upscale Preset"), + } }; + Util::Settings::BootSnapshot bootSnapshot{ kRestartFields }; + struct JitterCB { float2 jitter; @@ -108,6 +136,14 @@ struct Upscaling : Feature bool IsUpscalingActive() const; // Feature interface overrides + std::span GetRestartRequiredFields() const override + { + return { kRestartFields.data(), kRestartFields.size() }; + } + const void* GetBootValue(std::string_view jsonKey) const override { return bootSnapshot.RawBoot(jsonKey); } + const void* GetSettingsBlob() const override { return &settings; } + size_t GetSettingsBlobSize() const override { return sizeof(settings); } + virtual void DrawSettings() override; virtual void SaveSettings(json& o_json) override; virtual void LoadSettings(json& o_json) override; @@ -125,6 +161,16 @@ struct Upscaling : Feature UpscaleMethod GetUpscaleMethod() const; + /// Render-to-display scale ratio for a quality mode index + /// (1=Quality, 2=Balanced, 3=Performance, 4=UltraPerformance). + /// Single source of truth across DLSS, FSR, and FoveatedRender paths: + /// the four "quality preset" ratios (1.5/1.7/2.0/3.0) are aligned across + /// DLSS and FSR3 by NVIDIA's DLSS Programming Guide and FFX's + /// FfxFsr3QualityMode enum, so all upscalers in this plugin route their + /// quality lookups through here rather than duplicating the table. Returns + /// 3.0 (UltraPerformance) on out-of-range input. + static float GetQualityModeRatio(uint qualityMode); + void CheckResources(UpscaleMethod a_upscalemethod); void CreateUpscalingTextureResources(UpscaleMethod a_upscalemethod); void DestroyUpscalingTextureResources(UpscaleMethod a_upscalemethod); @@ -199,7 +245,9 @@ struct Upscaling : Feature static inline Streamline streamline; static inline FidelityFX fidelityFX; ///< Only for frame generation static inline DX12SwapChain dx12SwapChain; - static inline RCAS rcas; ///< Standalone RCAS sharpening for DLSS + static inline RCAS rcas; ///< Standalone RCAS sharpening for DLSS + static inline PerfMode perfMode; ///< VR-only: render engine at upscaled-render res + static inline FoveatedRender foveatedRender; ///< VR-only: foveated subrect DLSS winrt::com_ptr copyDepthToSharedBufferPS; @@ -224,6 +272,19 @@ struct Upscaling : Feature void PerformUpscaling(); void UpscaleDepth(); + /** + * @brief Standalone full-resolution underwater mask repair (VR). + * + * Same draw as UpscaleDepth's mask branch on the full-resolution path, + * extracted so callers that bypass the standard upscale flow (notably + * PerfMode::HandlePostProcessing, where engine RTs are pre-shrunk to + * renderRes and DLSS targets a private displayRes texture) can drive + * the repair without going through UpscaleDepth's wider envelope. + * Sets and leaves D3D11 pipeline state dirty on exit — wrap in your + * own save/restore (PerfMode uses its FullscreenPassScope). + */ + void RunUnderwaterMaskRepair(); + /** * @brief Applies RCAS sharpening to the main render target after DLSS upscaling. * diff --git a/src/Features/Upscaling/FidelityFX.cpp b/src/Features/Upscaling/FidelityFX.cpp index b9976a99c7..e22ae987b9 100644 --- a/src/Features/Upscaling/FidelityFX.cpp +++ b/src/Features/Upscaling/FidelityFX.cpp @@ -273,8 +273,17 @@ void FidelityFX::CreateFSRResources() auto screenSize = state->screenSize; auto renderSize = Util::ConvertToDynamic(screenSize); - uint32_t displayWidth = (uint32_t)(globals::game::isVR ? screenSize.x / 2 : screenSize.x); - uint32_t displayHeight = (uint32_t)screenSize.y; + // PerfMode bridge: when the BSOpenVR size hook is live, state->screenSize is polluted + // to renderRes (engine RTs were allocated small). FSR3 still needs to upscale to the + // real HMD display resolution, so use perfMode's snapshot for displaySize/maxUpscaleSize. + // maxRenderSize stays at screenSize (which IS renderRes under the hook — that's FSR's + // expected input extent). + auto& perfMode = globals::features::upscaling.perfMode; + const bool dlssperfActive = perfMode.IsHookActive(); + const auto displaySize = dlssperfActive ? perfMode.GetDisplayScreenSize() : screenSize; + + uint32_t displayWidth = (uint32_t)(globals::game::isVR ? displaySize.x / 2 : displaySize.x); + uint32_t displayHeight = (uint32_t)displaySize.y; uint32_t renderWidth = (uint32_t)(globals::game::isVR ? renderSize.x / 2 : renderSize.x); uint32_t renderHeight = (uint32_t)renderSize.y; @@ -344,7 +353,7 @@ FfxResource ffxGetResource(ID3D11Resource* dx11Resource, return resource; } -void FidelityFX::Upscale(ID3D11Resource* a_upscalingTexture, ID3D11Resource* a_reactiveMask, ID3D11Resource* a_transparencyCompositionMask, ID3D11Resource* a_motionVectors, float a_sharpness) +void FidelityFX::Upscale(ID3D11Resource* a_upscalingTexture, ID3D11Resource* a_reactiveMask, ID3D11Resource* a_transparencyCompositionMask, ID3D11Resource* a_motionVectors, float a_sharpness, ID3D11Resource* a_colorOut) { auto renderer = globals::game::renderer; auto context = globals::d3d::context; @@ -357,6 +366,10 @@ void FidelityFX::Upscale(ID3D11Resource* a_upscalingTexture, ID3D11Resource* a_r auto& upscaling = globals::features::upscaling; auto jitter = upscaling.jitter; + // Default to in-place output when caller didn't supply a separate destination. + if (!a_colorOut) + a_colorOut = a_upscalingTexture; + auto DispatchFSR = [&](uint32_t contextIndex, ID3D11Resource* r_color, ID3D11Resource* r_depth, ID3D11Resource* r_mvec, ID3D11Resource* r_reactive, ID3D11Resource* r_trans, ID3D11Resource* r_output, uint32_t r_width, float mv_scale_x) { @@ -431,8 +444,9 @@ void FidelityFX::Upscale(ID3D11Resource* a_upscalingTexture, ID3D11Resource* a_r renderSize.x / 2.0f); } - // Merge outputs back to kMAIN - upscaling.FinalizePerEyeOutputs(a_upscalingTexture); + // Merge outputs into the supplied displayRes destination (kMAIN by default; + // perfMode.testTexture when PerfMode has shrunk the engine RTs). + upscaling.FinalizePerEyeOutputs(a_colorOut); } else { DispatchFSR(0, a_upscalingTexture, @@ -440,7 +454,7 @@ void FidelityFX::Upscale(ID3D11Resource* a_upscalingTexture, ID3D11Resource* a_r a_motionVectors, a_reactiveMask, a_transparencyCompositionMask, - a_upscalingTexture, // Output to same texture + a_colorOut, (uint)renderSize.x, renderSize.x); } diff --git a/src/Features/Upscaling/FidelityFX.h b/src/Features/Upscaling/FidelityFX.h index 5db922ebec..f324bb580c 100644 --- a/src/Features/Upscaling/FidelityFX.h +++ b/src/Features/Upscaling/FidelityFX.h @@ -55,7 +55,11 @@ class FidelityFX void DestroyFSRResources(); - void Upscale(ID3D11Resource* a_upscalingTexture, ID3D11Resource* a_reactiveMask, ID3D11Resource* a_transparencyCompositionMask, ID3D11Resource* a_motionVectors, float a_sharpness); + // a_colorOut is the destination for the upscaled result. When nullptr, output is written + // back into a_upscalingTexture (legacy in-place behavior). Callers route a separate output + // when the engine's kMAIN is renderRes (e.g., PerfMode VR mode) and the upscaled result + // must land in a displayRes target. + void Upscale(ID3D11Resource* a_upscalingTexture, ID3D11Resource* a_reactiveMask, ID3D11Resource* a_transparencyCompositionMask, ID3D11Resource* a_motionVectors, float a_sharpness, ID3D11Resource* a_colorOut = nullptr); private: // FSR scratch buffer - needs to be freed in DestroyFSRResources diff --git a/src/Features/Upscaling/FoveatedRender.cpp b/src/Features/Upscaling/FoveatedRender.cpp new file mode 100644 index 0000000000..52cc6cfbfc --- /dev/null +++ b/src/Features/Upscaling/FoveatedRender.cpp @@ -0,0 +1,323 @@ +#include "FoveatedRender.h" + +#include "../../Globals.h" +#include "../../Utils/Subrect.h" +#include "../../Utils/UI.h" +#include "../Upscaling.h" +#include "FoveatedRender/Core.h" + +#include + +NLOHMANN_DEFINE_TYPE_NON_INTRUSIVE_WITH_DEFAULT( + FoveatedRender::Settings, + enabled, + dlssMode, + stretchMode, + debugVisualize); + +// ============================================================================ +// Lifecycle +// ============================================================================ + +void FoveatedRender::PostPostLoad() +{ + // Opt into PR-1's stereo extension so the controller tracks a separate + // right-eye UV (HMD nose-side overlap symmetry). + subrectController.SetStereoEnabled(true); + + // Seed sensible foveal presets. Empty-case only — user edits persist. + // "Center N%" presets are symmetric per eye (no rightUV → auto-mirror, which + // for centered UVs produces an identical right-eye UV). "Nasal Convergence" + // is asymmetric: left eye biased toward its right edge, right eye biased + // toward its left edge — both targeting the nose-side region where HMD + // binocular fusion is strongest, so DLSS reconstruction lands in the actual + // stereo overlap zone rather than diverging left/right fields. + subrectController.SeedDefaultPresets({ + { .name = "Full Eye", .uv = { 0.0f, 0.0f, 1.0f, 1.0f } }, + { .name = "Center 75%", .uv = { 0.125f, 0.125f, 0.75f, 0.75f } }, + { .name = "Center 50%", .uv = { 0.25f, 0.25f, 0.5f, 0.5f } }, + { .name = "Nasal Convergence 50%", + .uv = { 0.5f, 0.25f, 0.5f, 0.5f }, + .rightUV = Util::Subrect::UVRegion{ 0.0f, 0.25f, 0.5f, 0.5f } }, + }); +} + +void FoveatedRender::ClearShaderCache() +{ + FoveatedRenderImpl::Core::ClearShaderCache(); +} + +// ============================================================================ +// Settings I/O — driven from Upscaling::Save/LoadSettings under a nested key +// ============================================================================ + +void FoveatedRender::SaveSettings(json& o_json) +{ + o_json = settings; + subrectController.SaveSettings(o_json); +} + +void FoveatedRender::LoadSettings(const json& o_json) +{ + settings = o_json; + // Util::Subrect::Controller::LoadSettings takes `const json&` (Subrect.h:68) + // so no const_cast is needed — keeping it would imply mutation that never + // happens. (Copilot + CodeRabbit on PR #44.) + subrectController.LoadSettings(o_json); + ClampSettings(); +} + +void FoveatedRender::RestoreDefaultSettings() +{ + settings = {}; + ClampSettings(); +} + +void FoveatedRender::ClampSettings() +{ + settings.enabled = std::min(settings.enabled, 1u); + settings.dlssMode = std::min(settings.dlssMode, 1u); + settings.stretchMode = std::min(settings.stretchMode, 2u); + settings.debugVisualize = std::min(settings.debugVisualize, 1u); + // Preset clamping reads from Upscaling::Settings now. + auto& sharedPreset = globals::features::upscaling.settings.presetDLSS; + sharedPreset = std::min(sharedPreset, 5u); + if (!IsPresetCompatibleWithMode(sharedPreset)) { + sharedPreset = 3; // Fall back to L + } +} + +// ============================================================================ +// Activation + accessors +// ============================================================================ + +bool FoveatedRender::IsActive() const +{ + return enabledAtBoot && IsRuntimeSupported(); +} + +bool FoveatedRender::IsRuntimeSupported() const +{ + return globals::game::isVR && globals::features::upscaling.streamline.featureDLSS; +} + +void FoveatedRender::LatchQualityMode() +{ + qualityModeAtBoot = std::clamp(globals::features::upscaling.settings.qualityMode, 1u, 4u); +} + +uint FoveatedRender::GetActiveQualityMode() const +{ + return std::clamp(globals::features::upscaling.settings.qualityMode, 1u, 4u); +} + +uint FoveatedRender::GetActivePresetDLSS() const +{ + return std::min(globals::features::upscaling.settings.presetDLSS, 5u); +} + +float FoveatedRender::GetActiveSharpnessDLSS() const +{ + return std::clamp(globals::features::upscaling.settings.sharpnessDLSS, 0.0f, 1.0f); +} + +float FoveatedRender::GetRenderScaleForQuality(uint qualityMode) +{ + return Upscaling::GetQualityModeRatio(qualityMode); +} + +bool FoveatedRender::IsPresetCompatibleWithMode(uint presetIndex) const +{ + // Preset indices: 0=Default, 1=J, 2=K, 3=L, 4=M, 5=F + // Faster mode: J(1) and K(2) are incompatible. + if (GetDlssMode() == DlssMode::kFaster) { + return presetIndex != 1 && presetIndex != 2; + } + return true; +} + +void FoveatedRender::ClampPresetToMode() +{ + auto& sharedPreset = globals::features::upscaling.settings.presetDLSS; + if (!IsPresetCompatibleWithMode(sharedPreset)) { + sharedPreset = 3; // Fall back to L + } +} + +// ============================================================================ +// UI — FoveatedRender-specific knobs only. Quality / sharpness / preset / +// Streamline log level live on Upscaling's panel and apply to both DLSS paths. +// Called from Upscaling::DrawSettings inside a TreeNode. +// ============================================================================ + +void FoveatedRender::DrawEnable() +{ + ClampSettings(); + + ImGui::TextWrapped( + "Foveated subrect DLSS: only the user-selected region gets full DLSS upscaling, " + "the periphery is cheaply stretched. Significant DLSS cost reduction at the cost " + "of peripheral sharpness. VR + DLSS only."); + + const bool runtimeSupported = IsRuntimeSupported(); + if (!runtimeSupported) { + settings.enabled = 0; + } + + if (!runtimeSupported) + ImGui::BeginDisabled(); + bool enabledBool = settings.enabled != 0; + if (ImGui::Checkbox("Enable Foveated DLSS", &enabledBool)) + settings.enabled = enabledBool ? 1u : 0u; + if (!runtimeSupported) + ImGui::EndDisabled(); + + if ((settings.enabled != 0) != enabledAtBoot) { + Util::Text::RestartNeeded("Pending restart: FoveatedRender will %s on next launch.", + settings.enabled ? "enable" : "disable"); + } + + if (enabledAtBoot) { + Util::Text::WrappedInfo("Active: upscaling is forced to DLSS while enabled."); + } + + if (!globals::game::isVR) { + Util::Text::Warning("VR only. Non-VR / FSR support pending future contributors."); + } + if (globals::game::isVR && !globals::features::upscaling.streamline.featureDLSS) { + Util::Text::Warning("DLSS runtime not available. Enable is blocked."); + } +} + +void FoveatedRender::DrawSettings() +{ + static const char* dlssModes[] = { "Default", "Faster" }; + static const char* stretchModes[] = { "Bilinear", "Point", "Gaussian Blur" }; + + ClampSettings(); + + Util::Text::WrappedInfo("Quality / Sharpness / DLSS Preset / Streamline log level are shared with the standard DLSS path above."); + + // ── VR-only knobs ── + if (globals::game::isVR) { + ImGui::Separator(); + ImGui::Text("VR DLSS Mode"); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text( + "Default vs Faster: trade per-eye image quality for setup cost. Switch only when you\n" + "can see a difference in your scene — otherwise prefer Faster.\n" + "\n" + "Default — use when: image quality matters more than the small overhead — cinematic\n" + "scenes, screenshot/recording, or if you notice ghosting/edge artifacts in Faster.\n" + "Each eye gets isolated per-eye intermediates for color/depth/MV/reactive/transparency\n" + "so DLSS can't sample across the SBS midline. Costs five per-eye copies per frame.\n" + "All DLSS presets (Default, J, K, L, M, F) supported.\n" + "\n" + "Faster — use when: you want the cheapest foveated path and aren't seeing artifacts —\n" + "fast-motion gameplay, exploration, anywhere small quality losses go unnoticed.\n" + "DLSS reads kMAIN directly via extent offsets, so bilinear sampling can touch 1-2\n" + "texels of the neighboring eye near the SBS midline. We snapshot kMAIN once and\n" + "clear the HMD hidden-area ring to prevent sky-blue bleed on fast head motion.\n" + "Presets J and K are unavailable — switching here auto-clamps preset to L."); + } + + uint prevMode = settings.dlssMode; + ImGui::SliderInt("DLSS Mode", reinterpret_cast(&settings.dlssMode), 0, 1, dlssModes[settings.dlssMode]); + if (settings.dlssMode != prevMode) { + const uint prevPreset = globals::features::upscaling.settings.presetDLSS; + ClampPresetToMode(); + if (globals::features::upscaling.settings.presetDLSS != prevPreset) { + logger::info("[FOVEATED] DLSS preset clamped from {} to {} after Faster switch (J/K incompatible)", + prevPreset, globals::features::upscaling.settings.presetDLSS); + } + } + switch (GetDlssMode()) { + case DlssMode::kDefault: + ImGui::TextWrapped("Per-eye isolation: 2 resource sets, 2 DLSS evaluates."); + break; + case DlssMode::kFaster: + ImGui::TextWrapped("SBS viewport: 1 snapshot + 2 mask clears, 2 evaluates. Presets J/K unavailable."); + break; + } + + ImGui::Separator(); + ImGui::Text("Background Stretch"); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text( + "How the cheap periphery is reconstructed to fill the area outside the DLSS subrect.\n" + "This is the cost-saving step — DLSS only runs on the subrect, the rest is filled by\n" + "this cheaper pass. Only affects pixels outside your selected region."); + } + ImGui::SliderInt("Stretch Mode", reinterpret_cast(&settings.stretchMode), 0, 2, stretchModes[settings.stretchMode]); + switch (GetStretchMode()) { + case StretchMode::kBilinear: + ImGui::TextWrapped("Bilinear: clean linear upscale. Looks like a soft DLSS-Performance result."); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text( + "Use when: you want the periphery to look like a sensible low-quality reconstruction,\n" + "close to how DLSS-Performance would look. Default-ish choice.\n" + "\n" + "Visual artifact: typical bilinear softness — fine geometry in the periphery looks\n" + "slightly out of focus but not visibly stretched."); + } + break; + case StretchMode::kPoint: + ImGui::TextWrapped("Point (nearest-neighbor): cheapest. Visibly pixelated periphery."); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text( + "Use when: you want the smallest possible cost in the periphery and don't mind\n" + "obvious pixelation outside your gaze region. Useful for benchmarking the upper\n" + "bound of foveated savings.\n" + "\n" + "Visual artifact: chunky pixel blocks in the periphery, very visible if you look\n" + "away from the subrect center."); + } + break; + case StretchMode::kGaussianBlur: + ImGui::TextWrapped("Gaussian blur: softens periphery further. Hides upscale artifacts behind blur."); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text( + "Use when: you want the periphery to fall away into soft focus — closer to how\n" + "natural human peripheral vision feels. Good default for actual foveated use.\n" + "\n" + "Visual artifact: noticeable blur in the periphery. If your subrect is large this\n" + "is barely visible; if small, the blur is the dominant visual signal."); + } + break; + } + + ImGui::Separator(); + ImGui::Text("Subrect Region"); + ImGui::TextWrapped( + "Drag in the preview below to select the region that gets full DLSS upscaling. " + "The rest is cheaply stretched — saves significant DLSS cost."); + Util::Text::WrappedInfo("Screenshot has its own subrect; align them only if you want pixel-matched captures."); + + bool debugBool = settings.debugVisualize != 0; + if (ImGui::Checkbox("Visualize regions", &debugBool)) + settings.debugVisualize = debugBool ? 1u : 0u; + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text( + "Diagnostic: tint the cheap-stretched periphery red so the DLSS-reconstructed\n" + "subrect (un-tinted) pops visually in-game. Lets you confirm at a glance where\n" + "DLSS is actually running vs where the cheap stretch is filling. No perf impact;\n" + "runtime toggle, no restart needed."); + } + + // Preview off kVR_FRAMEBUFFER (the final composed SBS image the headset + // sees) rather than kMAIN. kMAIN is mid-pipeline and carries non-1 + // alpha where Skyrim composited UI plates, so even with the opaque + // blend callback you see the menu mask outline instead of the rendered + // world. ScreenshotFeature picks the same RT for the same reason + // (ScreenshotFeature.cpp:243). Foveated is VR-only so kVR_FRAMEBUFFER + // is always populated when we get here. + auto renderer = globals::game::renderer; + if (renderer) { + auto& fb = renderer->GetRuntimeData().renderTargets[RE::RENDER_TARGETS::kVR_FRAMEBUFFER]; + auto* tex = static_cast(fb.texture); + subrectController.DrawEditor(fb.SRV, tex, 0.5f, 0.0f, Util::Subrect::OpaquePreviewBlendCallback); + } else { + subrectController.DrawEditor(nullptr, nullptr, 0.5f); + } + } +} diff --git a/src/Features/Upscaling/FoveatedRender.h b/src/Features/Upscaling/FoveatedRender.h new file mode 100644 index 0000000000..0d7ee3b62c --- /dev/null +++ b/src/Features/Upscaling/FoveatedRender.h @@ -0,0 +1,111 @@ +#pragma once + +// ============================================================================ +// FoveatedRender — VR DLSS enhancement mode of Upscaling +// ============================================================================ +// +// Foveated subrect-DLSS path: only the user-selected region gets full DLSS +// upscaling; the periphery is cheaply stretched via SubrectStretchCS. Halves +// (or more) the DLSS workload. Composes with VRS, Screenshot, and the lossless +// recording feature through the shared Util::Subrect module — use the same +// preset for consistent results across them. +// +// Architecturally a mode inside Upscaling (mirroring DLSSperf): a static- +// inline member, not a peer Feature. Settings that overlap with Upscaling's +// (quality mode, sharpness, DLSS preset, Streamline log level) read directly +// from `globals::features::upscaling.settings` rather than being duplicated. +// VR + DLSS only at present; non-VR / FSR extension is left to future work. +// +// ============================================================================ + +#include "../../Utils/Subrect.h" + +struct FoveatedRender +{ + // DLSS execution mode for VR + enum class DlssMode : uint + { + kDefault = 0, // Per-eye isolation: 2 extra resource sets, 2 evaluates. Supports F/J/K/L/M. + kFaster = 1, // SBS viewport: tell SL to read subrect from SBS directly, no extra resources, 2 evaluates. J/K incompatible, only L/M/F. + }; + + // Stretch algorithm for DRS → full-eye background (used by SubrectStretchCS shader) + enum class StretchMode : uint + { + kBilinear = 0, // Default bilinear sampling (clean upscale) + kPoint = 1, // Nearest-neighbor / point (cheapest, VRS-like broadcast) + kGaussianBlur = 2, // 3x3 Gaussian blur (soft periphery) + }; + + // FoveatedRender-specific settings. Quality mode / sharpness / DLSS preset / + // Streamline log level live on Upscaling::Settings and are read through + // the accessors below — do not duplicate them here. Sharpening on/off is + // controlled by the shared sharpnessDLSS slider (0 disables RCAS). + // + // Deferred to PR-3b: per-input DLSS hint toggles (MV dilation, reactive mask, + // transparency mask). The original PR #2096 declared the Settings fields and + // UI sliders but never plumbed them to EncodeTexturesCS or to the EvaluateDLSS + // arg list, so they were no-ops there too. Bringing them back in PR-3b means + // shader permutations (per-toggle defines), conditional encode-pass skip when + // all are off, and per-toggle DLSS arg gating — ship the implementation and + // the UI together so the knobs don't lie. + struct Settings + { + uint enabled = 0; // opt-in: requires restart to take effect via LatchEnabled() + uint dlssMode = (uint)DlssMode::kDefault; + uint stretchMode = (uint)StretchMode::kGaussianBlur; + uint debugVisualize = 0; // tint cheap-stretched periphery red; runtime toggle + }; + + Settings settings; + Util::Subrect::Controller subrectController; + + // Called from Upscaling::DrawSettings. DrawEnable renders the always-visible + // header + Enable checkbox at the parent's top level; DrawSettings renders + // the body knobs inside a collapsible TreeNode (Upscaling wraps it in + // BeginDisabled when settings.enabled == 0). + void DrawEnable(); + void DrawSettings(); + // Called from Upscaling::SaveSettings / LoadSettings to round-trip JSON. + void SaveSettings(json& o_json); + void LoadSettings(const json& o_json); + void RestoreDefaultSettings(); + void ClearShaderCache(); + // Called from Upscaling::PostPostLoad to seed subrect presets. + void PostPostLoad(); + + bool IsRuntimeSupported() const; + bool IsActive() const; + bool IsLoaded() const { return enabledAtBoot; } + + // Main enable: latched at boot, change requires restart + void LatchEnabled() { enabledAtBoot = (settings.enabled != 0); } + + // Quality mode reads through Upscaling::Settings — latch the boot value so + // downstream RT allocations stay coherent if the user moves the slider. + void LatchQualityMode(); + uint GetQualityModeAtBoot() const { return qualityModeAtBoot; } + + /// Render-to-display scale denominator for a quality mode index + /// (1=Quality .. 4=UltraPerformance). Delegates to the FFX SDK ratio table. + static float GetRenderScaleForQuality(uint qualityMode); + + DlssMode GetDlssMode() const { return (DlssMode)std::min(settings.dlssMode, 1u); } + StretchMode GetStretchMode() const { return (StretchMode)std::min(settings.stretchMode, 2u); } + + // Active getters: clamp + route shared fields through Upscaling::Settings. + uint GetActiveQualityMode() const; + uint GetActivePresetDLSS() const; + float GetActiveSharpnessDLSS() const; + + // Re-clamp cross-feature settings (preset vs DLSS mode). Idempotent; safe to call + // from Upscaling::LoadSettings after JSON has overwritten shared fields. + void ClampSettings(); + +private: + bool enabledAtBoot = false; // latched from settings.enabled at boot + uint qualityModeAtBoot = 4; // latched from Upscaling::Settings::qualityMode at boot + + bool IsPresetCompatibleWithMode(uint presetIndex) const; + void ClampPresetToMode(); +}; diff --git a/src/Features/Upscaling/FoveatedRender/Bridge.cpp b/src/Features/Upscaling/FoveatedRender/Bridge.cpp new file mode 100644 index 0000000000..bc4309c9df --- /dev/null +++ b/src/Features/Upscaling/FoveatedRender/Bridge.cpp @@ -0,0 +1,79 @@ +#include "Bridge.h" + +#include "../../../Globals.h" +#include "../../Upscaling.h" +#include "../FoveatedRender.h" + +bool FoveatedRenderImpl::Bridge::IsRouteActive() +{ + // IsActive() already checks: globals::game::isVR + // && globals::features::upscaling.streamline.featureDLSS + // && enabledAtBoot + return globals::features::upscaling.foveatedRender.IsActive(); +} + +// Bridge.h contract: when the route is inactive, getters return a neutral / +// identity value so callers that forget to check IsRouteActive() don't +// silently pick up FoveatedRender values. + +uint32_t FoveatedRenderImpl::Bridge::GetQualityMode() +{ + if (!IsRouteActive()) + return 0u; + return globals::features::upscaling.foveatedRender.GetActiveQualityMode(); +} + +uint32_t FoveatedRenderImpl::Bridge::GetPresetDLSS() +{ + if (!IsRouteActive()) + return 0u; + return globals::features::upscaling.foveatedRender.GetActivePresetDLSS(); +} + +float FoveatedRenderImpl::Bridge::GetSharpnessDLSS() +{ + if (!IsRouteActive()) + return 0.0f; + return globals::features::upscaling.foveatedRender.GetActiveSharpnessDLSS(); +} + +void FoveatedRenderImpl::Bridge::BootSequence() +{ + auto& enhancer = globals::features::upscaling.foveatedRender; + enhancer.LatchEnabled(); + enhancer.LatchQualityMode(); +} + +void FoveatedRenderImpl::Bridge::ComputeMvecScale(float& outX, float& outY) +{ + // Default: identity (caller's normal Streamline path). + outX = 1.0f; + outY = 1.0f; + + if (!IsRouteActive()) + return; + + auto& enhancer = globals::features::upscaling.foveatedRender; + const auto& uv = enhancer.subrectController.GetUV(); // PR-1 stereo Subrect: GetUV() == left-eye in stereo mode + const bool isFullEye = (uv.w >= 0.999f && uv.h >= 0.999f); + + if (isFullEye) + return; + + // Default + Faster both use per-eye DLSS calls (not strip-merged), so + // motion vectors scale by 1/UV.w on x. + outX = (uv.w > 0.0f) ? (1.0f / uv.w) : 1.0f; + outY = (uv.h > 0.0f) ? (1.0f / uv.h) : 1.0f; +} + +float FoveatedRenderImpl::Bridge::GetRenderScaleForQuality(uint32_t qualityMode) +{ + return FoveatedRender::GetRenderScaleForQuality(qualityMode); +} + +uint32_t FoveatedRenderImpl::Bridge::GetQualityModeAtBoot() +{ + if (!IsRouteActive()) + return 0u; + return globals::features::upscaling.foveatedRender.GetQualityModeAtBoot(); +} diff --git a/src/Features/Upscaling/FoveatedRender/Bridge.h b/src/Features/Upscaling/FoveatedRender/Bridge.h new file mode 100644 index 0000000000..dd408dfc2c --- /dev/null +++ b/src/Features/Upscaling/FoveatedRender/Bridge.h @@ -0,0 +1,42 @@ +#pragma once + +// FoveatedRenderImpl::Bridge — single point of contact between the FoveatedRender +// subsystem and the rest of Community Shaders (Upscaling, Streamline). +// +// All "is FoveatedRender active?", "what settings should DLSS use?", and +// "what happened at boot?" questions are answered here, so consumers never +// need to #include FoveatedRender.h or poke globals::features::upscaling.foveatedRender +// directly. +// +// IMPORTANT: when the FoveatedRender route is inactive every query returns a +// neutral / identity value — callers must still check IsRouteActive() and +// fall back to their own settings when it returns false. + +#include + +namespace FoveatedRenderImpl::Bridge +{ + // True when VR + DLSS available + FoveatedRender enabled-at-boot. + bool IsRouteActive(); + + // Settings forwarding (live values from FoveatedRender GUI). + uint32_t GetQualityMode(); + uint32_t GetPresetDLSS(); + float GetSharpnessDLSS(); + + // Boot-time latches. Run once during BSShaderRenderTargets::Create. + // Latches enable + qualityMode so settings cannot drift mid-frame. + void BootSequence(); + + // Compute motion-vector scale for Streamline constants. + // Returns {1,1} when route is inactive or subrect is full-eye. + void ComputeMvecScale(float& outX, float& outY); + + // Render-to-display scale for a quality mode index (1=Quality .. 4=UltraPerf). + // Delegates to the FFX SDK ratio table. + float GetRenderScaleForQuality(uint32_t qualityMode); + + // Quality mode latched at boot (resource sizing decisions consult this so + // they don't shift mid-game when the user changes the live setting). + uint32_t GetQualityModeAtBoot(); +} diff --git a/src/Features/Upscaling/FoveatedRender/Core.cpp b/src/Features/Upscaling/FoveatedRender/Core.cpp new file mode 100644 index 0000000000..78bd1ab1d3 --- /dev/null +++ b/src/Features/Upscaling/FoveatedRender/Core.cpp @@ -0,0 +1,581 @@ +#include "Core.h" +#include "Ops.h" + +#include "../../../State.h" +#include "../../../Util.h" +#include "../../Upscaling.h" +#include "../FoveatedRender.h" + +#include + +namespace FoveatedRenderImpl::Ops +{ + // Mirrors the StretchCB layout in SubrectStretchCS.hlsl — 8 dims + mode + + // blur radius + debug flag + pad. Kept at namespace scope so the create-CB + // path can size against sizeof(StretchCB) instead of a magic number. + struct StretchCB + { + uint32_t data[8]; + uint32_t stretchMode; + float blurRadius; + uint32_t debugVisualize; + uint32_t pad; + }; + + eastl::unique_ptr CreateTextureFromSource(ID3D11Resource* src, uint32_t width, uint32_t height, + bool copyBindFlags, bool createSRV, bool createUAV, const char* name) + { + if (!src) { + logger::error("[FOVEATED] CreateTextureFromSource called with null src ({})", name ? name : ""); + return nullptr; + } + + // QueryInterface for ID3D11Texture2D rather than blind static_cast — a + // non-texture resource passed here would crash GetDesc otherwise. + // (CodeRabbit on PR #44.) + winrt::com_ptr srcTex; + if (FAILED(src->QueryInterface(IID_PPV_ARGS(srcTex.put())))) { + logger::error("[FOVEATED] CreateTextureFromSource src is not an ID3D11Texture2D ({})", name ? name : ""); + return nullptr; + } + + D3D11_TEXTURE2D_DESC srcDesc; + srcTex->GetDesc(&srcDesc); + + D3D11_TEXTURE2D_DESC desc = {}; + desc.Width = width; + desc.Height = height; + desc.MipLevels = 1; + desc.ArraySize = 1; + desc.Format = srcDesc.Format; + desc.SampleDesc.Count = 1; + desc.Usage = D3D11_USAGE_DEFAULT; + desc.BindFlags = copyBindFlags ? srcDesc.BindFlags : (D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_UNORDERED_ACCESS); + + auto tex = eastl::make_unique(desc); + + if (name) { + Util::SetResourceName(tex->resource.get(), name); + } + + if (createSRV) { + D3D11_SHADER_RESOURCE_VIEW_DESC srvDesc = {}; + srvDesc.Format = srcDesc.Format; + srvDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D; + srvDesc.Texture2D.MostDetailedMip = 0; + srvDesc.Texture2D.MipLevels = 1; + tex->CreateSRV(srvDesc); + } + + if (createUAV) { + D3D11_UNORDERED_ACCESS_VIEW_DESC uavDesc = {}; + uavDesc.Format = srcDesc.Format; + uavDesc.ViewDimension = D3D11_UAV_DIMENSION_TEXTURE2D; + uavDesc.Texture2D.MipSlice = 0; + tex->CreateUAV(uavDesc); + } + + return tex; + } + + void EnsureVRIntermediateTextures( + uint32_t inWidth, + uint32_t inHeight, + uint32_t outWidth, + uint32_t outHeight, + ID3D11Resource* colorSrc, + ID3D11Resource* mvecSrc, + ID3D11Resource* reactiveSrc, + ID3D11Resource* transparencySrc) + { + bool needsRecreate = !Core::vrIntermediateColorIn[0] || !Core::vrIntermediateColorOut[0]; + if (!needsRecreate) { + needsRecreate = (Core::vrIntermediateColorIn[0]->desc.Width != inWidth || + Core::vrIntermediateColorIn[0]->desc.Height != inHeight || + Core::vrIntermediateColorOut[0]->desc.Width != outWidth || + Core::vrIntermediateColorOut[0]->desc.Height != outHeight); + } + // Recreate if reactive/transparency source appeared but intermediate is missing + if (!needsRecreate) { + needsRecreate = (reactiveSrc && !Core::vrIntermediateReactiveMask[0]) || + (transparencySrc && !Core::vrIntermediateTransparencyMask[0]); + } + + // Also reset stale intermediates when a source DISAPPEARED. Otherwise + // the per-eye reactive/transparency intermediates keep their last-known + // data and PreparePerEyeInputs's null-source branch skips the copy — + // DLSS then samples stale masks. Drop the intermediate so subsequent + // frames don't read it. Independent of the recreate path: shrinking is + // cheap and the next non-null source will trigger recreate above. + // (Copilot on PR #44.) + if (!reactiveSrc && Core::vrIntermediateReactiveMask[0]) { + Core::vrIntermediateReactiveMask[0].reset(); + Core::vrIntermediateReactiveMask[1].reset(); + } + if (!transparencySrc && Core::vrIntermediateTransparencyMask[0]) { + Core::vrIntermediateTransparencyMask[0].reset(); + Core::vrIntermediateTransparencyMask[1].reset(); + } + + if (!needsRecreate) { + return; + } + + for (int i = 0; i < 2; i++) { + std::string suffix = (i == 0) ? "Left" : "Right"; + + Core::vrIntermediateColorIn[i] = CreateTextureFromSource(colorSrc, inWidth, inHeight, false, true, true, ("FoveatedRender_ColorIn_" + suffix).c_str()); + Core::vrIntermediateColorOut[i] = CreateTextureFromSource(colorSrc, outWidth, outHeight, false, true, false, ("FoveatedRender_ColorOut_" + suffix).c_str()); + + D3D11_TEXTURE2D_DESC depthDesc = {}; + depthDesc.Width = inWidth; + depthDesc.Height = inHeight; + depthDesc.MipLevels = 1; + depthDesc.ArraySize = 1; + depthDesc.Format = DXGI_FORMAT_R32_TYPELESS; + depthDesc.SampleDesc.Count = 1; + depthDesc.Usage = D3D11_USAGE_DEFAULT; + depthDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE; + Core::vrIntermediateDepth[i] = eastl::make_unique(depthDesc); + Util::SetResourceName(Core::vrIntermediateDepth[i]->resource.get(), ("FoveatedRender_Depth_" + suffix).c_str()); + + D3D11_SHADER_RESOURCE_VIEW_DESC srvDesc = {}; + srvDesc.Format = DXGI_FORMAT_R32_FLOAT; + srvDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D; + srvDesc.Texture2D.MipLevels = 1; + Core::vrIntermediateDepth[i]->CreateSRV(srvDesc); + + Core::vrIntermediateMotionVectors[i] = CreateTextureFromSource(mvecSrc, inWidth, inHeight, false, true, false, ("FoveatedRender_MVec_" + suffix).c_str()); + if (reactiveSrc) + Core::vrIntermediateReactiveMask[i] = CreateTextureFromSource(reactiveSrc, inWidth, inHeight, false, true, false, ("FoveatedRender_Reactive_" + suffix).c_str()); + else + Core::vrIntermediateReactiveMask[i].reset(); + if (transparencySrc) + Core::vrIntermediateTransparencyMask[i] = CreateTextureFromSource(transparencySrc, inWidth, inHeight, false, true, false, ("FoveatedRender_Transparency_" + suffix).c_str()); + else + Core::vrIntermediateTransparencyMask[i].reset(); + } + } + + void EnsureVRSubrectTextures( + uint32_t subInW, + uint32_t subInH, + uint32_t subOutW, + uint32_t subOutH, + ID3D11Resource* colorSrc, + ID3D11Resource* mvecSrc, + ID3D11Resource* reactiveSrc, + ID3D11Resource* transparencySrc) + { + bool needsRecreate = !Core::vrSubrectColorIn[0] || + Core::vrSubrectInW != subInW || Core::vrSubrectInH != subInH || + Core::vrSubrectOutW != subOutW || Core::vrSubrectOutH != subOutH; + // Recreate if reactive/transparency source appeared but intermediate is missing + if (!needsRecreate) { + needsRecreate = (reactiveSrc && !Core::vrSubrectReactiveMask[0]) || + (transparencySrc && !Core::vrSubrectTransparencyMask[0]); + } + + if (needsRecreate) { + for (int i = 0; i < 2; i++) { + std::string suffix = (i == 0) ? "Left" : "Right"; + Core::vrSubrectColorIn[i] = CreateTextureFromSource(colorSrc, subInW, subInH, false, true, true, ("FoveatedRender_Subrect_ColorIn_" + suffix).c_str()); + Core::vrSubrectColorOut[i] = CreateTextureFromSource(colorSrc, subOutW, subOutH, false, true, false, ("FoveatedRender_Subrect_ColorOut_" + suffix).c_str()); + + D3D11_TEXTURE2D_DESC depthDesc = {}; + depthDesc.Width = subInW; + depthDesc.Height = subInH; + depthDesc.MipLevels = 1; + depthDesc.ArraySize = 1; + depthDesc.Format = DXGI_FORMAT_R32_TYPELESS; + depthDesc.SampleDesc.Count = 1; + depthDesc.Usage = D3D11_USAGE_DEFAULT; + depthDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE; + Core::vrSubrectDepth[i] = eastl::make_unique(depthDesc); + Util::SetResourceName(Core::vrSubrectDepth[i]->resource.get(), ("FoveatedRender_Subrect_Depth_" + suffix).c_str()); + + D3D11_SHADER_RESOURCE_VIEW_DESC srvDesc = {}; + srvDesc.Format = DXGI_FORMAT_R32_FLOAT; + srvDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D; + srvDesc.Texture2D.MipLevels = 1; + Core::vrSubrectDepth[i]->CreateSRV(srvDesc); + + Core::vrSubrectMotionVectors[i] = CreateTextureFromSource(mvecSrc, subInW, subInH, false, true, false, ("FoveatedRender_Subrect_MVec_" + suffix).c_str()); + if (reactiveSrc) + Core::vrSubrectReactiveMask[i] = CreateTextureFromSource(reactiveSrc, subInW, subInH, false, true, false, ("FoveatedRender_Subrect_Reactive_" + suffix).c_str()); + else + Core::vrSubrectReactiveMask[i].reset(); + if (transparencySrc) + Core::vrSubrectTransparencyMask[i] = CreateTextureFromSource(transparencySrc, subInW, subInH, false, true, false, ("FoveatedRender_Subrect_Transparency_" + suffix).c_str()); + else + Core::vrSubrectTransparencyMask[i].reset(); + } + + Core::vrSubrectInW = subInW; + Core::vrSubrectInH = subInH; + Core::vrSubrectOutW = subOutW; + Core::vrSubrectOutH = subOutH; + } + } + + bool PreparePerEyeInputs( + ID3D11Resource* colorSrc, + ID3D11Resource* depthSrc, + ID3D11Resource* mvecSrc, + ID3D11Resource* reactiveSrc, + ID3D11Resource* transparencySrc, + uint32_t eyeWidthIn, + uint32_t eyeHeightIn, + uint32_t eyeWidthOut, + uint32_t eyeHeightOut) + { + // Required sources are dereferenced unconditionally below; bail + // rather than null-deref CopySubresourceRegion. Reactive/transparency + // are optional and already conditionally copied. + if (!colorSrc || !depthSrc || !mvecSrc) { + logger::error("[FOVEATED] PreparePerEyeInputs missing required source textures"); + return false; + } + + EnsureVRIntermediateTextures( + eyeWidthIn, + eyeHeightIn, + eyeWidthOut, + eyeHeightOut, + colorSrc, + mvecSrc, + reactiveSrc, + transparencySrc); + + for (uint32_t i = 0; i < 2; ++i) { + if (!Core::vrIntermediateColorIn[i] || !Core::vrIntermediateColorOut[i] || + !Core::vrIntermediateDepth[i] || !Core::vrIntermediateMotionVectors[i] || + (reactiveSrc && !Core::vrIntermediateReactiveMask[i]) || + (transparencySrc && !Core::vrIntermediateTransparencyMask[i])) { + logger::error("[FOVEATED] Missing per-eye intermediate resources for eye {}", i); + return false; + } + } + + auto context = globals::d3d::context; + auto* depthSRV = globals::game::renderer->GetDepthStencilData() + .depthStencils[RE::RENDER_TARGETS_DEPTHSTENCIL::kMAIN] + .depthSRV; + for (uint32_t i = 0; i < 2; ++i) { + uint32_t offsetXIn = (i == 1) ? eyeWidthIn : 0; + D3D11_BOX srcBox = { offsetXIn, 0, 0, offsetXIn + eyeWidthIn, eyeHeightIn, 1 }; + + context->CopySubresourceRegion(Core::vrIntermediateColorIn[i]->resource.get(), 0, 0, 0, 0, colorSrc, 0, &srcBox); + context->CopySubresourceRegion(Core::vrIntermediateDepth[i]->resource.get(), 0, 0, 0, 0, depthSrc, 0, &srcBox); + context->CopySubresourceRegion(Core::vrIntermediateMotionVectors[i]->resource.get(), 0, 0, 0, 0, mvecSrc, 0, &srcBox); + if (transparencySrc) + context->CopySubresourceRegion(Core::vrIntermediateTransparencyMask[i]->resource.get(), 0, 0, 0, 0, transparencySrc, 0, &srcBox); + if (reactiveSrc) + context->CopySubresourceRegion(Core::vrIntermediateReactiveMask[i]->resource.get(), 0, 0, 0, 0, reactiveSrc, 0, &srcBox); + + // Reapply HMD hidden-area mask clear into the per-eye intermediate so DLSS + // history doesn't accumulate garbage from the masked-out region. + // Depth source is full SBS (read at per-eye offset); color destination is per-eye + // sized (write at offset 0). + globals::features::upscaling.ClearHMDMask( + Core::vrIntermediateColorIn[i]->uav.get(), + depthSRV, + eyeWidthIn, + eyeHeightIn, + i * eyeWidthIn, + 0); + } + + return true; + } + + bool FinalizePerEyeOutputs(ID3D11Resource* colorDst, uint32_t eyeWidthOut, uint32_t eyeHeightOut) + { + if (!colorDst) { + logger::error("[FOVEATED] FinalizePerEyeOutputs received null destination color resource"); + return false; + } + + for (uint32_t i = 0; i < 2; ++i) { + if (!Core::vrIntermediateColorOut[i]) { + logger::error("[FOVEATED] Missing per-eye output resource for eye {}", i); + return false; + } + } + + auto context = globals::d3d::context; + for (uint32_t i = 0; i < 2; ++i) { + uint32_t offsetXOut = (i == 1) ? eyeWidthOut : 0; + D3D11_BOX outBox = { 0, 0, 0, eyeWidthOut, eyeHeightOut, 1 }; + context->CopySubresourceRegion(colorDst, 0, offsetXOut, 0, 0, Core::vrIntermediateColorOut[i]->resource.get(), 0, &outBox); + } + + return true; + } + + void StretchDRSToFullEye( + ID3D11ShaderResourceView* renderSBSSRV, + ID3D11UnorderedAccessView* kMainUAV, + uint32_t dstOffsetX, + uint32_t dstWidth, + uint32_t dstHeight, + uint32_t srcOffsetX, + uint32_t srcWidth, + uint32_t srcHeight, + uint32_t srcEyeWidth, + uint32_t srcEyeHeight) + { + auto context = globals::d3d::context; + + if (!Core::vrSubrectStretchCS) { + Core::vrSubrectStretchCS.attach((ID3D11ComputeShader*)Util::CompileShader(L"Data/Shaders/Upscaling/FoveatedRender/SubrectStretchCS.hlsl", {}, "cs_5_0")); + Util::SetResourceName(Core::vrSubrectStretchCS.get(), "FoveatedRender::SubrectStretchCS"); + + D3D11_BUFFER_DESC cbDesc = {}; + cbDesc.ByteWidth = sizeof(StretchCB); + cbDesc.Usage = D3D11_USAGE_DYNAMIC; + cbDesc.BindFlags = D3D11_BIND_CONSTANT_BUFFER; + cbDesc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE; + if (FAILED(globals::d3d::device->CreateBuffer(&cbDesc, nullptr, Core::vrSubrectStretchCB.put()))) { + logger::error("[FOVEATED] Failed to create SubrectStretch constant buffer"); + // Drop the partially-attached CS so the next frame retries the + // whole init block — otherwise the outer !vrSubrectStretchCS + // guard above stays false forever and Faster mode is dead for + // the rest of the session. + Core::vrSubrectStretchCS = nullptr; + return; + } + Util::SetResourceName(Core::vrSubrectStretchCB.get(), "FoveatedRender::SubrectStretchCB"); + + D3D11_SAMPLER_DESC sampDesc = {}; + sampDesc.Filter = D3D11_FILTER_MIN_MAG_LINEAR_MIP_POINT; + sampDesc.AddressU = D3D11_TEXTURE_ADDRESS_CLAMP; + sampDesc.AddressV = D3D11_TEXTURE_ADDRESS_CLAMP; + sampDesc.AddressW = D3D11_TEXTURE_ADDRESS_CLAMP; + if (FAILED(globals::d3d::device->CreateSamplerState(&sampDesc, Core::vrSubrectStretchSampler.put()))) { + logger::error("[FOVEATED] Failed to create SubrectStretch sampler"); + Core::vrSubrectStretchCS = nullptr; + Core::vrSubrectStretchCB = nullptr; + return; + } + Util::SetResourceName(Core::vrSubrectStretchSampler.get(), "FoveatedRender::SubrectStretchSampler"); + } + + if (!Core::vrSubrectStretchCS || !Core::vrSubrectStretchCB || !Core::vrSubrectStretchSampler) { + return; + } + + // Guard against a null destination UAV — CSSetUnorderedAccessViews + + // Dispatch with nullptr would either no-op silently or assert in + // debug builds. Returning lets the route's `routeHandled=false` path + // fall back to standard DLSS so users still see output. (CodeRabbit + // on PR #44.) + if (!kMainUAV) { + logger::error("[FOVEATED] StretchDRSToFullEye called with null kMainUAV"); + return; + } + + D3D11_MAPPED_SUBRESOURCE mapped{}; + // If Map fails the constant buffer keeps stale data from a prior + // dispatch. CSSetConstantBuffers + Dispatch would then run the + // shader against stale geometry/scale parameters, producing wrong + // pixels rather than no pixels. Early-return preserves the prior + // frame's output. (CodeRabbit on PR #44.) + if (FAILED(context->Map(Core::vrSubrectStretchCB.get(), 0, D3D11_MAP_WRITE_DISCARD, 0, &mapped))) { + logger::error("[FOVEATED] StretchDRSToFullEye Map(vrSubrectStretchCB) failed; skipping dispatch"); + return; + } + { + auto& enhSettings = globals::features::upscaling.foveatedRender.settings; + StretchCB cb = {}; + cb.data[0] = dstOffsetX; + cb.data[1] = dstWidth; + cb.data[2] = dstHeight; + cb.data[3] = srcOffsetX; + cb.data[4] = srcWidth; + cb.data[5] = srcHeight; + cb.data[6] = srcEyeWidth; + cb.data[7] = srcEyeHeight; + cb.stretchMode = enhSettings.stretchMode; + // Fixed 1.0 blur radius for the GaussianBlur stretch path. + cb.blurRadius = 1.0f; + cb.debugVisualize = enhSettings.debugVisualize; + std::memcpy(mapped.pData, &cb, sizeof(cb)); + context->Unmap(Core::vrSubrectStretchCB.get(), 0); + } + + context->CSSetShader(Core::vrSubrectStretchCS.get(), nullptr, 0); + ID3D11Buffer* cbs[1] = { Core::vrSubrectStretchCB.get() }; + context->CSSetConstantBuffers(0, 1, cbs); + ID3D11ShaderResourceView* srvs[1] = { renderSBSSRV }; + context->CSSetShaderResources(0, 1, srvs); + ID3D11SamplerState* samplers[1] = { Core::vrSubrectStretchSampler.get() }; + context->CSSetSamplers(0, 1, samplers); + ID3D11UnorderedAccessView* uavs[1] = { kMainUAV }; + context->CSSetUnorderedAccessViews(0, 1, uavs, nullptr); + + context->Dispatch((dstWidth + 7) / 8, (dstHeight + 7) / 8, 1); + + ID3D11ShaderResourceView* nullSRV[1] = { nullptr }; + ID3D11UnorderedAccessView* nullUAV[1] = { nullptr }; + ID3D11Buffer* nullCB[1] = { nullptr }; + ID3D11SamplerState* nullSampler[1] = { nullptr }; + context->CSSetShaderResources(0, 1, nullSRV); + context->CSSetUnorderedAccessViews(0, 1, nullUAV, nullptr); + context->CSSetConstantBuffers(0, 1, nullCB); + context->CSSetSamplers(0, 1, nullSampler); + context->CSSetShader(nullptr, nullptr, 0); + } + + void EnsureVRRenderSBS(uint32_t renderW, uint32_t renderH, ID3D11Resource* colorSrc) + { + if (!Core::vrRenderSBS || Core::vrRenderSBSW != renderW || Core::vrRenderSBSH != renderH) { + // UAV is required for the Faster-mode HMD mask clear pass (ClearHMDMask + // writes through the UAV before DLSS reads the SBS via extent offsets). + Core::vrRenderSBS = CreateTextureFromSource(colorSrc, renderW, renderH, false, true, true, "FoveatedRender_RenderSBS"); + Core::vrRenderSBSW = renderW; + Core::vrRenderSBSH = renderH; + } + } + + void EnsureFasterOutputTextures(uint32_t subOutW, uint32_t subOutH, ID3D11Resource* colorSrc) + { + bool needsRecreate = !Core::vrFasterColorOut[0] || + Core::vrFasterOutW != subOutW || Core::vrFasterOutH != subOutH; + if (!needsRecreate) + return; + for (int i = 0; i < 2; i++) { + std::string suffix = (i == 0) ? "Left" : "Right"; + Core::vrFasterColorOut[i] = CreateTextureFromSource(colorSrc, subOutW, subOutH, false, true, false, ("FoveatedRender_Faster_ColorOut_" + suffix).c_str()); + } + Core::vrFasterOutW = subOutW; + Core::vrFasterOutH = subOutH; + } + + uint64_t ComputeSubrectUVHash(const Util::Subrect::UVRegion& leftUV, + const Util::Subrect::UVRegion& rightUV, uint32_t mode) + { + uint64_t h = 0; + auto mix = [&](uint64_t v) { h ^= v + 0x9e3779b97f4a7c15ULL + (h << 12) + (h >> 4); }; + auto mixUV = [&](const Util::Subrect::UVRegion& uv) { + mix(std::hash{}(uv.x)); + mix(std::hash{}(uv.y)); + mix(std::hash{}(uv.w)); + mix(std::hash{}(uv.h)); + }; + mixUV(leftUV); + mixUV(rightUV); + mix(std::hash{}(mode)); + return h; + } + + void SnapshotSBS(ID3D11Resource* src, uint32_t renderW, uint32_t renderH) + { + EnsureVRRenderSBS(renderW, renderH, src); + auto context = globals::d3d::context; + D3D11_BOX drsBox = { 0, 0, 0, renderW, renderH, 1 }; + context->CopySubresourceRegion(Core::vrRenderSBS->resource.get(), 0, 0, 0, 0, src, 0, &drsBox); + } + + void StretchDRSBothEyes(ID3D11UnorderedAccessView* dstUAV, uint32_t eyeWidthOut, uint32_t eyeHeightOut, + uint32_t eyeWidthIn, uint32_t eyeHeightIn, uint32_t renderW, uint32_t renderH, + ID3D11ShaderResourceView* srcOverride) + { + // Snapshot creation can fail or be skipped on a fresh frame; degrade + // rather than dereference vrRenderSBS->srv on the null path. + auto* src = srcOverride ? srcOverride : + (Core::vrRenderSBS ? Core::vrRenderSBS->srv.get() : nullptr); + if (!src) { + logger::error("[FOVEATED] StretchDRSBothEyes missing source SRV"); + return; + } + for (uint32_t i = 0; i < 2; ++i) { + uint32_t dstX = (i == 1) ? eyeWidthOut : 0; + uint32_t srcX = (i == 1) ? eyeWidthIn : 0; + StretchDRSToFullEye( + src, dstUAV, + dstX, eyeWidthOut, eyeHeightOut, + srcX, renderW, renderH, + eyeWidthIn, eyeHeightIn); + } + } + + void BlendSubrectToOutput(ID3D11Resource* dlssSrc, ID3D11Resource* dst, + uint32_t dstOffsetX, uint32_t dstOffsetY, uint32_t subWidth, uint32_t subHeight, uint32_t srcOffsetX) + { + auto context = globals::d3d::context; + D3D11_BOX srcBox = { srcOffsetX, 0, 0, srcOffsetX + subWidth, subHeight, 1 }; + context->CopySubresourceRegion(dst, 0, dstOffsetX, dstOffsetY, 0, dlssSrc, 0, &srcBox); + } + +} // namespace FoveatedRenderImpl::Ops + +namespace FoveatedRenderImpl +{ + bool Core::PrepareVRPerEyeInputs( + ID3D11Resource* colorSrc, + ID3D11Resource* depthSrc, + ID3D11Resource* mvecSrc, + ID3D11Resource* reactiveSrc, + ID3D11Resource* transparencySrc, + uint32_t eyeWidthIn, + uint32_t eyeHeightIn, + uint32_t eyeWidthOut, + uint32_t eyeHeightOut) + { + return Ops::PreparePerEyeInputs( + colorSrc, + depthSrc, + mvecSrc, + reactiveSrc, + transparencySrc, + eyeWidthIn, + eyeHeightIn, + eyeWidthOut, + eyeHeightOut); + } + + bool Core::FinalizeVRPerEyeOutputs( + ID3D11Resource* colorDst, + uint32_t eyeWidthOut, + uint32_t eyeHeightOut) + { + return Ops::FinalizePerEyeOutputs(colorDst, eyeWidthOut, eyeHeightOut); + } + + void Core::ClearResources() + { + for (int i = 0; i < 2; ++i) { + vrIntermediateColorIn[i].reset(); + vrIntermediateColorOut[i].reset(); + vrIntermediateDepth[i].reset(); + vrIntermediateMotionVectors[i].reset(); + vrIntermediateReactiveMask[i].reset(); + vrIntermediateTransparencyMask[i].reset(); + + vrSubrectColorIn[i].reset(); + vrSubrectColorOut[i].reset(); + vrSubrectDepth[i].reset(); + vrSubrectMotionVectors[i].reset(); + vrSubrectReactiveMask[i].reset(); + vrSubrectTransparencyMask[i].reset(); + } + vrSubrectInW = vrSubrectInH = vrSubrectOutW = vrSubrectOutH = 0; + + vrRenderSBS.reset(); + vrRenderSBSW = vrRenderSBSH = 0; + + vrFasterColorOut[0].reset(); + vrFasterColorOut[1].reset(); + vrFasterOutW = vrFasterOutH = 0; + + activeSubrectUVHash = 0; + } + + void Core::ClearShaderCache() + { + vrSubrectStretchCS = nullptr; + vrSubrectStretchCB = nullptr; + vrSubrectStretchSampler = nullptr; + } +} diff --git a/src/Features/Upscaling/FoveatedRender/Core.h b/src/Features/Upscaling/FoveatedRender/Core.h new file mode 100644 index 0000000000..7b4a04e869 --- /dev/null +++ b/src/Features/Upscaling/FoveatedRender/Core.h @@ -0,0 +1,93 @@ +#pragma once + +// ============================================================================ +// FoveatedRenderImpl::Core — GPU resource pool & mode-dispatch entry point +// ============================================================================ +// +// Owns all per-mode intermediate textures (Default / Faster), compute-shader +// objects (subrect stretch), and the public entry points consumed by +// Upscaling.cpp. +// +// ============================================================================ + +#include "Buffer.h" +#include "Params.h" +#include +#include + +class Streamline; + +namespace FoveatedRenderImpl +{ + class Core + { + public: + // Stage1: dispatches across Default / Faster modes. + static bool ExecuteVRDlssCore(Streamline& streamline, + ID3D11Resource* upscalingTexture, + ID3D11Resource* depthTexture, + ID3D11Resource* reactiveMask, + ID3D11Resource* transparencyMask, + ID3D11Resource* motionVectors); + + // Shared VR per-eye preprocessing/finalization for non-DLSS callers (e.g. FSR). + static bool PrepareVRPerEyeInputs( + ID3D11Resource* colorSrc, + ID3D11Resource* depthSrc, + ID3D11Resource* mvecSrc, + ID3D11Resource* reactiveSrc, + ID3D11Resource* transparencySrc, + uint32_t eyeWidthIn, + uint32_t eyeHeightIn, + uint32_t eyeWidthOut, + uint32_t eyeHeightOut); + + static bool FinalizeVRPerEyeOutputs( + ID3D11Resource* colorDst, + uint32_t eyeWidthOut, + uint32_t eyeHeightOut); + + // Release all GPU resources owned by Core. + static void ClearResources(); + static void ClearShaderCache(); + + // ── Own VR resources (independent from Upscaling) ── + + // Per-eye intermediate buffers (Default full-eye mode) + static inline eastl::unique_ptr vrIntermediateColorIn[2]; + static inline eastl::unique_ptr vrIntermediateColorOut[2]; + static inline eastl::unique_ptr vrIntermediateDepth[2]; + static inline eastl::unique_ptr vrIntermediateMotionVectors[2]; + static inline eastl::unique_ptr vrIntermediateReactiveMask[2]; + static inline eastl::unique_ptr vrIntermediateTransparencyMask[2]; + + // Subrect-sized textures (Default/Faster subrect mode) + static inline eastl::unique_ptr vrSubrectColorIn[2]; + static inline eastl::unique_ptr vrSubrectColorOut[2]; + static inline eastl::unique_ptr vrSubrectDepth[2]; + static inline eastl::unique_ptr vrSubrectMotionVectors[2]; + static inline eastl::unique_ptr vrSubrectReactiveMask[2]; + static inline eastl::unique_ptr vrSubrectTransparencyMask[2]; + static inline uint32_t vrSubrectInW = 0, vrSubrectInH = 0, vrSubrectOutW = 0, vrSubrectOutH = 0; + + // Faster mode per-eye output textures (subOutW × subOutH) + static inline eastl::unique_ptr vrFasterColorOut[2]; + static inline uint32_t vrFasterOutW = 0, vrFasterOutH = 0; + + // DRS region copy (render-resolution SBS) + static inline eastl::unique_ptr vrRenderSBS; + static inline uint32_t vrRenderSBSW = 0, vrRenderSBSH = 0; + + // DRS stretch compute shader resources + static inline winrt::com_ptr vrSubrectStretchCS; + static inline winrt::com_ptr vrSubrectStretchCB; + static inline winrt::com_ptr vrSubrectStretchSampler; + + // Subrect UV hash for resource recreation detection + static inline uint64_t activeSubrectUVHash = 0; + + private: + static bool ExecuteDefaultMode(Streamline& streamline, const VRDlssParams& p); + static bool ExecuteFasterMode(Streamline& streamline, const VRDlssParams& p); + }; +} diff --git a/src/Features/Upscaling/FoveatedRender/Modes.cpp b/src/Features/Upscaling/FoveatedRender/Modes.cpp new file mode 100644 index 0000000000..56628cc365 --- /dev/null +++ b/src/Features/Upscaling/FoveatedRender/Modes.cpp @@ -0,0 +1,254 @@ +// ============================================================================ +// Modes.cpp — Default / Faster DLSS execution strategies +// ============================================================================ +// +// Each mode composes Ops primitives (snapshot, stretch, crop, blend…) in a +// different order. Router resolves VRDlssParams and dispatches. +// +// ============================================================================ + +#include "Core.h" +#include "Ops.h" +#include "Params.h" + +#include "../../../Globals.h" +#include "../../../Utils/Subrect.h" +#include "../../Upscaling.h" +#include "../Streamline.h" + +namespace FoveatedRenderImpl +{ + using namespace Ops; + + // ── Router: resolves params via Params module, dispatches to the selected mode ── + + bool Core::ExecuteVRDlssCore(Streamline& streamline, + ID3D11Resource* upscalingTexture, ID3D11Resource* depthTexture, + ID3D11Resource* reactiveMask, ID3D11Resource* transparencyMask, ID3D11Resource* motionVectors) + { + auto p = VRDlssParams::Resolve(upscalingTexture, depthTexture, reactiveMask, transparencyMask, motionVectors); + + // Detect UV/mode change → destroy DLSS resources so SL recreates them at + // the new size. Both eye UVs feed the hash; asymmetric presets (e.g. + // Nasal Convergence) can change rightUV while leftUV stays put. + uint64_t uvHash = ComputeSubrectUVHash(p.leftUV, p.rightUV, (uint32_t)p.mode); + if (uvHash != Core::activeSubrectUVHash) { + logger::info("[FOVEATED] Subrect UV or mode changed, recreating DLSS resources"); + streamline.DestroyDLSSResources(); + Core::activeSubrectUVHash = uvHash; + } + + switch (p.mode) { + case FoveatedRender::DlssMode::kFaster: + return ExecuteFasterMode(streamline, p); + default: + return ExecuteDefaultMode(streamline, p); + } + } + + // ── Default mode: per-eye isolation, 2 resource sets, 2 evaluates ── + + bool Core::ExecuteDefaultMode(Streamline& streamline, const VRDlssParams& p) + { + // Subrect path needs colorDstUAV (StretchDRSBothEyes writes through it). + // Full-eye path doesn't touch it. Return false on the subrect path so + // the router falls back to standard DLSS rather than hitting the null + // guard inside StretchDRSToFullEye every frame. (CodeRabbit on PR #44.) + if (!p.isFullEye && !p.colorDstUAV) { + logger::error("[FOVEATED] ExecuteDefaultMode subrect path missing colorDstUAV — falling back"); + return false; + } + if (p.isFullEye) { + // Full-eye path: same as standard VR DLSS + if (!PreparePerEyeInputs( + p.colorSrc, p.depthTexture, p.motionVectors, p.reactiveMask, p.transparencyMask, + p.eyeWidthIn, p.eyeHeightIn, p.eyeWidthOut, p.eyeHeightOut)) + return false; + + for (uint32_t i = 0; i < 2; ++i) { + sl::ViewportHandle vp = (i == 1) ? streamline.viewportRight : streamline.viewport; + sl::Extent extentIn{ 0, 0, p.eyeWidthIn, p.eyeHeightIn }; + sl::Extent extentOut{ 0, 0, p.eyeWidthOut, p.eyeHeightOut }; + streamline.EvaluateDLSS(vp, i, + Core::vrIntermediateColorIn[i]->resource.get(), Core::vrIntermediateColorOut[i]->resource.get(), + Core::vrIntermediateDepth[i]->resource.get(), Core::vrIntermediateMotionVectors[i]->resource.get(), + p.reactiveMask ? Core::vrIntermediateReactiveMask[i]->resource.get() : nullptr, + p.transparencyMask ? Core::vrIntermediateTransparencyMask[i]->resource.get() : nullptr, + extentIn, extentOut, p.eyeWidthOut); + } + + return FinalizePerEyeOutputs(p.colorDst, p.eyeWidthOut, p.eyeHeightOut); + } + + // ── Subrect path: crop per-eye, DLSS at subrect size, stretch back ── + + const Util::Subrect::UVRegion* eyeUVs[2] = { &p.leftUV, &p.rightUV }; + + // NOTE: EnsureVRSubrectTextures allocates a single shared per-eye texture + // set sized to LEFT-eye subrect dimensions. Correct only while + // Util::Subrect's auto-mirror keeps leftUV.w/h == rightUV.w/h — the + // per-eye loop below uses the eye's own uv for the real extents. + uint32_t allocSubInW = std::max(1, (uint32_t)(p.eyeWidthIn * p.leftUV.w)); + uint32_t allocSubInH = std::max(1, (uint32_t)(p.eyeHeightIn * p.leftUV.h)); + uint32_t allocSubOutW = std::max(1, (uint32_t)(p.eyeWidthOut * p.leftUV.w)); + uint32_t allocSubOutH = std::max(1, (uint32_t)(p.eyeHeightOut * p.leftUV.h)); + + EnsureVRSubrectTextures(allocSubInW, allocSubInH, allocSubOutW, allocSubOutH, + p.colorSrc, p.motionVectors, p.reactiveMask, p.transparencyMask); + + // Snapshot + Stretch DRS → kMAIN (fill full-eye background) + SnapshotSBS(p.colorSrc, p.renderW, p.renderH); + + StretchDRSBothEyes(p.colorDstUAV, p.eyeWidthOut, p.eyeHeightOut, p.eyeWidthIn, p.eyeHeightIn, p.renderW, p.renderH); + + // Crop subrect per-eye from snapshot (not kMAIN which was overwritten by stretch) + auto context = globals::d3d::context; + for (uint32_t i = 0; i < 2; ++i) { + const auto& uv = *eyeUVs[i]; + // Per-eye sizing — right eye uses rightUV.w/h, not leftUV. + uint32_t subInW = std::max(1, (uint32_t)(p.eyeWidthIn * uv.w)); + uint32_t subInH = std::max(1, (uint32_t)(p.eyeHeightIn * uv.h)); + uint32_t subOutW = std::max(1, (uint32_t)(p.eyeWidthOut * uv.w)); + uint32_t subOutH = std::max(1, (uint32_t)(p.eyeHeightOut * uv.h)); + + uint32_t cropX = (uint32_t)(uv.x * p.eyeWidthIn); + uint32_t cropY = (uint32_t)(uv.y * p.eyeHeightIn); + uint32_t sbsX = (i == 1 ? p.eyeWidthIn : 0) + cropX; + D3D11_BOX sbsCrop = { sbsX, cropY, 0, sbsX + subInW, cropY + subInH, 1 }; + + context->CopySubresourceRegion(Core::vrSubrectColorIn[i]->resource.get(), 0, 0, 0, 0, Core::vrRenderSBS->resource.get(), 0, &sbsCrop); + context->CopySubresourceRegion(Core::vrSubrectDepth[i]->resource.get(), 0, 0, 0, 0, p.depthTexture, 0, &sbsCrop); + context->CopySubresourceRegion(Core::vrSubrectMotionVectors[i]->resource.get(), 0, 0, 0, 0, p.motionVectors, 0, &sbsCrop); + if (p.reactiveMask) + context->CopySubresourceRegion(Core::vrSubrectReactiveMask[i]->resource.get(), 0, 0, 0, 0, p.reactiveMask, 0, &sbsCrop); + if (p.transparencyMask) + context->CopySubresourceRegion(Core::vrSubrectTransparencyMask[i]->resource.get(), 0, 0, 0, 0, p.transparencyMask, 0, &sbsCrop); + + sl::ViewportHandle vp = (i == 1) ? streamline.viewportRight : streamline.viewport; + sl::Extent extentIn{ 0, 0, subInW, subInH }; + sl::Extent extentOut{ 0, 0, subOutW, subOutH }; + streamline.EvaluateDLSS(vp, i, + Core::vrSubrectColorIn[i]->resource.get(), Core::vrSubrectColorOut[i]->resource.get(), + Core::vrSubrectDepth[i]->resource.get(), Core::vrSubrectMotionVectors[i]->resource.get(), + p.reactiveMask ? Core::vrSubrectReactiveMask[i]->resource.get() : nullptr, + p.transparencyMask ? Core::vrSubrectTransparencyMask[i]->resource.get() : nullptr, + extentIn, extentOut, subOutW, subOutH); + } + + // Write DLSS output back at subrect position (with optional blend) + for (uint32_t i = 0; i < 2; ++i) { + const auto& uv = *eyeUVs[i]; + // Per-eye sizing. + uint32_t subOutW = std::max(1, (uint32_t)(p.eyeWidthOut * uv.w)); + uint32_t subOutH = std::max(1, (uint32_t)(p.eyeHeightOut * uv.h)); + + uint32_t dstCropX = (uint32_t)(uv.x * p.eyeWidthOut); + uint32_t dstCropY = (uint32_t)(uv.y * p.eyeHeightOut); + uint32_t dstX = (i == 1 ? p.eyeWidthOut : 0) + dstCropX; + BlendSubrectToOutput(Core::vrSubrectColorOut[i]->resource.get(), p.colorDst, + dstX, dstCropY, subOutW, subOutH); + } + + return true; + } + + // ── Faster mode: DLSS reads directly from SBS via extents, per-eye output, 2 evaluates ── + // Input: kMAIN/depth/mvec SBS textures using extent offsets (zero input copies). + // Output: per-eye independent textures with extent {0,0}. + // Flow: DLSS read → snapshot+stretch background → copy outputs back to kMAIN. + + bool Core::ExecuteFasterMode(Streamline& streamline, const VRDlssParams& p) + { + // Subrect path needs colorDstUAV (StretchDRSBothEyes writes through it + // in Step 3). Full-eye Faster skips Step 3 — don't reject it here just + // because the UAV isn't bound. + if (!p.isFullEye && !p.colorDstUAV) { + logger::error("[FOVEATED] ExecuteFasterMode subrect path missing colorDstUAV — falling back"); + return false; + } + const Util::Subrect::UVRegion* eyeUVs[2] = { &p.leftUV, &p.rightUV }; + + // NOTE: EnsureFasterOutputTextures allocates one per-eye texture set + // sized to LEFT-eye subrect dimensions. Correct only while Util::Subrect + // auto-mirror keeps leftUV.w/h == rightUV.w/h. Per-eye DLSS extents + // below use the eye's own uv. + uint32_t allocSubOutW = p.isFullEye ? p.eyeWidthOut : std::max(1, (uint32_t)(p.eyeWidthOut * p.leftUV.w)); + uint32_t allocSubOutH = p.isFullEye ? p.eyeHeightOut : std::max(1, (uint32_t)(p.eyeHeightOut * p.leftUV.h)); + + // Step 1: Ensure per-eye output textures + EnsureFasterOutputTextures(allocSubOutW, allocSubOutH, p.colorSrc); + + // Step 2a: Snapshot kMAIN into vrRenderSBS so we can clear the HMD + // hidden-area ring without writing to kMAIN itself. Without this clear + // DLSS's temporal accumulation drags Skyrim's default sky clear from + // the masked-out edge into the visible region on fast head motion — + // the standard Streamline path (Streamline.cpp) and Default mode both + // pre-clear via per-eye intermediates. + SnapshotSBS(p.colorSrc, p.renderW, p.renderH); + auto& upscaling = globals::features::upscaling; + auto* depthSRV = globals::game::renderer->GetDepthStencilData() + .depthStencils[RE::RENDER_TARGETS_DEPTHSTENCIL::kMAIN] + .depthSRV; + if (Core::vrRenderSBS && Core::vrRenderSBS->uav && depthSRV) { + // Color target IS the SBS snapshot (not a per-eye buffer), so + // colorOffsetX must select the eye's half — same as depthOffsetX. + // ClearHMDMaskCS's default contract assumes the color target is + // per-eye (colorOffsetX = 0) and was written for Streamline's + // per-eye intermediates; here we're routing both eyes through one + // SBS texture so we override both offsets together. + for (uint32_t i = 0; i < 2; ++i) { + const uint32_t eyeOffsetX = i * p.eyeWidthIn; + upscaling.ClearHMDMask(Core::vrRenderSBS->uav.get(), depthSRV, + p.eyeWidthIn, p.eyeHeightIn, eyeOffsetX, eyeOffsetX); + } + } + ID3D11Resource* dlssColorSrc = (Core::vrRenderSBS ? Core::vrRenderSBS->resource.get() : p.colorSrc); + + // Step 2b: DLSS reads from the mask-cleared SBS snapshot via extent offsets + // → per-eye output. sl::Extent field order is {top, left, width, height}. + for (uint32_t i = 0; i < 2; ++i) { + const auto& uv = *eyeUVs[i]; + // Per-eye sizing. + uint32_t subInW = p.isFullEye ? p.eyeWidthIn : std::max(1, (uint32_t)(p.eyeWidthIn * uv.w)); + uint32_t subInH = p.isFullEye ? p.eyeHeightIn : std::max(1, (uint32_t)(p.eyeHeightIn * uv.h)); + uint32_t subOutW = p.isFullEye ? p.eyeWidthOut : std::max(1, (uint32_t)(p.eyeWidthOut * uv.w)); + uint32_t subOutH = p.isFullEye ? p.eyeHeightOut : std::max(1, (uint32_t)(p.eyeHeightOut * uv.h)); + + uint32_t cropX = p.isFullEye ? 0 : (uint32_t)(uv.x * p.eyeWidthIn); + uint32_t cropY = p.isFullEye ? 0 : (uint32_t)(uv.y * p.eyeHeightIn); + uint32_t inOffsetX = (i == 1 ? p.eyeWidthIn : 0) + cropX; + uint32_t inOffsetY = cropY; + + sl::ViewportHandle vp = (i == 1) ? streamline.viewportRight : streamline.viewport; + sl::Extent extentIn{ inOffsetY, inOffsetX, subInW, subInH }; + sl::Extent extentOut{ 0, 0, subOutW, subOutH }; + + streamline.EvaluateDLSS(vp, i, + dlssColorSrc, Core::vrFasterColorOut[i]->resource.get(), + p.depthTexture, p.motionVectors, + p.reactiveMask, p.transparencyMask, + extentIn, extentOut, subOutW, subOutH); + } + + // Step 3: Stretch DRS → kMAIN (subrect only) — snapshot reused from Step 2a. + if (!p.isFullEye) { + StretchDRSBothEyes(p.colorDstUAV, p.eyeWidthOut, p.eyeHeightOut, p.eyeWidthIn, p.eyeHeightIn, p.renderW, p.renderH); + } + + // Step 4: Copy DLSS output back (with optional blend) + for (uint32_t i = 0; i < 2; ++i) { + const auto& uv = *eyeUVs[i]; + // Per-eye sizing. + uint32_t subOutW = p.isFullEye ? p.eyeWidthOut : std::max(1, (uint32_t)(p.eyeWidthOut * uv.w)); + uint32_t subOutH = p.isFullEye ? p.eyeHeightOut : std::max(1, (uint32_t)(p.eyeHeightOut * uv.h)); + + uint32_t dstCropX = p.isFullEye ? 0 : (uint32_t)(uv.x * p.eyeWidthOut); + uint32_t dstCropY = p.isFullEye ? 0 : (uint32_t)(uv.y * p.eyeHeightOut); + uint32_t dstX = (i == 1 ? p.eyeWidthOut : 0) + dstCropX; + BlendSubrectToOutput(Core::vrFasterColorOut[i]->resource.get(), p.colorDst, + dstX, dstCropY, subOutW, subOutH); + } + + return true; + } +} diff --git a/src/Features/Upscaling/FoveatedRender/Ops.h b/src/Features/Upscaling/FoveatedRender/Ops.h new file mode 100644 index 0000000000..44bcc8cd61 --- /dev/null +++ b/src/Features/Upscaling/FoveatedRender/Ops.h @@ -0,0 +1,66 @@ +#pragma once + +#include "Core.h" +#include "Utils/Subrect.h" + +#include +#include + +class Texture2D; + +// Primitive operations for the FoveatedRender VR DLSS pipeline. +// +// Each function is a self-contained building block. Mode pipelines in +// Modes.cpp compose these in different orders to form the Default and +// Faster strategies. +namespace FoveatedRenderImpl::Ops +{ + // Texture creation helper. + eastl::unique_ptr CreateTextureFromSource(ID3D11Resource* src, uint32_t width, uint32_t height, + bool copyBindFlags = false, bool createSRV = false, bool createUAV = false, const char* name = nullptr); + + // Lazy/idempotent resource ensure helpers. + void EnsureVRIntermediateTextures(uint32_t inW, uint32_t inH, uint32_t outW, uint32_t outH, + ID3D11Resource* colorSrc, ID3D11Resource* mvecSrc, ID3D11Resource* reactiveSrc, ID3D11Resource* transparencySrc); + + void EnsureVRSubrectTextures(uint32_t subInW, uint32_t subInH, uint32_t subOutW, uint32_t subOutH, + ID3D11Resource* colorSrc, ID3D11Resource* mvecSrc, ID3D11Resource* reactiveSrc, ID3D11Resource* transparencySrc); + + void EnsureFasterOutputTextures(uint32_t subOutW, uint32_t subOutH, ID3D11Resource* colorSrc); + + void EnsureVRRenderSBS(uint32_t renderW, uint32_t renderH, ID3D11Resource* colorSrc); + + // Copy full-eye slices from SBS textures into per-eye intermediates. + bool PreparePerEyeInputs(ID3D11Resource* colorSrc, ID3D11Resource* depthSrc, ID3D11Resource* mvecSrc, + ID3D11Resource* reactiveSrc, ID3D11Resource* transparencySrc, + uint32_t eyeWidthIn, uint32_t eyeHeightIn, uint32_t eyeWidthOut, uint32_t eyeHeightOut); + + // Copy per-eye output intermediates back into the SBS output texture. + bool FinalizePerEyeOutputs(ID3D11Resource* colorDst, uint32_t eyeWidthOut, uint32_t eyeHeightOut); + + // Snapshot kMAIN DRS data into vrRenderSBS. + void SnapshotSBS(ID3D11Resource* src, uint32_t renderW, uint32_t renderH); + + // Compute-shader stretch of a single eye region from renderSBS → kMAIN. + void StretchDRSToFullEye(ID3D11ShaderResourceView* renderSBSSRV, ID3D11UnorderedAccessView* kMainUAV, + uint32_t dstOffsetX, uint32_t dstWidth, uint32_t dstHeight, + uint32_t srcOffsetX, uint32_t srcWidth, uint32_t srcHeight, + uint32_t srcEyeWidth, uint32_t srcEyeHeight); + + // StretchDRS for both eyes (snapshot must already exist in vrRenderSBS). + void StretchDRSBothEyes(ID3D11UnorderedAccessView* dstUAV, uint32_t eyeWidthOut, uint32_t eyeHeightOut, + uint32_t eyeWidthIn, uint32_t eyeHeightIn, uint32_t renderW, uint32_t renderH, + ID3D11ShaderResourceView* srcOverride = nullptr); + + // Hard-copy a DLSS subrect output onto the destination at (offsetX, offsetY). + // No feather/dither — straight CopySubresourceRegion. + void BlendSubrectToOutput(ID3D11Resource* dlssSrc, ID3D11Resource* dst, + uint32_t dstOffsetX, uint32_t dstOffsetY, uint32_t subWidth, uint32_t subHeight, uint32_t srcOffsetX = 0); + + // Hash of per-eye UVs + mode for change detection (forces SL DLSS resource + // recreation). Both eyes are mixed in so asymmetric presets — e.g. Nasal + // Convergence, where rightUV differs from leftUV — don't collide on a + // left-eye-only hash and skip SL recreation. + uint64_t ComputeSubrectUVHash(const Util::Subrect::UVRegion& leftUV, + const Util::Subrect::UVRegion& rightUV, uint32_t mode); +} diff --git a/src/Features/Upscaling/FoveatedRender/Params.cpp b/src/Features/Upscaling/FoveatedRender/Params.cpp new file mode 100644 index 0000000000..ad6eb610c8 --- /dev/null +++ b/src/Features/Upscaling/FoveatedRender/Params.cpp @@ -0,0 +1,70 @@ +#include "Params.h" + +#include "../../../State.h" +#include "../../../Utils/Game.h" +#include "../../Upscaling.h" +#include "../FoveatedRender.h" +#include "../PerfMode.h" + +namespace FoveatedRenderImpl +{ + VRDlssParams VRDlssParams::Resolve( + ID3D11Resource* upscalingTexture, + ID3D11Resource* depth, + ID3D11Resource* reactive, + ID3D11Resource* transparency, + ID3D11Resource* mvec) + { + VRDlssParams p{}; + + // Dimensions. With DLSSperf (PerfMode) active, the engine RTs (kMAIN, + // depth, mvec) are allocated at RenderRes and state->screenSize is + // spoofed to RenderRes too. PerfMode owns a private DisplayRes + // testTexture that DLSS must target. Mirror Streamline::Upscale's + // plumbing (Streamline.cpp:617-626) so the foveated route works in + // both stacks: input extents read from kMAIN at RenderRes, output + // extents and colorDst point at DisplayRes / testTexture. + auto& perfMode = globals::features::upscaling.perfMode; + const bool dlssperfActive = perfMode.IsHookActive() && perfMode.GetTestTexture(); + + const auto screenSize = globals::state->screenSize; + const auto renderSize = Util::ConvertToDynamic(screenSize); + const auto displaySize = dlssperfActive ? perfMode.GetDisplayScreenSize() : screenSize; + + p.renderW = (uint32_t)renderSize.x; + p.renderH = (uint32_t)renderSize.y; + p.eyeWidthIn = (uint32_t)(renderSize.x / 2); + p.eyeHeightIn = (uint32_t)renderSize.y; + p.eyeWidthOut = (uint32_t)(displaySize.x / 2); + p.eyeHeightOut = (uint32_t)displaySize.y; + + // Textures. With DLSSperf, DLSS output lands in PerfMode's testTexture + // (DisplayRes); the stretched periphery also targets the testTexture's + // UAV. Without DLSSperf, both alias kMAIN at full size. + p.colorSrc = upscalingTexture; + p.colorDst = dlssperfActive ? static_cast(perfMode.GetTestTexture()) : upscalingTexture; + p.colorDstUAV = dlssperfActive ? perfMode.GetTestTextureUAV() : + globals::game::renderer->GetRuntimeData().renderTargets[RE::RENDER_TARGETS::kMAIN].UAV; + + p.depthTexture = depth; + p.reactiveMask = reactive; + p.transparencyMask = transparency; + p.motionVectors = mvec; + + // Mode & subrect. PR-1's stereo Subrect API: GetUV() returns the + // primary UV (= left-eye in stereo mode); GetRightEyeUV() returns + // the mirrored right-eye UV. + auto& enhancer = globals::features::upscaling.foveatedRender; + p.mode = enhancer.GetDlssMode(); + p.leftUV = enhancer.subrectController.GetUV(); + p.rightUV = enhancer.subrectController.GetRightEyeUV(); + p.isFullEye = (p.leftUV.w >= 0.999f && p.leftUV.h >= 0.999f); + + // Jitter — ConfigureUpscaling already computed correct DLSS jitter. + auto& upscaling = globals::features::upscaling; + p.jitterX = upscaling.jitter.x; + p.jitterY = upscaling.jitter.y; + + return p; + } +} diff --git a/src/Features/Upscaling/FoveatedRender/Params.h b/src/Features/Upscaling/FoveatedRender/Params.h new file mode 100644 index 0000000000..eb85cd70d2 --- /dev/null +++ b/src/Features/Upscaling/FoveatedRender/Params.h @@ -0,0 +1,50 @@ +#pragma once + +#include "../FoveatedRender.h" +#include "Utils/Subrect.h" +#include + +namespace FoveatedRenderImpl +{ + // Unified parameter block consumed by Mode functions. Resolved from + // current global state — when DLSSperf is active, Params::Resolve routes + // `colorDst`/`colorDstUAV` and the output extents through PerfMode's + // testTexture (see Params.cpp). + struct VRDlssParams + { + // Dimensions + uint32_t renderW; // SBS render width (after DRS) + uint32_t renderH; // SBS render height (after DRS) + uint32_t eyeWidthIn; // per-eye input (render) width + uint32_t eyeHeightIn; // per-eye input (render) height + uint32_t eyeWidthOut; // per-eye output (display) width + uint32_t eyeHeightOut; // per-eye output (display) height + + // Textures + ID3D11Resource* colorSrc; // input color (kMAIN) + ID3D11Resource* colorDst; // output color (kMAIN, or PerfMode's testTexture when DLSSperf is active) + ID3D11UnorderedAccessView* colorDstUAV; // UAV for stretch output target + ID3D11Resource* depthTexture; + ID3D11Resource* reactiveMask; + ID3D11Resource* transparencyMask; + ID3D11Resource* motionVectors; + + // Mode & subrect (mode set: kDefault, kFaster). + FoveatedRender::DlssMode mode; + Util::Subrect::UVRegion leftUV; + Util::Subrect::UVRegion rightUV; + bool isFullEye; + + // Jitter (pixel-space, render resolution) + float jitterX; + float jitterY; + + // Build a complete parameter block from current global state. + static VRDlssParams Resolve( + ID3D11Resource* upscalingTexture, + ID3D11Resource* depthTexture, + ID3D11Resource* reactiveMask, + ID3D11Resource* transparencyMask, + ID3D11Resource* motionVectors); + }; +} diff --git a/src/Features/Upscaling/FoveatedRender/Postprocess.cpp b/src/Features/Upscaling/FoveatedRender/Postprocess.cpp new file mode 100644 index 0000000000..535eea89fa --- /dev/null +++ b/src/Features/Upscaling/FoveatedRender/Postprocess.cpp @@ -0,0 +1,62 @@ +#include "Postprocess.h" + +#include "../../../Globals.h" +#include "../../../State.h" +#include "../../Upscaling.h" +#include "../FoveatedRender.h" + +#include + +namespace FoveatedRenderImpl +{ + bool Postprocess::ApplyDlssSharpening(Upscaling& upscaling) + { + // sharpnessDLSS <= 0 is the single disable signal — sharpness lives on + // Upscaling::Settings so the route shares the global slider. + const float sharpnessSetting = upscaling.settings.sharpnessDLSS; + if (sharpnessSetting <= 0.0f) { + return true; + } + + if (!upscaling.sharpenerTexture || !upscaling.sharpenerTexture->uav || !upscaling.sharpenerTexture->resource) { + logger::error("[FOVEATED] Missing sharpener resources"); + return false; + } + + auto context = globals::d3d::context; + auto renderer = globals::game::renderer; + if (!context || !renderer) { + logger::error("[FOVEATED] Missing D3D context or renderer for sharpening"); + return false; + } + auto& main = renderer->GetRuntimeData().renderTargets[RE::RENDER_TARGETS::kMAIN]; + + if (!main.SRV) { + logger::error("[FOVEATED] Missing main SRV for sharpening"); + return false; + } + + // Same exponential mapping Upscaling::ApplySharpening uses: lower + // setting = stronger sharpen. + float currentSharpness = (-2.0f * sharpnessSetting) + 2.0f; + currentSharpness = exp2(-currentSharpness); + + // In-place RCAS on kMAIN through sharpenerTexture. + ID3D11Resource* mainResource = nullptr; + main.SRV->GetResource(&mainResource); + if (!mainResource) { + logger::error("[FOVEATED] Failed to acquire main resource for sharpening"); + return false; + } + + context->OMSetRenderTargets(0, nullptr, nullptr); + upscaling.rcas.ApplySharpen(main.SRV, upscaling.sharpenerTexture->uav.get(), currentSharpness); + context->CopyResource(mainResource, upscaling.sharpenerTexture->resource.get()); + mainResource->Release(); + + if (globals::game::stateUpdateFlags) { + globals::game::stateUpdateFlags->set(RE::BSGraphics::ShaderFlags::DIRTY_RENDERTARGET); + } + return true; + } +} diff --git a/src/Features/Upscaling/FoveatedRender/Postprocess.h b/src/Features/Upscaling/FoveatedRender/Postprocess.h new file mode 100644 index 0000000000..7ccc8dc06b --- /dev/null +++ b/src/Features/Upscaling/FoveatedRender/Postprocess.h @@ -0,0 +1,16 @@ +#pragma once + +struct Upscaling; + +namespace FoveatedRenderImpl +{ + class Postprocess + { + public: + // Sharpening pass for the FoveatedRender route. Mirrors what + // Upscaling::ApplySharpening does but is invoked from + // Main_PostProcessing only when the FoveatedRender route is active. + // Only the kRCAS path is wired. + static bool ApplyDlssSharpening(Upscaling& upscaling); + }; +} diff --git a/src/Features/Upscaling/FoveatedRender/Preprocess.cpp b/src/Features/Upscaling/FoveatedRender/Preprocess.cpp new file mode 100644 index 0000000000..74cfb4a7a4 --- /dev/null +++ b/src/Features/Upscaling/FoveatedRender/Preprocess.cpp @@ -0,0 +1,109 @@ +#include "Preprocess.h" + +#include "../../../Deferred.h" +#include "../../../State.h" +#include "../../../Util.h" +#include "../../Upscaling.h" + +namespace +{ + ID3D11ComputeShader* GetEnhancerEncodeTexturesCS(Upscaling& upscaling, Upscaling::UpscaleMethod upscaleMethod) + { + uint methodIndex = (uint)upscaleMethod; + if (!upscaling.encodeTexturesCS[methodIndex]) { + std::vector> defines; + defines.push_back({ "DLSS", "" }); + + upscaling.encodeTexturesCS[methodIndex].attach((ID3D11ComputeShader*)Util::CompileShader( + L"Data/Shaders/Upscaling/EncodeTexturesCS.hlsl", defines, "cs_5_0")); + } + + return upscaling.encodeTexturesCS[methodIndex].get(); + } +} + +namespace FoveatedRenderImpl +{ + bool Preprocess::EncodeUpscalingTextures(Upscaling& upscaling) + { + auto upscaleMethod = upscaling.GetUpscaleMethod(); + if (upscaleMethod != Upscaling::UpscaleMethod::kDLSS) { + logger::error("[FOVEATED] Non-DLSS preprocess path is disabled; method={}", (int)upscaleMethod); + return false; + } + + auto state = globals::state; + auto context = globals::d3d::context; + auto renderer = globals::game::renderer; + + if (!upscaling.upscalingDataCB || !upscaling.reactiveMaskTexture || !upscaling.transparencyCompositionMaskTexture) { + logger::error("[FOVEATED] Missing preprocess resources"); + return false; + } + + // motionVectorCopyTexture is dereferenced unconditionally in the UAV + // array below when method == kDLSS. The above resource check did not + // cover it. Fail closed rather than null-deref. (CodeRabbit on PR #44.) + if (upscaleMethod == Upscaling::UpscaleMethod::kDLSS && !upscaling.motionVectorCopyTexture) { + logger::error("[FOVEATED] Missing motionVectorCopyTexture for DLSS preprocess"); + return false; + } + + auto& motionVector = renderer->GetRuntimeData().renderTargets[RE::RENDER_TARGETS::kMOTION_VECTOR]; + auto& temporalAAMask = renderer->GetRuntimeData().renderTargets[RE::RENDER_TARGETS::kTEMPORAL_AA_MASK]; + auto& normals = renderer->GetRuntimeData().renderTargets[globals::deferred->forwardRenderTargets[2]]; + auto& depth = renderer->GetDepthStencilData().depthStencils[RE::RENDER_TARGETS_DEPTHSTENCIL::kMAIN]; + + // Bail before BeginPerfEvent so the perf-event lifecycle stays balanced. + // CSSetShaderResources with a null view in the array doesn't crash, but + // the encode shader reads all four — a null among them silently corrupts + // the reactive/transparency masks DLSS will sample next. + if (!temporalAAMask.SRV || !normals.SRV || !motionVector.SRV || !depth.depthSRV) { + logger::error("[FOVEATED] Missing preprocess SRV inputs"); + return false; + } + + auto dispatchCount = Util::GetScreenDispatchCount(true); + + state->BeginPerfEvent("FOVEATED Encode Upscaling Textures"); + + auto renderSize = Util::ConvertToDynamic(globals::state->screenSize); + Upscaling::UpscalingDataCB upscalingData{}; + upscalingData.trueSamplingDim = renderSize; + upscaling.upscalingDataCB->Update(upscalingData); + + auto upscalingBuffer = upscaling.upscalingDataCB->CB(); + context->CSSetConstantBuffers(0, 1, &upscalingBuffer); + + ID3D11ShaderResourceView* views[4] = { temporalAAMask.SRV, normals.SRV, motionVector.SRV, depth.depthSRV }; + context->CSSetShaderResources(0, ARRAYSIZE(views), views); + + ID3D11UnorderedAccessView* uavs[3] = { + upscaling.reactiveMaskTexture->uav.get(), + upscaling.transparencyCompositionMaskTexture->uav.get(), + upscaleMethod == Upscaling::UpscaleMethod::kDLSS ? upscaling.motionVectorCopyTexture->uav.get() : nullptr + }; + context->CSSetUnorderedAccessViews(0, ARRAYSIZE(uavs), uavs, nullptr); + + ID3D11ComputeShader* cs = GetEnhancerEncodeTexturesCS(upscaling, upscaleMethod); + if (!cs) { + state->EndPerfEvent(); + logger::error("[FOVEATED] Failed to get encode compute shader"); + return false; + } + + context->CSSetShader(cs, nullptr, 0); + context->Dispatch(dispatchCount.x, dispatchCount.y, 1); + + ID3D11ShaderResourceView* nullViews[4] = { nullptr, nullptr, nullptr, nullptr }; + context->CSSetShaderResources(0, ARRAYSIZE(nullViews), nullViews); + ID3D11UnorderedAccessView* nullUavs[3] = { nullptr, nullptr, nullptr }; + context->CSSetUnorderedAccessViews(0, ARRAYSIZE(nullUavs), nullUavs, nullptr); + ID3D11Buffer* nullBuffer = nullptr; + context->CSSetConstantBuffers(0, 1, &nullBuffer); + context->CSSetShader(nullptr, nullptr, 0); + + state->EndPerfEvent(); + return true; + } +} diff --git a/src/Features/Upscaling/FoveatedRender/Preprocess.h b/src/Features/Upscaling/FoveatedRender/Preprocess.h new file mode 100644 index 0000000000..e6bba9b321 --- /dev/null +++ b/src/Features/Upscaling/FoveatedRender/Preprocess.h @@ -0,0 +1,15 @@ +#pragma once + +struct Upscaling; + +namespace FoveatedRenderImpl +{ + class Preprocess + { + public: + // Mirrors Upscaling::EncodeUpscalingTextures (with a DLSS-specific + // shader define) so the FoveatedRender route can prepare reactive + + // transparency masks without touching dev's path. + static bool EncodeUpscalingTextures(Upscaling& upscaling); + }; +} diff --git a/src/Features/Upscaling/PerfMode.cpp b/src/Features/Upscaling/PerfMode.cpp new file mode 100644 index 0000000000..522773bba0 --- /dev/null +++ b/src/Features/Upscaling/PerfMode.cpp @@ -0,0 +1,1267 @@ +#include "PerfMode.h" + +#include +#include + +#include "../../State.h" +#include "../Upscaling.h" + +// Quality mode → render-scale resolution is supplied by the FFX SDK helper +// (same one Upscaling.cpp uses at ConfigureUpscaling), avoiding a duplicate +// scale table here. +#include + +PerfMode::FullscreenPassScope::FullscreenPassScope(ID3D11DeviceContext* a_context) : + ctx(a_context) +{ + ctx->OMGetRenderTargets(1, &savedRTV, &savedDSV); + ctx->RSGetViewports(&numVP, savedVP); + ctx->OMGetBlendState(&savedBlend, savedBlendFactor, &savedSampleMask); + ctx->OMGetDepthStencilState(&savedDSState, &savedStencilRef); + ctx->VSGetShader(&savedVS, nullptr, nullptr); + ctx->PSGetShader(&savedPS, nullptr, nullptr); + ctx->GSGetShader(&savedGS, nullptr, nullptr); + ctx->HSGetShader(&savedHS, nullptr, nullptr); + ctx->DSGetShader(&savedDS, nullptr, nullptr); + ctx->RSGetState(&savedRS); + ctx->PSGetSamplers(0, 1, &savedSampler0); + ctx->PSGetShaderResources(0, 1, &savedSRV0); + ctx->IAGetInputLayout(&savedIL); + ctx->IAGetVertexBuffers(0, D3D11_IA_VERTEX_INPUT_RESOURCE_SLOT_COUNT, savedVB, savedVBStride, savedVBOffset); + ctx->IAGetIndexBuffer(&savedIB, &savedIBFormat, &savedIBOffset); + ctx->IAGetPrimitiveTopology(&savedTopology); +} + +PerfMode::FullscreenPassScope::~FullscreenPassScope() +{ + // Null the SRV slot before restoring to break any potential SRV-vs-RTV + // hazard from the pass we just ran (matches the explicit null-pass the + // previous inline code did). + ID3D11ShaderResourceView* nullSRV[] = { nullptr }; + ctx->PSSetShaderResources(0, 1, nullSRV); + ctx->PSSetShaderResources(0, 1, &savedSRV0); + + ctx->OMSetRenderTargets(1, &savedRTV, savedDSV); + if (numVP > 0) + ctx->RSSetViewports(numVP, savedVP); + ctx->OMSetBlendState(savedBlend, savedBlendFactor, savedSampleMask); + ctx->OMSetDepthStencilState(savedDSState, savedStencilRef); + ctx->VSSetShader(savedVS, nullptr, 0); + ctx->PSSetShader(savedPS, nullptr, 0); + ctx->GSSetShader(savedGS, nullptr, 0); + ctx->HSSetShader(savedHS, nullptr, 0); + ctx->DSSetShader(savedDS, nullptr, 0); + ctx->RSSetState(savedRS); + ctx->PSSetSamplers(0, 1, &savedSampler0); + ctx->IASetInputLayout(savedIL); + ctx->IASetVertexBuffers(0, D3D11_IA_VERTEX_INPUT_RESOURCE_SLOT_COUNT, savedVB, savedVBStride, savedVBOffset); + ctx->IASetIndexBuffer(savedIB, savedIBFormat, savedIBOffset); + ctx->IASetPrimitiveTopology(savedTopology); + + if (savedRTV) + savedRTV->Release(); + if (savedDSV) + savedDSV->Release(); + if (savedBlend) + savedBlend->Release(); + if (savedDSState) + savedDSState->Release(); + if (savedVS) + savedVS->Release(); + if (savedPS) + savedPS->Release(); + if (savedGS) + savedGS->Release(); + if (savedHS) + savedHS->Release(); + if (savedDS) + savedDS->Release(); + if (savedRS) + savedRS->Release(); + if (savedSampler0) + savedSampler0->Release(); + if (savedSRV0) + savedSRV0->Release(); + if (savedIL) + savedIL->Release(); + for (auto*& vb : savedVB) { + if (vb) + vb->Release(); + } + if (savedIB) + savedIB->Release(); +} + +void PerfMode::InstallRenderTargetSizeHook() +{ + if (!globals::game::isVR) + return; + + if (hookActive) + return; + + // Eager capture — get real HMD resolution BEFORE installing hook + auto* openvr = RE::BSOpenVR::GetSingleton(); + if (!openvr || !openvr->vrSystem) { + logger::error("[PerfMode] BSOpenVR or vrSystem not available — hook NOT installed"); + return; + } + + uint32_t w = 0, h = 0; + openvr->vrSystem->GetRecommendedRenderTargetSize(&w, &h); + if (w == 0 || h == 0) { + logger::error("[PerfMode] GetRecommendedRenderTargetSize returned {}x{} — hook NOT installed", w, h); + return; + } + + displayEyeWidth = w; + displayEyeHeight = h; + + // BSShaderRenderTargets::Create runs after SKSE feature settings load, so + // upscaling.settings.qualityMode here reflects the user-saved value. We + // snapshot the corresponding scale at install time and never re-read it — + // the engine's RT allocations happen once, so a later UI quality change + // can't shrink/grow RTs anyway. (Requires a game restart, same as DLSS.) + // + // Validate before division: a bad/corrupt JSON could put qualityMode + // outside FFX's range, returning 0/inf/NaN; that would propagate to bogus + // renderEye dimensions and silently mis-size every engine RT. Fail closed + // — leave hookActive=false so the rest of PerfMode is dormant and DLSS + // runs on dev's standard path. + const uint32_t qualityModeRaw = globals::features::upscaling.settings.qualityMode; + const uint32_t qualityMode = std::clamp(qualityModeRaw, 0, 4); // FfxFsr3QualityMode range + const float scale = ffxFsr3GetUpscaleRatioFromQualityMode(static_cast(qualityMode)); + if (!std::isfinite(scale) || scale <= 0.0f) { + logger::error("[PerfMode] FFX returned invalid upscale ratio {} for qualityMode {} (raw {}); hook NOT installed", scale, qualityMode, qualityModeRaw); + return; + } + renderEyeWidth = std::max(1, (uint32_t)(w / scale)); + renderEyeHeight = std::max(1, (uint32_t)(h / scale)); + + // Restart-required settings snapshot is latched by the render-target + // creation hook, but keep this robust to call-order changes. + globals::features::upscaling.bootSnapshot.LatchIfNeeded(globals::features::upscaling.settings); + + stl::write_vfunc<0x12, GetRenderTargetSize_Hook>(RE::VTABLE_BSOpenVR[0]); + + // Per-frame detours that used to live in Hooks.cpp. Both addresses are + // already detoured by core/other features; stl::detour_thunk chains + // (each new install wraps the prior thunk via its static func ptr). + if (!setDirtyStatesHookInstalled) { + stl::detour_thunk(REL::RelocationID(75580, 77386)); + setDirtyStatesHookInstalled = true; + } + if (!updateViewPortHookInstalled) { + stl::detour_thunk(REL::RelocationID(75455, 77240)); + updateViewPortHookInstalled = true; + } + + hookActive = true; +} + +void PerfMode::GetRenderTargetSize_Hook::thunk(RE::BSOpenVR* a_this, uint32_t* a_width, uint32_t* a_height) +{ + // Call original to get real HMD resolution + func(a_this, a_width, a_height); + + auto& perfMode = globals::features::upscaling.perfMode; + + *a_width = perfMode.renderEyeWidth; + *a_height = perfMode.renderEyeHeight; +} + +void PerfMode::SetupResources() +{ + if (!globals::game::isVR) + return; + + auto renderer = globals::game::renderer; + auto& mainRT = renderer->GetRuntimeData().renderTargets[RE::RENDER_TARGETS::kMAIN]; + if (!mainRT.texture) { + logger::error("[PerfMode] kMAIN texture not available in SetupResources"); + return; + } + + D3D11_TEXTURE2D_DESC mainDesc{}; + static_cast(mainRT.texture)->GetDesc(&mainDesc); + + D3D11_TEXTURE2D_DESC desc{}; + if (hookActive) { + desc.Width = displayEyeWidth * 2; + desc.Height = displayEyeHeight; + } else { + desc.Width = mainDesc.Width; + desc.Height = mainDesc.Height; + } + desc.MipLevels = 1; + desc.ArraySize = 1; + desc.Format = mainDesc.Format; + desc.SampleDesc.Count = 1; + desc.Usage = D3D11_USAGE_DEFAULT; + desc.BindFlags = D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_RENDER_TARGET | D3D11_BIND_UNORDERED_ACCESS; + + auto device = globals::d3d::device; + HRESULT hr = device->CreateTexture2D(&desc, nullptr, testTexture.put()); + if (FAILED(hr)) { + logger::error("[PerfMode] Failed to create test texture: {:#x}", (uint32_t)hr); + return; + } + + D3D11_SHADER_RESOURCE_VIEW_DESC srvDesc{}; + srvDesc.Format = desc.Format; + srvDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D; + srvDesc.Texture2D.MipLevels = 1; + srvDesc.Texture2D.MostDetailedMip = 0; + hr = device->CreateShaderResourceView(testTexture.get(), &srvDesc, testTextureSRV.put()); + if (FAILED(hr)) { + logger::error("[PerfMode] Failed to create test texture SRV: {:#x}", (uint32_t)hr); + testTexture = nullptr; + testTextureUAV = nullptr; + return; + } + + // UAV for testTexture + { + D3D11_UNORDERED_ACCESS_VIEW_DESC uavDesc{}; + uavDesc.Format = desc.Format; + uavDesc.ViewDimension = D3D11_UAV_DIMENSION_TEXTURE2D; + uavDesc.Texture2D.MipSlice = 0; + hr = device->CreateUnorderedAccessView(testTexture.get(), &uavDesc, testTextureUAV.put()); + if (FAILED(hr)) { + logger::error("[PerfMode] Failed to create testTexture UAV: {:#x}", (uint32_t)hr); + } + } + + // RTV for testTexture (ISRefraction output) + if (hookActive) { + D3D11_RENDER_TARGET_VIEW_DESC rtvDesc{}; + rtvDesc.Format = desc.Format; + rtvDesc.ViewDimension = D3D11_RTV_DIMENSION_TEXTURE2D; + rtvDesc.Texture2D.MipSlice = 0; + hr = device->CreateRenderTargetView(testTexture.get(), &rtvDesc, testTextureRTV.put()); + if (FAILED(hr)) { + logger::error("[PerfMode] Failed to create testTexture RTV: {:#x}", (uint32_t)hr); + } + } + + // refraTempTex: copy of testTexture for ISRefraction input + if (hookActive) { + D3D11_TEXTURE2D_DESC refraDesc = desc; + refraDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE; + + hr = device->CreateTexture2D(&refraDesc, nullptr, refraTempTex.put()); + if (FAILED(hr)) { + logger::error("[PerfMode] Failed to create refraTempTex: {:#x}", (uint32_t)hr); + } else { + D3D11_SHADER_RESOURCE_VIEW_DESC refraSrvDesc{}; + refraSrvDesc.Format = refraDesc.Format; + refraSrvDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D; + refraSrvDesc.Texture2D.MipLevels = 1; + refraSrvDesc.Texture2D.MostDetailedMip = 0; + hr = device->CreateShaderResourceView(refraTempTex.get(), &refraSrvDesc, refraTempSRV.put()); + if (FAILED(hr)) { + logger::error("[PerfMode] Failed to create refraTempSRV: {:#x}", (uint32_t)hr); + refraTempTex = nullptr; + } + } + } + + // Fake DepthStencil at DisplayRes, matching engine kMAIN DS format. + if (hookActive) { + auto& dsData = renderer->GetDepthStencilData(); + auto* mainDSTex = dsData.depthStencils[RE::RENDER_TARGETS_DEPTHSTENCIL::kMAIN].texture; + if (mainDSTex) { + D3D11_TEXTURE2D_DESC dsDesc{}; + mainDSTex->GetDesc(&dsDesc); + + D3D11_TEXTURE2D_DESC fakeDesc = dsDesc; + fakeDesc.Width = displayEyeWidth * 2; + fakeDesc.Height = displayEyeHeight; + + HRESULT hr2 = device->CreateTexture2D(&fakeDesc, nullptr, fakeDS.put()); + if (FAILED(hr2)) { + logger::error("[PerfMode] Failed to create fake DS texture: {:#x}", (uint32_t)hr2); + } else { + // Create DSV — format depends on typeless base format + D3D11_DEPTH_STENCIL_VIEW_DESC dsvDesc{}; + dsvDesc.ViewDimension = D3D11_DSV_DIMENSION_TEXTURE2D; + dsvDesc.Texture2D.MipSlice = 0; + + // Map typeless→typed DSV format + if (fakeDesc.Format == DXGI_FORMAT_R32G8X24_TYPELESS) + dsvDesc.Format = DXGI_FORMAT_D32_FLOAT_S8X24_UINT; + else if (fakeDesc.Format == DXGI_FORMAT_R24G8_TYPELESS) + dsvDesc.Format = DXGI_FORMAT_D24_UNORM_S8_UINT; + else if (fakeDesc.Format == DXGI_FORMAT_R32_TYPELESS) + dsvDesc.Format = DXGI_FORMAT_D32_FLOAT; + else if (fakeDesc.Format == DXGI_FORMAT_R16_TYPELESS) + dsvDesc.Format = DXGI_FORMAT_D16_UNORM; + else + dsvDesc.Format = fakeDesc.Format; // fallback: hope it's already a DS format + + hr2 = device->CreateDepthStencilView(fakeDS.get(), &dsvDesc, fakeDSV.put()); + if (FAILED(hr2)) { + logger::error("[PerfMode] Failed to create fake DSV: {:#x}", (uint32_t)hr2); + fakeDS = nullptr; + } + } + } else { + logger::warn("[PerfMode] kMAIN DS texture not available, skipping fake DS creation"); + } + } + + if (hookActive && fakeDSV) { + auto* ctx = globals::d3d::context; + ctx->ClearDepthStencilView(fakeDSV.get(), D3D11_CLEAR_DEPTH | D3D11_CLEAR_STENCIL, 1.0f, 0); + } + + // IS shader hooks (must be installed AFTER FrameAnnotations) + if (hookActive && !tonemapHookInstalled) { + stl::write_vfunc<0x1, TonemapRender_Hook>(RE::VTABLE_BSImagespaceShaderHDRTonemapBlendCinematic[3]); + tonemapHookInstalled = true; + } + + if (hookActive && !refractionHookInstalled && testTextureRTV && refraTempSRV) { + stl::write_vfunc<0x1, RefractionRender_Hook>(RE::VTABLE_BSImagespaceShaderRefraction[3]); + refractionHookInstalled = true; + } + + // Menu-background fix: ISCopy is the entire menu post-chain (verified via + // RenderDoc on a baseline frame — single ISCopy draw, source → kPROJECTED- + // MENU 2048² / kMENUBG). Hook the same vtable slot FrameAnnotations uses + // for its passthrough annotation, then replay with a stretched VP when + // dest > source. No resource dependencies — pure VP/Draw replay. + if (hookActive && !isCopyHookInstalled) { + stl::write_vfunc<0x1, ISCopyRender_Hook>(RE::VTABLE_BSImagespaceShaderCopy[3]); + isCopyHookInstalled = true; + } + + if (hookActive && !uiPassHookInstalled && fakeDSV) { + stl::write_vfunc<0x2A, UIPassDispatch_Hook>(RE::VTABLE_BSShaderAccumulator[0]); + uiPassHookInstalled = true; + } + + // PlayerView end hook: chains after FrameAnnotations' Main_RenderPlayerView. + // Clears postChainDone so Present-前 UI and next frame use normal VP. + if (hookActive && !playerViewHookInstalled) { + stl::detour_thunk(REL::RelocationID(35560, 36559)); + playerViewHookInstalled = true; + } + + // Downscale + blit shaders + if (hookActive && !boxDownscalePS) { + boxDownscalePS.attach(static_cast( + Util::CompileShader(L"Data/Shaders/Upscaling/PerfMode/BoxDownscalePS.hlsl", { { "PSHADER", "" } }, "ps_5_0"))); + if (!boxDownscalePS) + logger::error("[PerfMode] Failed to compile BoxDownscalePS"); + } + if (hookActive && !boxDownscaleVS) { + boxDownscaleVS.attach(static_cast( + Util::CompileShader(L"Data/Shaders/Upscaling/UpscaleVS.hlsl", { { "VSHADER", "" } }, "vs_5_0"))); + if (!boxDownscaleVS) + logger::error("[PerfMode] Failed to compile BoxDownscale VS"); + } + if (hookActive && !menuBlitPS) { + menuBlitPS.attach(static_cast( + Util::CompileShader(L"Data/Shaders/Upscaling/PerfMode/MenuBGBlitPS.hlsl", { { "PSHADER", "" } }, "ps_5_0"))); + if (!menuBlitPS) + logger::error("[PerfMode] Failed to compile MenuBGBlitPS"); + } + if (hookActive && !linearSampler) { + D3D11_SAMPLER_DESC sd{}; + sd.Filter = D3D11_FILTER_MIN_MAG_MIP_LINEAR; + sd.AddressU = D3D11_TEXTURE_ADDRESS_CLAMP; + sd.AddressV = D3D11_TEXTURE_ADDRESS_CLAMP; + sd.AddressW = D3D11_TEXTURE_ADDRESS_CLAMP; + sd.MaxAnisotropy = 1; + sd.MaxLOD = D3D11_FLOAT32_MAX; + if (FAILED(device->CreateSamplerState(&sd, linearSampler.put()))) + logger::error("[PerfMode] Failed to create linear sampler"); + } + + // Fail-closed pipeline-ready gate + // hookActive only means "BSOpenVR size hook is live + the engine has been + // sized at RenderRes." If any of the downstream resources we depend on at + // Post time failed to create, we'd previously still claim ShouldHandlePost + // and walk into a null deref inside HandlePostProcessing/Tonemap/Refraction. + // Compute the readiness flag once, here — every consumer keys off it. + // + // The minimum viable set for Post wrapping: + // testTexture + testTextureSRV — read by tonemap inner-swap (always) + // fakeDS + fakeDSV — bound as the 3k DS during Post + // boxDownscaleVS/PS + linearSampler — DownscaleToKMain needs these + // Refraction (refraTempTex/SRV + testTextureRTV) is optional: its hook + // gates on those resources at install time, so absence just means the + // refraction draw runs on the engine's 1k path — degraded but stable. + postPipelineReady = + hookActive && + testTexture && testTextureSRV && + fakeDS && fakeDSV && + boxDownscaleVS && boxDownscalePS && linearSampler; + + if (hookActive && !postPipelineReady) { + logger::error( + "[PerfMode] Post pipeline failed to initialize fully — Post wrap " + "disabled, engine RTs remain at RenderRes. Check upstream resource " + "creation errors above."); + } +} + +// ============================================================================ +// TonemapRender_Hook: IS shader hook for ISHDRTonemapBlendCinematic +// ============================================================================ +// Installed via stl::write_vfunc<0x1> on vtable[3], chains after FrameAnnotations. +// Inner layer of two-layer swap: swaps kMAIN SRV → testTextureSRV and +// kMAIN DS → fakeDS before tonemap Render(), restores after. + +void PerfMode::TonemapRender_Hook::thunk(void* imageSpaceShader, RE::BSTriShape* shape, RE::ImageSpaceEffectParam* param) +{ + auto& perfMode = globals::features::upscaling.perfMode; + + if (!perfMode.hookActive || !perfMode.testTextureSRV || !perfMode.fakeDSV) { + func(imageSpaceShader, shape, param); + return; + } + + // Menu/loading-screen path: the engine's bridge into kTOTAL assumes + // RT.size == kMAIN.size (true for DLAA, broken under DLSS presets where + // kMAIN is renderRes), so the BG ends up missing and OpenVR reprojects + // stale content as movement smears. Skip the gameplay SRV/DS hijack and + // let tonemap run untouched, then call MaybeBlitMenuBG to drive a + // one-shot Upscaling::Upscale() + MenuBGBlitPS blit of the resulting + // DLSS-reconstructed testTexture into kTOTAL. Per-frame guarded so it + // runs at most once per frame regardless of how many menu redraws fire. + if (globals::state && globals::state->IsMainOrLoadingMenuOpen()) { + func(imageSpaceShader, shape, param); + perfMode.MaybeBlitMenuBG(RE::RENDER_TARGETS::kTOTAL); + return; + } + + ZoneScoped; + TracyD3D11Zone(globals::state->tracyCtx, "PerfMode::TonemapRender"); + + auto renderer = RE::BSGraphics::Renderer::GetSingleton(); + auto& rtData = renderer->GetRuntimeData(); + auto& dsData = renderer->GetDepthStencilData(); + + // --- Swap kMAIN SRV → testTextureSRV (so tonemap reads 3k upscaled color) --- + auto& kmainRT = rtData.renderTargets[RE::RENDER_TARGETS::kMAIN]; + perfMode.savedKMainSRV = kmainRT.SRV; + kmainRT.SRV = perfMode.testTextureSRV.get(); + + // --- Also swap kMAIN_COPY SRV (refraction path reads this instead of kMAIN) --- + auto& kmainCopyRT = rtData.renderTargets[RE::RENDER_TARGETS::kMAIN_COPY]; + perfMode.savedKMainCopySRV = kmainCopyRT.SRV; + kmainCopyRT.SRV = perfMode.testTextureSRV.get(); + + // --- Swap kMAIN DS views → fakeDS (so 3k RT doesn't mismatch 1k DS) --- + auto& kmainDS = dsData.depthStencils[RE::RENDER_TARGETS_DEPTHSTENCIL::kMAIN]; + for (int i = 0; i < 8; i++) { + perfMode.savedKMainViews[i] = kmainDS.views[i]; + if (kmainDS.views[i]) + kmainDS.views[i] = perfMode.fakeDSV.get(); + } + for (int i = 0; i < 8; i++) { + perfMode.savedKMainReadOnlyViews[i] = kmainDS.readOnlyViews[i]; + if (kmainDS.readOnlyViews[i]) + kmainDS.readOnlyViews[i] = perfMode.fakeDSV.get(); + } + + // --- Call original (or FrameAnnotations chain) --- + func(imageSpaceShader, shape, param); + + // --- Restore kMAIN SRV --- + kmainRT.SRV = perfMode.savedKMainSRV; + perfMode.savedKMainSRV = nullptr; + + // --- Restore kMAIN_COPY SRV --- + kmainCopyRT.SRV = perfMode.savedKMainCopySRV; + perfMode.savedKMainCopySRV = nullptr; + + // --- Restore kMAIN DS views --- + for (int i = 0; i < 8; i++) + kmainDS.views[i] = perfMode.savedKMainViews[i]; + for (int i = 0; i < 8; i++) + kmainDS.readOnlyViews[i] = perfMode.savedKMainReadOnlyViews[i]; +} + +// ============================================================================ +// RefractionRender_Hook: IS shader hook for ISRefraction +// ============================================================================ +// Strategy: let func() run normally (1k refraction, kMAIN→kMAIN_COPY). +// After func() returns, D3D11 state is sticky (PS/CB/sampler/IA all still bound). +// We replay the draw with our own RT (testTexture 3k), VP (3k), and SRV (refraTempTex 3k). + +void PerfMode::RefractionRender_Hook::thunk(void* imageSpaceShader, RE::BSTriShape* shape, RE::ImageSpaceEffectParam* param) +{ + auto& perfMode = globals::features::upscaling.perfMode; + + if (!perfMode.hookActive || !perfMode.testTextureRTV || !perfMode.refraTempSRV) { + func(imageSpaceShader, shape, param); + return; + } + + ZoneScoped; + TracyD3D11Zone(globals::state->tracyCtx, "PerfMode::RefractionRender"); + + // --- Pass 1: engine's normal 1k refraction (untouched) --- + func(imageSpaceShader, shape, param); + + // --- Pass 2: our 3k refraction replay --- + // func() left PS/CB/sampler/IA/VB/IB all bound on the D3D context. + // We only change RT, VP, and t0 SRV, then DrawIndexed with the same geometry. + + auto* context = globals::d3d::context; + + // Save current RT so we can restore after our draw + ID3D11RenderTargetView* savedRTV = nullptr; + ID3D11DepthStencilView* savedDSV = nullptr; + context->OMGetRenderTargets(1, &savedRTV, &savedDSV); + + // Save the full viewport stack rather than a single VP — RSSetViewports/ + // RSGetViewports work on arrays sized up to D3D11_VIEWPORT_AND_SCISSORRECT_ + // OBJECT_COUNT_PER_PIPELINE (16). Truncating to one would silently drop + // extra bound viewports if a later engine pass relied on multi-VP. + UINT numVP = D3D11_VIEWPORT_AND_SCISSORRECT_OBJECT_COUNT_PER_PIPELINE; + D3D11_VIEWPORT savedVP[D3D11_VIEWPORT_AND_SCISSORRECT_OBJECT_COUNT_PER_PIPELINE] = {}; + context->RSGetViewports(&numVP, savedVP); + + // Save current t0 SRV (kMAIN.SRV used by ISRefraction as scene input) + ID3D11ShaderResourceView* savedSRV0 = nullptr; + context->PSGetShaderResources(0, 1, &savedSRV0); + + // Set 3k output: testTexture RTV, no DS needed for fullscreen IS shader + ID3D11RenderTargetView* rtv3k = perfMode.testTextureRTV.get(); + context->OMSetRenderTargets(1, &rtv3k, nullptr); + + // Set 3k VP + D3D11_VIEWPORT vp3k = {}; + vp3k.TopLeftX = 0.0f; + vp3k.TopLeftY = 0.0f; + vp3k.Width = static_cast(perfMode.displayEyeWidth * 2); + vp3k.Height = static_cast(perfMode.displayEyeHeight); + vp3k.MinDepth = 0.0f; + vp3k.MaxDepth = 1.0f; + context->RSSetViewports(1, &vp3k); + + // Set 3k input: refraTempTex as t0 (scene color for refraction sampling) + ID3D11ShaderResourceView* srv3k = perfMode.refraTempSRV.get(); + context->PSSetShaderResources(0, 1, &srv3k); + + // Draw with the same geometry (BSTriShape fullscreen quad, IA still bound) + context->DrawIndexed(6, 0, 0); + + // --- Restore D3D state so engine continues normally --- + context->OMSetRenderTargets(1, &savedRTV, savedDSV); + // RSGetViewports may have returned 0 if the prior pass left no viewport + // bound; skip the restore in that case rather than pushing a zero-init VP. + if (numVP > 0) { + context->RSSetViewports(numVP, savedVP); + } + context->PSSetShaderResources(0, 1, &savedSRV0); + + // Release COM refs from Get calls + if (savedRTV) + savedRTV->Release(); + if (savedDSV) + savedDSV->Release(); + if (savedSRV0) + savedSRV0->Release(); +} + +// ============================================================================ +// ISCopyRender_Hook: stretch ISCopy when source < dest (menu compositor fix) +// ============================================================================ +// The VR menu compositor uses a single ISCopy draw to blit the rendered scene +// (kMAIN — RenderRes under PerfMode) into a fixed-size projection surface +// (kPROJECTEDMENU 2048², or kMENUBG which PerfMode enlarges to DisplayRes). +// Engine ISCopy uses a 1:1 viewport sized to the source, so the small source +// is stamped into the top-left of the larger dest. Symptom: "main menu image +// looks downscaled." +// +// Fix: after func() runs, if dest > current VP we replay the draw with the +// VP expanded to the dest's full dims. ISCopy's PS/CB/IA/sampler are sticky +// on the D3D context after func() returns (same pattern RefractionRender_Hook +// relies on), so the replay needs only a viewport change + DrawIndexed and +// the engine's clamp-sampler stretches the source naturally. +// +// In-game ISCopy (where source.w == dest.w under PerfMode — kMAIN renderRes +// → kMAIN_COPY renderRes) takes the early-out branch and the engine's draw +// is the final pixel. + +void PerfMode::ISCopyRender_Hook::thunk(void* imageSpaceShader, RE::BSTriShape* shape, RE::ImageSpaceEffectParam* param) +{ + auto& perfMode = globals::features::upscaling.perfMode; + + // Inactive / non-VR: passthrough. + if (!perfMode.hookActive) { + func(imageSpaceShader, shape, param); + return; + } + + // Let the engine draw first. After func() returns the IS shader's PS, CB, + // sampler, IA layout, vertex/index buffers, and topology are all still + // bound on the context (sticky D3D11 state). We only need to override the + // viewport for the replay draw. + func(imageSpaceShader, shape, param); + + auto* context = globals::d3d::context; + + // Inspect the current RTV's dest dimensions. + ID3D11RenderTargetView* curRTV = nullptr; + ID3D11DepthStencilView* curDSV = nullptr; + context->OMGetRenderTargets(1, &curRTV, &curDSV); + if (!curRTV) { + if (curDSV) + curDSV->Release(); + return; + } + + ID3D11Resource* rtRes = nullptr; + curRTV->GetResource(&rtRes); + if (!rtRes) { + curRTV->Release(); + if (curDSV) + curDSV->Release(); + return; + } + + D3D11_TEXTURE2D_DESC rtDesc{}; + static_cast(rtRes)->GetDesc(&rtDesc); + rtRes->Release(); + + // Current VP (the one func() used) — full array so we can restore exactly. + UINT numVP = D3D11_VIEWPORT_AND_SCISSORRECT_OBJECT_COUNT_PER_PIPELINE; + D3D11_VIEWPORT savedVP[D3D11_VIEWPORT_AND_SCISSORRECT_OBJECT_COUNT_PER_PIPELINE] = {}; + context->RSGetViewports(&numVP, savedVP); + + // Only intervene when either axis of the engine's VP is smaller than the + // dest. The +1 guard avoids float-equality issues (VPs are floats, RT + // dims are uint32). Width-only would miss the case where the engine + // binds a taller-than-VP RT (e.g., a square 2048² panel against a + // renderRes-height VP). + bool needsStretch = numVP > 0 && + (rtDesc.Width > static_cast(savedVP[0].Width + 1.0f) || + rtDesc.Height > static_cast(savedVP[0].Height + 1.0f)); + + if (needsStretch) { + ZoneScoped; + TracyD3D11Zone(globals::state->tracyCtx, "PerfMode::ISCopyStretch"); + + // Replay viewport: full dest extent, preserve depth range from the + // original so anything sampling depth (unlikely for ISCopy but safe) + // keeps the same Z behavior. + D3D11_VIEWPORT stretchVP = savedVP[0]; + stretchVP.TopLeftX = 0.0f; + stretchVP.TopLeftY = 0.0f; + stretchVP.Width = static_cast(rtDesc.Width); + stretchVP.Height = static_cast(rtDesc.Height); + context->RSSetViewports(1, &stretchVP); + + // ISCopy is a fullscreen quad drawn as a triangle list (6 indices). + // Same index count RefractionRender_Hook uses for the same reason — + // both replay the IS shader's standard fullscreen geometry. + context->DrawIndexed(6, 0, 0); + + // Restore engine's VP so any state inspector downstream sees what it + // expects. numVP guaranteed > 0 inside this branch. + context->RSSetViewports(numVP, savedVP); + } + + curRTV->Release(); + if (curDSV) + curDSV->Release(); +} + +// ============================================================================ +// UIPassDispatch_Hook: swap KMAIN DS → fakeDS for UI pass (renderMode==24) +// ============================================================================ +// UI pass draws VR HUD to kMENUBG (now 3k). Engine binds KMAIN(DS) as DS, +// which is still 1k → size mismatch. Swap to fakeDS (3k) before, restore after. + +void PerfMode::UIPassDispatch_Hook::thunk(RE::BSGraphics::BSShaderAccumulator* shaderAccumulator, uint32_t renderFlags) +{ + auto& perfMode = globals::features::upscaling.perfMode; + + // Only intercept renderMode==24 (UI pass) when hook is active + auto& rtData = shaderAccumulator->GetRuntimeData(); + if (!perfMode.hookActive || !perfMode.fakeDSV || rtData.renderMode != 24) { + func(shaderAccumulator, renderFlags); + return; + } + + auto renderer = RE::BSGraphics::Renderer::GetSingleton(); + auto& dsData = renderer->GetDepthStencilData(); + auto& kmainDS = dsData.depthStencils[RE::RENDER_TARGETS_DEPTHSTENCIL::kMAIN]; + + // Save original KMAIN DS views and swap to fakeDS + ID3D11DepthStencilView* savedViews[8] = {}; + ID3D11DepthStencilView* savedReadOnlyViews[8] = {}; + for (int i = 0; i < 8; i++) { + savedViews[i] = kmainDS.views[i]; + if (kmainDS.views[i]) + kmainDS.views[i] = perfMode.fakeDSV.get(); + } + for (int i = 0; i < 8; i++) { + savedReadOnlyViews[i] = kmainDS.readOnlyViews[i]; + if (kmainDS.readOnlyViews[i]) + kmainDS.readOnlyViews[i] = perfMode.fakeDSV.get(); + } + + // Force engine to re-bind DS from struct + globals::game::stateUpdateFlags->set(RE::BSGraphics::ShaderFlags::DIRTY_RENDERTARGET); + + // Force 3k VP: engine may not call UpdateViewPort during UI pass, + // so we directly set shadowState viewport to DisplayRes and mark dirty. + auto* ss = globals::game::shadowState; + D3D11_VIEWPORT savedVP = {}; + if (ss) { + auto& vp = ss->GetVRRuntimeData().viewPort; + savedVP = vp; + vp.TopLeftX = 0.0f; + vp.TopLeftY = 0.0f; + vp.Width = static_cast(perfMode.displayEyeWidth * 2); + vp.Height = static_cast(perfMode.displayEyeHeight); + vp.MinDepth = 0.0f; + vp.MaxDepth = 1.0f; + globals::game::stateUpdateFlags->set(RE::BSGraphics::ShaderFlags::DIRTY_VIEWPORT); + } + + // Skip VP compression in UpdateViewPort hook during UI pass + perfMode.postInterceptActive = true; + + func(shaderAccumulator, renderFlags); + + perfMode.postInterceptActive = false; + + // Restore viewport + if (ss) { + ss->GetVRRuntimeData().viewPort = savedVP; + globals::game::stateUpdateFlags->set(RE::BSGraphics::ShaderFlags::DIRTY_VIEWPORT); + } + + // Restore original KMAIN DS views + for (int i = 0; i < 8; i++) + kmainDS.views[i] = savedViews[i]; + for (int i = 0; i < 8; i++) + kmainDS.readOnlyViews[i] = savedReadOnlyViews[i]; + + // Re-dirty so subsequent passes get correct DS + globals::game::stateUpdateFlags->set(RE::BSGraphics::ShaderFlags::DIRTY_RENDERTARGET); +} + +// ============================================================================ +// PlayerViewRender_Hook: clear postChainDone at PlayerView end +// ============================================================================ +// PlayerView covers the entire VR pipeline (World→Post→UI→Submit). +// After func() returns, clear postChainDone so the Present-前 UI chain +// and the next frame use normal VP compression. + +void PerfMode::PlayerViewRender_Hook::thunk(void* a1, bool a2, bool a3) +{ + func(a1, a2, a3); + + globals::features::upscaling.perfMode.ClearPostChainDone(); +} + +// ============================================================================ +// BSGraphics_SetDirtyStates_Hook +// ============================================================================ +// Wraps DS swap around the engine's RT/DS flush so enlarged-RT draws don't +// rasterizer-clip to the smaller kMAIN DS bounds. +void PerfMode::BSGraphics_SetDirtyStates_Hook::thunk(bool isCompute) +{ + bool swapped = false; + if (!isCompute) + swapped = globals::features::upscaling.perfMode.MaybeSwapDSForEnlargedRT(); + func(isCompute); + if (swapped) + globals::features::upscaling.perfMode.RestoreSwappedDS(); +} + +// ============================================================================ +// BSGraphics_Renderer_UpdateViewPort_Hook +// ============================================================================ +// Post-corrects the engine viewport when the bound RT and the requested VP +// don't agree about render-vs-display extent. Was originally in Hooks.cpp. +void PerfMode::BSGraphics_Renderer_UpdateViewPort_Hook::thunk(RE::BSGraphics::Renderer* a_this, uint32_t a_width, uint32_t a_height, bool a_forceMatchRT) +{ + func(a_this, a_width, a_height, a_forceMatchRT); + + auto& perfMode = globals::features::upscaling.perfMode; + if (!perfMode.IsHookActive()) + return; + + // During Post intercept enlarged kTEMP/kTOTAL already get the right VP + // from func() because of their inflated RT dims — don't second-guess it. + if (perfMode.IsPostInterceptActive()) + return; + + auto* ss = globals::game::shadowState; + if (!ss) + return; + auto& vp = ss->GetVRRuntimeData().viewPort; + const uint32_t displayW = perfMode.GetDisplayEyeWidth() * 2; + const uint32_t displayH = perfMode.GetDisplayEyeHeight(); + const uint32_t renderW = perfMode.GetRenderEyeWidth() * 2; + const uint32_t renderH = perfMode.GetRenderEyeHeight(); + + // After the Post chain, UI / submit-prep draws target enlarged kTOTAL + // at displayRes — expand any renderRes VP the engine sets back up. + // The fade Draw(30) bypasses this path entirely (direct D3D RSSet- + // Viewports) and is handled by the Draw vfunc hook in Globals.cpp. + if (perfMode.IsPostChainDone()) { + if (static_cast(vp.Width) == renderW && + static_cast(vp.Height) == renderH) { + vp.Width = static_cast(displayW); + vp.Height = static_cast(displayH); + } + return; + } + + // Honor forceMatchRT for displayRes-enlarged RTs — shrinking VP there + // leaves menu content in a renderRes corner of kTOTAL. + if (a_forceMatchRT) + return; + + // Same risk on the non-forceMatchRT path: the menu compositor calls + // UpdateViewPort(displayW, displayH, false) directly with screen-space + // dims, and compressing those would clip the BG. + { + const uint32_t rtIdx = static_cast(ss->GetVRRuntimeData().renderTargets[0]); + if (rtIdx == RE::RENDER_TARGETS::kTOTAL || + rtIdx == RE::RENDER_TARGETS::kMENUBG || + rtIdx == RE::RENDER_TARGETS::kIMAGESPACE_TEMP_COPY) + return; + } + + // Normal world/depth path: compress displayRes → renderRes so draws + // stay inside the renderRes-sized kMAIN family. + if (static_cast(vp.Width) == displayW && + static_cast(vp.Height) == displayH) { + vp.Width = static_cast(renderW); + vp.Height = static_cast(renderH); + } +} + +// ============================================================================ +// BeginPostIntercept / EndPostIntercept +// ============================================================================ +// Outer layer of two-layer swap: swaps kMAIN_COPY DS → fakeDS before the +// entire Post chain (covers the copy step #10 which binds kMAIN_COPY DS). +// Inner layer (tonemap hook) handles kMAIN DS + kMAIN SRV for step #9. + +void PerfMode::BeginPostIntercept() +{ + if (!hookActive || !fakeDSV) + return; + + ZoneScoped; + auto state = globals::state; + state->BeginPerfEvent("PerfMode::BeginPostIntercept"); + + auto renderer = RE::BSGraphics::Renderer::GetSingleton(); + auto& dsData = renderer->GetDepthStencilData(); + auto& kmainCopyDS = dsData.depthStencils[RE::RENDER_TARGETS_DEPTHSTENCIL::kMAIN_COPY]; + + postInterceptActive = true; + + // Swap kMAIN_COPY DS views → fakeDS + for (int i = 0; i < 8; i++) { + savedKMainCopyViews[i] = kmainCopyDS.views[i]; + if (kmainCopyDS.views[i]) + kmainCopyDS.views[i] = fakeDSV.get(); + } + for (int i = 0; i < 8; i++) { + savedKMainCopyReadOnlyViews[i] = kmainCopyDS.readOnlyViews[i]; + if (kmainCopyDS.readOnlyViews[i]) + kmainCopyDS.readOnlyViews[i] = fakeDSV.get(); + } + + state->EndPerfEvent(); +} + +void PerfMode::EndPostIntercept() +{ + if (!hookActive || !fakeDSV) + return; + + ZoneScoped; + auto state = globals::state; + state->BeginPerfEvent("PerfMode::EndPostIntercept"); + + auto renderer = RE::BSGraphics::Renderer::GetSingleton(); + auto& dsData = renderer->GetDepthStencilData(); + auto& kmainCopyDS = dsData.depthStencils[RE::RENDER_TARGETS_DEPTHSTENCIL::kMAIN_COPY]; + + postInterceptActive = false; + postChainDone = true; + + // Restore kMAIN_COPY DS views + for (int i = 0; i < 8; i++) + kmainCopyDS.views[i] = savedKMainCopyViews[i]; + for (int i = 0; i < 8; i++) + kmainCopyDS.readOnlyViews[i] = savedKMainCopyReadOnlyViews[i]; + + state->EndPerfEvent(); +} + +// ============================================================================ +// DownscaleToKMain: Box 3×3 downscale testTexture (3k) → kMAIN (1k) +// ============================================================================ +// Called before the Post chain so the HDR pyramid builds from AA'd DLSS output +// instead of the raw 1k render, eliminating shimmer in bloom/exposure. +// Only kMAIN needs writing: +// - No refraction: kMAIN is the pyramid input directly. +// - With refraction: engine composites kMAIN → kMAIN_COPY, which enters pyramid. + +void PerfMode::DownscaleToKMain() +{ + if (!hookActive || !testTextureSRV || !boxDownscalePS || !boxDownscaleVS || !linearSampler) + return; + + ZoneScoped; + auto state = globals::state; + auto renderer = globals::game::renderer; + auto context = globals::d3d::context; + auto& rtData = renderer->GetRuntimeData(); + + auto& kmain = rtData.renderTargets[RE::RENDER_TARGETS::kMAIN]; + + // Bail before opening the perf event so we don't leak a dangling + // Begin without End on the null-RTV early-return path. + if (!kmain.RTV) + return; + + state->BeginPerfEvent("PerfMode::DownscaleToKMain"); + TracyD3D11Zone(state->tracyCtx, "PerfMode::DownscaleToKMain"); + + { + FullscreenPassScope stateScope(context); + + // IA: fullscreen triangle (no vertex/index buffers) + context->IASetInputLayout(nullptr); + context->IASetVertexBuffers(0, 0, nullptr, nullptr, nullptr); + context->IASetIndexBuffer(nullptr, DXGI_FORMAT_UNKNOWN, 0); + context->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST); + + // Shaders — clear GS/HS/DS to prevent pipeline interference + context->VSSetShader(boxDownscaleVS.get(), nullptr, 0); + context->PSSetShader(boxDownscalePS.get(), nullptr, 0); + context->GSSetShader(nullptr, nullptr, 0); + context->HSSetShader(nullptr, nullptr, 0); + context->DSSetShader(nullptr, nullptr, 0); + + ID3D11ShaderResourceView* srvs[] = { testTextureSRV.get() }; + context->PSSetShaderResources(0, 1, srvs); + ID3D11SamplerState* samplers[] = { linearSampler.get() }; + context->PSSetSamplers(0, 1, samplers); + + // Opaque overwrite: no blending, no depth test, default rasterizer + context->OMSetBlendState(nullptr, nullptr, 0xffffffff); + context->OMSetDepthStencilState(nullptr, 0); + context->RSSetState(nullptr); + + // Viewport at RenderRes SBS (1k) + D3D11_VIEWPORT vp = {}; + vp.Width = static_cast(renderEyeWidth * 2); + vp.Height = static_cast(renderEyeHeight); + vp.MaxDepth = 1.0f; + context->RSSetViewports(1, &vp); + + ID3D11RenderTargetView* rtv = kmain.RTV; + context->OMSetRenderTargets(1, &rtv, nullptr); + context->Draw(3, 0); + } + + state->EndPerfEvent(); +} + +// Bridge the DLSS-reconstructed menu BG into kTOTAL/kMENUBG. Driven from +// TonemapRender_Hook post-func in menu/loading state: the engine's tonemap +// shader's UV math assumes RT.size == kMAIN.size (true for DLAA, broken +// under DLSS), so we run our own DLSS evaluate against the engine's +// menu-state inputs (jitter via Main_UpdateJitter, depth via menu BG pre- +// pass, motion vectors as ISTemporalAA reads them) and blit testTexture → +// dest. One-shot per frame via blittedFrameId (Present doesn't fire here +// and PlayerView doesn't fire in main-menu, so the frame-id guard is the +// only reliable per-frame boundary). +void PerfMode::MaybeBlitMenuBG(uint32_t boundRTIdx) +{ + const uint32_t currentFrame = globals::state ? globals::state->frameCount : 0; + if (!hookActive || blittedFrameId == currentFrame || !menuBlitPS || !boxDownscaleVS || !linearSampler) + return; + if (!testTexture || !testTextureSRV) + return; + if (!globals::state || !globals::state->IsMainOrLoadingMenuOpen()) + return; + if (boundRTIdx != RE::RENDER_TARGETS::kTOTAL && + boundRTIdx != RE::RENDER_TARGETS::kMENUBG) + return; + + auto renderer = globals::game::renderer; + auto& rtData = renderer->GetRuntimeData(); + auto& dest = rtData.renderTargets[boundRTIdx]; + if (!dest.RTV || !dest.texture) + return; + + ZoneScoped; + auto state = globals::state; + auto* context = globals::d3d::context; + state->BeginPerfEvent("PerfMode::MenuBGBlit"); + TracyD3D11Zone(state->tracyCtx, "PerfMode::MenuBGBlit"); + + globals::features::upscaling.Upscale(); + + { + FullscreenPassScope stateScope(context); + + // IA: fullscreen triangle, no VB/IB + context->IASetInputLayout(nullptr); + context->IASetVertexBuffers(0, 0, nullptr, nullptr, nullptr); + context->IASetIndexBuffer(nullptr, DXGI_FORMAT_UNKNOWN, 0); + context->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST); + + context->VSSetShader(boxDownscaleVS.get(), nullptr, 0); + context->PSSetShader(menuBlitPS.get(), nullptr, 0); + context->GSSetShader(nullptr, nullptr, 0); + context->HSSetShader(nullptr, nullptr, 0); + context->DSSetShader(nullptr, nullptr, 0); + + ID3D11ShaderResourceView* srvs[] = { testTextureSRV.get() }; + context->PSSetShaderResources(0, 1, srvs); + ID3D11SamplerState* samplers[] = { linearSampler.get() }; + context->PSSetSamplers(0, 1, samplers); + + context->OMSetBlendState(nullptr, nullptr, 0xffffffff); + context->OMSetDepthStencilState(nullptr, 0); + context->RSSetState(nullptr); + + D3D11_TEXTURE2D_DESC destDesc{}; + static_cast(dest.texture)->GetDesc(&destDesc); + D3D11_VIEWPORT vp = {}; + vp.Width = static_cast(destDesc.Width); + vp.Height = static_cast(destDesc.Height); + vp.MaxDepth = 1.0f; + context->RSSetViewports(1, &vp); + + ID3D11RenderTargetView* rtv = dest.RTV; + context->OMSetRenderTargets(1, &rtv, nullptr); + context->Draw(3, 0); + } + + blittedFrameId = currentFrame; + state->EndPerfEvent(); +} + +void PerfMode::HandlePostProcessing(const std::function& enginePost) +{ + ZoneScoped; + auto state = globals::state; + state->BeginPerfEvent("PerfMode::HandlePostProcessing"); + + // Copy testTexture → refraTempTex before Post, so ISRefraction can read 3k scene + if (refraTempTex) { + globals::d3d::context->CopyResource(refraTempTex.get(), testTexture.get()); + } + + // Downscale testTexture (3k AA'd) → kMAIN (1k) so the HDR pyramid and + // bloom compute from anti-aliased content instead of raw 1k render. + DownscaleToKMain(); + + // Underwater mask analytical repair. Engine RTs (depth, mask) are at + // renderRes under PerfMode, so the full-resolution path of UpscaleDepth + // would apply — but routing through UpscaleDepth here leaves pipeline + // state dirty and the trailing enginePost() loses kMAIN. Drive the + // mask-only draw directly inside our own FullscreenPassScope so the + // inbound engine state is restored on exit. + { + FullscreenPassScope scope(globals::d3d::context); + globals::features::upscaling.RunUnderwaterMaskRepair(); + } + + // Outer layer: swap kMAIN_COPY DS + SRV for refraction path coverage + BeginPostIntercept(); + + // Run full engine Post chain; IS shader hook handles tonemap step (#9) swap/restore + enginePost(); + + // Restore kMAIN_COPY DS + EndPostIntercept(); + + state->EndPerfEvent(); +} + +bool PerfMode::MaybeSwapDSForEnlargedRT() +{ + if (!hookActive || postInterceptActive) + return false; + if (autoSwapDSIdx != UINT32_MAX) + return false; // re-entry guard + + auto* ss = globals::game::shadowState; + if (!ss) + return false; + auto& srd = ss->GetVRRuntimeData(); + + // Only the three RTs PerfMode_MaybeEnlargeRT inflates to displayRes. + const uint32_t rtIdx = static_cast(srd.renderTargets[0]); + if (rtIdx != RE::RENDER_TARGETS::kTOTAL && + rtIdx != RE::RENDER_TARGETS::kMENUBG && + rtIdx != RE::RENDER_TARGETS::kIMAGESPACE_TEMP_COPY) + return false; + + const uint32_t dsIdx = static_cast(srd.depthStencil); + if (dsIdx != RE::RENDER_TARGETS_DEPTHSTENCIL::kMAIN && + dsIdx != RE::RENDER_TARGETS_DEPTHSTENCIL::kMAIN_COPY) + return false; + + auto renderer = globals::game::renderer; + auto& dsData = renderer->GetDepthStencilData(); + auto& bound = dsData.depthStencils[dsIdx]; + + // Unbind DS entirely rather than rebind to fakeDSV. fakeDSV is cleared + // once at init with stencil=0; ISHDRTonemapBlendCinematic (the menu's + // kMAIN→kTOTAL bridge at event 931 in the baseline capture) reads + // stencil to mask sky vs. world, so a wrong stencil value discards every + // pixel and the BG never reaches kTOTAL. Fullscreen IS shaders and the + // menu UI quad don't actually depth-test, so nullptr DS is safe and + // sidesteps the stencil-content mismatch. Swap pattern matches UIPass- + // Dispatch_Hook (all 8 view slots). + for (int i = 0; i < 8; ++i) { + autoSwapSavedViews[i] = bound.views[i]; + if (bound.views[i]) + bound.views[i] = nullptr; + } + for (int i = 0; i < 8; ++i) { + autoSwapSavedReadOnlyViews[i] = bound.readOnlyViews[i]; + if (bound.readOnlyViews[i]) + bound.readOnlyViews[i] = nullptr; + } + autoSwapDSIdx = dsIdx; + return true; +} + +void PerfMode::RestoreSwappedDS() +{ + if (autoSwapDSIdx == UINT32_MAX) + return; + auto renderer = globals::game::renderer; + auto& dsData = renderer->GetDepthStencilData(); + auto& bound = dsData.depthStencils[autoSwapDSIdx]; + for (int i = 0; i < 8; ++i) + bound.views[i] = autoSwapSavedViews[i]; + for (int i = 0; i < 8; ++i) + bound.readOnlyViews[i] = autoSwapSavedReadOnlyViews[i]; + autoSwapDSIdx = UINT32_MAX; +} + +// ============================================================================ +// CreateRenderTarget enlarge — install + per-site thunks +// ============================================================================ +// Three specific call sites inside BSShaderRenderTargets::Create produce the +// displayRes-enlarged RTs (kMENUBG, kIMAGESPACE_TEMP_COPY, kTOTAL). Offsets +// identified in Ghidra (see CreateRT_k* labels inside the renamed +// BSShaderRenderTargets__Create function in SkyrimVR.exe). VR-only. + +// ============================================================================ +// ID3D11DeviceContext_Draw_Hook (vtable index 13) +// ============================================================================ +// Engine fade-overlay Draw(30) fires after the Post chain and before Submit. +// Under PerfMode the draw's VP is computed at renderRes while the RT (kTOTAL) +// is displayRes — partial-screen "black stamp" without this swap. Gate on +// VertexCount==30 + isVR keeps the cost a single comparison on flat / non- +// fade draws. +void PerfMode::ID3D11DeviceContext_Draw_Hook::thunk(ID3D11DeviceContext* This, UINT VertexCount, UINT StartVertexLocation) +{ + if (VertexCount == 30 && globals::game::isVR) { + auto& perfMode = globals::features::upscaling.perfMode; + if (perfMode.IsHookActive() && perfMode.IsPostChainDone()) { + UINT numVP = D3D11_VIEWPORT_AND_SCISSORRECT_OBJECT_COUNT_PER_PIPELINE; + D3D11_VIEWPORT savedVP[D3D11_VIEWPORT_AND_SCISSORRECT_OBJECT_COUNT_PER_PIPELINE] = {}; + This->RSGetViewports(&numVP, savedVP); + + D3D11_VIEWPORT vp{}; + vp.Width = static_cast(perfMode.GetDisplayEyeWidth() * 2); + vp.Height = static_cast(perfMode.GetDisplayEyeHeight()); + vp.MinDepth = 0.0f; + vp.MaxDepth = 1.0f; + This->RSSetViewports(1, &vp); + + func(This, VertexCount, StartVertexLocation); + + if (numVP > 0) + This->RSSetViewports(numVP, savedVP); + return; + } + } + func(This, VertexCount, StartVertexLocation); +} + +void PerfMode::InstallFadeOverlayHook(ID3D11DeviceContext* context) +{ + if (!globals::game::isVR || !context) + return; + stl::detour_vfunc<13, ID3D11DeviceContext_Draw_Hook>(context); +} + +void PerfMode::InstallCreateRTThunks() +{ + if (!REL::Module::IsVR()) + return; + auto vrBase = REL::RelocationID(100458, 107175).address(); + stl::write_thunk_call(vrBase + 0x6cc); + stl::write_thunk_call(vrBase + 0x7a3); + stl::write_thunk_call(vrBase + 0x1547); +} + +void PerfMode::BeginCreateRTEnlarge() +{ + if (!hookActive) + return; + enlargeWidth = displayEyeWidth * 2; + enlargeHeight = displayEyeHeight; + enlargeActive = true; +} + +void PerfMode::EndCreateRTEnlarge() +{ + enlargeActive = false; +} + +namespace +{ + void EnlargeProps(RE::BSGraphics::RenderTargetProperties* a_props) + { + auto& dp = globals::features::upscaling.perfMode; + if (!dp.IsCreateRTEnlargeActive()) + return; + a_props->width = dp.GetEnlargeWidth(); + a_props->height = dp.GetEnlargeHeight(); + } +} + +void PerfMode::CreateRT_MenuBG_Hook::thunk(RE::BSGraphics::Renderer* a_this, RE::RENDER_TARGETS::RENDER_TARGET a_target, RE::BSGraphics::RenderTargetProperties* a_properties) +{ + EnlargeProps(a_properties); + func(a_this, a_target, a_properties); +} + +void PerfMode::CreateRT_ImagespaceTempCopy_Hook::thunk(RE::BSGraphics::Renderer* a_this, RE::RENDER_TARGETS::RENDER_TARGET a_target, RE::BSGraphics::RenderTargetProperties* a_properties) +{ + EnlargeProps(a_properties); + func(a_this, a_target, a_properties); +} + +void PerfMode::CreateRT_Total_Hook::thunk(RE::BSGraphics::Renderer* a_this, RE::RENDER_TARGETS::RENDER_TARGET a_target, RE::BSGraphics::RenderTargetProperties* a_properties) +{ + EnlargeProps(a_properties); + func(a_this, a_target, a_properties); +} + +void PerfMode::DrawSettings() +{ + // PerfMode has no user-facing settings of its own — enablement is gated + // at install time by whether the BSShaderRenderTargets::Create hook ran + // successfully. A future PR may surface diagnostic info here. +} diff --git a/src/Features/Upscaling/PerfMode.h b/src/Features/Upscaling/PerfMode.h new file mode 100644 index 0000000000..0905d61b71 --- /dev/null +++ b/src/Features/Upscaling/PerfMode.h @@ -0,0 +1,372 @@ +#pragma once + +// ============================================================================ +// PerfMode — render-target size hook + post-processing interception +// ============================================================================ +// +// Opt-in VR upscaling feature. Hooks BSOpenVR::GetRenderTargetSize so all +// engine render targets are allocated at a small RenderRes while DLSS writes +// its output to a private DisplayRes testTexture. Ships standalone. +// +// Benefits: +// - VRAM and bandwidth savings proportional to the quality-mode scale ratio. +// - UpscaleRT is no longer needed. +// - Game menus are no longer occluded by the upscaler output. +// +// Current limitations: +// - Post-processing still runs on renderRes kMAIN via a 3x3-box downscale +// of testTexture (see BoxDownscalePS.hlsl). Performance is good and +// visual loss is minimal. Once the post chain is rewritten to consume +// testTexture natively the downscale can be removed. +// - Main menu / pause backgrounds render through a path that doesn't pass +// through Main_PostProcessing. We bridge them via ISCopyRender_Hook + +// MaybeBlitMenuBG: ISCopy's destination viewport is stretched to the +// full dest dims so the source covers the panel, and MaybeBlitMenuBG +// drives a one-shot Upscaling::Upscale() + MenuBGBlitPS into the bound +// menu RT so the BG sees a DLSS-reconstructed image instead of a raw +// renderRes stretch. Both paths are one-shot per frame. +// +// ============================================================================ + +#include +#include +#include +#include + +struct PerfMode +{ + void SetupResources(); + void DrawSettings(); + + // Phase 1: standalone test texture that receives Upscaling output instead of kMAIN. + // Returns nullptr when not ready. + ID3D11Texture2D* GetTestTexture() const { return testTexture.get(); } + ID3D11ShaderResourceView* GetTestTextureSRV() const { return testTextureSRV.get(); } + ID3D11UnorderedAccessView* GetTestTextureUAV() const { return testTextureUAV.get(); } + ID3D11Texture2D* GetRefraTempTex() const { return refraTempTex.get(); } + ID3D11ShaderResourceView* GetRefraTempSRV() const { return refraTempSRV.get(); } + + // Phase 2: resolution hook status + bool IsHookActive() const { return hookActive; } + bool IsPostInterceptActive() const { return postInterceptActive; } + bool IsPostChainDone() const { return postChainDone; } + void ClearPostChainDone() { postChainDone = false; } + uint32_t GetDisplayEyeWidth() const { return displayEyeWidth; } + uint32_t GetDisplayEyeHeight() const { return displayEyeHeight; } + uint32_t GetRenderEyeWidth() const { return renderEyeWidth; } + uint32_t GetRenderEyeHeight() const { return renderEyeHeight; } + + // Phase 3: real HMD display resolution in SBS format (e.g. 3072×1632) + // Used by Upscaling pipeline to override polluted screenSize (which equals RenderRes after hook) + float2 GetDisplayScreenSize() const + { + return { static_cast(displayEyeWidth * 2), static_cast(displayEyeHeight) }; + } + + // Phase 2: called from BSShaderRenderTargets_Create::thunk (before func()) + // where BSOpenVR is guaranteed to be available + void InstallRenderTargetSizeHook(); + + // Hybrid Post: tonemap interception via IS shader hooks + // Call BeginPostIntercept() before func(), EndPostIntercept() after. + void BeginPostIntercept(); + void EndPostIntercept(); + + // Downscale testTexture (3k AA'd DLSS output) → kMAIN (1k) + // so the HDR pyramid builds from anti-aliased content instead of raw 1k render. + // Only kMAIN: no-refra reads kMAIN directly; with-refra engine copies kMAIN→kMAIN_COPY. + void DownscaleToKMain(); + + // Post hybrid entry point: called from Upscaling's Main_PostProcessing::thunk. + // Wraps the engine Post chain with PerfMode's two-layer struct swap. + // Keyed on postPipelineReady (set at the end of SetupResources) so a + // partial-init state can't slip past the gate into a null deref. The + // runtime upscaler-method gate is enforced separately by callers (the + // engine's BSOpenVR hook is install-time, so a mid-session DLSS→FSR swap + // would leave hookActive=true but testTexture stale — see Upscaling.cpp). + bool ShouldHandlePost() const { return postPipelineReady; } + void HandlePostProcessing(const std::function& enginePost); + + // Fake 3k DepthStencil for Post pass DS swap + ID3D11DepthStencilView* GetFakeDSV() const { return fakeDSV.get(); } + + // Bridge the DLSS-reconstructed menu BG (testTexture, displayRes) into + // the bound enlarged RT so OpenVR submit sees both BG + UI compositor + // output. One-shot per frame; gated via blittedFrameId. + void MaybeBlitMenuBG(uint32_t boundRTIdx); + + // Generic DS swap for draws binding an enlarged RT against kMAIN/kMAIN + // _COPY DS — without this the rasterizer clips to the smaller DS and + // fills only the renderRes corner of the enlarged RT. Skipped when + // postInterceptActive (HandlePostProcessing already redirects DS). + bool MaybeSwapDSForEnlargedRT(); + void RestoreSwappedDS(); + + // Install the 3 named per-site CreateRenderTarget thunks (kMENUBG, + // kIMAGESPACE_TEMP_COPY, kTOTAL — VR-only) at startup. Called from + // Hooks::Install in the BSShaderRenderTargets::Create install sequence. + // Thunks are no-ops outside the BeginEnlarge/EndEnlarge window so flat + // Skyrim is unaffected. + void InstallCreateRTThunks(); + + // Install the Draw vfunc detour (D3D11DeviceContext vtable index 13) + // that fixes the scene-fade overlay viewport. Called from Globals:: + // InstallD3DHooks. VR-only; thunk early-outs unless VertexCount==30 + // and the hook is live, so cost is one comparison per Draw call when + // PerfMode isn't active. + void InstallFadeOverlayHook(ID3D11DeviceContext* context); + + // Enlarge window — set true around the engine's BSShaderRenderTargets:: + // Create call from Hooks.cpp's wrapper. The 3 installed thunks read + // enlargeActive/Width/Height directly. + void BeginCreateRTEnlarge(); + void EndCreateRTEnlarge(); + bool IsCreateRTEnlargeActive() const { return enlargeActive; } + uint32_t GetEnlargeWidth() const { return enlargeWidth; } + uint32_t GetEnlargeHeight() const { return enlargeHeight; } + +private: + // RAII snapshot of the D3D11 pipeline state our fullscreen passes + // (DownscaleToKMain, MaybeBlitMenuBG) overwrite: OM RT/DS, viewport array, + // blend, depth-stencil, VS/PS/GS/HS/DS/RS, PS sampler+SRV slot 0, IA + // layout/VB/IB/topology. Captured on ctor, restored + Released on dtor. + // We're sandwiched between engine passes that don't fully rebind, so any + // leaked binding shows up as corruption downstream. + struct FullscreenPassScope + { + explicit FullscreenPassScope(ID3D11DeviceContext* a_context); + ~FullscreenPassScope(); + FullscreenPassScope(const FullscreenPassScope&) = delete; + FullscreenPassScope& operator=(const FullscreenPassScope&) = delete; + + private: + ID3D11DeviceContext* ctx = nullptr; + ID3D11RenderTargetView* savedRTV = nullptr; + ID3D11DepthStencilView* savedDSV = nullptr; + UINT numVP = D3D11_VIEWPORT_AND_SCISSORRECT_OBJECT_COUNT_PER_PIPELINE; + D3D11_VIEWPORT savedVP[D3D11_VIEWPORT_AND_SCISSORRECT_OBJECT_COUNT_PER_PIPELINE] = {}; + ID3D11BlendState* savedBlend = nullptr; + FLOAT savedBlendFactor[4] = {}; + UINT savedSampleMask = 0; + ID3D11DepthStencilState* savedDSState = nullptr; + UINT savedStencilRef = 0; + ID3D11VertexShader* savedVS = nullptr; + ID3D11PixelShader* savedPS = nullptr; + ID3D11GeometryShader* savedGS = nullptr; + ID3D11HullShader* savedHS = nullptr; + ID3D11DomainShader* savedDS = nullptr; + ID3D11RasterizerState* savedRS = nullptr; + ID3D11SamplerState* savedSampler0 = nullptr; + ID3D11ShaderResourceView* savedSRV0 = nullptr; + ID3D11InputLayout* savedIL = nullptr; + ID3D11Buffer* savedVB[D3D11_IA_VERTEX_INPUT_RESOURCE_SLOT_COUNT] = {}; + UINT savedVBStride[D3D11_IA_VERTEX_INPUT_RESOURCE_SLOT_COUNT] = {}; + UINT savedVBOffset[D3D11_IA_VERTEX_INPUT_RESOURCE_SLOT_COUNT] = {}; + ID3D11Buffer* savedIB = nullptr; + DXGI_FORMAT savedIBFormat = DXGI_FORMAT_UNKNOWN; + UINT savedIBOffset = 0; + D3D11_PRIMITIVE_TOPOLOGY savedTopology = D3D11_PRIMITIVE_TOPOLOGY_UNDEFINED; + }; + + // Phase 1 + winrt::com_ptr testTexture; + winrt::com_ptr testTextureSRV; + winrt::com_ptr testTextureUAV; + + // Phase 2: resolution hook state + bool hookActive = false; + + // Set at the end of SetupResources after every critical Post resource + // (textures, views, fake DS, downscale shaders, sampler) successfully + // initialized. ShouldHandlePost() returns this — a partial-init state + // (e.g., refraTempTex OOM after testTexture succeeds) flips this to false + // and the engine Post chain runs unwrapped on the small kMAIN, which is + // visually degraded but stable. + bool postPipelineReady = false; + + // Post intercept phase flag: when true, VP post-correction is skipped + // so enlarged kTEMP/kTOTAL get correct 3k VP from engine. + bool postInterceptActive = false; + + // Post-chain-done flag: set true after EndPostIntercept, cleared at + // PlayerView end by PlayerViewRender_Hook. When true, UpdateViewPort + // hook expands VP to displayRes so draws after the Post chain + // (UI composition, scene fade, submit prep) use the correct + // display-res VP. + bool postChainDone = false; + + uint32_t displayEyeWidth = 0; + uint32_t displayEyeHeight = 0; + uint32_t renderEyeWidth = 0; + uint32_t renderEyeHeight = 0; + + // Phase 2: vtable hook for BSOpenVR::GetRenderTargetSize (vfunc 0x12) + struct GetRenderTargetSize_Hook + { + static void thunk(RE::BSOpenVR* a_this, uint32_t* a_width, uint32_t* a_height); + static inline REL::Relocation func; + }; + + // IS shader hook: ISHDRTonemapBlendCinematic (Render vfunc 0x1 on vtable[3]) + // Chains after FrameAnnotations (if active). Swaps kMAIN SRV + kMAIN DS before + // tonemap, restores after. + struct TonemapRender_Hook + { + static void thunk(void* imageSpaceShader, RE::BSTriShape* shape, RE::ImageSpaceEffectParam* param); + static inline REL::Relocation func; + }; + bool tonemapHookInstalled = false; + + // IS shader hook: ISRefraction (Render vfunc 0x1 on vtable[3]) + // Replay DrawIndexed: func() runs 1k refraction normally, then replays 3k draw with sticky D3D state. + struct RefractionRender_Hook + { + static void thunk(void* imageSpaceShader, RE::BSTriShape* shape, RE::ImageSpaceEffectParam* param); + static inline REL::Relocation func; + }; + bool refractionHookInstalled = false; + + // IS shader hook: ISCopy (Render vfunc 0x1 on vtable[3]). + // The VR main menu / pause compositor uses a single ISCopy draw from kMAIN + // (RenderRes when PerfMode is active) into kPROJECTEDMENU (fixed 2048²) or + // kMENUBG (DisplayRes via enlargement). With a 1:1 viewport the small + // source gets stamped into the top-left of the larger dest — that's the + // "main menu looks downscaled" bug. Strategy: let func() draw normally, + // then if dest > VP, replay the draw with the viewport stretched to the + // dest's full dims so the sampler-clamped source is rescaled across the + // whole panel. ISCopy's PS/CB/IA stay sticky on the context after func(), + // so the replay only needs a VP change + a DrawIndexed. + struct ISCopyRender_Hook + { + static void thunk(void* imageSpaceShader, RE::BSTriShape* shape, RE::ImageSpaceEffectParam* param); + static inline REL::Relocation func; + }; + bool isCopyHookInstalled = false; + + // UI pass hook: FinishAccumulatingDispatch (vfunc 0x2A on BSShaderAccumulator) + // When renderMode==24 (UI pass), swaps KMAIN DS → fakeDS so 3k kMENUBG gets 3k depth. + struct UIPassDispatch_Hook + { + static void thunk(RE::BSGraphics::BSShaderAccumulator* shaderAccumulator, uint32_t renderFlags); + static inline REL::Relocation func; + }; + bool uiPassHookInstalled = false; + + // PlayerView end hook: Main_RenderPlayerView (REL 35560/36559) + // Clears postChainDone after the entire VR pipeline (World→Post→UI→Submit) + // so Present-前 UI chain and next frame use normal VP compression. + struct PlayerViewRender_Hook + { + static void thunk(void* a1, bool a2, bool a3); + static inline REL::Relocation func; + }; + bool playerViewHookInstalled = false; + + // Chains via stl::detour_thunk on the same address Hooks.cpp + Terrain- + // Blending already detour. Wraps MaybeSwapDSForEnlargedRT around the + // engine's RT/DS flush; runs after the prior thunk so PerfMode's swap + // is the innermost wrap. + struct BSGraphics_SetDirtyStates_Hook + { + static void thunk(bool isCompute); + static inline REL::Relocation func; + }; + bool setDirtyStatesHookInstalled = false; + + // D3D11 Draw vfunc detour. Engine's scene-fade overlay is a Draw(30) + // that fires after the Post chain and before Submit. Under PerfMode + // the draw's VP/vertices are computed at renderRes while the RT + // (kTOTAL) is displayRes — produces a partial-screen "black stamp" + // without this swap. + struct ID3D11DeviceContext_Draw_Hook + { + static void thunk(ID3D11DeviceContext* This, UINT VertexCount, UINT StartVertexLocation); + static inline REL::Relocation func; + }; + + // Post-corrects the engine viewport whenever it differs from our + // enlarged RTs. Chains via stl::detour_thunk. + struct BSGraphics_Renderer_UpdateViewPort_Hook + { + static void thunk(RE::BSGraphics::Renderer* a_this, uint32_t a_width, uint32_t a_height, bool a_forceMatchRT); + static inline REL::Relocation func; + }; + bool updateViewPortHookInstalled = false; + + // Refraction: 3k temp texture (copy of testTexture) for ISRefraction input + winrt::com_ptr refraTempTex; + winrt::com_ptr refraTempSRV; + // Refraction: RTV for testTexture (ISRefraction 3k output target) + winrt::com_ptr testTextureRTV; + + // Two-layer swap: saved pointers for restore. + // Outer layer (BeginPostIntercept/EndPostIntercept): kMAIN_COPY DS views + // only — the engine writes the post chain's DS through kMAIN_COPY's + // depth slot, so we redirect it at the start/end of Post. + // Inner layer (TonemapRender_Hook): kMAIN + kMAIN_COPY SRVs and kMAIN DS + // views — the tonemap shader reads from kMAIN SRV (and the refraction + // path reads from kMAIN_COPY SRV); both need to point at testTextureSRV + // so the tonemap consumes the AA'd 3k DLSS output instead of the small + // kMAIN. savedKMainCopySRV is captured/restored by the inner layer, not + // the outer one + ID3D11DepthStencilView* savedKMainCopyViews[8] = {}; + ID3D11DepthStencilView* savedKMainCopyReadOnlyViews[8] = {}; + ID3D11ShaderResourceView* savedKMainCopySRV = nullptr; + ID3D11DepthStencilView* savedKMainViews[8] = {}; + ID3D11DepthStencilView* savedKMainReadOnlyViews[8] = {}; + ID3D11ShaderResourceView* savedKMainSRV = nullptr; + + // Fake 3k DepthStencil (DisplayRes, same format as engine kMAIN DS) + winrt::com_ptr fakeDS; + winrt::com_ptr fakeDSV; + + // autoSwapDSIdx == UINT32_MAX → no active swap; otherwise it's the + // kMAIN/kMAIN_COPY slot whose views[] were rewritten and must be + // restored on the matching RestoreSwappedDS(). + ID3D11DepthStencilView* autoSwapSavedViews[8] = {}; + ID3D11DepthStencilView* autoSwapSavedReadOnlyViews[8] = {}; + uint32_t autoSwapDSIdx = UINT32_MAX; + + // Downscale pass: Box 3×3 downscale testTexture (3k) → kMAIN (1k). + // (Named "boxDownscale" — earlier revisions called this "bilinearCopy" + // when the implementation was a true bilinear sample. It became a 9-tap + // box during development; the rename happened pre-release.) + winrt::com_ptr boxDownscalePS; + winrt::com_ptr boxDownscaleVS; + winrt::com_ptr linearSampler; + + // Menu BG blit — fullscreen sample of testTexture into kTOTAL/kMENUBG + // with format conversion (R16G16B16A16_FLOAT → R8G8B8A8_UNORM via the + // RTV view). Reuses boxDownscaleVS + linearSampler. blittedFrameId is + // the per-frame one-shot guard, compared against state->frameCount + // (PlayerView doesn't fire in main menu, so a flag-clear hook wouldn't + // reliably reset across all states). + winrt::com_ptr menuBlitPS; + uint32_t blittedFrameId = UINT32_MAX; + + // CreateRenderTarget enlarge window — see BeginCreateRTEnlarge. + bool enlargeActive = false; + uint32_t enlargeWidth = 0; + uint32_t enlargeHeight = 0; + + // Per-site CreateRenderTarget thunks. Each fires from a single + // installed call site within BSShaderRenderTargets::Create and overrides + // the RT's allocation dimensions when the enlarge window is active. + // Static `func` ptrs are per-struct so each chains the original + // CreateRenderTarget call independently. + struct CreateRT_MenuBG_Hook + { + static void thunk(RE::BSGraphics::Renderer* a_this, RE::RENDER_TARGETS::RENDER_TARGET a_target, RE::BSGraphics::RenderTargetProperties* a_properties); + static inline REL::Relocation func; + }; + struct CreateRT_ImagespaceTempCopy_Hook + { + static void thunk(RE::BSGraphics::Renderer* a_this, RE::RENDER_TARGETS::RENDER_TARGET a_target, RE::BSGraphics::RenderTargetProperties* a_properties); + static inline REL::Relocation func; + }; + struct CreateRT_Total_Hook + { + static void thunk(RE::BSGraphics::Renderer* a_this, RE::RENDER_TARGETS::RENDER_TARGET a_target, RE::BSGraphics::RenderTargetProperties* a_properties); + static inline REL::Relocation func; + }; +}; diff --git a/src/Features/Upscaling/Streamline.cpp b/src/Features/Upscaling/Streamline.cpp index d7517c679f..3fc429d2e3 100644 --- a/src/Features/Upscaling/Streamline.cpp +++ b/src/Features/Upscaling/Streamline.cpp @@ -11,6 +11,7 @@ #include "../../Util.h" #include "../Upscaling.h" #include "DX12SwapChain.h" +#include "PerfMode.h" namespace { @@ -416,12 +417,13 @@ bool Streamline::IsRTXAndBelow40Series(IDXGIAdapter* a_adapter) return false; } -void Streamline::SetDLSSOptions(sl::ViewportHandle p_viewport, uint32_t width) +void Streamline::SetDLSSOptions(sl::ViewportHandle p_viewport, uint32_t width, uint32_t height) { sl::DLSSOptions dlssOptions{}; - // Map quality mode to DLSS mode - uint32_t qualityMode = globals::features::upscaling.settings.qualityMode; + // Boot qualityMode under PerfMode — DLSS dispatch must match the + // renderRes the engine was sized for at install. + uint32_t qualityMode = globals::features::upscaling.perfMode.IsHookActive() ? globals::features::upscaling.bootSnapshot.Boot(&Upscaling::Settings::qualityMode) : globals::features::upscaling.settings.qualityMode; switch (qualityMode) { case 1: dlssOptions.mode = sl::DLSSMode::eMaxQuality; @@ -442,8 +444,18 @@ void Streamline::SetDLSSOptions(sl::ViewportHandle p_viewport, uint32_t width) auto state = globals::state; + // PerfMode bridge: state->screenSize.y is polluted to RenderRes by the + // BSOpenVR size hook; use perfMode's snapshot of the real DisplayRes when + // the hook is live so DLSS is created at the right scale. The width arg + // is already display-correct (caller computes from displaySize). + auto& perfMode = globals::features::upscaling.perfMode; + const bool dlssperfActive = perfMode.IsHookActive() && perfMode.GetTestTexture(); + dlssOptions.outputWidth = width; - dlssOptions.outputHeight = (uint)state->screenSize.y; + // height==0 → caller is the standard upscale path; use full per-eye DisplayRes height. + // Non-zero is the FoveatedRender subrect height — must match extentOut.height or NGX + // produces zeroed output. See SetDLSSOptions decl in Streamline.h for the rationale. + dlssOptions.outputHeight = height != 0 ? height : (dlssperfActive ? (uint)perfMode.GetDisplayScreenSize().y : (uint)state->screenSize.y); // Detect HDR from kMAIN format at runtime -- VR kMAIN may be 8-bit while SE is FP16 { @@ -506,7 +518,8 @@ void Streamline::SetDLSSOptions(sl::ViewportHandle p_viewport, uint32_t width) void Streamline::EvaluateDLSS(sl::ViewportHandle vp, uint32_t eyeIndex, ID3D11Resource* colorIn, ID3D11Resource* colorOut, ID3D11Resource* depth, ID3D11Resource* mvec, ID3D11Resource* reactiveMask, ID3D11Resource* transparencyMask, - const sl::Extent& extentIn, const sl::Extent& extentOut, uint32_t outputWidth) + const sl::Extent& extentIn, const sl::Extent& extentOut, uint32_t outputWidth, + uint32_t outputHeight) { auto context = globals::d3d::context; @@ -543,7 +556,7 @@ void Streamline::EvaluateDLSS(sl::ViewportHandle vp, uint32_t eyeIndex, } }; - SetDLSSOptions(vp, outputWidth); + SetDLSSOptions(vp, outputWidth, outputHeight); sl::ResourceTag tags[] = { { &colorInRes, sl::kBufferTypeScalingInputColor, sl::ResourceLifecycle::eOnlyValidNow, &extentIn }, @@ -597,11 +610,24 @@ void Streamline::Upscale(ID3D11Resource* a_upscalingTexture, ID3D11Resource* a_r auto screenSize = state->screenSize; auto renderSize = Util::ConvertToDynamic(screenSize); - // When RCAS sharpening is active, direct DLSS output to sharpenerTexture so RCAS can - // sharpen directly into kMAIN.UAV without a CopyResource round-trip. + // PerfMode bridge: when the BSOpenVR size hook is live, state->screenSize + // is polluted to RenderRes (the spoofed HMD recommended size). DLSS must + // be told the TRUE DisplayRes for its output extent, otherwise NGX rejects + // the evaluate as InvalidParameter (0xbad00005) because the configured + // quality-scale doesn't match the actual extent ratio. The upscale also + // has to write into perfMode's private DisplayRes testTexture instead of + // the now-RenderRes kMAIN. auto& upscaling = globals::features::upscaling; + auto& perfMode = globals::features::upscaling.perfMode; + const bool dlssperfActive = perfMode.IsHookActive() && perfMode.GetTestTexture(); + const auto displaySize = dlssperfActive ? perfMode.GetDisplayScreenSize() : screenSize; + + // When RCAS sharpening is active, direct DLSS output to sharpenerTexture so RCAS can + // sharpen directly into kMAIN.UAV without a CopyResource round-trip. PerfMode + // bypasses the sharpener entirely (writes DLSS output straight into testTexture). ID3D11Resource* colorOut = - (upscaling.settings.sharpnessDLSS > 0.0f && upscaling.sharpenerTexture) ? upscaling.sharpenerTexture->resource.get() : a_upscalingTexture; + dlssperfActive ? static_cast(perfMode.GetTestTexture()) : + ((upscaling.settings.sharpnessDLSS > 0.0f && upscaling.sharpenerTexture) ? upscaling.sharpenerTexture->resource.get() : a_upscalingTexture); // VR stereo DLSS: NGX D3D11 only accepts zero-offset subrects. Non-zero offsets return // FAIL_InvalidParameter because Streamline's dlssEntry.cpp never sets @@ -617,8 +643,8 @@ void Streamline::Upscale(ID3D11Resource* a_upscalingTexture, ID3D11Resource* a_r if (globals::game::isVR) { auto context = globals::d3d::context; - uint32_t eyeWidthOut = (uint32_t)(screenSize.x / 2); - uint32_t eyeHeightOut = (uint32_t)screenSize.y; + uint32_t eyeWidthOut = (uint32_t)(displaySize.x / 2); + uint32_t eyeHeightOut = (uint32_t)displaySize.y; uint32_t eyeWidthIn = (uint32_t)(renderSize.x / 2); uint32_t eyeHeightIn = (uint32_t)renderSize.y; @@ -676,12 +702,12 @@ void Streamline::Upscale(ID3D11Resource* a_upscalingTexture, ID3D11Resource* a_r } else { // Non-VR: Simple full-texture upscale. sl::Extent extentIn{ 0, 0, (uint)renderSize.x, (uint)renderSize.y }; - sl::Extent extentOut{ 0, 0, (uint)screenSize.x, (uint)screenSize.y }; + sl::Extent extentOut{ 0, 0, (uint)displaySize.x, (uint)displaySize.y }; EvaluateDLSS(viewport, 0, a_upscalingTexture, colorOut, depthTexture.texture, a_motionVectors, a_reactiveMask, a_transparencyCompositionMask, - extentIn, extentOut, (uint)screenSize.x); + extentIn, extentOut, (uint)displaySize.x); } } diff --git a/src/Features/Upscaling/Streamline.h b/src/Features/Upscaling/Streamline.h index f173dd1bde..ea8088f7bf 100644 --- a/src/Features/Upscaling/Streamline.h +++ b/src/Features/Upscaling/Streamline.h @@ -28,7 +28,6 @@ class Streamline inline std::string GetShortName() { return "Streamline"; } - bool enabledAtBoot = false; bool initialized = false; bool triedInitialization = false; @@ -87,11 +86,17 @@ class Streamline ReflexOptionsCache reflexOptionsCache{}; uint32_t lastReflexSleepFrame = UINT32_MAX; - // Helper: Execute DLSS for a single viewport with given resources + // Helper: Execute DLSS for a single viewport with given resources. + // outputHeight defaults to 0 → SetDLSSOptions uses full per-eye DisplayRes height + // (matches the standard upscale path where every eval is full eye). FoveatedRender's + // subrect path must pass the actual subrect height so DLSS isn't configured for + // `subOutW × eyeHeightOut` while extentOut says `subOutW × subOutH` — that mismatch + // makes NGX return zeroed output and the subrect region renders black. void EvaluateDLSS(sl::ViewportHandle vp, uint32_t eyeIndex, ID3D11Resource* colorIn, ID3D11Resource* colorOut, ID3D11Resource* depth, ID3D11Resource* mvec, ID3D11Resource* reactiveMask, ID3D11Resource* transparencyMask, - const sl::Extent& extentIn, const sl::Extent& extentOut, uint32_t outputWidth); + const sl::Extent& extentIn, const sl::Extent& extentOut, uint32_t outputWidth, + uint32_t outputHeight = 0); // Cached DLL version info for Streamline plugin directory static std::vector> dllVersions; @@ -107,7 +112,9 @@ class Streamline bool IsRTXAndBelow40Series(IDXGIAdapter* a_adapter); - void SetDLSSOptions(sl::ViewportHandle p_viewport, uint32_t width); + // height = 0 → use full per-eye DisplayRes height (default for the standard + // upscale path). Non-zero is the subrect height the FoveatedRender route needs. + void SetDLSSOptions(sl::ViewportHandle p_viewport, uint32_t width, uint32_t height = 0); void Upscale(ID3D11Resource* a_upscalingTexture, ID3D11Resource* a_reactiveMask, ID3D11Resource* a_transparencyCompositionMask, ID3D11Resource* a_motionVectors); void UpdateReflex(); diff --git a/src/Features/VR.cpp b/src/Features/VR.cpp index 6f0b676a5d..49a5f099f3 100644 --- a/src/Features/VR.cpp +++ b/src/Features/VR.cpp @@ -126,7 +126,7 @@ void VR::SetupResources() if (globals::state->IsDeveloperMode()) { logger::info("OpenVR not natively compatible, but developer mode is active - VR menus enabled"); } else { - logger::info("OpenVR version is incompatible. Community Shaders VR menus will be disabled for stability"); + logger::info("OpenVR version is incompatible. Open Shaders VR menus will be disabled for stability"); } } } else { @@ -136,6 +136,8 @@ void VR::SetupResources() void VR::PostPostLoad() { + stereoOpt.LatchBootSnapshot(); + gDepthBufferCulling = reinterpret_cast(REL::Offset(0x1EC6B88).address()); if (!gDepthBufferCulling) { static bool s_defaultDepthBufferCulling = false; @@ -292,3 +294,18 @@ void VR::Reset() { stereoOpt.Reset(); } + +float VR::GetHMDRefreshRate() const +{ + if (!globals::game::isVR) + return 0.0f; + auto* openvr = RE::BSOpenVR::GetSingleton(); + if (!openvr || !openvr->vrSystem) + return 0.0f; + vr::ETrackedPropertyError err = vr::TrackedProp_Success; + float hz = openvr->vrSystem->GetFloatTrackedDeviceProperty( + vr::k_unTrackedDeviceIndex_Hmd, + vr::Prop_DisplayFrequency_Float, + &err); + return (err == vr::TrackedProp_Success && hz > 1.0f) ? hz : 0.0f; +} diff --git a/src/Features/VR.h b/src/Features/VR.h index 459ca8287b..e910a38db8 100644 --- a/src/Features/VR.h +++ b/src/Features/VR.h @@ -31,7 +31,7 @@ using ButtonCombo = InputCombo; * - Drag-and-drop overlay repositioning * * The VR class follows the singleton pattern and integrates with the OpenVR API - * to provide seamless VR experience within the Community Shaders framework. + * to provide seamless VR experience within the Open Shaders framework. * * @example * ```cpp @@ -100,7 +100,7 @@ struct VR : OverlayFeature virtual std::pair> GetFeatureSummary() override { return { - "Provides VR-specific optimizations and enhancements for Community Shaders, improving performance and visual quality in virtual reality environments.", + "Provides VR-specific optimizations and enhancements for Open Shaders, improving performance and visual quality in virtual reality environments.", { "Depth buffer culling optimization for VR performance", "In-scene overlay menu with HMD/Controller/Fixed World attach modes", "VR controller input with customizable button mappings", @@ -548,6 +548,10 @@ struct VR : OverlayFeature void DetectOpenVRInfo(); bool IsOpenVRCompatible() const; + /// Returns the HMD display refresh rate in Hz, or 0.0 if unavailable. + /// Queries IVRSystem via the game's already-loaded OpenVR DLL — no extra linking required. + float GetHMDRefreshRate() const; + private: //============================================================================= // PRIVATE HELPERS diff --git a/src/Features/VR/Input.cpp b/src/Features/VR/Input.cpp index b4c2278115..43e7528ee4 100644 --- a/src/Features/VR/Input.cpp +++ b/src/Features/VR/Input.cpp @@ -74,13 +74,13 @@ void VR::UpdateOverlayMenuStateFromInput() }; std::vector mappings = { - // Open Community Shaders menu when closed + // Open Open Shaders menu when closed { [&]() { return CheckCombo(settings.VRMenuOpenKeys) && !isEnabled; }, [&]() { isEnabled = true; } }, - // Close Community Shaders menu when open + // Close Open Shaders menu when open { [&]() { return CheckCombo(settings.VRMenuCloseKeys) && isEnabled; }, diff --git a/src/Features/VR/SettingsUI.cpp b/src/Features/VR/SettingsUI.cpp index 08819d2084..b237379377 100644 --- a/src/Features/VR/SettingsUI.cpp +++ b/src/Features/VR/SettingsUI.cpp @@ -109,7 +109,7 @@ void VR::DrawOverlay() ImGui::Begin("HowToUseOverlay", nullptr, ImGuiWindowFlags_NoTitleBar | ImGuiWindowFlags_AlwaysAutoResize | ImGuiWindowFlags_NoSavedSettings | ImGuiWindowFlags_NoFocusOnAppearing | ImGuiWindowFlags_NoNav); ImGui::PushTextWrapPos(ImGui::GetCursorPos().x + 500.0f * scale); - ImGui::TextWrapped("How to Use VR Community Shaders Menu:"); + ImGui::TextWrapped("How to Use VR Open Shaders Menu:"); ImGui::Separator(); ImGui::TextWrapped("You must open the Main Menu or Tween Menu before VR controls work."); ImGui::Spacing(); @@ -158,12 +158,12 @@ namespace if (ImGui::BeginTable("MenuInstructionsTable", 2, ImGuiTableFlags_Borders | ImGuiTableFlags_RowBg)) { ImGui::TableNextRow(); ImGui::TableSetColumnIndex(0); - ImGui::Text("Open Community Shaders Menu:"); + ImGui::Text("Open the Open Shaders Menu:"); ImGui::TableSetColumnIndex(1); Util::DrawButtonCombo(settings.VRMenuOpenKeys, true); ImGui::TableNextRow(); ImGui::TableSetColumnIndex(0); - ImGui::Text("Close Community Shaders Menu:"); + ImGui::Text("Close the Open Shaders Menu:"); ImGui::TableSetColumnIndex(1); Util::DrawButtonCombo(settings.VRMenuCloseKeys, true); ImGui::EndTable(); @@ -530,8 +530,8 @@ namespace } ImGui::Separator(); const char* comboTypes[] = { - "Open Community Shaders Menu", - "Close Community Shaders Menu", + "Open the Open Shaders Menu", + "Close the Open Shaders Menu", "Open VR Overlay", "Close VR Overlay" }; @@ -587,8 +587,8 @@ namespace const char* controllerRequirement; }; std::vector keyBindingConfigs = { - { "Open Community Shaders Menu", settings.VRMenuOpenKeys, "Button combination to open the Community Shaders menu", "Primary" }, - { "Close Community Shaders Menu", settings.VRMenuCloseKeys, "Button combination to close the Community Shaders menu", "Both" }, + { "Open the Open Shaders Menu", settings.VRMenuOpenKeys, "Button combination to open the Open Shaders menu", "Primary" }, + { "Close the Open Shaders Menu", settings.VRMenuCloseKeys, "Button combination to close the Open Shaders menu", "Both" }, { "Open VR Overlay", settings.VROverlayOpenKeys, "Button combination to open the VR overlay", "Primary" }, { "Close VR Overlay", settings.VROverlayCloseKeys, "Button combination to close the VR overlay", "Secondary" } }; diff --git a/src/Features/VRStereoOptimizations.cpp b/src/Features/VRStereoOptimizations.cpp index 63d5da8942..8f230c593b 100644 --- a/src/Features/VRStereoOptimizations.cpp +++ b/src/Features/VRStereoOptimizations.cpp @@ -259,11 +259,8 @@ void VRStereoOptimizations::DrawSettings() settings.stereoMode = static_cast(currentMode); Util::AddTooltip("Reprojects Eye 0 (left) pixels into Eye 1 (right) using depth and motion data,\nskipping redundant full shading where the views overlap.\nReduces GPU cost in VR by shading each pixel fewer times per frame."); - if (globals::game::isVR && settings.stereoMode == StereoMode::Enable && !loaded) { - const auto& themeSettings = Menu::GetSingleton()->GetTheme(); - ImGui::TextColored(themeSettings.StatusPalette.RestartNeeded, - "Restart is required to enable VR stereo reprojection."); - } + if (globals::game::isVR) + Util::UI::DrawSettingDiff(bootSnapshot, settings, &Settings::stereoMode); if (settings.stereoMode == StereoMode::Off) return; diff --git a/src/Features/VRStereoOptimizations.h b/src/Features/VRStereoOptimizations.h index 4f324395ce..d3b564eef3 100644 --- a/src/Features/VRStereoOptimizations.h +++ b/src/Features/VRStereoOptimizations.h @@ -3,6 +3,7 @@ #include using json = nlohmann::json; +#include "Utils/BootSnapshot.h" #include #include #include @@ -95,6 +96,16 @@ struct VRStereoOptimizations } settings; + // stereoMode is restart-gated: the stencil/CS resources are only set up + // when `loaded` is true at boot, and toggling mid-session can't install + // them. Latched from VR::PostPostLoad. + inline static constexpr Util::Settings::RestartTable kRestartFields{ { + UTIL_RESTART_FIELD(Settings, stereoMode, "VR Stereo Reprojection"), + } }; + Util::Settings::BootSnapshot bootSnapshot{ kRestartFields }; + + void LatchBootSnapshot() { bootSnapshot.LatchIfNeeded(settings); } + //============================================================================= // GPU CONSTANT BUFFER (must match HLSL cbuffer layout exactly) //============================================================================= diff --git a/src/Features/VolumetricLighting.cpp b/src/Features/VolumetricLighting.cpp index d52d725ee3..c7f9739194 100644 --- a/src/Features/VolumetricLighting.cpp +++ b/src/Features/VolumetricLighting.cpp @@ -3,6 +3,7 @@ #include "InteriorSun.h" #include "ShaderCache.h" #include "State.h" +#include "Utils/UI.h" NLOHMANN_DEFINE_TYPE_NON_INTRUSIVE_WITH_DEFAULT( VolumetricLighting::TextureSize, @@ -23,12 +24,16 @@ void VolumetricLighting::DrawSettings() { if (ImGui::Checkbox("Enable Volumetric Lighting in Exteriors", &settings.ExteriorEnabled)) SetupVL(); + if (globals::game::isVR) + Util::UI::DrawSettingDiff(bootSnapshot, settings, &Settings::ExteriorEnabled); if (settings.ExteriorEnabled) DrawVolumetricLightingSettings(settings.ExteriorQuality, settings.ExteriorCustomSize, false, !inInterior); if (ImGui::Checkbox("Enable Volumetric Lighting in Interiors", &settings.InteriorEnabled)) SetupVL(); + if (globals::game::isVR) + Util::UI::DrawSettingDiff(bootSnapshot, settings, &Settings::InteriorEnabled); if (settings.InteriorEnabled) DrawVolumetricLightingSettings(settings.InteriorQuality, settings.InteriorCustomSize, true, inInterior); @@ -147,7 +152,7 @@ void VolumetricLighting::DataLoaded() const static auto address = REL::Offset{ 0x1ec6b88 }.address(); bool& bDepthBufferCulling = *reinterpret_cast(address); - if (REL::Module::IsVR() && bDepthBufferCulling && shaderCache->IsDiskCache()) { + if (globals::game::isVR && bDepthBufferCulling && shaderCache->IsDiskCache()) { // clear cache to fix bug caused by bDepthBufferCulling logger::info("Force clearing cache due to bDepthBufferCulling"); shaderCache->Clear(); @@ -156,7 +161,8 @@ void VolumetricLighting::DataLoaded() void VolumetricLighting::PostPostLoad() { - if (REL::Module::IsVR()) { + bootSnapshot.LatchIfNeeded(settings); + if (globals::game::isVR) { if (settings.ExteriorEnabled || settings.InteriorEnabled) EnableBooleanSettings(hiddenVRSettings, GetName()); auto address = REL::RelocationID(100475, 0).address() + 0x45b; // AE not needed, VR only hook @@ -337,4 +343,4 @@ void VolumetricLighting::CopyResource::thunk(ID3D11DeviceContext* a_this, ID3D11 if (!(Util::IsDynamicResolution() && singleton.bEnableVolumetricLighting)) { a_this->CopyResource(a_renderTarget, a_renderTargetSource); } -} \ No newline at end of file +} diff --git a/src/Features/VolumetricLighting.h b/src/Features/VolumetricLighting.h index e5251c52ef..4c9c7bbc15 100644 --- a/src/Features/VolumetricLighting.h +++ b/src/Features/VolumetricLighting.h @@ -1,5 +1,7 @@ #pragma once +#include "Utils/BootSnapshot.h" + struct VolumetricLighting : Feature { public: @@ -22,7 +24,20 @@ struct VolumetricLighting : Feature Settings settings; - bool enabledAtBoot = false; + inline static constexpr Util::Settings::RestartTable kRestartFields{ { + UTIL_RESTART_FIELD(Settings, ExteriorEnabled, "Volumetric Lighting (Exterior)"), + UTIL_RESTART_FIELD(Settings, InteriorEnabled, "Volumetric Lighting (Interior)"), + } }; + Util::Settings::BootSnapshot bootSnapshot{ kRestartFields }; + + std::span GetRestartRequiredFields() const override + { + // VR-only: enabling VL relies on startup-only game setting initialization. + return globals::game::isVR ? std::span{ kRestartFields.data(), kRestartFields.size() } : std::span{}; + } + const void* GetBootValue(std::string_view jsonKey) const override { return bootSnapshot.RawBoot(jsonKey); } + const void* GetSettingsBlob() const override { return &settings; } + size_t GetSettingsBlobSize() const override { return sizeof(settings); } virtual inline std::string GetName() override { return "Volumetric Lighting"; } virtual inline std::string GetShortName() override { return "VolumetricLighting"; } diff --git a/src/Features/WeatherEditor.cpp b/src/Features/WeatherEditor.cpp index 8f6ff39e43..2272b870a6 100644 --- a/src/Features/WeatherEditor.cpp +++ b/src/Features/WeatherEditor.cpp @@ -559,7 +559,6 @@ void WeatherEditor::DisplayWindInfo(RE::TESWeather* weather) auto sky = globals::game::sky; if (!weather || (weather->data.windSpeed <= 0 && (!sky || sky->windSpeed <= 0.0f))) return; - const auto& theme = Menu::GetSingleton()->GetTheme(); float windSpeedDisplay = weather->data.windSpeed / 255.0f; ImGui::BulletText("Weather Wind Speed: %.2f (raw %d)", windSpeedDisplay, weather->data.windSpeed); if (auto _tt = Util::HoverTooltipWrapper()) { @@ -603,7 +602,7 @@ void WeatherEditor::DisplayWindInfo(RE::TESWeather* weather) windRelation = "Left crosswind"; } ImGui::SameLine(); - ImGui::TextColored(theme.StatusPalette.RestartNeeded, "(%s)", windRelation); + Util::Text::RestartNeeded("(%s)", windRelation); if (auto _tt = Util::HoverTooltipWrapper()) { Util::DrawMultiLineTooltip({ "Wind relative to player direction:", diff --git a/src/Globals.cpp b/src/Globals.cpp index 850d0f9dce..b62d509246 100644 --- a/src/Globals.cpp +++ b/src/Globals.cpp @@ -17,6 +17,7 @@ #include "Features/LightLimitFix.h" #include "Features/LinearLighting.h" #include "Features/PerformanceOverlay.h" +#include "Features/RemoteControl.h" #include "Features/RenderDoc.h" #include "Features/ScreenSpaceGI.h" #include "Features/ScreenSpaceShadows.h" @@ -85,6 +86,7 @@ namespace globals Upscaling upscaling{}; HDRDisplay hdrDisplay{}; RenderDoc renderDoc{}; + RemoteControl remoteControl{}; ScreenshotFeature screenshotFeature{}; WeatherEditor weatherEditor{}; ExponentialHeightFog exponentialHeightFog{}; @@ -92,6 +94,8 @@ namespace globals namespace llf { + void** normalDepthBuffer = nullptr; + void** readOnlyDepthBuffer = nullptr; } } @@ -131,6 +135,11 @@ namespace globals D3D11_MAPPED_SUBRESOURCE* mappedFrameBuffer = nullptr; FrameBufferCache frameBufferCached{}; + + int32_t* frameCounter = nullptr; + int* viewWidth = nullptr; + int* viewHeight = nullptr; + bool* drawStereo = nullptr; } static void RefreshTES() @@ -192,6 +201,12 @@ namespace globals perFrame = { REL::RelocationID(524768, 411384) }; currentAccumulator = { REL::RelocationID(527650, 414600) }; + + frameCounter = reinterpret_cast(REL::RelocationID(525008, 411489).address()); + viewWidth = reinterpret_cast(REL::RelocationID(524978, 411459).address()); + viewHeight = reinterpret_cast(REL::RelocationID(524979, 411460).address()); + if (REL::Module::IsVR()) + drawStereo = reinterpret_cast(REL::RelocationID(524907, 411393).address() + sizeof(void*)); } { @@ -402,5 +417,16 @@ namespace globals stl::detour_vfunc<36, ID3D11DeviceContext_OMSetDepthStencilState>(a_context); stl::detour_vfunc<53, ID3D11DeviceContext_ClearDepthStencilView>(a_context); } + + // Scene-fade overlay Draw(30) detour — only useful when PerfMode will + // actually go active. The hook's thunk already early-outs on + // VertexCount != 30 || !hookActive, but skipping the vtable patch + // entirely when the user has PerfMode off avoids a foreign-interop + // surface other context-vfunc hookers could trip on. Gated by the + // persisted intent, not IsHookActive(), because the hook is installed + // here at D3D init time while IsHookActive() only flips true later + // inside BSShaderRenderTargets::Create. + if (globals::game::isVR && globals::features::upscaling.settings.renderAtUpscaleRes) + globals::features::upscaling.perfMode.InstallFadeOverlayHook(a_context); } } diff --git a/src/Globals.h b/src/Globals.h index a5e262a768..a2ca155e0b 100644 --- a/src/Globals.h +++ b/src/Globals.h @@ -41,6 +41,7 @@ class State; class Deferred; struct TruePBR; class RenderDoc; +class RemoteControl; class Menu; namespace SIE @@ -92,6 +93,7 @@ namespace globals extern Upscaling upscaling; extern HDRDisplay hdrDisplay; extern RenderDoc renderDoc; + extern RemoteControl remoteControl; extern ScreenshotFeature screenshotFeature; extern WeatherEditor weatherEditor; extern ExponentialHeightFog exponentialHeightFog; @@ -99,6 +101,8 @@ namespace globals namespace llf { + extern void** normalDepthBuffer; + extern void** readOnlyDepthBuffer; } } @@ -244,6 +248,11 @@ namespace globals extern D3D11_MAPPED_SUBRESOURCE* mappedFrameBuffer; extern FrameBufferCache frameBufferCached; + + extern int32_t* frameCounter; + extern int* viewWidth; + extern int* viewHeight; + extern bool* drawStereo; } namespace rtti diff --git a/src/Hooks.cpp b/src/Hooks.cpp index afe2d4cda7..35963c05f5 100644 --- a/src/Hooks.cpp +++ b/src/Hooks.cpp @@ -14,6 +14,7 @@ #include "Features/InteriorSun.h" #include "Features/LightLimitFix.h" #include "Features/Upscaling.h" +#include "Features/Upscaling/FoveatedRender/Bridge.h" #include "Features/VR.h" #include "Features/VolumetricLighting.h" @@ -369,9 +370,55 @@ struct BSShaderRenderTargets_Create static void thunk() { Util::SetGameSettingValue("iNumFocusShadow:Display", iNumFocusShadow, 0); + + // Restart-required settings snapshot. Latch once as soon as engine + // rendering state begins initializing (pre-RT allocation) so UI/MCP + // can diff "active at boot" vs "selected". + globals::features::upscaling.bootSnapshot.LatchIfNeeded(globals::features::upscaling.settings); + + // PerfMode: install the BSOpenVR render-target-size hook before the + // engine creates its render targets. This is the only place where + // BSOpenVR is guaranteed available AND we can still influence RT + // allocation. Gated on user opt-in via Upscaling::Settings AND on + // the resolved upscale path being one that can write its output to + // a separate displayRes target (DLSS via Streamline, FSR via + // FidelityFX). TAA/NONE have no upscale output to redirect, and a + // stale config can leave renderAtUpscaleRes=true after the user + // switched methods or after DLSS becomes unsupported on this GPU. + const auto resolvedUpscaleMethod = globals::features::upscaling.GetUpscaleMethod(); + const bool methodSupportsPerfMode = + resolvedUpscaleMethod == Upscaling::UpscaleMethod::kDLSS || + resolvedUpscaleMethod == Upscaling::UpscaleMethod::kFSR; + const bool dlssperfShouldRun = + globals::game::isVR && + globals::features::upscaling.settings.renderAtUpscaleRes && + methodSupportsPerfMode; + + if (dlssperfShouldRun) { + globals::features::upscaling.perfMode.InstallRenderTargetSizeHook(); + } + + // Open PerfMode's enlarge window across the engine's Create() so + // its 3 per-site thunks override props for the displayRes RTs. + auto& perfMode = globals::features::upscaling.perfMode; + perfMode.BeginCreateRTEnlarge(); func(); + perfMode.EndCreateRTEnlarge(); + globals::ReInit(); globals::state->Setup(); + + // PerfMode is not in the Feature list (it's a worker driven by the + // upscaling toggle), so SetupResources runs here directly. + if (perfMode.IsHookActive()) + perfMode.SetupResources(); + + // PR-3 MVP-B: latch FoveatedRender enable + qualityMode at the moment + // the engine is fully initialized but before the first frame. After + // this point, live setting changes won't be honored mid-game (matches + // Streamline's DLSS option lifecycle — quality changes need a full + // resource recreate the user has to opt into). + FoveatedRenderImpl::Bridge::BootSequence(); } static inline REL::Relocation func; }; @@ -867,6 +914,8 @@ namespace Hooks stl::write_thunk_call(REL::RelocationID(100458, 107175).address() + REL::Relocate(0xA25, 0xA25, 0xCD2)); stl::write_thunk_call(REL::RelocationID(100458, 107175).address() + REL::Relocate(0xA59, 0xA59, 0xD13)); + globals::features::upscaling.perfMode.InstallCreateRTThunks(); + #ifdef TRACY_ENABLE stl::write_thunk_call(REL::RelocationID(35551, 36544).address() + REL::Relocate(0x11F, 0x160)); #endif diff --git a/src/Menu.cpp b/src/Menu.cpp index cb6b9c253e..1e5d7596e0 100644 --- a/src/Menu.cpp +++ b/src/Menu.cpp @@ -291,7 +291,6 @@ Menu::~Menu() uiIcons.applyToGame.Release(); uiIcons.pauseTime.Release(); uiIcons.undo.Release(); - uiIcons.discord.Release(); uiIcons.characters.Release(); uiIcons.display.Release(); uiIcons.grass.Release(); @@ -639,7 +638,7 @@ void Menu::Init() } /** - * @brief Main UI rendering coordinator for the Community Shaders menu + * @brief Main UI rendering coordinator for the Open Shaders menu * * This method serves as the primary entry point for rendering the entire menu interface. * It handles window setup, docking configuration, and delegates rendering to specialized @@ -669,8 +668,10 @@ void Menu::DrawSettings() resetLayout = false; auto versionStr = Util::GetFormattedVersion(Plugin::VERSION); auto expectedTag = std::format("v{}", versionStr); - auto displayTitle = Plugin::BUILD_DESCRIBE == expectedTag ? std::format("Community Shaders {}", versionStr) : std::format("Community Shaders {} [{}]", versionStr, Plugin::BUILD_DESCRIBE); - // Use ### to keep a stable window ID regardless of build suffix, preserving docking state + auto displayTitle = Plugin::BUILD_DESCRIBE == expectedTag ? std::format("Open Shaders {}", versionStr) : std::format("Open Shaders {} [{}]", versionStr, Plugin::BUILD_DESCRIBE); + // Use ### to keep a stable window ID regardless of build suffix or display + // branding, preserving docking state. The literal "CommunityShaders" ID is + // load-bearing: changing it would discard users' existing docking layouts. auto title = std::format("{}###CommunityShaders", displayTitle); // Determine window flags based on docking state diff --git a/src/Menu.h b/src/Menu.h index 2e2b05774d..0f51be0110 100644 --- a/src/Menu.h +++ b/src/Menu.h @@ -56,7 +56,7 @@ class Menu enum class FontRole : std::uint8_t { Body = 0, // Default UI text - Title, // Large title text (e.g., "Community Shaders" header) + Title, // Large title text (e.g., "Open Shaders" header) Heading, // Section headers (tabs, category labels) Subheading, // Subsection headers (feature names, separators) Subtext, // Smaller secondary text (descriptions, about content) @@ -209,9 +209,6 @@ class Menu UIIcon freeCamera; // Free camera preview icon (weather editor) UIIcon playMode; // Play mode preview icon (weather editor) - // Social media/external link icons - UIIcon discord; - // Category icons UIIcon characters; UIIcon display; @@ -401,19 +398,19 @@ class Menu { std::vector ToggleKey = { InputCombo::Keyboard(VK_END) }; std::vector SkipCompilationKey = { InputCombo::Keyboard(VK_ESCAPE) }; - std::vector EffectToggleKey = { InputCombo::Keyboard(VK_MULTIPLY) }; // toggle all effects - std::vector OverlayToggleKey = { InputCombo::Keyboard(VK_F10) }; // Global overlay toggle key for all overlays - std::vector ShaderBlockPrevKey = { InputCombo::Keyboard(VK_PRIOR) }; // Debug: cycle backward through shaders (PageUp) - std::vector ShaderBlockNextKey = { InputCombo::Keyboard(VK_NEXT) }; // Debug: cycle forward through shaders (PageDown) + std::vector EffectToggleKey = { InputCombo::Keyboard(VK_MULTIPLY) }; // toggle all effects + std::vector OverlayToggleKey = { InputCombo::Keyboard(VK_F10) }; // Global overlay toggle key for all overlays + std::vector ShaderBlockPrevKey = { InputCombo::Keyboard(VK_PRIOR) }; // Debug: cycle backward through shaders (PageUp) + std::vector ShaderBlockNextKey = { InputCombo::Keyboard(VK_NEXT) }; // Debug: cycle forward through shaders (PageDown) std::vector WeatherEditorToggleKey = { InputCombo::Keyboard(VK_SHIFT), InputCombo::Keyboard(VK_END) }; // Weather Editor toggle key - std::vector ScreenshotKey = { InputCombo::Keyboard(VK_SNAPSHOT) }; // Screenshot capture key - bool EnableShaderBlocking = false; // Enable shader blocking hotkeys for debugging - bool FirstTimeSetupCompleted = false; // Track if first-time setup has been completed - bool SkipClearCacheConfirmation = false; // Skip confirmation dialog when clearing shader cache - bool AutoHideFeatureList = false; // Auto-hide left feature list panel, show on hover - bool SkipConstraintWarning = false; // Skip popup when a setting change creates new constraints - bool RequireShiftToDock = true; // Require holding Shift to dock windows - bool UseResolutionFont = true; // When true, runtime font size scales with screen resolution; when persisted to theme files, FontSize is zeroed for backward compatibility + std::vector ScreenshotKey = { InputCombo::Keyboard(VK_SNAPSHOT) }; // Screenshot capture key + bool EnableShaderBlocking = false; // Enable shader blocking hotkeys for debugging + bool FirstTimeSetupCompleted = false; // Track if first-time setup has been completed + bool SkipClearCacheConfirmation = false; // Skip confirmation dialog when clearing shader cache + bool AutoHideFeatureList = false; // Auto-hide left feature list panel, show on hover + bool SkipConstraintWarning = false; // Skip popup when a setting change creates new constraints + bool RequireShiftToDock = true; // Require holding Shift to dock windows + bool UseResolutionFont = true; // When true, runtime font size scales with screen resolution; when persisted to theme files, FontSize is zeroed for backward compatibility ThemeSettings Theme; std::string SelectedThemePreset = ""; // Currently selected theme preset (empty = custom/user theme) }; diff --git a/src/Menu/AdvancedSettingsRenderer.cpp b/src/Menu/AdvancedSettingsRenderer.cpp index 8bd217b20d..0738816f32 100644 --- a/src/Menu/AdvancedSettingsRenderer.cpp +++ b/src/Menu/AdvancedSettingsRenderer.cpp @@ -20,18 +20,20 @@ void AdvancedSettingsRenderer::RenderAdvancedSettings( const std::function& drawDisableAtBootSettings) { - // Use TabBar system - tabs sorted alphabetically + // Tabs ordered alphabetically; each tab is grouped by purpose, not audience. + // Shaders = configure & inspect shader compilation + // Diagnostics = log/inspect runtime state & block individual shaders + // Disable at Boot = user-facing failsafe toggles + // Testing = A/B harness + dev-mode test scaffolding if (ImGui::BeginTabBar("##AdvancedSettingsTabs", ImGuiTabBarFlags_None)) { - // Developer Tab - if (MenuFonts::BeginTabItemWithFont("Developer", Menu::FontRole::Subheading)) { - if (ImGui::BeginChild("##DeveloperContent", ImVec2(0, 0), false)) { - RenderDeveloperSection(); + if (MenuFonts::BeginTabItemWithFont("Diagnostics", Menu::FontRole::Subheading)) { + if (ImGui::BeginChild("##DiagnosticsContent", ImVec2(0, 0), false)) { + RenderDiagnosticsSection(); } ImGui::EndChild(); ImGui::EndTabItem(); } - // Disable at Boot Tab if (MenuFonts::BeginTabItemWithFont("Disable at Boot", Menu::FontRole::Subheading)) { if (ImGui::BeginChild("##DisableAtBootContent", ImVec2(0, 0), false)) { RenderDisableAtBootSection(drawDisableAtBootSettings); @@ -40,27 +42,16 @@ void AdvancedSettingsRenderer::RenderAdvancedSettings( ImGui::EndTabItem(); } - // Logging Tab - if (MenuFonts::BeginTabItemWithFont("Logging", Menu::FontRole::Subheading)) { - if (ImGui::BeginChild("##LoggingContent", ImVec2(0, 0), false)) { - RenderLoggingSection(); + if (MenuFonts::BeginTabItemWithFont("Shaders", Menu::FontRole::Subheading)) { + if (ImGui::BeginChild("##ShadersContent", ImVec2(0, 0), false)) { + RenderShadersSection(); } ImGui::EndChild(); ImGui::EndTabItem(); } - // Shader Debug Tab - if (MenuFonts::BeginTabItemWithFont("Shader Debug", Menu::FontRole::Subheading)) { - if (ImGui::BeginChild("##ShaderDebugContent", ImVec2(0, 0), false)) { - RenderShaderDebugSection(); - } - ImGui::EndChild(); - ImGui::EndTabItem(); - } - - // Testing Tab (for A/B Testing and related settings) if (MenuFonts::BeginTabItemWithFont("Testing", Menu::FontRole::Subheading)) { - if (ImGui::BeginChild("##Testing", ImVec2(0, 0), false)) { + if (ImGui::BeginChild("##TestingContent", ImVec2(0, 0), false)) { RenderTestingSection(); } ImGui::EndChild(); @@ -71,29 +62,44 @@ void AdvancedSettingsRenderer::RenderAdvancedSettings( } } -void AdvancedSettingsRenderer::RenderLoggingSection() +// ----------------------------------------------------------------------------- +// Shaders tab +// ----------------------------------------------------------------------------- + +void AdvancedSettingsRenderer::RenderShadersSection() +{ + RenderShaderCompileFlags(); + + ImGui::Spacing(); + ImGui::Separator(); + ImGui::Spacing(); + + RenderShaderThreading(); + + ImGui::Spacing(); + ImGui::Separator(); + ImGui::Spacing(); + + RenderShaderCacheControls(); + + ImGui::Spacing(); + ImGui::Separator(); + ImGui::Spacing(); + + RenderShaderReplacementTable(); + + ImGui::Spacing(); + ImGui::Separator(); + ImGui::Spacing(); + + RenderShaderCompileStatistics(); +} + +void AdvancedSettingsRenderer::RenderShaderCompileFlags() { auto shaderCache = globals::shaderCache; - // Log Level selection - spdlog::level::level_enum logLevel = globals::state->GetLogLevel(); - const char* items[] = { - "trace", - "debug", - "info", - "warn", - "err", - "critical", - "off" - }; - static int item_current = static_cast(logLevel); - if (ImGui::Combo("Log Level", &item_current, items, IM_ARRAYSIZE(items))) { - ImGui::SameLine(); - globals::state->SetLogLevel(static_cast(item_current)); - } - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text("Log level. Trace is most verbose. Default is info."); - } + Util::DrawSectionHeader("Compile Flags"); // Shader Defines input auto& shaderDefines = globals::state->shaderDefinesString; @@ -110,46 +116,97 @@ void AdvancedSettingsRenderer::RenderLoggingSection() ImGui::Text("Defines for Shader Compiler. Semicolon \";\" separated. Clear with space. Rebuild shaders after making change. Compute Shaders require a restart to recompile."); } - ImGui::Spacing(); + // Half-precision (partial precision) shader compile flag + bool partialPrecision = globals::state->enablePartialPrecision.load(std::memory_order_relaxed); + if (ImGui::Checkbox("Half Precision (Partial Precision)", &partialPrecision)) { + globals::state->enablePartialPrecision.store(partialPrecision, std::memory_order_relaxed); + // Force a recompile so the flag actually takes effect on subsequent shader builds. + shaderCache->Clear(); + } + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text( + "Adds D3DCOMPILE_PARTIAL_PRECISION to the shader compiler flags.\n" + "Lets fxc downgrade unmarked float ops to FP16 where it can prove safety, " + "on top of the existing min16float type hints.\n" + "On FP16-capable GPUs (Pascal+ / GCN+ / Skylake+) this can halve register " + "pressure and double ALU throughput, but it can also introduce minor visual " + "differences in shaders that haven't been audited for precision sensitivity.\n" + "Toggling this clears the shader cache and triggers a full recompile."); + } - // Compiler Thread controls - ImGui::SliderInt("Compiler Threads", &shaderCache->compilationThreadCount, 1, static_cast(std::thread::hardware_concurrency())); + // Avoid flow control compiler flag (transient — not saved to config because the + // right setting depends on the current scene, not the user). + bool avoidFlowControl = globals::state->enableAvoidFlowControl.load(std::memory_order_relaxed); + if (ImGui::Checkbox("Avoid Flow Control", &avoidFlowControl)) { + globals::state->enableAvoidFlowControl.store(avoidFlowControl, std::memory_order_relaxed); + // Force a recompile so the flag actually takes effect on subsequent shader builds. + shaderCache->Clear(); + } + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text( + "Adds D3DCOMPILE_AVOID_FLOW_CONTROL to the shader compiler flags.\n" + "Forces fxc to flatten branches into predicated ops rather than emitting " + "dynamic flow control. Often a win for short branch bodies and uniformly-" + "taken branches; usually a loss for long divergent branches that vanilla " + "flow control would skip entirely.\n" + "Resets every launch. Toggling this clears the shader cache and triggers a " + "full recompile."); + } +} + +void AdvancedSettingsRenderer::RenderShaderThreading() +{ + auto shaderCache = globals::shaderCache; + + Util::DrawSectionHeader("Threading"); + + // hardware_concurrency() is permitted to return 0 if the implementation can't + // detect it. Fall back to the actual compile-pool thread count we ended up + // using at startup (which itself defaults to a sensible value when the OS + // query fails), then clamp to at least 1 so the slider range (min=1, max=N) + // stays valid and ImGui doesn't assert. + const uint32_t hwThreads = std::thread::hardware_concurrency(); + const int32_t poolThreads = static_cast(shaderCache->compilationPool.get_thread_count()); + const int32_t maxThreads = std::max({ 1, poolThreads, static_cast(hwThreads) }); + + // Snap the persisted values back into the valid range — a stale config can + // otherwise leave compilationThreadCount above maxThreads, which would + // render the slider in an out-of-range state. + shaderCache->compilationThreadCount = std::clamp(shaderCache->compilationThreadCount, 1, maxThreads); + shaderCache->backgroundCompilationThreadCount = std::clamp(shaderCache->backgroundCompilationThreadCount, 1, maxThreads); + + ImGui::SliderInt("Compiler Threads", &shaderCache->compilationThreadCount, 1, maxThreads); if (auto _tt = Util::HoverTooltipWrapper()) { ImGui::Text( "Number of threads used to compile shaders at startup. " "Defaults to all logical cores minus one for OS headroom (E-cores included). " "Higher values finish compilation faster but may make the system less responsive."); } - ImGui::SliderInt("Background Compiler Threads", &shaderCache->backgroundCompilationThreadCount, 1, static_cast(std::thread::hardware_concurrency())); + ImGui::SliderInt("Background Compiler Threads", &shaderCache->backgroundCompilationThreadCount, 1, maxThreads); if (auto _tt = Util::HoverTooltipWrapper()) { ImGui::Text( "Number of threads used to compile shaders during gameplay. " "Defaults to half of performance cores to avoid impacting the render thread. " "Higher values finish compilation faster but may cause stuttering."); } - - ImGui::Columns(2, nullptr, false); - - // Dump Ini Settings button - if (ImGui::Button("Dump Ini Settings", { -1, 0 })) { - Util::DumpSettingsOptions(); - } - - ImGui::NextColumn(); - - // Open Logs button - std::filesystem::path logPath = Util::PathHelpers::GetLogPath(); - if (!logPath.empty() && ImGui::Button("Open Logs", { -1, 0 })) { - ShellExecuteA(NULL, "open", logPath.string().c_str(), NULL, NULL, SW_SHOWNORMAL); - } - - ImGui::Columns(1); } -void AdvancedSettingsRenderer::RenderShaderDebugSection() +void AdvancedSettingsRenderer::RenderShaderCacheControls() { auto shaderCache = globals::shaderCache; - auto state = globals::state; + + Util::DrawSectionHeader("Cache & File Watcher"); + + // File Watcher option + bool useFileWatcher = shaderCache->UseFileWatcher(); + if (ImGui::Checkbox("Enable File Watcher", &useFileWatcher)) { + shaderCache->SetFileWatcher(useFileWatcher); + } + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text( + "Automatically recompile shaders on file change. " + "Intended for development."); + } // Dump Shaders option bool useDump = shaderCache->IsDump(); @@ -167,12 +224,12 @@ void AdvancedSettingsRenderer::RenderShaderDebugSection() if (auto _tt = Util::HoverTooltipWrapper()) { ImGui::Text("Clear all compiled shaders from memory. Forces recompilation of all shaders on next use."); } +} - ImGui::Spacing(); - ImGui::Separator(); - ImGui::Spacing(); +void AdvancedSettingsRenderer::RenderShaderReplacementTable() +{ + auto state = globals::state; - // Shader Replacement section Util::DrawSectionHeader("Replace Original Shaders"); if (ImGui::BeginTable("##ReplaceToggles", 3, ImGuiTableFlags_SizingStretchSame)) { @@ -213,16 +270,212 @@ void AdvancedSettingsRenderer::RenderShaderDebugSection() } ImGui::EndTable(); } +} + +void AdvancedSettingsRenderer::RenderShaderCompileStatistics() +{ + auto shaderCache = globals::shaderCache; - // Only show shader blocking section in developer mode - if (!globals::state->IsDeveloperMode()) { + if (!ImGui::TreeNodeEx("Statistics", ImGuiTreeNodeFlags_DefaultOpen)) { return; } + ImGui::Text("Shader Compiler : %s", shaderCache->GetShaderStatsString().c_str()); + + // Derived parallelism metrics are computed lazily on demand and only shown + // once compilation has completed to avoid per-frame analysis while compiling. + if (!shaderCache->IsCompiling()) { + auto parallelism = shaderCache->GetParallelismStats(); + if (parallelism.has_value()) { + const auto& p = parallelism.value(); + ImGui::Spacing(); + ImGui::TextDisabled("Parallelism (derived from %zu compiled tasks)", p.sampleCount); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("Computed lazily from the last completed build."); + ImGui::Text("Only evaluated when this Statistics section is open."); + } + ImGui::Text("Work (W, sum of task wall times): %s", Util::FormatDuration(p.workMs).c_str()); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("Total compile work: sum of all per-shader wall-clock compile times."); + ImGui::Text("This is not CPU time; it is accumulated task elapsed time."); + ImGui::Text("Equivalent serial time on one worker if overhead stayed the same."); + } + ImGui::Text("Span (S, longest): %s", Util::FormatDuration(p.spanMs).c_str()); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("Critical-path lower bound, approximated by the single slowest shader."); + ImGui::Text("Even infinite cores cannot finish faster than this."); + } + ImGui::Text("Makespan (T_p): %s", Util::FormatDuration(p.makespanMs).c_str()); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("Observed wall-clock duration for the full shader build."); + } + ImGui::Text("Queue wait (avg/max): %s / %s", + Util::FormatDuration(p.avgQueueWaitMs).c_str(), + Util::FormatDuration(p.maxQueueWaitMs).c_str()); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("Time spent waiting in the ready queue before a worker started compilation."); + ImGui::Text("Useful for identifying scheduler-induced delay separate from compile cost."); + } + ImGui::Text("Average parallelism (W/S): %.2fx", p.avgParallelism); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("Average useful concurrency in this workload."); + ImGui::Text("Roughly the worker count where adding more cores gives diminishing returns."); + } + ImGui::Text("Infinite-core efficiency (S/T_p): %.1f%%", 100.0 * p.infiniteCoreEfficiency); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("How close runtime is to the infinite-core lower bound."); + ImGui::Text("100%% means T_p == S."); + } + ImGui::Text("Infinite-core gap: %.1f%%", p.infiniteCoreGapPercent); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("Distance from ideal infinite-core time."); + ImGui::Text("Defined as 100 * (1 - S / T_p). Lower is better."); + } + + ImGui::Spacing(); + ImGui::TextDisabled("Infinite-core efficiency"); + float efficiency = static_cast(std::clamp(p.infiniteCoreEfficiency, 0.0, 1.0)); + ImGui::ProgressBar(efficiency, ImVec2(-1.0f, 0.0f), std::format("{:.1f}% efficient / {:.1f}% gap", 100.0 * p.infiniteCoreEfficiency, p.infiniteCoreGapPercent).c_str()); + + ImGui::Spacing(); + ImGui::TextDisabled("Relative durations (normalized)"); + double maxMs = std::max({ p.workMs, p.spanMs, p.makespanMs, 1.0 }); + auto drawRelativeBar = [maxMs](const char* label, double value) { + float ratio = static_cast(std::clamp(value / maxMs, 0.0, 1.0)); + ImGui::TextUnformatted(label); + ImGui::SameLine(); + ImGui::ProgressBar(ratio, ImVec2(-1.0f, 0.0f), std::format("{} ({:.1f}%)", Util::FormatDuration(value), 100.0 * ratio).c_str()); + }; + drawRelativeBar("Span (S)", p.spanMs); + drawRelativeBar("Makespan (T_p)", p.makespanMs); + drawRelativeBar("Work (W)", p.workMs); + } + } + + // Top-3 slowest shaders from the last build + auto topSlow = shaderCache->GetTopSlowTasks(3); + if (!topSlow.empty()) { + ImGui::Spacing(); + ImGui::TextDisabled("Top %zu Slowest Shaders (last build)", topSlow.size()); + for (size_t i = 0; i < topSlow.size(); ++i) { + const auto& rec = topSlow[i]; + ImGui::Text("#%zu %s (weight %d)", i + 1, + Util::FormatDuration(rec.elapsedMs).c_str(), rec.priority); + ImGui::SameLine(); + ImGui::TextDisabled("%s", rec.key.c_str()); + if (ImGui::IsItemHovered()) { + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("%s", rec.key.c_str()); + } + } + // Allow copying the full key with a right-click + if (ImGui::BeginPopupContextItem(std::format("##slowcopy{}", i).c_str())) { + if (ImGui::MenuItem("Copy key")) { + ImGui::SetClipboardText(rec.key.c_str()); + } + ImGui::EndPopup(); + } + } + } + + ImGui::TreePop(); +} + +// ----------------------------------------------------------------------------- +// Diagnostics tab +// ----------------------------------------------------------------------------- + +void AdvancedSettingsRenderer::RenderDiagnosticsSection() +{ + RenderLoggingControls(); + ImGui::Spacing(); ImGui::Separator(); ImGui::Spacing(); + RenderRuntimeDebugControls(); + + // Shader blocking only meaningful in developer mode (matches prior behavior). + if (globals::state->IsDeveloperMode()) { + ImGui::Spacing(); + ImGui::Separator(); + ImGui::Spacing(); + + RenderShaderBlockingPanel(); + } +} + +void AdvancedSettingsRenderer::RenderLoggingControls() +{ + Util::DrawSectionHeader("Logging"); + + // Log Level selection. Resync from state every frame so external changes + // (config reload, console command, another caller of SetLogLevel) don't + // leave the combo displaying a stale selection. + spdlog::level::level_enum logLevel = globals::state->GetLogLevel(); + const char* items[] = { + "trace", + "debug", + "info", + "warn", + "err", + "critical", + "off" + }; + int item_current = static_cast(logLevel); + if (ImGui::Combo("Log Level", &item_current, items, IM_ARRAYSIZE(items))) { + globals::state->SetLogLevel(static_cast(item_current)); + } + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("Log level. Trace is most verbose. Default is info."); + } + + ImGui::Columns(2, nullptr, false); + + // Dump Ini Settings button + if (ImGui::Button("Dump Ini Settings", { -1, 0 })) { + Util::DumpSettingsOptions(); + } + + ImGui::NextColumn(); + + // Open Logs button + std::filesystem::path logPath = Util::PathHelpers::GetLogPath(); + if (!logPath.empty() && ImGui::Button("Open Logs", { -1, 0 })) { + ShellExecuteA(NULL, "open", logPath.string().c_str(), NULL, NULL, SW_SHOWNORMAL); + } + + ImGui::Columns(1); +} + +void AdvancedSettingsRenderer::RenderRuntimeDebugControls() +{ + Util::DrawSectionHeader("Runtime Debug"); + + // Frame annotations toggle + ImGui::Checkbox("Frame Annotations", &globals::state->frameAnnotations); + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("Enable detailed frame annotations for debugging render passes and draw calls."); + } + + // Debug addresses section + if (ImGui::TreeNodeEx("Addresses")) { + auto Renderer = globals::game::renderer; + auto BSShaderAccumulator = *globals::game::currentAccumulator.get(); + auto RendererShadowState = globals::game::shadowState; + ADDRESS_NODE(Renderer) + ADDRESS_NODE(BSShaderAccumulator) + ADDRESS_NODE(RendererShadowState) + ImGui::TreePop(); + } +} + +void AdvancedSettingsRenderer::RenderShaderBlockingPanel() +{ + auto shaderCache = globals::shaderCache; + + Util::DrawSectionHeader("Shader Blocking"); + // Show blocked shader status as a regular section if (!shaderCache->blockedKey.empty()) { // Create a visually distinct box for the blocked shader info with rounded corners and border @@ -286,8 +539,8 @@ void AdvancedSettingsRenderer::RenderShaderDebugSection() ImGui::PopStyleColor(); // ChildBg } - // Shader Debug section - if (ImGui::CollapsingHeader("Shader Debug")) { + // Blocking hotkeys + enable toggle + { auto menu = globals::menu; auto& menuSettings = menu->GetSettings(); auto& themeSettings = menuSettings.Theme; @@ -338,9 +591,11 @@ void AdvancedSettingsRenderer::RenderShaderDebugSection() } } - // Active shaders list - if (ImGui::CollapsingHeader("Active Shaders", ImGuiTreeNodeFlags_DefaultOpen)) { - ImGui::Text("Active Shaders (Used Recently)"); + // Active shaders list — rendered inline; the parent panel already says + // "Shader Blocking", so a nested CollapsingHeader was redundant noise. + { + ImGui::Spacing(); + Util::DrawSectionHeader("Active Shaders (Used Recently)"); if (auto _tt = Util::HoverTooltipWrapper()) { ImGui::Text( "List of shaders that have been used in recent frames. " @@ -504,189 +759,31 @@ void AdvancedSettingsRenderer::RenderShaderDebugSection() } } +// ----------------------------------------------------------------------------- +// Disable at Boot tab +// ----------------------------------------------------------------------------- + void AdvancedSettingsRenderer::RenderDisableAtBootSection(const std::function& drawDisableAtBootSettings) { drawDisableAtBootSettings(); } -void AdvancedSettingsRenderer::RenderDeveloperSection() -{ - auto shaderCache = globals::shaderCache; - - // File Watcher option (moved from Advanced/Logging) - bool useFileWatcher = shaderCache->UseFileWatcher(); - if (ImGui::Checkbox("Enable File Watcher", &useFileWatcher)) { - shaderCache->SetFileWatcher(useFileWatcher); - } - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text( - "Automatically recompile shaders on file change. " - "Intended for developing."); - } - - // Debug addresses section (moved from Advanced/Logging) - if (ImGui::TreeNodeEx("Addresses")) { - auto Renderer = globals::game::renderer; - auto BSShaderAccumulator = *globals::game::currentAccumulator.get(); - auto RendererShadowState = globals::game::shadowState; - ADDRESS_NODE(Renderer) - ADDRESS_NODE(BSShaderAccumulator) - ADDRESS_NODE(RendererShadowState) - ImGui::TreePop(); - } - - // Statistics section (moved from Advanced/Logging) - if (ImGui::TreeNodeEx("Statistics", ImGuiTreeNodeFlags_DefaultOpen)) { - ImGui::Text(std::format("Shader Compiler : {}", shaderCache->GetShaderStatsString()).c_str()); - - // Derived parallelism metrics are computed lazily on demand and only shown - // once compilation has completed to avoid per-frame analysis while compiling. - if (!shaderCache->IsCompiling()) { - auto parallelism = shaderCache->GetParallelismStats(); - if (parallelism.has_value()) { - const auto& p = parallelism.value(); - ImGui::Spacing(); - ImGui::TextDisabled("Parallelism (derived from %zu compiled tasks)", p.sampleCount); - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text("Computed lazily from the last completed build."); - ImGui::Text("Only evaluated when this Statistics section is open."); - } - ImGui::Text("Work (W, sum of task wall times): %s", Util::FormatDuration(p.workMs).c_str()); - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text("Total compile work: sum of all per-shader wall-clock compile times."); - ImGui::Text("This is not CPU time; it is accumulated task elapsed time."); - ImGui::Text("Equivalent serial time on one worker if overhead stayed the same."); - } - ImGui::Text("Span (S, longest): %s", Util::FormatDuration(p.spanMs).c_str()); - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text("Critical-path lower bound, approximated by the single slowest shader."); - ImGui::Text("Even infinite cores cannot finish faster than this."); - } - ImGui::Text("Makespan (T_p): %s", Util::FormatDuration(p.makespanMs).c_str()); - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text("Observed wall-clock duration for the full shader build."); - } - ImGui::Text("Queue wait (avg/max): %s / %s", - Util::FormatDuration(p.avgQueueWaitMs).c_str(), - Util::FormatDuration(p.maxQueueWaitMs).c_str()); - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text("Time spent waiting in the ready queue before a worker started compilation."); - ImGui::Text("Useful for identifying scheduler-induced delay separate from compile cost."); - } - ImGui::Text("Average parallelism (W/S): %.2fx", p.avgParallelism); - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text("Average useful concurrency in this workload."); - ImGui::Text("Roughly the worker count where adding more cores gives diminishing returns."); - } - ImGui::Text("Infinite-core efficiency (S/T_p): %.1f%%", 100.0 * p.infiniteCoreEfficiency); - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text("How close runtime is to the infinite-core lower bound."); - ImGui::Text("100%% means T_p == S."); - } - ImGui::Text("Infinite-core gap: %.1f%%", p.infiniteCoreGapPercent); - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text("Distance from ideal infinite-core time."); - ImGui::Text("Defined as 100 * (1 - S / T_p). Lower is better."); - } - - ImGui::Spacing(); - ImGui::TextDisabled("Infinite-core efficiency"); - float efficiency = static_cast(std::clamp(p.infiniteCoreEfficiency, 0.0, 1.0)); - ImGui::ProgressBar(efficiency, ImVec2(-1.0f, 0.0f), std::format("{:.1f}% efficient / {:.1f}% gap", 100.0 * p.infiniteCoreEfficiency, p.infiniteCoreGapPercent).c_str()); - - ImGui::Spacing(); - ImGui::TextDisabled("Relative durations (normalized)"); - double maxMs = std::max({ p.workMs, p.spanMs, p.makespanMs, 1.0 }); - auto drawRelativeBar = [maxMs](const char* label, double value) { - float ratio = static_cast(std::clamp(value / maxMs, 0.0, 1.0)); - ImGui::TextUnformatted(label); - ImGui::SameLine(); - ImGui::ProgressBar(ratio, ImVec2(-1.0f, 0.0f), std::format("{} ({:.1f}%)", Util::FormatDuration(value), 100.0 * ratio).c_str()); - }; - drawRelativeBar("Span (S)", p.spanMs); - drawRelativeBar("Makespan (T_p)", p.makespanMs); - drawRelativeBar("Work (W)", p.workMs); - } - } - - // Top-3 slowest shaders from the last build - auto topSlow = shaderCache->GetTopSlowTasks(3); - if (!topSlow.empty()) { - ImGui::Spacing(); - ImGui::TextDisabled("Top %zu Slowest Shaders (last build)", topSlow.size()); - for (size_t i = 0; i < topSlow.size(); ++i) { - const auto& rec = topSlow[i]; - ImGui::Text("#%zu %s (weight %d)", i + 1, - Util::FormatDuration(rec.elapsedMs).c_str(), rec.priority); - ImGui::SameLine(); - ImGui::TextDisabled("%s", rec.key.c_str()); - if (ImGui::IsItemHovered()) { - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text("%s", rec.key.c_str()); - } - } - // Allow copying the full key with a right-click - if (ImGui::BeginPopupContextItem(std::format("##slowcopy{}", i).c_str())) { - if (ImGui::MenuItem("Copy key")) { - ImGui::SetClipboardText(rec.key.c_str()); - } - ImGui::EndPopup(); - } - } - } - - ImGui::TreePop(); - } - - // Frame annotations toggle (moved from Advanced/Logging) - ImGui::Checkbox("Frame Annotations", &globals::state->frameAnnotations); - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text("Enable detailed frame annotations for debugging render passes and draw calls."); - } +// ----------------------------------------------------------------------------- +// Testing tab +// ----------------------------------------------------------------------------- - // Half-precision (partial precision) shader compile flag - bool partialPrecision = globals::state->enablePartialPrecision.load(std::memory_order_relaxed); - if (ImGui::Checkbox("Half Precision (Partial Precision)", &partialPrecision)) { - globals::state->enablePartialPrecision.store(partialPrecision, std::memory_order_relaxed); - // Force a recompile so the flag actually takes effect on subsequent shader builds. - globals::shaderCache->Clear(); - } - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text( - "Adds D3DCOMPILE_PARTIAL_PRECISION to the shader compiler flags.\n" - "Lets fxc downgrade unmarked float ops to FP16 where it can prove safety, " - "on top of the existing min16float type hints.\n" - "On FP16-capable GPUs (Pascal+ / GCN+ / Skylake+) this can halve register " - "pressure and double ALU throughput, but it can also introduce minor visual " - "differences in shaders that haven't been audited for precision sensitivity.\n" - "Toggling this clears the shader cache and triggers a full recompile."); - } - - // Avoid flow control compiler flag (transient — not saved to config because the - // right setting depends on the current scene, not the user). - bool avoidFlowControl = globals::state->enableAvoidFlowControl.load(std::memory_order_relaxed); - if (ImGui::Checkbox("Avoid Flow Control", &avoidFlowControl)) { - globals::state->enableAvoidFlowControl.store(avoidFlowControl, std::memory_order_relaxed); - // Force a recompile so the flag actually takes effect on subsequent shader builds. - globals::shaderCache->Clear(); - } - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text( - "Adds D3DCOMPILE_AVOID_FLOW_CONTROL to the shader compiler flags.\n" - "Forces fxc to flatten branches into predicated ops rather than emitting " - "dynamic flow control. Often a win for short branch bodies and uniformly-" - "taken branches; usually a loss for long divergent branches that vanilla " - "flow control would skip entirely.\n" - "Resets every launch. Toggling this clears the shader cache and triggers a " - "full recompile."); - } - - ImGui::Spacing(); - ImGui::Separator(); - ImGui::Spacing(); +void AdvancedSettingsRenderer::RenderTestingSection() +{ + // A/B Testing settings + auto* abTestingManager = ABTestingManager::GetSingleton(); + abTestingManager->DrawSettingsUI(); - // Developer Mode Testing Section + // Developer Mode Testing UI + scene-prep button (previously on the "Developer" tab) if (globals::state->IsDeveloperMode()) { + ImGui::Spacing(); + ImGui::Separator(); + ImGui::Spacing(); + FeatureIssues::Test::DrawDeveloperModeTestingUI(); ImGui::Spacing(); @@ -704,10 +801,3 @@ void AdvancedSettingsRenderer::RenderDeveloperSection() } } } - -void AdvancedSettingsRenderer::RenderTestingSection() -{ - // A/B Testing settings - auto* abTestingManager = ABTestingManager::GetSingleton(); - abTestingManager->DrawSettingsUI(); -} diff --git a/src/Menu/AdvancedSettingsRenderer.h b/src/Menu/AdvancedSettingsRenderer.h index 086486c246..eaf6cac0e4 100644 --- a/src/Menu/AdvancedSettingsRenderer.h +++ b/src/Menu/AdvancedSettingsRenderer.h @@ -13,9 +13,19 @@ class AdvancedSettingsRenderer const std::function& drawDisableAtBootSettings); private: - static void RenderLoggingSection(); - static void RenderShaderDebugSection(); + static void RenderShadersSection(); + static void RenderDiagnosticsSection(); static void RenderDisableAtBootSection(const std::function& drawDisableAtBootSettings); - static void RenderDeveloperSection(); static void RenderTestingSection(); -}; \ No newline at end of file + + // Helpers used by the sections above + static void RenderShaderCompileFlags(); + static void RenderShaderThreading(); + static void RenderShaderCacheControls(); + static void RenderShaderReplacementTable(); + static void RenderShaderCompileStatistics(); + + static void RenderLoggingControls(); + static void RenderRuntimeDebugControls(); + static void RenderShaderBlockingPanel(); +}; diff --git a/src/Menu/FeatureListRenderer.cpp b/src/Menu/FeatureListRenderer.cpp index 978984eb7d..c8f66ffef0 100644 --- a/src/Menu/FeatureListRenderer.cpp +++ b/src/Menu/FeatureListRenderer.cpp @@ -509,7 +509,11 @@ void FeatureListRenderer::ListMenuVisitor::operator()(Feature* feat) if (isDisabled) { textColor = themeSettings.StatusPalette.Disable; } else if (isLoaded) { - textColor = ImGui::GetStyleColorVec4(ImGuiCol_Text); + // Loaded feature with staged but-not-yet-applied restart-gated + // settings tints the same green as a feature pending re-enable. + // Same semantic from the user's POV: "this feature has unmade + // changes that take effect on restart." + textColor = feat->HasAnyPendingRestart() ? themeSettings.StatusPalette.RestartNeeded : ImGui::GetStyleColorVec4(ImGuiCol_Text); } else if (hasFailedMessage) { textColor = feat->version.empty() ? themeSettings.StatusPalette.Disable : themeSettings.StatusPalette.Error; } else { diff --git a/src/Menu/HomePageRenderer.cpp b/src/Menu/HomePageRenderer.cpp index df8dde47ec..85d3fde71e 100644 --- a/src/Menu/HomePageRenderer.cpp +++ b/src/Menu/HomePageRenderer.cpp @@ -66,7 +66,7 @@ void HomePageRenderer::RenderWelcomeSection() ImVec2 windowSize = ImGui::GetWindowSize(); auto versionStr = Util::GetFormattedVersion(Plugin::VERSION); auto expectedTag = std::format("v{}", versionStr); - std::string titleWithVersion = Plugin::BUILD_DESCRIBE == expectedTag ? std::format("Welcome to Community Shaders {}", versionStr) : std::format("Welcome to Community Shaders {} [{}]", versionStr, Plugin::BUILD_DESCRIBE); + std::string titleWithVersion = Plugin::BUILD_DESCRIBE == expectedTag ? std::format("Welcome to Open Shaders {}", versionStr) : std::format("Welcome to Open Shaders {} [{}]", versionStr, Plugin::BUILD_DESCRIBE); ImVec2 titleSize = ImGui::CalcTextSize(titleWithVersion.c_str()); ImGui::SetCursorPosX((windowSize.x - titleSize.x) * 0.5f); ImGui::Text("%s", titleWithVersion.c_str()); @@ -83,7 +83,7 @@ void HomePageRenderer::RenderWelcomeSection() // Intro text - centered const char* introText = - "Community Shaders provides advanced graphics enhancements for Skyrim.\n" + "Open Shaders is a fork of Community Shaders providing advanced graphics enhancements for Skyrim.\n" "This comprehensive collection of features brings modern rendering techniques\n" "to enhance your visual experience."; ImVec2 introSize = ImGui::CalcTextSize(introText); @@ -92,56 +92,6 @@ void HomePageRenderer::RenderWelcomeSection() ImGui::Spacing(); - // Discord banner - centered with proper error checking - auto menu = Menu::GetSingleton(); - bool discordIconAvailable = false; - - // Check if menu exists, has icons, and Discord icon is loaded - if (menu && menu->uiIcons.discord.texture != nullptr && - menu->uiIcons.discord.size.x > 0 && menu->uiIcons.discord.size.y > 0) { - discordIconAvailable = true; - } - - if (discordIconAvailable) { - // Calculate scaled icon size based on window width, with min/max constraints - ImVec2 originalSize = ImVec2(menu->uiIcons.discord.size.x, menu->uiIcons.discord.size.y); - - // Compute width based on window size with constraints and padding (handles very small windows) - float ratioWidth = windowSize.x * DISCORD_BANNER_TARGET_WIDTH_RATIO; - float aspectRatio = originalSize.y / originalSize.x; - float maxAllowed = std::max(1.0f, windowSize.x - DISCORD_BANNER_PADDING_MARGIN); - float upperBound = std::min(DISCORD_BANNER_MAX_WIDTH, maxAllowed); - float lowerBound = std::min(DISCORD_BANNER_MIN_WIDTH, upperBound); - float targetWidth = std::clamp(ratioWidth, lowerBound, upperBound); - - ImVec2 iconSize = ImVec2(targetWidth, targetWidth * aspectRatio); - ImGui::SetCursorPosX((windowSize.x - iconSize.x) * 0.5f); - - // Push style to remove border - ImGui::PushStyleVar(ImGuiStyleVar_FrameBorderSize, 0.0f); - ImGui::PushStyleColor(ImGuiCol_Button, ImVec4(0, 0, 0, 0)); // Transparent background - ImGui::PushStyleColor(ImGuiCol_ButtonHovered, ImVec4(0.1f, 0.1f, 0.1f, 0.3f)); // Subtle hover - ImGui::PushStyleColor(ImGuiCol_ButtonActive, ImVec4(0.2f, 0.2f, 0.2f, 0.5f)); // Subtle click - - if (ImGui::ImageButton("##DiscordButton", menu->uiIcons.discord.texture, iconSize)) { - ShellExecuteA(NULL, "open", DISCORD_URL, NULL, NULL, SW_SHOWNORMAL); - } - - // Pop the style changes - ImGui::PopStyleColor(3); - ImGui::PopStyleVar(); - - Util::AddTooltip("Join Community Shaders Discord Server"); - } else { - // Fallback button when Discord icon is not available - float buttonWidth = DISCORD_BANNER_MIN_WIDTH * scale; - ImGui::SetCursorPosX((windowSize.x - buttonWidth) * 0.5f); - if (ImGui::Button("Join Discord Server", ImVec2(buttonWidth, 0))) { - ShellExecuteA(NULL, "open", DISCORD_URL, NULL, NULL, SW_SHOWNORMAL); - } - Util::AddTooltip("Join Community Shaders Discord Server"); - } - ImGui::PopStyleVar(); } @@ -153,26 +103,22 @@ void HomePageRenderer::RenderQuickLinksSection() ImGui::SetCursorPosX((windowSize.x - titleSize.x) * 0.5f); ImGui::Text("Quick Links"); - ImGui::Columns(4, nullptr, false); + // The Nexus button points at upstream Community Shaders until the + // Open Shaders Nexus mod page exists; swap when ready. + ImGui::Columns(3, nullptr, false); - // External links in a row if (ImGui::Button("Nexus Mods", ImVec2(-1, 0))) { - ShellExecuteA(NULL, "open", "https://www.nexusmods.com/skyrimspecialedition/mods/86492", NULL, NULL, SW_SHOWNORMAL); + ShellExecuteA(NULL, "open", "https://www.nexusmods.com/skyrimspecialedition/mods/180419", NULL, NULL, SW_SHOWNORMAL); } ImGui::NextColumn(); if (ImGui::Button("GitHub", ImVec2(-1, 0))) { - ShellExecuteA(NULL, "open", "https://github.com/doodlum/skyrim-community-shaders", NULL, NULL, SW_SHOWNORMAL); - } - - ImGui::NextColumn(); - if (ImGui::Button("Wiki", ImVec2(-1, 0))) { - ShellExecuteA(NULL, "open", "https://modding.wiki/en/skyrim/developers/community-shaders", NULL, NULL, SW_SHOWNORMAL); + ShellExecuteA(NULL, "open", "https://github.com/alandtse/open-shaders", NULL, NULL, SW_SHOWNORMAL); } ImGui::NextColumn(); if (ImGui::Button("Developer Wiki", ImVec2(-1, 0))) { - ShellExecuteA(NULL, "open", "https://github.com/doodlum/skyrim-community-shaders/wiki", NULL, NULL, SW_SHOWNORMAL); + ShellExecuteA(NULL, "open", "https://github.com/alandtse/open-shaders/wiki", NULL, NULL, SW_SHOWNORMAL); } ImGui::Columns(1); @@ -188,11 +134,14 @@ void HomePageRenderer::RenderFAQSection() ImGui::Separator(); // FAQ items with collapsible headers - if (ImGui::CollapsingHeader("What is Community Shaders?")) { + if (ImGui::CollapsingHeader("What is Open Shaders?")) { ImGui::TextWrapped( - "Community Shaders is a comprehensive graphics enhancement framework for Skyrim that " - "provides advanced lighting, materials, and visual effects. It's designed to be modular, " - "allowing you to enable only the features you want while maintaining good performance."); + "Open Shaders is a fork of Community Shaders that ships features the upstream project " + "has not yet released. Both projects are comprehensive graphics enhancement frameworks " + "for Skyrim that provide advanced lighting, materials, and visual effects. They're " + "designed to be modular, letting you enable only the features you want while " + "maintaining good performance. This fork preserves the upstream runtime layout so user " + "settings and themes are compatible."); } if (ImGui::CollapsingHeader("How do I configure features?")) { @@ -223,34 +172,36 @@ void HomePageRenderer::RenderFAQSection() "tab also includes upscaling options that can improve performance."); } - if (ImGui::CollapsingHeader("Is Community Shaders compatible with ENB?")) { + if (ImGui::CollapsingHeader("Is Open Shaders compatible with ENB?")) { ImGui::TextWrapped( - "No, Community Shaders is not compatible with ENB. Community Shaders will automatically " - "disable itself if ENB is detected."); + "No, Open Shaders (like upstream Community Shaders) is not compatible with ENB. The " + "plugin will automatically disable itself if ENB is detected."); } if (ImGui::CollapsingHeader("The menu hotkey isn't working!")) { ImGui::TextWrapped( - "By default, Community Shaders uses the END key to open this menu. If your keyboard " + "By default, Open Shaders uses the END key to open this menu. If your keyboard " "doesn't have an END key or it's not working, you can change it in the General > Keybindings tab. " "You can also edit the hotkey in the JSON configuration files."); } - if (ImGui::CollapsingHeader("I would like to help develop Community Shaders.")) { + if (ImGui::CollapsingHeader("I would like to help develop Open Shaders.")) { ImGui::TextWrapped( - "We're always looking for talented developers to join the team! Check out our GitHub wiki " - "for contribution guidelines and join our Discord server to connect with the development team. " - "Whether you're interested in shader programming, C++ development, or documentation, there's " - "always something to contribute."); + "Open Shaders is open source. Check out the upstream GitHub wiki for contribution " + "guidelines on the shared architecture; open issues and PRs against this repository for " + "fork-specific work, or against upstream Community Shaders for changes that benefit both " + "projects. Whether you're interested in shader programming, C++ development, or " + "documentation, there's always something to contribute."); } - if (ImGui::CollapsingHeader("Is Community Shaders open source?")) { + if (ImGui::CollapsingHeader("Is Open Shaders open source?")) { ImGui::TextWrapped( - "Yes! Community Shaders is completely open source and available on GitHub. You can view " - "the source code, report issues, suggest features, and contribute to the project. " - "The project is licensed under GPL, ensuring it remains free and open for everyone." - " Branding materials and assets (icons, nexus branding, typography, etc) are not covered by the GPL Licence." - " Any included assets may not be used without explicit permission."); + "Yes! Open Shaders is completely open source and available on GitHub, as is upstream " + "Community Shaders. You can view the source code, report issues, suggest features, and " + "contribute to either project. Both are licensed under GPL, ensuring they remain free and " + "open for everyone. Branding materials and assets (icons, Nexus branding, typography, etc.) " + "are not covered by the GPL Licence. Any included assets may not be used without explicit " + "permission."); } } @@ -444,7 +395,7 @@ void HomePageRenderer::RenderFirstTimeSetupDialog() // Version text - two lines, both centered (reduced spacing between lines) const char* versionLine1 = "This appears to be a new install, update, or"; - const char* versionLine2 = "reinstallation of Community Shaders."; + const char* versionLine2 = "reinstallation of Open Shaders."; centerText(versionLine1); ImGui::Text("%s", versionLine1); diff --git a/src/Menu/HomePageRenderer.h b/src/Menu/HomePageRenderer.h index cc36dd1fd7..c4cee59dbe 100644 --- a/src/Menu/HomePageRenderer.h +++ b/src/Menu/HomePageRenderer.h @@ -7,7 +7,6 @@ class HomePageRenderer { public: // Constants - static constexpr const char* DISCORD_URL = "https://discord.com/invite/nkrQybAsyy"; static constexpr float TITLE_FONT_SCALE = 2.0f; static constexpr float HOTKEY_TEXT_SCALE = 1.6f; static constexpr float HOTKEY_TEXT_SCALE_CAPTURING = 2.0f; @@ -22,12 +21,6 @@ class HomePageRenderer static constexpr float DIALOG_CORNER_ROUNDING = 6.0f; static constexpr float DIALOG_LINE_TIGHTEN = 3.0f; - // Discord banner scaling constants - static constexpr float DISCORD_BANNER_TARGET_WIDTH_RATIO = 0.85f; - static constexpr float DISCORD_BANNER_MIN_WIDTH = 150.0f; - static constexpr float DISCORD_BANNER_MAX_WIDTH = 1200.0f; - static constexpr float DISCORD_BANNER_PADDING_MARGIN = 40.0f; - static void RenderHomePage(); // First-time setup management diff --git a/src/Menu/IconLoader.cpp b/src/Menu/IconLoader.cpp index d12f3ce69a..8c37c472e1 100644 --- a/src/Menu/IconLoader.cpp +++ b/src/Menu/IconLoader.cpp @@ -104,7 +104,6 @@ namespace Util::IconLoader { std::string(iconFolder) + "\\delete.png", &menu->uiIcons.deleteSettings.texture, &menu->uiIcons.deleteSettings.size }, { logoPath, &menu->uiIcons.logo.texture, &menu->uiIcons.logo.size }, { std::string(iconFolder) + "\\restore-settings.png", &menu->uiIcons.featureSettingRevert.texture, &menu->uiIcons.featureSettingRevert.size }, - { std::string(iconFolder) + "\\discord.png", &menu->uiIcons.discord.texture, &menu->uiIcons.discord.size }, { std::string(iconFolder) + "\\apply-to-game.png", &menu->uiIcons.applyToGame.texture, &menu->uiIcons.applyToGame.size }, { std::string(iconFolder) + "\\pause.png", &menu->uiIcons.pauseTime.texture, &menu->uiIcons.pauseTime.size }, { std::string(iconFolder) + "\\undo.png", &menu->uiIcons.undo.texture, &menu->uiIcons.undo.size }, diff --git a/src/Menu/MenuHeaderRenderer.cpp b/src/Menu/MenuHeaderRenderer.cpp index 27ec1a1703..0caec58d0a 100644 --- a/src/Menu/MenuHeaderRenderer.cpp +++ b/src/Menu/MenuHeaderRenderer.cpp @@ -25,7 +25,7 @@ void MenuHeaderRenderer::RenderHeader(bool isDocked, bool showLogo, bool canShow auto versionStr = Util::GetFormattedVersion(Plugin::VERSION); auto expectedTag = std::format("v{}", versionStr); - auto title = Plugin::BUILD_DESCRIBE == expectedTag ? std::format("Community Shaders {}", versionStr) : std::format("Community Shaders {} [{}]", versionStr, Plugin::BUILD_DESCRIBE); + auto title = Plugin::BUILD_DESCRIBE == expectedTag ? std::format("Open Shaders {}", versionStr) : std::format("Open Shaders {} [{}]", versionStr, Plugin::BUILD_DESCRIBE); auto actionIcons = BuildActionIcons(canShowIcons, uiIcons); if (isDocked) { diff --git a/src/Menu/OverlayRenderer.cpp b/src/Menu/OverlayRenderer.cpp index 01d5f619e9..ed50ae1a3b 100644 --- a/src/Menu/OverlayRenderer.cpp +++ b/src/Menu/OverlayRenderer.cpp @@ -29,7 +29,11 @@ namespace std::unordered_map s_windowOverlapAlpha; constexpr ImGuiWindowFlags SKIP_WINDOW_FLAGS = ImGuiWindowFlags_Tooltip | ImGuiWindowFlags_NoBackground | ImGuiWindowFlags_NoMove; - constexpr const char* MAIN_WINDOW_PREFIX = "Community Shaders"; + // Prefix-match against the display title set by Menu.cpp ("Open Shaders "). + // Must track that title — if the display name changes the prefix here must + // change too, or IsMainWindow() will silently start returning false and + // overlay-alpha logic loses its anchor. + constexpr const char* MAIN_WINDOW_PREFIX = "Open Shaders"; bool IsMainWindow(ImGuiWindow* win) { return win->Name && strncmp(win->Name, MAIN_WINDOW_PREFIX, strlen(MAIN_WINDOW_PREFIX)) == 0; } diff --git a/src/Menu/SettingsTabRenderer.cpp b/src/Menu/SettingsTabRenderer.cpp index 4caf0a1aa0..d639d3ffd7 100644 --- a/src/Menu/SettingsTabRenderer.cpp +++ b/src/Menu/SettingsTabRenderer.cpp @@ -249,16 +249,6 @@ void SettingsTabRenderer::RenderShadersTab() ImGui::Text("Skips a shader being replaced if it hasn't been compiled yet. Also makes compilation blazingly fast!"); } - // Skip confirmation when clearing shader cache - auto& menuSettings = globals::menu->GetSettings(); - bool skipConfirmation = menuSettings.SkipClearCacheConfirmation; - if (ImGui::Checkbox("Skip Clear Cache Dialogue", &skipConfirmation)) { - menuSettings.SkipClearCacheConfirmation = skipConfirmation; - } - if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text("When checked, the shader cache will be cleared immediately without asking for confirmation."); - } - if (shaderCache->GetTotalTasks() > 0) { ImGui::Text("Last shader cache build duration: %s", shaderCache->GetShaderStatsString(true, true).c_str()); @@ -431,7 +421,7 @@ void SettingsTabRenderer::RenderBehaviorTab() globals::menu->pendingIconReload = true; } if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text("Uses monochrome version of the Community Shaders logo"); + ImGui::Text("Uses monochrome version of the logo"); } ImGui::Unindent(); } @@ -443,7 +433,7 @@ void SettingsTabRenderer::RenderBehaviorTab() ImGui::Checkbox("Center Header Title", &themeSettings.CenterHeader); if (auto _tt = Util::HoverTooltipWrapper()) { - ImGui::Text("Centers the Community Shaders title and logo in the header title bar"); + ImGui::Text("Centers the title and logo in the header title bar"); } ImGui::Checkbox("Auto-hide Feature List", &globals::menu->GetSettings().AutoHideFeatureList); @@ -463,6 +453,16 @@ void SettingsTabRenderer::RenderBehaviorTab() ImGui::TextUnformatted("Time in seconds to wait before a tooltip appears when hovering over an item."); } + // Skip confirmation when clearing shader cache (UI behavior, not a shader setting). + auto& menuSettings = globals::menu->GetSettings(); + bool skipConfirmation = menuSettings.SkipClearCacheConfirmation; + if (ImGui::Checkbox("Skip Clear Cache Confirmation", &skipConfirmation)) { + menuSettings.SkipClearCacheConfirmation = skipConfirmation; + } + if (auto _tt = Util::HoverTooltipWrapper()) { + ImGui::Text("When checked, the shader cache will be cleared immediately without asking for confirmation."); + } + SeparatorTextWithFont("Visual Effects", Menu::FontRole::Subheading); if (ImGui::Checkbox("Background Blur", &themeSettings.BackgroundBlurEnabled)) { diff --git a/src/Menu/ThemeManager.h b/src/Menu/ThemeManager.h index ea9ed43208..866810bb20 100644 --- a/src/Menu/ThemeManager.h +++ b/src/Menu/ThemeManager.h @@ -12,7 +12,7 @@ using json = nlohmann::json; /** - * @brief Manages hot-swappable theme system for Community Shaders menu + * @brief Manages hot-swappable theme system for the Open Shaders menu * * THEME JSON SCHEMA: * ================== diff --git a/src/SettingsOverrideManager.h b/src/SettingsOverrideManager.h index ae17a1b899..cca9a56ba0 100644 --- a/src/SettingsOverrideManager.h +++ b/src/SettingsOverrideManager.h @@ -10,7 +10,7 @@ using json = nlohmann::json; /** - * @brief Manages layered JSON override system for Community Shaders features + * @brief Manages layered JSON override system for Open Shaders features * * This class handles discovery and application of feature setting overrides * from external mod files without requiring changes to existing feature code. diff --git a/src/ShaderCache.cpp b/src/ShaderCache.cpp index 38bd325651..a8d27b4b4c 100644 --- a/src/ShaderCache.cpp +++ b/src/ShaderCache.cpp @@ -1794,7 +1794,7 @@ namespace SIE // VR only shaders // Disable BSImagespaceShaderCopyDepthBuffer since we don't have it REed and it causes issues with cache and upscaling - // https://github.com/doodlum/skyrim-community-shaders/issues/1552 + // https://github.com/community-shaders/skyrim-community-shaders/issues/1552 // { "BSImagespaceShaderCopyDepthBuffer", RE::ImageSpaceManager::GetCurrentIndex(ISCopyDepthBuffer) }, // { "BSImagespaceShaderCopyDepthBuffer_DR", RE::ImageSpaceManager::GetCurrentIndex(ISCopyDepthBuffer_DR) }, // { "BSImagespaceShaderCopyDepthBufferTargetSize", RE::ImageSpaceManager::GetCurrentIndex(ISCopyDepthBufferTargetSize) }, @@ -2980,7 +2980,7 @@ namespace SIE // still reads high briefly, which would otherwise underflow uint64_t (logs as ~2^64-1). const uint64_t total = compilationSet.totalTasks.load(std::memory_order_relaxed); const uint64_t done = compilationSet.completedTasks.load(std::memory_order_relaxed) + - compilationSet.failedTasks.load(std::memory_order_relaxed); + compilationSet.failedTasks.load(std::memory_order_relaxed); // This task has already finished running, but Complete(task) has not yet updated the counters. // Include the current task in the local progress snapshot so the logged remaining count is accurate. const uint64_t doneIncludingCurrent = (done < total) ? (done + 1) : total; diff --git a/src/State.cpp b/src/State.cpp index 753f2ce18c..48d184ad89 100644 --- a/src/State.cpp +++ b/src/State.cpp @@ -186,6 +186,8 @@ void State::Reset() lastVertexDescriptor = 0; std::memset(&permutationDataPrevious, 0xFF, sizeof(PermutationCB)); frameCount++; + // Publish for off-thread readers (e.g. the MCP listener thread). + frameCountAtomic.store(frameCount, std::memory_order_relaxed); if (auto* imageSpaceManager = RE::ImageSpaceManager::GetSingleton()) { GET_INSTANCE_MEMBER(BSImagespaceShaderApplyReflections, imageSpaceManager); diff --git a/src/State.h b/src/State.h index 2dd811d445..5d0987e0c2 100644 --- a/src/State.h +++ b/src/State.h @@ -102,7 +102,7 @@ class State std::vector>* GetDefines(); /* - * Whether a_type is currently enabled in Community Shaders + * Whether a_type is currently enabled in Open Shaders * * @param a_type The type of shader to check * @return Whether the shader has been enabled. @@ -110,7 +110,7 @@ class State bool ShaderEnabled(const RE::BSShader::Type a_type); /* - * Whether a_shader is currently enabled in Community Shaders + * Whether a_shader is currently enabled in Open Shaders * * @param a_shader The shader to check * @return Whether the shader has been enabled. @@ -268,6 +268,10 @@ class State Util::FrameChecker frameChecker; uint frameCount = 0; + // Thread-safe mirror of frameCount maintained by the render thread. + // Off-thread readers (MCP listener, future telemetry) must read this + // instead of touching frameCount directly to avoid a data race. + std::atomic frameCountAtomic{ 0 }; // Skyrim constants float2 screenSize = {}; diff --git a/src/Utils/BootSnapshot.h b/src/Utils/BootSnapshot.h new file mode 100644 index 0000000000..86ae038c63 --- /dev/null +++ b/src/Utils/BootSnapshot.h @@ -0,0 +1,139 @@ +#pragma once + +#include "Utils/RestartSettings.h" + +#include +#include +#include +#include +#include +#include + +namespace Util::Settings +{ + namespace detail + { + template + size_t MemberOffset(T SettingsT::* member) noexcept + { + static_assert(std::is_default_constructible_v); + SettingsT tmp{}; + const auto* base = reinterpret_cast(&tmp); + const auto* field = reinterpret_cast(&(tmp.*member)); + return static_cast(field - base); + } + } + + template + class BootSnapshot + { + public: + template + explicit constexpr BootSnapshot(const RestartTable& table) noexcept : + table_(table.data()), tableSize_(N) + { + static_assert(std::is_standard_layout_v, "BootSnapshot requires standard-layout Settings for offsetof-based tables."); + static_assert(std::is_copy_assignable_v, "BootSnapshot requires copy-assignable Settings."); + static_assert(std::is_default_constructible_v, + "BootSnapshot requires default-constructible Settings (bootCopy_ default-inits and detail::MemberOffset constructs a temporary)."); + } + + void Latch(const SettingsT& live) noexcept(std::is_nothrow_copy_assignable_v) + { + if constexpr (std::is_trivially_copyable_v) { + // Trivially-copyable fast path: memcpy preserves padding bytes + // verbatim. Matters when a registered restart field's *type* + // contains padding -- HasPendingChange's memcmp would otherwise + // see false-positive diffs from uninitialized padding bytes. + std::memcpy(&bootCopy_, &live, sizeof(SettingsT)); + } else { + // Copy-assign for Settings with non-trivial members (e.g. the + // std::string formula fields in ShadowCasterManager::Settings). + // Registered restart fields must still be trivially comparable + // for HasPendingChange's per-field memcmp to be meaningful; + // std::string in the outer struct is fine as long as it isn't + // registered (it isn't -- formulas are runtime-tunable). + bootCopy_ = live; + } + latched_ = true; + } + + void LatchIfNeeded(const SettingsT& live) noexcept(noexcept(std::declval().Latch(live))) + { + if (!latched_) { + Latch(live); + } + } + + bool IsLatched() const noexcept { return latched_; } + + std::span Fields() const noexcept + { + return { table_, tableSize_ }; + } + + const void* RawBoot(std::string_view jsonKey) const noexcept + { + if (!latched_) { + return nullptr; + } + const auto* field = FindRestartField(Fields(), jsonKey); + if (!field) { + return nullptr; + } + return reinterpret_cast(&bootCopy_) + field->offset; + } + + template + const T& Boot(T SettingsT::* member) const noexcept + { + static const T kZero{}; + if (!latched_) { + return kZero; + } + const size_t offset = detail::MemberOffset(member); + return *reinterpret_cast(reinterpret_cast(&bootCopy_) + offset); + } + + template + bool HasPendingChange(const SettingsT& live, T SettingsT::* member) const noexcept + { + if (!latched_) { + return false; + } + const size_t offset = detail::MemberOffset(member); + return std::memcmp(reinterpret_cast(&bootCopy_) + offset, + reinterpret_cast(&live) + offset, + sizeof(T)) != 0; + } + + bool HasPendingChange(const SettingsT& live, const RestartFieldInfo& field) const noexcept + { + if (!latched_ || !field.jsonKey) { + return false; + } + return std::memcmp(reinterpret_cast(&bootCopy_) + field.offset, + reinterpret_cast(&live) + field.offset, + field.size) != 0; + } + + template + const RestartFieldInfo* FindField(T SettingsT::* member) const noexcept + { + const size_t offset = detail::MemberOffset(member); + const size_t size = sizeof(T); + for (const auto& field : Fields()) { + if (field.offset == offset && field.size == size) { + return &field; + } + } + return nullptr; + } + + private: + SettingsT bootCopy_{}; + const RestartFieldInfo* table_ = nullptr; + size_t tableSize_ = 0; + bool latched_ = false; + }; +} diff --git a/src/Utils/RestartSettings.h b/src/Utils/RestartSettings.h new file mode 100644 index 0000000000..1a5f217b82 --- /dev/null +++ b/src/Utils/RestartSettings.h @@ -0,0 +1,40 @@ +#pragma once + +#include +#include +#include +#include + +namespace Util::Settings +{ + // Type-erased field descriptor for restart-gated settings. + // + // `jsonKey` must match the NLOHMANN_DEFINE_TYPE_NON_INTRUSIVE field name so + // MCP/RemoteControl can refer to it without per-feature glue. + struct RestartFieldInfo + { + const char* jsonKey = nullptr; + const char* label = nullptr; + size_t offset = 0; + size_t size = 0; + }; + + template + using RestartTable = std::array; + + inline constexpr const RestartFieldInfo* FindRestartField(std::span fields, std::string_view jsonKey) noexcept + { + for (const auto& field : fields) { + if (field.jsonKey && jsonKey == field.jsonKey) { + return &field; + } + } + return nullptr; + } +} + +// Convenience macro for building a RestartFieldInfo entry without duplicating +// the member name string. Requires SettingsT to be standard-layout. +#define UTIL_RESTART_FIELD(SettingsT, member, userLabel) \ + Util::Settings::RestartFieldInfo{ #member, userLabel, offsetof(SettingsT, member), sizeof(decltype(SettingsT::member)) } + diff --git a/src/Utils/Subrect.cpp b/src/Utils/Subrect.cpp index 2a9fb590ef..84d0eb61bb 100644 --- a/src/Utils/Subrect.cpp +++ b/src/Utils/Subrect.cpp @@ -1,8 +1,14 @@ #include "Utils/Subrect.h" #include +#include +#include #include +// OpaquePreviewBlendCallback lives in Subrect_PreviewBlend.cpp — that TU +// reaches into the plugin's d3d singletons, which the unit-test target +// (tests/cpp pulls Subrect.cpp standalone) can't link against. + namespace { Util::Subrect::UVRegion ClampUV(Util::Subrect::UVRegion uv) @@ -39,6 +45,16 @@ namespace return ClampUV(uv); } + Util::Subrect::UVRegion MirrorUVHorizontal(const Util::Subrect::UVRegion& uv) + { + // HMD nose-side overlap: left-eye nose-side region is on the right + // half of the eye texture; mirror around x=0.5 maps it to the + // right-eye's left half. + Util::Subrect::UVRegion mirrored = uv; + mirrored.x = 1.0f - uv.x - uv.w; + return ClampUV(mirrored); + } + json SaveUVToJson(const Util::Subrect::UVRegion& uv) { return { uv.x, uv.y, uv.w, uv.h }; @@ -70,6 +86,30 @@ namespace Util::Subrect if (a_json.contains("CropH")) currentUV.h = a_json["CropH"]; + const bool hasExplicitLeft = + a_json.contains("CropX") || a_json.contains("CropY") || + a_json.contains("CropW") || a_json.contains("CropH"); + // Require the full quartet before declaring the right-eye UV explicit. + // A partial config (e.g. only CropRightW present) would otherwise reuse + // stale values for the missing components and silently suppress the + // left→right auto-mirror fallback. With AND semantics, partial keys + // behave as "not explicit" and the mirror still runs. + const bool hasExplicitRight = + a_json.contains("CropRightX") && a_json.contains("CropRightY") && + a_json.contains("CropRightW") && a_json.contains("CropRightH"); + if (a_json.contains("CropRightX")) + currentRightUV.x = a_json["CropRightX"]; + if (a_json.contains("CropRightY")) + currentRightUV.y = a_json["CropRightY"]; + if (a_json.contains("CropRightW")) + currentRightUV.w = a_json["CropRightW"]; + if (a_json.contains("CropRightH")) + currentRightUV.h = a_json["CropRightH"]; + // Reset every load — a later LoadSettings without CropRight* keys + // should let SetStereoEnabled(true) auto-mirror again rather than + // preserving stale state from a prior load. + rightUVLoadedFromJson = hasExplicitRight; + if (a_json.contains("CropPresets") && a_json["CropPresets"].is_array()) { presets.clear(); for (auto& entry : a_json["CropPresets"]) { @@ -78,6 +118,19 @@ namespace Util::Subrect if (entry.contains("uv")) { preset.uv = LoadUVArray(entry["uv"]); } + // Right-eye UV is optional in JSON; leave nullopt when absent so + // ApplyPreset auto-mirrors the left eye on demand. Explicit + // right_uv in JSON wins over any mirror — but only when it + // looks structurally valid. LoadUVArray falls back to a + // full-frame UV on malformed input, so without this guard a + // bad `right_uv` payload would suppress auto-mirroring AND + // land the right eye as full-frame, which is the worst of + // both worlds. + if (entry.contains("right_uv") && + entry["right_uv"].is_array() && + entry["right_uv"].size() == 4) { + preset.rightUV = LoadUVArray(entry["right_uv"]); + } presets.push_back(std::move(preset)); } } @@ -85,6 +138,14 @@ namespace Util::Subrect EnsureDefaultPreset(); ClampCurrentUV(); + // Legacy upgrade: if the JSON has the mono crop keys but no right-eye + // keys, mirror left → right so existing user settings transition + // cleanly. If neither side is present, leave currentRightUV alone so + // EnsureDefaultPreset's seeded right-eye value survives. + if (stereoEnabled && hasExplicitLeft && !hasExplicitRight) { + SyncRightUV(); + } + if (a_json.contains("SelectedPresetIndex")) { selectedPresetIndex = a_json["SelectedPresetIndex"]; if (selectedPresetIndex >= 0 && selectedPresetIndex < static_cast(presets.size())) { @@ -102,11 +163,32 @@ namespace Util::Subrect a_json["CropW"] = currentUV.w; a_json["CropH"] = currentUV.h; + if (stereoEnabled) { + a_json["CropRightX"] = currentRightUV.x; + a_json["CropRightY"] = currentRightUV.y; + a_json["CropRightW"] = currentRightUV.w; + a_json["CropRightH"] = currentRightUV.h; + } else { + // Caller may pass a JSON object with prior stereo keys (e.g. a + // host that re-saves into the same in-memory config). Drop them + // so the next load doesn't look like it had explicit stereo data. + a_json.erase("CropRightX"); + a_json.erase("CropRightY"); + a_json.erase("CropRightW"); + a_json.erase("CropRightH"); + } + json presetsJson = json::array(); for (const auto& preset : presets) { json entry; entry["name"] = preset.name; entry["uv"] = SaveUVToJson(preset.uv); + // Only serialize right_uv when stereo is enabled AND we have an + // explicit value to persist. A nullopt preset implicitly means + // "auto-mirror at apply time" and shouldn't be locked into JSON. + if (stereoEnabled && preset.rightUV.has_value()) { + entry["right_uv"] = SaveUVToJson(*preset.rightUV); + } presetsJson.push_back(std::move(entry)); } a_json["CropPresets"] = presetsJson; @@ -118,6 +200,21 @@ namespace Util::Subrect seededDefaults = std::move(defaults); } + void Controller::SetStereoEnabled(bool enabled) + { + if (stereoEnabled == enabled) { + return; + } + stereoEnabled = enabled; + // Only auto-mirror left→right when the right-eye UV hasn't been + // explicitly loaded from JSON. Otherwise a caller that does + // `LoadSettings` (stereo off) then `SetStereoEnabled(true)` would + // silently overwrite a deliberate persisted right-eye crop. + if (stereoEnabled && !rightUVLoadedFromJson) { + SyncRightUV(); + } + } + void Controller::DrawEditor(ID3D11ShaderResourceView* previewSrv, ID3D11Texture2D* previewTexture, float uvVisibleWidth, float uvStartX, ImDrawCallback imageRenderCallback) { // Hosts that render without first calling LoadSettings would otherwise @@ -148,7 +245,17 @@ namespace Util::Subrect if (ImGui::Button("Save Preset")) { std::string presetName = newPresetName; if (!presetName.empty()) { - presets.push_back(Preset{ .name = presetName, .uv = currentUV }); + // Preserve the right-eye UV only when stereo is on. In mono + // mode currentRightUV is not tracked against currentUV, so + // snapshotting it would falsely mark the preset as having an + // explicit right eye and disable the auto-mirror fallback + // once stereo is later enabled. Leave rightUV as nullopt in + // mono — ApplyPreset will mirror left at apply time. + Preset newPreset{ .name = presetName, .uv = currentUV }; + if (stereoEnabled) { + newPreset.rightUV = currentRightUV; + } + presets.push_back(std::move(newPreset)); selectedPresetIndex = static_cast(presets.size()) - 1; newPresetName[0] = '\0'; } @@ -177,6 +284,9 @@ namespace Util::Subrect if (changed) { selectedPresetIndex = -1; ClampCurrentUV(); + if (stereoEnabled) { + SyncRightUV(); + } } ImGui::Spacing(); @@ -235,6 +345,9 @@ namespace Util::Subrect currentUV.w = maxX - minX; currentUV.h = maxY - minY; ClampCurrentUV(); + if (stereoEnabled) { + SyncRightUV(); + } if (!ImGui::IsMouseDown(ImGuiMouseButton_Left)) { isDraggingCrop = false; @@ -253,6 +366,23 @@ namespace Util::Subrect return UVToPixelRegion(currentUV, width, height); } + StereoPixelRegions Controller::GetStereoPixelRegions(uint32_t fullWidth, uint32_t fullHeight) const + { + // Degenerate inputs would underflow UVToPixelRegion's `width - 1` / + // `height - 1` computations into huge values. Fail safe with empty + // regions so callers can detect the bad-input case via .w == 0. + if (fullWidth < 2 || fullHeight == 0) { + return { PixelRegion{ 0, 0, 0, 0 }, PixelRegion{ 0, 0, 0, 0 } }; + } + // Each eye occupies half the SBS texture width. In mono mode, both + // eyes report the same region so callers don't need to branch. + const uint32_t eyeWidth = fullWidth / 2; + StereoPixelRegions regions; + regions.leftEye = UVToPixelRegion(currentUV, eyeWidth, fullHeight); + regions.rightEye = UVToPixelRegion(stereoEnabled ? currentRightUV : currentUV, eyeWidth, fullHeight); + return regions; + } + void Controller::EnsureDefaultPreset() { if (!presets.empty()) { @@ -263,6 +393,9 @@ namespace Util::Subrect // currentUV must match what the combo shows as selected; otherwise // the first preset appears chosen but the crop region stays full-frame. currentUV = presets[0].uv; + // nullopt rightUV means "auto-mirror" — match the same fallback + // ApplyPreset uses below. + currentRightUV = presets[0].rightUV.value_or(MirrorUVHorizontal(currentUV)); selectedPresetIndex = 0; } else { presets.push_back(Preset{ .name = "Full Frame", .uv = DefaultUV() }); @@ -272,6 +405,7 @@ namespace Util::Subrect void Controller::ClampCurrentUV() { currentUV = ClampUV(currentUV); + currentRightUV = ClampUV(currentRightUV); } void Controller::ApplyPreset(int index) @@ -279,6 +413,15 @@ namespace Util::Subrect EnsureDefaultPreset(); selectedPresetIndex = std::clamp(index, 0, static_cast(presets.size()) - 1); currentUV = presets[selectedPresetIndex].uv; + // Nullopt right-UV → mirror left around x=0.5. This is the safe default + // for presets created without a stereo-specific intent (e.g. via + // SeedDefaultPresets with only .name + .uv specified). + currentRightUV = presets[selectedPresetIndex].rightUV.value_or(MirrorUVHorizontal(currentUV)); ClampCurrentUV(); } + + void Controller::SyncRightUV() + { + currentRightUV = MirrorUVHorizontal(currentUV); + } } // namespace Util::Subrect diff --git a/src/Utils/Subrect.h b/src/Utils/Subrect.h index f05934c8be..c5c600f257 100644 --- a/src/Utils/Subrect.h +++ b/src/Utils/Subrect.h @@ -1,8 +1,22 @@ #pragma once +#include #include +#include +#include +#include #include +// Forward-declared so the header doesn't drag in . The plugin's PCH +// brings the real types into scope at definition sites. +struct ID3D11ShaderResourceView; +struct ID3D11Texture2D; + +// Mirrors the global `using json = nlohmann::json;` from the plugin PCH so +// the header builds standalone (e.g. in unit-test targets that don't +// precompile PCH). Identical aliases in the same scope are well-defined. +using json = nlohmann::json; + namespace Util::Subrect { struct UVRegion @@ -21,15 +35,33 @@ namespace Util::Subrect uint32_t h = 1; }; + struct StereoPixelRegions + { + PixelRegion leftEye; + PixelRegion rightEye; + }; + struct Preset { std::string name; - UVRegion uv; + UVRegion uv; // Left-eye UV when stereo is enabled; sole UV otherwise. + // Right-eye UV. `std::nullopt` means "no explicit right eye — auto-mirror + // left around x=0.5 when stereo is enabled". A default-constructed + // `UVRegion{}` (full frame) would otherwise be ambiguous: it could mean + // "the user wants full frame" or "the caller didn't supply one", and + // the silent-full-frame case bites SeedDefaultPresets callers that only + // fill `.name` and `.uv`. + std::optional rightUV{}; }; // "User picks a sub-rectangle of an image" controller. Crop UV is in [0,1] // of the source the caller passes to GetPixelRegion(). Hosts that want // preset-based eye selection seed Left/Right/Full Frame via SeedDefaultPresets. + // + // Stereo: hosts that consume a side-by-side stereo texture call + // SetStereoEnabled(true) to track a separate right-eye UV. Right-eye UV + // auto-mirrors left around x=0.5 unless explicitly edited; this matches + // HMD nose-side overlap symmetry. class Controller { public: @@ -40,11 +72,19 @@ namespace Util::Subrect // CropPresets entry yet. Empty-case only - user edits/deletions persist. void SeedDefaultPresets(std::vector defaults); + // Toggles right-eye UV tracking. Off by default (mono). + // When enabled, edits to the primary UV auto-mirror to the right-eye + // UV (around x=0.5), and SaveSettings emits the extra right-eye keys. + void SetStereoEnabled(bool enabled); + bool IsStereoEnabled() const { return stereoEnabled; } + // uvStartX/uvVisibleWidth window the preview onto a sub-region of the // texture; crop UV stays in [0,1] of that window. imageRenderCallback, // when non-null, is queued via ImDrawList::AddCallback around the // preview Image draw (paired with ImDrawCallback_ResetRenderState) so - // hosts can override blend state for the image specifically. + // hosts can override blend state for the image specifically. Pass + // OpaquePreviewBlendCallback when the preview texture is an RT with + // non-1 alpha (kMAIN, etc.) to suppress menu-background bleed-through. void DrawEditor(ID3D11ShaderResourceView* previewSrv, ID3D11Texture2D* previewTexture, float uvVisibleWidth = 1.0f, float uvStartX = 0.0f, ImDrawCallback imageRenderCallback = nullptr); @@ -52,7 +92,17 @@ namespace Util::Subrect // Resolves the crop UV against an arbitrary pixel size. PixelRegion GetPixelRegion(uint32_t width, uint32_t height) const; + // In stereo mode, resolves both eyes' UVs against an SBS texture by + // dividing width by 2. In mono mode, both eyes resolve from currentUV. + // + // Coordinate space: both leftEye.x and rightEye.x are in PER-EYE + // space (i.e. x in [0, fullWidth/2)) — the right eye is NOT + // pre-offset by eyeWidth. Callers that draw into the full SBS + // texture must add `fullWidth / 2` to rightEye.x themselves. + StereoPixelRegions GetStereoPixelRegions(uint32_t fullWidth, uint32_t fullHeight) const; + const UVRegion& GetUV() const { return currentUV; } + const UVRegion& GetRightEyeUV() const { return stereoEnabled ? currentRightUV : currentUV; } private: std::vector presets; @@ -61,6 +111,13 @@ namespace Util::Subrect char newPresetName[64] = ""; UVRegion currentUV{}; + UVRegion currentRightUV{}; + bool stereoEnabled = false; + // True once LoadSettings sees an explicit CropRight* key. Suppresses + // the auto-mirror in SetStereoEnabled(true) so a deliberate JSON + // right-eye crop survives a mono→stereo transition that happens + // after the load. + bool rightUVLoadedFromJson = false; bool isDraggingCrop = false; float dragStartUV[2] = { 0.0f, 0.0f }; @@ -68,5 +125,23 @@ namespace Util::Subrect void EnsureDefaultPreset(); void ClampCurrentUV(); void ApplyPreset(int index); + void SyncRightUV(); }; + + // Opaque-RGB blend state callback for Controller::DrawEditor. Pass when the + // preview SRV is a render target with non-1 alpha (kMAIN, kTOTAL, etc.). + // ImGui's default SRC_ALPHA blend would let the menu background bleed + // through where the source alpha is < 1, making the preview look like a + // transparency mask. This callback switches to opaque RGB-only writes + // around the Image draw; DrawEditor queues ImDrawCallback_ResetRenderState + // immediately after to restore default state. + // + // Two non-obvious regression risks if reimplemented: + // 1. BlendEnable must stay FALSE — SRC_ALPHA causes the bleed-through. + // 2. WriteMask must exclude alpha (RGB only). In VR, Skyrim's menu UI + // shader recomposites the menu plate over the SBS framebuffer with + // alpha blending; writing texture alpha into the menu plate RT + // produces a cutout visible only through the HMD. RGB-only writes + // leave the plate's pre-cleared alpha=1 in place. + void OpaquePreviewBlendCallback(const ImDrawList*, const ImDrawCmd*); } // namespace Util::Subrect diff --git a/src/Utils/Subrect_PreviewBlend.cpp b/src/Utils/Subrect_PreviewBlend.cpp new file mode 100644 index 0000000000..287eeed0cb --- /dev/null +++ b/src/Utils/Subrect_PreviewBlend.cpp @@ -0,0 +1,42 @@ +// Separate TU for Util::Subrect::OpaquePreviewBlendCallback. Split from +// Subrect.cpp because this needs the plugin's d3d singletons (globals::d3d), +// and Subrect.cpp is also compiled standalone by the unit-test target +// (tests/cpp/CMakeLists.txt) which has no PCH and no D3D context to bind. +// Plugin builds pick this up automatically via the src/*.cpp GLOB_RECURSE. + +#include "Globals.h" +#include "Utils/D3D.h" +#include "Utils/Subrect.h" + +#include +#include +#include + +namespace Util::Subrect +{ + void OpaquePreviewBlendCallback(const ImDrawList*, const ImDrawCmd*) + { + auto* device = globals::d3d::device; + auto* context = globals::d3d::context; + if (!device || !context) { + return; + } + + static winrt::com_ptr opaqueBlend; + if (!opaqueBlend) { + D3D11_BLEND_DESC desc{}; + desc.RenderTarget[0].BlendEnable = FALSE; + desc.RenderTarget[0].RenderTargetWriteMask = + D3D11_COLOR_WRITE_ENABLE_RED | + D3D11_COLOR_WRITE_ENABLE_GREEN | + D3D11_COLOR_WRITE_ENABLE_BLUE; + if (FAILED(device->CreateBlendState(&desc, opaqueBlend.put()))) { + return; + } + Util::SetResourceName(opaqueBlend.get(), "Subrect::OpaquePreviewBlend"); + } + if (opaqueBlend) { + context->OMSetBlendState(opaqueBlend.get(), nullptr, 0xFFFFFFFF); + } + } +} // namespace Util::Subrect diff --git a/src/Utils/UI.cpp b/src/Utils/UI.cpp index 404d7cfdb8..f105e48e1f 100644 --- a/src/Utils/UI.cpp +++ b/src/Utils/UI.cpp @@ -1424,6 +1424,11 @@ namespace Util return globals::menu->GetTheme().StatusPalette.Disable; } + ImVec4 GetRestartNeeded() + { + return globals::menu->GetTheme().StatusPalette.RestartNeeded; + } + } namespace Text @@ -1469,6 +1474,8 @@ namespace Util UTIL_TEXT_WRAPPED(WrappedInfo, GetInfo) UTIL_TEXT(Disabled, GetDisabled) UTIL_TEXT_WRAPPED(WrappedDisabled, GetDisabled) + UTIL_TEXT(RestartNeeded, GetRestartNeeded) + UTIL_TEXT_WRAPPED(WrappedRestartNeeded, GetRestartNeeded) #undef UTIL_TEXT #undef UTIL_TEXT_WRAPPED diff --git a/src/Utils/UI.h b/src/Utils/UI.h index 7a7aec368a..77289ada38 100644 --- a/src/Utils/UI.h +++ b/src/Utils/UI.h @@ -1,15 +1,19 @@ #pragma once #include #include // For FLT_MAX +#include #include #include #include +#include #include +#include #include #include // For WPARAM and virtual key constants #include "../FeatureConstraints.h" #include "../Menu/Fonts.h" +#include "Utils/BootSnapshot.h" #include "Utils/Input.h" // Forward declarations @@ -622,7 +626,13 @@ namespace Util const std::vector& footerRows = {}, const ImVec2& outerSize = ImVec2(0, 0)) { - ImGuiTableFlags flags = ImGuiTableFlags_Borders | ImGuiTableFlags_RowBg | ImGuiTableFlags_Sortable | ImGuiTableFlags_Resizable | ImGuiTableFlags_SizingStretchProp; + // ScrollY makes the table scroll internally when its bounded + // outerSize is smaller than its content. For unbounded tables + // (outerSize.y==0, auto-sized below) the size always fits the rows + // so the scrollbar stays hidden -- adding the flag is harmless in + // that case and lets bounded callers (overlay's host window) keep + // content above the table visible regardless of row count. + ImGuiTableFlags flags = ImGuiTableFlags_Borders | ImGuiTableFlags_RowBg | ImGuiTableFlags_Sortable | ImGuiTableFlags_Resizable | ImGuiTableFlags_SizingStretchProp | ImGuiTableFlags_ScrollY; ImVec2 tableSize = outerSize; if (outerSize.y == 0.0f) { size_t totalRows = rows.size() + footerRows.size(); @@ -953,6 +963,126 @@ namespace Util void WrappedInfo(const char* fmt, ...) IM_FMTARGS(1); void Disabled(const char* fmt, ...) IM_FMTARGS(1); void WrappedDisabled(const char* fmt, ...) IM_FMTARGS(1); + void RestartNeeded(const char* fmt, ...) IM_FMTARGS(1); + void WrappedRestartNeeded(const char* fmt, ...) IM_FMTARGS(1); + } + + // Restart-required settings UI helpers. + namespace UI + { + template + inline void DrawSettingDiff(const Util::Settings::BootSnapshot& snapshot, const SettingsT& live, T SettingsT::* field) + { + if (!snapshot.IsLatched()) { + return; + } + const auto* info = snapshot.FindField(field); + if (!info) { + return; + } + if (!snapshot.HasPendingChange(live, field)) { + return; + } + + if constexpr (std::is_same_v) { + const bool boot = snapshot.Boot(field); + Util::Text::RestartNeeded( + "Pending restart: %s changed (active = %s, selected = %s).", + info->label, + boot ? "on" : "off", + (live.*field) ? "on" : "off"); + return; + } + if constexpr (std::is_integral_v) { + const auto boot = static_cast(snapshot.Boot(field)); + const auto selected = static_cast(live.*field); + Util::Text::RestartNeeded( + "Pending restart: %s changed (active = %lld, selected = %lld).", + info->label, + boot, + selected); + return; + } + if constexpr (std::is_floating_point_v) { + const double boot = static_cast(snapshot.Boot(field)); + const double selected = static_cast(live.*field); + Util::Text::RestartNeeded( + "Pending restart: %s changed (active = %.3f, selected = %.3f).", + info->label, + boot, + selected); + return; + } + + Util::Text::RestartNeeded("Pending restart: %s changed.", info->label); + } + + template + inline void DrawPendingBanners(const Util::Settings::BootSnapshot& snapshot, + const SettingsT& live, + std::span fields) + { + if (!snapshot.IsLatched()) { + return; + } + for (const auto& field : fields) { + if (snapshot.HasPendingChange(live, field)) { + Util::Text::RestartNeeded("Pending restart: %s changed.", field.label); + } + } + } + + // One-call wrapper for a restart-gated ImGui control. Call IMMEDIATELY + // AFTER the control (Checkbox / SliderInt / Combo / etc.) so the + // HoverTooltipWrapper attaches to that control. Does two things: + // 1. If hovered, renders a tooltip via HoverTooltipWrapper (the + // codebase's dominant tooltip pattern, used 296+ times -- gives + // a consistent Subtext font role and viewport-clamped placement) + // with the standard "Requires a game restart to change." suffix + // appended after the caller-supplied body. + // 2. Calls DrawSettingDiff outside the hover scope to render the + // "Pending restart" banner when the live value diverges from + // the boot snapshot. + // + // Two overloads: + // - `const char* body` for the simple single-string case. + // - Callable `body` for multi-line tooltips that already use + // HoverTooltipWrapper-style content (multiple ImGui::Text / + // TextWrapped calls). The callable runs inside the + // HoverTooltipWrapper RAII scope so callers can use any ImGui + // text primitive. + // + // Pass nullptr / empty body to render just the suffix. + template + inline void RestartGatedAnnotate(const Util::Settings::BootSnapshot& snapshot, + const SettingsT& live, + T SettingsT::* field, + const char* tooltipBody = nullptr) + { + if (auto _tt = Util::HoverTooltipWrapper()) { + if (tooltipBody && tooltipBody[0]) { + ImGui::TextUnformatted(tooltipBody); + ImGui::Spacing(); + } + ImGui::TextUnformatted("Requires a game restart to change."); + } + DrawSettingDiff(snapshot, live, field); + } + + template + requires std::invocable + inline void RestartGatedAnnotate(const Util::Settings::BootSnapshot& snapshot, + const SettingsT& live, + T SettingsT::* field, + Body&& body) + { + if (auto _tt = Util::HoverTooltipWrapper()) { + body(); + ImGui::Spacing(); + ImGui::TextUnformatted("Requires a game restart to change."); + } + DrawSettingDiff(snapshot, live, field); + } } /** diff --git a/src/WeatherEditor/EditorWindow.cpp b/src/WeatherEditor/EditorWindow.cpp index 7f84372e48..9eb12a0fc5 100644 --- a/src/WeatherEditor/EditorWindow.cpp +++ b/src/WeatherEditor/EditorWindow.cpp @@ -260,7 +260,8 @@ void EditorWindow::ShowObjectsWindow() }; // Build active records for the current category tab - struct ActiveRecord { + struct ActiveRecord + { std::string label; std::string suffix; RE::FormID formId; @@ -285,15 +286,17 @@ void EditorWindow::ShowObjectsWindow() }; auto addSingle = [&](RE::TESForm* form, const WidgetVec& widgets, std::string suffix = "") { - if (!form) return; + if (!form) + return; auto id = form->GetFormID(); activeRecords.push_back({ ResolveEditorId(form, widgets), std::move(suffix), id, openByFormId(id, &widgets) }); }; - auto addTOD = [&](auto* (&fields)[RE::TESWeather::ColorTimes::kTotal], const WidgetVec& widgets) { + auto addTOD = [&](auto*(&fields)[RE::TESWeather::ColorTimes::kTotal], const WidgetVec& widgets) { for (int tod = 0; tod < RE::TESWeather::ColorTimes::kTotal; ++tod) { auto* form = fields[tod]; - if (!form) continue; + if (!form) + continue; auto id = form->GetFormID(); bool already = std::any_of(activeRecords.begin(), activeRecords.end(), [&](const ActiveRecord& r) { return r.formId == id; }); @@ -303,7 +306,8 @@ void EditorWindow::ShowObjectsWindow() }; auto addWeather = [&](RE::TESWeather* weatherRecord, std::string suffix = "") { - if (!weatherRecord) return; + if (!weatherRecord) + return; auto id = weatherRecord->GetFormID(); activeRecords.push_back({ ResolveEditorId(weatherRecord, weatherWidgets), std::move(suffix), id, openByFormId(id, &weatherWidgets) }); }; @@ -313,7 +317,8 @@ void EditorWindow::ShowObjectsWindow() if (sky && sky->lastWeather != weather) addWeather(sky->lastWeather, "transitioning"); } else if (m_selectedCategory == "ImageSpace") { - if (weather) addTOD(weather->imageSpaces, imageSpaceWidgets); + if (weather) + addTOD(weather->imageSpaces, imageSpaceWidgets); } else if (m_selectedCategory == "Lighting Template") { auto* player = RE::PlayerCharacter::GetSingleton(); if (player && player->parentCell) @@ -339,13 +344,17 @@ void EditorWindow::ShowObjectsWindow() } }); } } else if (m_selectedCategory == "Volumetric Lighting") { - if (weather) addTOD(weather->volumetricLighting, volumetricLightingWidgets); + if (weather) + addTOD(weather->volumetricLighting, volumetricLightingWidgets); } else if (m_selectedCategory == "Shader Particle Geometry") { - if (weather) addSingle(weather->precipitationData, precipitationWidgets); + if (weather) + addSingle(weather->precipitationData, precipitationWidgets); } else if (m_selectedCategory == "Lens Flare") { - if (weather) addSingle(weather->sunGlareLensFlare, lensFlareWidgets); + if (weather) + addSingle(weather->sunGlareLensFlare, lensFlareWidgets); } else if (m_selectedCategory == "Visual Effect") { - if (weather) addSingle(weather->referenceEffect, referenceEffectWidgets); + if (weather) + addSingle(weather->referenceEffect, referenceEffectWidgets); } // Fall back to current weather when the active category has no active record @@ -359,9 +368,7 @@ void EditorWindow::ShowObjectsWindow() if (!activeRecords.empty()) { const auto& theme = Menu::GetSingleton()->GetTheme(); - ImGui::PushStyleColor(ImGuiCol_Text, theme.StatusPalette.RestartNeeded); - ImGui::Text("Active:"); - ImGui::PopStyleColor(); + Util::Text::RestartNeeded("Active:"); ImGui::SameLine(); const float recordX = ImGui::GetCursorPosX(); @@ -1859,8 +1866,8 @@ void EditorWindow::DrawTimeControls() const float framePadX = ImGui::GetStyle().FramePadding.x * 2.0f; const float buttonWidth = std::max({ ImGui::CalcTextSize("Resume Time").x, - ImGui::CalcTextSize("Pause Time").x, - ImGui::CalcTextSize("Reset Speed").x }) + + ImGui::CalcTextSize("Pause Time").x, + ImGui::CalcTextSize("Reset Speed").x }) + framePadX; if (ImGui::Button(timePaused ? "Resume Time" : "Pause Time", ImVec2(buttonWidth, 0))) TogglePause(); diff --git a/src/XSEPlugin.cpp b/src/XSEPlugin.cpp index f1d7f6ac98..44b94558c8 100644 --- a/src/XSEPlugin.cpp +++ b/src/XSEPlugin.cpp @@ -53,7 +53,7 @@ extern "C" DLLEXPORT bool SKSEAPI SKSEPlugin_Load(const SKSE::LoadInterface* a_s InitializeLog(); logger::info("Loaded {} {}", Plugin::NAME, Plugin::VERSION.string()); SKSE::Init(a_skse); - SKSE::AllocTrampoline(1 << 10); + SKSE::AllocTrampoline(1 << 12); return Load(); } @@ -106,7 +106,7 @@ void MessageHandler(SKSE::MessagingInterface::Message* message) { for (auto it = errors.begin(); it != errors.end(); ++it) { auto& errorMessage = *it; - RE::DebugMessageBox(std::format("Community Shaders\n{}, will disable all hooks and features", errorMessage).c_str()); + RE::DebugMessageBox(std::format("Open Shaders\n{}, will disable all hooks and features", errorMessage).c_str()); } if (errors.empty()) { diff --git a/tests/cpp/CMakeLists.txt b/tests/cpp/CMakeLists.txt new file mode 100644 index 0000000000..769dcc84d0 --- /dev/null +++ b/tests/cpp/CMakeLists.txt @@ -0,0 +1,80 @@ +# C++ unit tests for plugin utility code. +# +# Catch2-based. Disjoint from tests/shaders (which tests HLSL via +# ShaderTestFramework). Add test sources here for any plugin code that +# doesn't require a live DirectX device or ImGui context. + +cmake_minimum_required(VERSION 4.2) + +option(BUILD_CPP_TESTS "Build C++ unit tests for plugin utilities" ON) + +if(NOT BUILD_CPP_TESTS) + message(STATUS "C++ unit tests disabled") + return() +endif() + +message(STATUS "Configuring C++ unit tests...") + +include(FetchContent) + +# Reused by tests/shaders if that target is also enabled. FetchContent dedupes +# by name, so the same Catch2 source tree serves both. +FetchContent_Declare( + catch2 + GIT_REPOSITORY https://github.com/catchorg/Catch2.git + GIT_TAG v3.11.0 +) +FetchContent_MakeAvailable(Catch2) + +# MSVC 14.50+ (VS 2026) ICE workaround copied from tests/shaders. +if(MSVC AND MSVC_VERSION GREATER_EQUAL 1950) + if(TARGET Catch2) + target_compile_options(Catch2 PRIVATE /Od) + endif() +endif() + +find_package(nlohmann_json CONFIG REQUIRED) +find_package(imgui CONFIG REQUIRED) + +add_executable(cpp_tests + test_main.cpp + test_bootsnapshot.cpp + test_subrect.cpp + # Compile the unit-under-test directly into the test binary so we don't + # depend on the plugin DLL build (which pulls in FFX/Streamline/etc.). + "${CMAKE_SOURCE_DIR}/src/Utils/Subrect.cpp" +) + +set_property(TARGET cpp_tests PROPERTY CXX_STANDARD 23) +set_property(TARGET cpp_tests PROPERTY CXX_STANDARD_REQUIRED ON) + +target_include_directories(cpp_tests PRIVATE + "${CMAKE_SOURCE_DIR}/src" +) + +target_link_libraries(cpp_tests PRIVATE + Catch2::Catch2 + nlohmann_json::nlohmann_json + imgui::imgui +) + +# windows.h (pulled in via d3d11.h) defines min/max macros that break the +# std::min/std::max calls in Subrect.cpp. The plugin's PCH suppresses +# this implicitly; the test target needs it explicitly. +target_compile_definitions(cpp_tests PRIVATE + NOMINMAX + WIN32_LEAN_AND_MEAN +) + +enable_testing() +add_test( + NAME CppUtilTests + COMMAND cpp_tests --reporter compact + WORKING_DIRECTORY $ +) +set_tests_properties(CppUtilTests PROPERTIES + TIMEOUT 60 + LABELS "Cpp;UnitTests" +) + +message(STATUS "C++ unit tests configured successfully") diff --git a/tests/cpp/test_bootsnapshot.cpp b/tests/cpp/test_bootsnapshot.cpp new file mode 100644 index 0000000000..ea5f05228b --- /dev/null +++ b/tests/cpp/test_bootsnapshot.cpp @@ -0,0 +1,127 @@ +// Unit tests for Util::Settings::BootSnapshot (restart-required settings diff). + +#include "Utils/BootSnapshot.h" + +#include + +#include +#include + +namespace +{ + struct TestSettings + { + uint32_t mode = 0; + bool enabled = false; + float value = 0.0f; + }; + + inline constexpr Util::Settings::RestartTable kFields{ { + UTIL_RESTART_FIELD(TestSettings, mode, "Mode"), + UTIL_RESTART_FIELD(TestSettings, enabled, "Enabled"), + } }; +} + +TEST_CASE("BootSnapshot starts unlatched and ignores diffs", "[bootsnapshot]") +{ + Util::Settings::BootSnapshot snap{ kFields }; + TestSettings live{}; + live.mode = 3; + live.enabled = true; + + REQUIRE_FALSE(snap.IsLatched()); + REQUIRE(snap.RawBoot("mode") == nullptr); + REQUIRE_FALSE(snap.HasPendingChange(live, &TestSettings::mode)); +} + +TEST_CASE("BootSnapshot detects member changes after latch", "[bootsnapshot]") +{ + Util::Settings::BootSnapshot snap{ kFields }; + TestSettings boot{}; + boot.mode = 1; + boot.enabled = false; + + snap.Latch(boot); + REQUIRE(snap.IsLatched()); + REQUIRE(snap.Boot(&TestSettings::mode) == 1); + REQUIRE(snap.Boot(&TestSettings::enabled) == false); + + TestSettings live = boot; + REQUIRE_FALSE(snap.HasPendingChange(live, &TestSettings::mode)); + + live.mode = 2; + REQUIRE(snap.HasPendingChange(live, &TestSettings::mode)); + REQUIRE_FALSE(snap.HasPendingChange(live, &TestSettings::enabled)); +} + +TEST_CASE("BootSnapshot exposes field metadata by member", "[bootsnapshot]") +{ + Util::Settings::BootSnapshot snap{ kFields }; + const auto* info = snap.FindField(&TestSettings::enabled); + REQUIRE(info != nullptr); + REQUIRE(std::string(info->jsonKey) == "enabled"); + REQUIRE(std::string(info->label) == "Enabled"); +} + +namespace +{ + // Settings with a non-trivial member (std::string) -- not trivially- + // copyable but still copy-assignable. Mirrors real cases like + // ShadowCasterManager::Settings which carries exprtk formula strings + // alongside the POD restart-gated fields. + struct SettingsWithString + { + int32_t shadowLightCount = 0; + bool enabled = false; + std::string formula = "default"; // not registered as restart-gated + }; + + inline constexpr Util::Settings::RestartTable kStringFields{ { + UTIL_RESTART_FIELD(SettingsWithString, shadowLightCount, "Shadow Light Count"), + UTIL_RESTART_FIELD(SettingsWithString, enabled, "Enabled"), + } }; +} + +TEST_CASE("BootSnapshot deep-copies non-trivial members on Latch", "[bootsnapshot]") +{ + // Regression: the original BootSnapshot static_asserted trivially-copyable + // and Latch() used memcpy, which would shallow-copy std::string internals + // (corrupting the boot snapshot's string when the live string later + // reallocated). After the relaxation, Latch uses copy-assign so the + // non-trivial member is deep-copied. The POD restart-gated fields still + // drive HasPendingChange via memcmp. + Util::Settings::BootSnapshot snap{ kStringFields }; + SettingsWithString boot{}; + boot.shadowLightCount = 16; + boot.enabled = true; + const std::string originalFormula = "lightradius * lightintensity"; + boot.formula = originalFormula; + + snap.Latch(boot); + REQUIRE(snap.IsLatched()); + REQUIRE(snap.Boot(&SettingsWithString::shadowLightCount) == 16); + REQUIRE(snap.Boot(&SettingsWithString::enabled) == true); + // Read the snapshot's std::string directly. Catches shallow-copy + // regressions: if Latch were still doing a memcpy of SettingsWithString, + // the boot copy would hold a stale pointer into live.formula's heap + // buffer, and reading it (especially after live.formula reallocates) + // would crash or return garbage. The POD-only assertions don't cover + // this on their own. + REQUIRE(snap.Boot(&SettingsWithString::formula) == originalFormula); + + // Mutating the live struct (including reallocating its string) must NOT + // disturb the boot copy or produce false-positive diffs for unregistered + // fields. Force a string reallocation by growing it well past SSO size, + // then verify the boot copy's string is unchanged. + SettingsWithString live = boot; + live.formula = std::string(256, 'x'); + REQUIRE_FALSE(snap.HasPendingChange(live, &SettingsWithString::shadowLightCount)); + REQUIRE_FALSE(snap.HasPendingChange(live, &SettingsWithString::enabled)); + REQUIRE(snap.Boot(&SettingsWithString::formula) == originalFormula); + + // Now flip a registered POD field; the diff fires. + live.shadowLightCount = 32; + REQUIRE(snap.HasPendingChange(live, &SettingsWithString::shadowLightCount)); + REQUIRE_FALSE(snap.HasPendingChange(live, &SettingsWithString::enabled)); + REQUIRE(snap.Boot(&SettingsWithString::formula) == originalFormula); +} diff --git a/tests/cpp/test_main.cpp b/tests/cpp/test_main.cpp new file mode 100644 index 0000000000..2c8aefc944 --- /dev/null +++ b/tests/cpp/test_main.cpp @@ -0,0 +1,9 @@ +// Explicit main() for C++ unit tests. +// Catch2WithMain has linker issues with CMake 4.0 (see tests/shaders/minimal_test.cpp); +// we provide our own session entry point. +#include + +int main(int argc, char* argv[]) +{ + return Catch::Session().run(argc, argv); +} diff --git a/tests/cpp/test_subrect.cpp b/tests/cpp/test_subrect.cpp new file mode 100644 index 0000000000..b4908c8632 --- /dev/null +++ b/tests/cpp/test_subrect.cpp @@ -0,0 +1,418 @@ +// Unit tests for Util::Subrect::Controller. +// +// Focus: API contract for the stereo extension (PR-1 of the DLSS-PR-PLAN decomposition). +// Cannot exercise DrawEditor (needs an ImGui context); covers everything else: +// JSON load/save round-trips, mirror math, preset apply, mono/stereo back-compat. + +#include +using json = nlohmann::json; + +#include // ImDrawCallback declared in Subrect.h signature + +#include "Utils/Subrect.h" + +#include + +#include // std::abs +#include // std::pair +#include // std::vector — degenerate-dimensions cases enumerate (w,h) pairs + +using Util::Subrect::Controller; +using Util::Subrect::Preset; +using Util::Subrect::UVRegion; + +namespace +{ + bool UVApprox(const UVRegion& a, const UVRegion& b, float eps = 1e-5f) + { + return std::abs(a.x - b.x) < eps && std::abs(a.y - b.y) < eps && + std::abs(a.w - b.w) < eps && std::abs(a.h - b.h) < eps; + } +} + +TEST_CASE("Controller defaults to mono mode", "[subrect]") +{ + Controller c; + REQUIRE_FALSE(c.IsStereoEnabled()); + // Right-eye accessor folds onto primary UV in mono mode. + REQUIRE(UVApprox(c.GetUV(), c.GetRightEyeUV())); +} + +TEST_CASE("SaveSettings in mono mode emits no right-eye keys", "[subrect][backcompat]") +{ + // Pre-stereo screenshot JSON shape must round-trip bit-identically: this is + // the core back-compat contract for the existing ScreenshotFeature consumer. + // EnsureDefaultPreset is lazy (only runs in LoadSettings/ApplyPreset/DrawEditor), + // so prime it with an empty load to realize the placeholder preset. + Controller c; + c.LoadSettings(json::object()); + json out; + c.SaveSettings(out); + + REQUIRE(out.contains("CropX")); + REQUIRE(out.contains("CropY")); + REQUIRE(out.contains("CropW")); + REQUIRE(out.contains("CropH")); + REQUIRE_FALSE(out.contains("CropRightX")); + REQUIRE_FALSE(out.contains("CropRightY")); + REQUIRE_FALSE(out.contains("CropRightW")); + REQUIRE_FALSE(out.contains("CropRightH")); + + REQUIRE(out["CropPresets"].is_array()); + REQUIRE(out["CropPresets"].size() >= 1); + for (const auto& entry : out["CropPresets"]) { + REQUIRE(entry.contains("uv")); + REQUIRE_FALSE(entry.contains("right_uv")); + } +} + +TEST_CASE("LoadSettings reads legacy mono JSON unchanged", "[subrect][backcompat]") +{ + // Replicates the screenshot feature's existing on-disk schema. + json in = { + { "CropX", 0.25f }, + { "CropY", 0.10f }, + { "CropW", 0.5f }, + { "CropH", 0.8f }, + { "CropPresets", json::array({ + { + { "name", "Full Frame" }, + { "uv", { 0.0f, 0.0f, 1.0f, 1.0f } }, + }, + }) }, + { "SelectedPresetIndex", -1 }, + }; + + Controller c; + c.LoadSettings(in); + + REQUIRE(UVApprox(c.GetUV(), { 0.25f, 0.10f, 0.5f, 0.8f })); + REQUIRE_FALSE(c.IsStereoEnabled()); +} + +TEST_CASE("SetStereoEnabled toggles mirror sync", "[subrect][stereo]") +{ + // Stereo enable on a fresh controller mirrors the default UV (which is + // {0,0,1,1}); the mirror of a full-frame is still full-frame. + Controller c; + c.SetStereoEnabled(true); + REQUIRE(c.IsStereoEnabled()); + + // Re-enable is a no-op (no double-sync, no state corruption). + c.SetStereoEnabled(true); + REQUIRE(c.IsStereoEnabled()); + + c.SetStereoEnabled(false); + REQUIRE_FALSE(c.IsStereoEnabled()); +} + +TEST_CASE("Stereo save then load round-trips right-eye keys", "[subrect][stereo]") +{ + // Seed presets with explicit asymmetric right-eye to confirm the on-disk + // schema preserves it (rather than always mirroring on load). + Controller src; + src.SetStereoEnabled(true); + src.SeedDefaultPresets({ + Preset{ .name = "Asym", .uv = { 0.1f, 0.0f, 0.5f, 1.0f }, .rightUV = UVRegion{ 0.3f, 0.0f, 0.4f, 0.9f } }, + }); + // Realize the seed and copy it into currentUV/currentRightUV. + src.LoadSettings(json::object()); + + json saved; + src.SaveSettings(saved); + + REQUIRE(saved.contains("CropRightX")); + REQUIRE(saved["CropPresets"][0].contains("right_uv")); + + Controller dst; + dst.SetStereoEnabled(true); + dst.LoadSettings(saved); + + // After load, applying the asymmetric preset must restore both eyes + // exactly (not mirror-overwrite the right eye). + REQUIRE(UVApprox(dst.GetUV(), { 0.1f, 0.0f, 0.5f, 1.0f })); + REQUIRE(UVApprox(dst.GetRightEyeUV(), { 0.3f, 0.0f, 0.4f, 0.9f })); +} + +TEST_CASE("Stereo load mirrors right-eye when JSON lacks CropRight keys", "[subrect][stereo][backcompat]") +{ + // A user upgrading from a mono build has CropX/Y/W/H but no CropRight*. + // In stereo mode, the controller mirrors the primary UV around x=0.5. + json legacy = { + { "CropX", 0.2f }, + { "CropY", 0.0f }, + { "CropW", 0.5f }, + { "CropH", 1.0f }, + }; + + Controller c; + c.SetStereoEnabled(true); + c.LoadSettings(legacy); + + REQUIRE(UVApprox(c.GetUV(), { 0.2f, 0.0f, 0.5f, 1.0f })); + // Mirror: x = 1 - 0.2 - 0.5 = 0.3 + REQUIRE(UVApprox(c.GetRightEyeUV(), { 0.3f, 0.0f, 0.5f, 1.0f })); +} + +TEST_CASE("GetStereoPixelRegions splits SBS width per eye", "[subrect][stereo]") +{ + // Stereo regions resolve against half-width per eye (the texture is SBS). + // Caller passes the full stereo texture size; controller divides W by 2. + Controller c; + c.SetStereoEnabled(true); + + // Make eyes asymmetric so we can tell them apart. + c.SeedDefaultPresets({ + Preset{ + .name = "Asym", + .uv = { 0.0f, 0.0f, 1.0f, 1.0f }, // full left eye + .rightUV = UVRegion{ 0.0f, 0.0f, 0.5f, 1.0f }, // left half of right eye + }, + }); + // Realize the seed so currentUV/currentRightUV pick up the preset values. + c.LoadSettings(json::object()); + + const auto regions = c.GetStereoPixelRegions(2000, 1000); + // Left eye spans the full 1000-wide left half. + REQUIRE(regions.leftEye.x == 0); + REQUIRE(regions.leftEye.w == 1000); + REQUIRE(regions.leftEye.h == 1000); + // Right eye spans only half the 1000-wide right half = 500 px. + REQUIRE(regions.rightEye.w == 500); +} + +TEST_CASE("GetStereoPixelRegions in mono mode returns identical eyes", "[subrect][stereo]") +{ + // Mono callers can use the stereo accessor and both eyes will resolve from + // the primary UV — lets DLSS consumers stay agnostic of stereo state. + Controller c; + json in = { + { "CropX", 0.0f }, + { "CropY", 0.0f }, + { "CropW", 1.0f }, + { "CropH", 1.0f }, + }; + c.LoadSettings(in); + + const auto regions = c.GetStereoPixelRegions(2000, 1000); + REQUIRE(regions.leftEye.x == regions.rightEye.x); + REQUIRE(regions.leftEye.w == regions.rightEye.w); +} + +TEST_CASE("Stereo SaveSettings emits right_uv for every preset", "[subrect][stereo][regression]") +{ + // Regression for CodeRabbit Major @ scs#2356: the Save Preset button used + // to drop currentRightUV, so re-applying a saved preset would zero out the + // right eye. We can't drive the ImGui Save Preset button from a test, but + // we can stage the same end state (a Controller with stereo enabled and an + // in-memory preset whose rightUV differs from a mirror of left) and verify + // it round-trips both eyes through SaveSettings → LoadSettings. + Controller src; + src.SetStereoEnabled(true); + + // Stage a preset with an explicit asymmetric right_uv that doesn't equal + // MirrorUVHorizontal(uv) — otherwise we couldn't distinguish "right_uv was + // preserved" from "default fallback mirrored it back". + const UVRegion leftUV{ 0.10f, 0.0f, 0.40f, 1.0f }; + const UVRegion rightUV{ 0.55f, 0.0f, 0.35f, 1.0f }; + + // Build the preset object via direct mutation rather than a single nested + // initializer list. MSVC's C++23 module-aware parse cannot disambiguate + // `json::array({ json{ {k,v}, {k,v} } })` from the + // `initializer_list>` overload of `json`'s + // constructor, producing a misleading C3329 at the inner closing `)`. + // Building element-by-element sidesteps the ambiguity entirely. + json preset = json::object(); + preset["name"] = "Asymmetric"; + preset["uv"] = json::array({ leftUV.x, leftUV.y, leftUV.w, leftUV.h }); + preset["right_uv"] = json::array({ rightUV.x, rightUV.y, rightUV.w, rightUV.h }); + + json staged = { + { "CropX", leftUV.x }, + { "CropY", leftUV.y }, + { "CropW", leftUV.w }, + { "CropH", leftUV.h }, + { "CropRightX", rightUV.x }, + { "CropRightY", rightUV.y }, + { "CropRightW", rightUV.w }, + { "CropRightH", rightUV.h }, + { "CropPresets", json::array({ preset }) }, + { "SelectedPresetIndex", 0 } + }; + src.LoadSettings(staged); + + json saved; + src.SaveSettings(saved); + REQUIRE(saved["CropPresets"].is_array()); + REQUIRE_FALSE(saved["CropPresets"].empty()); + for (const auto& entry : saved["CropPresets"]) { + REQUIRE(entry.contains("right_uv")); + } + + // Round-trip into a fresh controller and confirm the asymmetric right UV + // survived. Before the fix the saved preset's right_uv would be missing + // and LoadSettings would mirror left → right, equalizing the eyes. + Controller dst; + dst.SetStereoEnabled(true); + dst.LoadSettings(saved); + REQUIRE(UVApprox(dst.GetRightEyeUV(), rightUV)); +} + +TEST_CASE("MirrorUVHorizontal symmetry via SetStereoEnabled", "[subrect][stereo][math]") +{ + // {0.4, *, 0.6, *} mirrors to {0, *, 0.6, *} — the nose-side overlap case. + // Exercised through SetStereoEnabled since MirrorUVHorizontal is private. + Controller c; + json in = { + { "CropX", 0.4f }, + { "CropY", 0.0f }, + { "CropW", 0.6f }, + { "CropH", 1.0f }, + }; + c.LoadSettings(in); + REQUIRE_FALSE(c.IsStereoEnabled()); + + c.SetStereoEnabled(true); + // x = 1 - 0.4 - 0.6 = 0.0 + REQUIRE(UVApprox(c.GetRightEyeUV(), { 0.0f, 0.0f, 0.6f, 1.0f })); +} + +TEST_CASE("SetStereoEnabled preserves explicit right UV loaded earlier", "[subrect][stereo][regression]") +{ + // Regression for the call-order trap: LoadSettings (mono) → SetStereoEnabled(true) + // must NOT overwrite an explicit CropRight* value with the mirror of the left eye. + Controller c; + const UVRegion explicitRight{ 0.62f, 0.10f, 0.30f, 0.80f }; + json in = { + { "CropX", 0.10f }, + { "CropY", 0.00f }, + { "CropW", 0.40f }, + { "CropH", 1.00f }, + { "CropRightX", explicitRight.x }, + { "CropRightY", explicitRight.y }, + { "CropRightW", explicitRight.w }, + { "CropRightH", explicitRight.h }, + }; + c.LoadSettings(in); + REQUIRE_FALSE(c.IsStereoEnabled()); + + c.SetStereoEnabled(true); + REQUIRE(UVApprox(c.GetRightEyeUV(), explicitRight)); +} + +TEST_CASE("Reload without CropRight* re-enables auto-mirror", "[subrect][stereo][regression]") +{ + // The rightUVLoadedFromJson flag must reset every LoadSettings — otherwise + // once a config with CropRight* was loaded, later loads without those + // keys would keep suppressing the mirror. + Controller c; + const UVRegion explicitRight{ 0.62f, 0.10f, 0.30f, 0.80f }; + json withRight = { + { "CropX", 0.10f }, { "CropY", 0.00f }, { "CropW", 0.40f }, { "CropH", 1.00f }, + { "CropRightX", explicitRight.x }, { "CropRightY", explicitRight.y }, + { "CropRightW", explicitRight.w }, { "CropRightH", explicitRight.h } + }; + c.LoadSettings(withRight); + + // Now load a fresh config WITHOUT CropRight*. The flag should reset to + // false so a subsequent SetStereoEnabled(true) auto-mirrors. + json withoutRight = { + { "CropX", 0.20f }, { "CropY", 0.00f }, { "CropW", 0.60f }, { "CropH", 1.00f } + }; + c.LoadSettings(withoutRight); + c.SetStereoEnabled(true); + // Mirror of {0.20, 0, 0.60, 1.0} is {1 - 0.20 - 0.60, *, 0.60, *} = {0.20, *, 0.60, *}. + REQUIRE(UVApprox(c.GetRightEyeUV(), { 0.20f, 0.0f, 0.60f, 1.0f })); +} + +TEST_CASE("SaveSettings erases stale CropRight* in mono mode", "[subrect][stereo][regression]") +{ + // When a host re-uses an in-memory JSON object that previously held + // stereo keys, the mono save must clear them or the next load looks + // like it still had explicit stereo data. + json carry = { + { "CropRightX", 0.5f }, { "CropRightY", 0.0f }, + { "CropRightW", 0.5f }, { "CropRightH", 1.0f } + }; + Controller c; + REQUIRE_FALSE(c.IsStereoEnabled()); + c.SaveSettings(carry); + REQUIRE_FALSE(carry.contains("CropRightX")); + REQUIRE_FALSE(carry.contains("CropRightY")); + REQUIRE_FALSE(carry.contains("CropRightW")); + REQUIRE_FALSE(carry.contains("CropRightH")); +} + +TEST_CASE("GetStereoPixelRegions returns empty for degenerate dimensions", "[subrect][stereo][edge]") +{ + // fullWidth/2 == 0 for widths 0 or 1; UVToPixelRegion's `width - 1` + // would underflow into a huge coord without the guard. + Controller c; + c.SetStereoEnabled(true); + for (auto [w, h] : std::vector>{ { 0, 100 }, { 1, 100 }, { 100, 0 } }) { + const auto regions = c.GetStereoPixelRegions(w, h); + REQUIRE(regions.leftEye.w == 0); + REQUIRE(regions.rightEye.w == 0); + } +} + +TEST_CASE("Seeded preset without explicit rightUV auto-mirrors in stereo", "[subrect][stereo][regression]") +{ + // Regression for the silent-full-frame bug: a Preset built with only + // .name + .uv (rightUV omitted, defaulting to std::nullopt) should + // auto-mirror the left eye when stereo is enabled, NOT show full frame + // for the right eye. + Controller c; + c.SetStereoEnabled(true); + c.SeedDefaultPresets({ + Preset{ .name = "Left Half", .uv = { 0.0f, 0.0f, 0.5f, 1.0f } }, + }); + c.LoadSettings(json::object()); + + // Mirror of {0, 0, 0.5, 1.0} around x=0.5 is {0.5, 0, 0.5, 1.0}. + REQUIRE(UVApprox(c.GetRightEyeUV(), { 0.5f, 0.0f, 0.5f, 1.0f })); +} + +TEST_CASE("Partial CropRight* keys still allow auto-mirror", "[subrect][stereo][regression]") +{ + // Only one of the four right-eye keys was provided. The old OR semantics + // would mark this as "explicit right eye" and suppress the mirror on + // SetStereoEnabled(true), leaving currentRightUV with mixed stale + loaded + // components. The fixed AND semantics treat partial as not-explicit so + // the mirror still runs. + Controller c; + json partial = { + { "CropX", 0.20f }, { "CropY", 0.00f }, { "CropW", 0.60f }, { "CropH", 1.00f }, + { "CropRightW", 0.50f } // a single right-eye key — incomplete quartet + }; + c.LoadSettings(partial); + c.SetStereoEnabled(true); + // Mirror of {0.20, 0, 0.60, 1.0} is {0.20, 0, 0.60, 1.0} — confirms the + // mirror ran rather than landing the half-loaded right UV. + REQUIRE(UVApprox(c.GetRightEyeUV(), { 0.20f, 0.0f, 0.60f, 1.0f })); +} + +TEST_CASE("Malformed preset right_uv falls back to auto-mirror", "[subrect][stereo][regression]") +{ + // LoadUVArray returns a default full-frame UV on malformed input. Without + // shape validation, a bad `right_uv` payload would land the right eye as + // full-frame AND suppress the auto-mirror — the worst of both worlds. + // With validation, malformed input is ignored and the mirror takes over. + Controller c; + c.SetStereoEnabled(true); + + // Same C++23-modules-friendly construction as above (see the regression + // test for the rationale). + json badPreset = json::object(); + badPreset["name"] = "Bad"; + badPreset["uv"] = json::array({ 0.10f, 0.0f, 0.40f, 1.0f }); + badPreset["right_uv"] = "not an array"; // malformed + + json bad = { + { "CropPresets", json::array({ badPreset }) }, + { "SelectedPresetIndex", 0 } + }; + c.LoadSettings(bad); + // Auto-mirror of {0.10, 0, 0.40, 1.0} = {0.50, 0, 0.40, 1.0}, NOT {0,0,1,1}. + REQUIRE(UVApprox(c.GetRightEyeUV(), { 0.50f, 0.0f, 0.40f, 1.0f })); +} diff --git a/tools/feature_version_audit.py b/tools/feature_version_audit.py index 3c3f50f74a..a5df777b37 100644 --- a/tools/feature_version_audit.py +++ b/tools/feature_version_audit.py @@ -787,7 +787,7 @@ def get_feature_key(feature_dir, feature_meta_map): commit_link = "" if bump_commit: author_str = f" ({bump_author})" if bump_author else "" - commit_link = f"[link](https://github.com/doodlum/skyrim-community-shaders/commit/{bump_commit}){author_str}" + commit_link = f"[link](https://github.com/community-shaders/skyrim-community-shaders/commit/{bump_commit}){author_str}" def bold(val): return f"**{val}**" if is_attention and val != '' and val != '-' else val @@ -875,7 +875,7 @@ def boldmeta(val, missing=missing): nexus_link = f"[Nexus]({meta['mod_link']})" if meta and meta['mod_link'] else ("**Missing metadata**" if not meta else "") author = get_commit_author(commit) if commit else None author_str = f" ({author})" if author else "" - commit_link = f"[link](https://github.com/doodlum/skyrim-community-shaders/commit/{commit}){author_str}" if commit else "" + commit_link = f"[link](https://github.com/community-shaders/skyrim-community-shaders/commit/{commit}){author_str}" if commit else "" lines.append(f"| {boldmeta(name)} | {boldmeta(ver)} | {nexus_link} | {commit_link} |") return lines diff --git a/vcpkg.json b/vcpkg.json index 1d6bd1ab78..1c5a5b57b2 100644 --- a/vcpkg.json +++ b/vcpkg.json @@ -17,6 +17,7 @@ "directxtex", "eastl", "efsw", + "exprtk", { "name": "imgui", "features": ["dx11-binding", "win32-binding", "docking-experimental"]