Skip to content

perf: reduce overhead of hot functions #1290

Merged
doodlum merged 53 commits into
devfrom
messy-opts
Jul 23, 2025
Merged

perf: reduce overhead of hot functions #1290
doodlum merged 53 commits into
devfrom
messy-opts

Conversation

@doodlum
Copy link
Copy Markdown
Collaborator

@doodlum doodlum commented Jul 21, 2025

Extreme optimisations for SetDirtyStates to effectively remove all overhead from having that hook on.

Optimisations for Terrain Helper and True PBR PSSetShaderResources.

Constant buffers are set separately instead of being checked every object render.

Restored permutation buffer masking to reduce number of uploads.

Permutation buffer optimised for best vectorisation.

Bitmasks are not set to 0 each time, and instead are disabled/enabled by associated code. Only triggers an upload if actually needed.

RTTI optimisations.

Optimised GetCurrentAccumulator by accessing the address directly.

Optimised SetupPointLights.

Summary by CodeRabbit

  • New Features

    • Added new debugging and performance tracking methods to improve insight into rendering performance.
    • Introduced new hooks for enhanced rendering state management.
  • Bug Fixes

    • Improved accuracy and consistency in shader descriptor and feature flag handling across rendering features.
  • Refactor

    • Centralized and streamlined shader permutation and descriptor tracking for better maintainability.
    • Optimized shader resource updates by batching consecutive texture updates, reducing API calls and improving performance.
    • Removed unused or redundant flags and variables for a cleaner codebase.
    • Simplified and removed legacy hooks and related code for light limit fixes and frame buffer setup.
    • Updated shader accumulator access to use global pointers for improved consistency.
  • Chores

    • Updated internal flag layouts and enums to reflect a refined descriptor bit structure.
    • Added global RTTI pointers and shader accumulator relocation for improved type safety and access consistency.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jul 21, 2025

"""

Walkthrough

This update centralizes and refactors shader permutation and feature descriptor tracking by consolidating related flags and variables into unified bitfields within a permutationData structure. It removes obsolete flags, restructures hooks, optimizes shader resource setting with batching, introduces debugging and timing methods, and streamlines state reset and update logic.

Changes

File(s) Change Summary
package/Shaders/Common/Permutation.hlsli, src/State.h Removed IsDecal flag, shifted IsTree bit position, updated enums and struct for permutation descriptors.
src/Deferred.cpp, src/Deferred.h Replaced boolean flags with bitfields, added new hooks, updated installation logic, removed obsolete members.
src/Features/ExtendedTranslucency.cpp, src/Features/TerrainHelper.cpp Updated feature descriptor handling to use nested permutationData bitfields instead of direct state fields.
src/Features/SubsurfaceScattering.cpp, src/Hooks.cpp Changed extra descriptor flag manipulation to use permutationData.ExtraShaderDescriptor with explicit clearing.
src/State.cpp, src/State.h Refactored draw and debug logic, added new methods, removed unused variables, improved permutation buffer update.
src/TruePBR.cpp, src/Features/TerrainHelper.cpp Optimized pixel shader resource setting by batching consecutive slots, removed per-frame buffer setup hook.
src/Features/LightLimitFix.cpp, src/Features/LightLimitFix.h Simplified BSLightingShader_SetupGeometry_GeometrySetupConstantPointLights signature, removed related hook.
src/Features/ScreenSpaceShadows.cpp, src/Features/TerrainShadows.cpp Replaced static method calls with global pointer dereference for current shader accumulator retrieval.
src/Globals.cpp, src/Globals.h Added global relocation pointer for current shader accumulator and RTTI pointers.
src/Menu.cpp Updated current shader accumulator retrieval to use global pointer instead of static method call.
src/XSEPlugin.cpp Updated VR address library version check from 0.181.0 to 0.182.0.

Sequence Diagram(s)

sequenceDiagram
    participant GameLoop
    participant State
    participant Deferred
    participant Shader
    participant TruePBR

    GameLoop->>State: Draw()
    State->>State: Compare permutationData with permutationDataPrevious
    alt Data changed
        State->>Shader: Update permutation constant buffer
    end
    State->>Deferred: CopyShadowData() if utility shader
    State->>State: Reset updateShader

    GameLoop->>State: DrawDebug()
    State->>State: Setup shader resources
    State->>State: Update permutation buffer if needed
    State->>State: Debug()
    State->>Deferred: CopyShadowData() if utility shader
    State->>State: Reset updateShader

    State->>State: Debug()
    State->>State: Smooth frame/draw call counters
    State->>State: Track timings and draw call counts
    State->>State: Emit performance events if enabled
Loading

Estimated code review effort

4 (~90 minutes)

Possibly related PRs

Suggested reviewers

  • davo0411
  • alandtse

Poem

In the warren of bits, we hop and we play,
Flags rearranged in a more unified way.
Shader descriptors now march in a line,
Batching textures, the rendering’s fine.
Debugging and timing, we burrow so deep—
This code review’s worth a carrot heap!
🥕
"""


📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 22341ba and 5bd236f.

📒 Files selected for processing (1)
  • src/Hooks.cpp (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/Hooks.cpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Validate shader compilation (VR, .github/configs/shader-validation-vr.yaml)
  • GitHub Check: Validate shader compilation (Flatrim, .github/configs/shader-validation.yaml)
  • GitHub Check: Build plugin and addons
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

doodlum and others added 2 commits July 21, 2025 04:36
Automated formatting by clang-format, prettier, and other hooks.
See https://pre-commit.ci for details.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jul 21, 2025

Using provided base ref: 08384fc
Using base ref: 08384fc
Base commit date: 2025-07-21T16:09:46+01:00 (Monday, July 21, 2025 04:09 PM)
No actionable suggestions for changed features.

@doodlum doodlum marked this pull request as ready for review July 21, 2025 16:12
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
src/State.h (1)

84-85: Add documentation for new public methods.

These new methods lack documentation explaining their purpose. Based on the implementation, DrawDebug() performs shader updates with debug instrumentation, while Debug() handles frame timing and performance metrics.

Consider adding documentation:

+	// Performs shader updates with debug instrumentation enabled
 	void DrawDebug();
+	// Updates frame timing metrics and draw call statistics
 	void Debug();
src/Deferred.cpp (3)

769-777: Consider exception safety for bit clearing.

The current implementation clears the bit after the function call, but if func(a1) throws an exception, the bit won't be cleared. Also, maintaining both state->inWorld and the bit in ExtraShaderDescriptor seems redundant.

Consider using RAII or consolidating the state:

 void Deferred::Hooks::Main_RenderWorld::thunk(bool a1)
 {
 	auto* const state = globals::state;
-	state->permutationData.ExtraShaderDescriptor |= (uint32_t)State::ExtraShaderDescriptors::InWorld;
-	state->inWorld = true;
-	func(a1);
-	state->inWorld = false;
-	state->permutationData.ExtraShaderDescriptor &= ~(uint32_t)State::ExtraShaderDescriptors::InWorld;
+	struct InWorldGuard {
+		State* state;
+		InWorldGuard(State* s) : state(s) {
+			state->permutationData.ExtraShaderDescriptor |= (uint32_t)State::ExtraShaderDescriptors::InWorld;
+			state->inWorld = true;
+		}
+		~InWorldGuard() {
+			state->inWorld = false;
+			state->permutationData.ExtraShaderDescriptor &= ~(uint32_t)State::ExtraShaderDescriptors::InWorld;
+		}
+	} guard(state);
+	func(a1);
 };

809-818: Apply consistent exception safety pattern.

This hook has the same exception safety issue as Main_RenderWorld. If the pattern suggested there is adopted, apply it here too for consistency.


828-841: Document the purpose and buffer slot assignments.

This new hook sets specific constant buffer slots but lacks documentation about why these specific slots (4-6 for PS, 5-6 for CS) are used.

Add a comment explaining the hook's purpose:

 void Deferred::Hooks::Renderer_ResetState::thunk(void* This)
 {
 	func(This);
 
+	// Reset shader constant buffers after renderer state reset
+	// PS slots: 4=permutation, 5=sharedData, 6=featureData
+	// CS slots: 5=sharedData, 6=featureData
 	auto* const state = globals::state;
 	auto* const context = globals::d3d::context;
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 08384fc and 92d98f8.

⛔ Files ignored due to path filters (1)
  • package/Shaders/Lighting.hlsl is excluded by !**/*.hlsl
📒 Files selected for processing (10)
  • package/Shaders/Common/Permutation.hlsli (1 hunks)
  • src/Deferred.cpp (2 hunks)
  • src/Deferred.h (2 hunks)
  • src/Features/ExtendedTranslucency.cpp (2 hunks)
  • src/Features/SubsurfaceScattering.cpp (1 hunks)
  • src/Features/TerrainHelper.cpp (2 hunks)
  • src/Hooks.cpp (3 hunks)
  • src/State.cpp (2 hunks)
  • src/State.h (4 hunks)
  • src/TruePBR.cpp (1 hunks)
🧰 Additional context used
🧠 Learnings (10)
📓 Common learnings
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-06-24T07:17:36.604Z
Learning: When reviewing PRs, always clarify the scope if there are multiple related features or dependencies. WeatherPicker was a separate PR that was already merged, while this PR focuses specifically on WetnessEffects climate preset system enhancements.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use `uint` type instead of `bool` for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use `uint` type instead of `bool` for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#577
File: features/Wetness Effects/Shaders/WetnessEffects/WetnessEffects.hlsli:57-61
Timestamp: 2025-06-17T05:40:22.785Z
Learning: Default parameter values are supported in the HLSL compiler used by the skyrim-community-shaders project, contrary to standard HLSL (FXC/DXC) limitations.
src/Features/ExtendedTranslucency.cpp (3)

Learnt from: alandtse
PR: #1157
File: src/Feature.cpp:42-49
Timestamp: 2025-06-17T09:27:49.594Z
Learning: In src/Feature.cpp, when an obsolete feature's INI file is deleted, the feature should be silently disabled without surfacing any issues to the user. This is the intended behavior because a deleted INI file for an obsolete feature indicates that the user has properly cleaned up the obsolete feature.

Learnt from: jiayev
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-18T15:21:03.641Z
Learning: In the skyrim-community-shaders rendering pipeline, materials with alpha < 1 or alpha blending enabled are rendered in non-deferred mode rather than deferred mode. This means issues with dynamic cubemaps on transparent materials are actually non-deferred rendering issues.

Learnt from: alandtse
PR: #577
File: features/Wetness Effects/Shaders/WetnessEffects/WetnessEffects.hlsli:57-61
Timestamp: 2025-06-17T05:40:22.785Z
Learning: Default parameter values are supported in the HLSL compiler used by the skyrim-community-shaders project, contrary to standard HLSL (FXC/DXC) limitations.

package/Shaders/Common/Permutation.hlsli (3)

Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.

Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.

Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-01T18:01:07.079Z
Learning: In the skyrim-community-shaders project, simple scalar constants in HLSL shaders use #define (e.g., #define NTHREADS 128), while more complex constants use static const within namespaces (e.g., Math namespace in Math.hlsli). For epsilon standardization, #define is the appropriate choice since epsilon values are simple scalar constants.

src/Features/SubsurfaceScattering.cpp (3)

Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.

Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.

Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.

src/Features/TerrainHelper.cpp (5)

Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.

Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.

Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.

Learnt from: alandtse
PR: #577
File: features/Wetness Effects/Shaders/WetnessEffects/WetnessEffects.hlsli:57-61
Timestamp: 2025-06-17T05:40:22.785Z
Learning: Default parameter values are supported in the HLSL compiler used by the skyrim-community-shaders project, contrary to standard HLSL (FXC/DXC) limitations.

Learnt from: davo0411
PR: #1070
File: src/State.cpp:79-83
Timestamp: 2025-05-30T11:44:15.542Z
Learning: In the Skyrim Community Shaders project, the smoothDrawCalls array in the State class is declared as type double in src/State.h, which is the correct floating-point type for performing exponential moving average smoothing calculations.

src/Hooks.cpp (6)

Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.

Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.

Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.

Learnt from: alandtse
PR: #577
File: features/Wetness Effects/Shaders/WetnessEffects/WetnessEffects.hlsli:57-61
Timestamp: 2025-06-17T05:40:22.785Z
Learning: Default parameter values are supported in the HLSL compiler used by the skyrim-community-shaders project, contrary to standard HLSL (FXC/DXC) limitations.

Learnt from: davo0411
PR: #1070
File: src/State.cpp:79-83
Timestamp: 2025-05-30T11:44:15.542Z
Learning: In the Skyrim Community Shaders project, the smoothDrawCalls array in the State class is declared as type double in src/State.h, which is the correct floating-point type for performing exponential moving average smoothing calculations.

Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-01T18:01:07.079Z
Learning: In the skyrim-community-shaders project, simple scalar constants in HLSL shaders use #define (e.g., #define NTHREADS 128), while more complex constants use static const within namespaces (e.g., Math namespace in Math.hlsli). For epsilon standardization, #define is the appropriate choice since epsilon values are simple scalar constants.

src/Deferred.h (5)

Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.

Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.

Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.

Learnt from: alandtse
PR: #577
File: features/Wetness Effects/Shaders/WetnessEffects/WetnessEffects.hlsli:57-61
Timestamp: 2025-06-17T05:40:22.785Z
Learning: Default parameter values are supported in the HLSL compiler used by the skyrim-community-shaders project, contrary to standard HLSL (FXC/DXC) limitations.

Learnt from: jiayev
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-18T15:21:03.641Z
Learning: In the skyrim-community-shaders rendering pipeline, materials with alpha < 1 or alpha blending enabled are rendered in non-deferred mode rather than deferred mode. This means issues with dynamic cubemaps on transparent materials are actually non-deferred rendering issues.

src/State.cpp (2)

Learnt from: davo0411
PR: #1070
File: src/State.cpp:79-83
Timestamp: 2025-05-30T11:44:15.542Z
Learning: In the Skyrim Community Shaders project, the smoothDrawCalls array in the State class is declared as type double in src/State.h, which is the correct floating-point type for performing exponential moving average smoothing calculations.

Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.

src/State.h (5)

Learnt from: davo0411
PR: #1070
File: src/State.cpp:79-83
Timestamp: 2025-05-30T11:44:15.542Z
Learning: In the Skyrim Community Shaders project, the smoothDrawCalls array in the State class is declared as type double in src/State.h, which is the correct floating-point type for performing exponential moving average smoothing calculations.

Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.

Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.

Learnt from: alandtse
PR: #577
File: features/Wetness Effects/Shaders/WetnessEffects/WetnessEffects.hlsli:57-61
Timestamp: 2025-06-17T05:40:22.785Z
Learning: Default parameter values are supported in the HLSL compiler used by the skyrim-community-shaders project, contrary to standard HLSL (FXC/DXC) limitations.

Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-01T18:01:07.079Z
Learning: In the skyrim-community-shaders project, simple scalar constants in HLSL shaders use #define (e.g., #define NTHREADS 128), while more complex constants use static const within namespaces (e.g., Math namespace in Math.hlsli). For epsilon standardization, #define is the appropriate choice since epsilon values are simple scalar constants.

src/Deferred.cpp (4)

Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.

Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.

Learnt from: jiayev
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-18T15:21:03.641Z
Learning: In the skyrim-community-shaders rendering pipeline, materials with alpha < 1 or alpha blending enabled are rendered in non-deferred mode rather than deferred mode. This means issues with dynamic cubemaps on transparent materials are actually non-deferred rendering issues.

Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.

🧬 Code Graph Analysis (1)
src/State.h (1)
src/State.cpp (4)
  • DrawDebug (58-97)
  • DrawDebug (58-58)
  • Debug (99-149)
  • Debug (99-99)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build plugin and addons
  • GitHub Check: Validate shader compilation (VR, .github/configs/shader-validation-vr.yaml)
  • GitHub Check: Validate shader compilation (Flatrim, .github/configs/shader-validation.yaml)
🔇 Additional comments (16)
package/Shaders/Common/Permutation.hlsli (1)

63-63: All IsTree references updated; no IsDecal usages remain

Search across C++ and shader files confirms:

  • src/State.h defines IsTree = 1 << 4
  • src/Hooks.cpp clears/sets State::ExtraShaderDescriptors::IsTree consistently
  • No hard-coded 1 << 5 related to tree detection
  • No occurrences of IsDecal in any source or shader files
src/Features/ExtendedTranslucency.cpp (1)

24-24: Consistent refactoring to use centralized permutationData structure.

The changes correctly update all references from currentExtraFeatureDescriptor to permutationData.ExtraFeatureDescriptor, maintaining the same bitfield manipulation logic while centralizing the state management.

Also applies to: 41-41, 49-49

src/Features/SubsurfaceScattering.cpp (1)

372-376: Good improvement: Explicit flag clearing prevents stale state.

The change to explicitly clear the IsBeastRace flag when the condition is false is a defensive programming improvement. This ensures the flag state is always correct and prevents potential issues from stale flag values persisting across render passes.

src/Deferred.h (2)

131-141: New hooks support permutation state management.

The addition of Main_RenderFirstPersonView and Renderer_ResetState hooks aligns with the PR's optimization objectives by providing better control over shader permutation state during different rendering phases.


153-155: Appropriate VR compatibility check.

Good practice to conditionally install the Main_RenderFirstPersonView hook only for non-VR builds, preventing potential compatibility issues.

src/Features/TerrainHelper.cpp (2)

124-158: Excellent BitScanForward optimization for batching shader resources!

This implementation efficiently reduces D3D11 API calls by:

  1. Using _BitScanForward to find contiguous set bits in the modification mask
  2. Batching consecutive texture updates into single PSSetShaderResources calls
  3. Early exit when no textures are modified

This optimization directly addresses the PR's performance goals and should provide measurable improvements in rendering efficiency.


190-193: Consistent migration to centralized permutationData.

The changes correctly update the extra feature descriptor manipulation to use the centralized permutationData.ExtraFeatureDescriptor, maintaining consistency with the refactoring across other files.

src/Hooks.cpp (3)

127-129: LGTM: Centralized shader descriptor storage improves performance.

The consolidation of vertex and pixel shader descriptors into the permutationData structure aligns well with the performance optimization goals. The masking operation correctly filters out shader-specific bits from the pixel descriptor, which should reduce unnecessary processing overhead.

Also applies to: 135-136


177-177: LGTM: Proper centralization of extra shader descriptor flags.

The migration from standalone currentExtraDescriptor to the centralized permutationData.ExtraShaderDescriptor is well-implemented. The bitwise OR operation correctly sets the EffectShadows flag while preserving existing bits, and the logic condition remains unchanged.


193-193: LGTM: Correct bitfield implementation for tree flag management.

The conversion from a boolean isTree flag to bitwise operations on the centralized permutationData.ExtraShaderDescriptor is well-implemented. The clear-first-then-conditionally-set pattern ensures proper state management, and the tree object detection logic remains unchanged. This change effectively contributes to the performance optimization by reducing separate flag tracking.

Also applies to: 198-198

src/TruePBR.cpp (1)

1591-1633: Excellent performance optimization using bit scanning and batching.

The new implementation represents a significant performance improvement over the previous individual slot approach:

  1. Efficient bit scanning: _BitScanForward efficiently finds the first dirty texture slot
  2. Smart batching: Consecutive dirty slots are batched into single PSSetShaderResources calls, reducing D3D11 API overhead
  3. Early exit optimization: The [[likely]] attribute and early return for empty masks optimizes the common case
  4. Correct bit manipulation: The consecutive bit counting and mask clearing logic is implemented correctly

This aligns perfectly with the PR's performance optimization goals by eliminating unnecessary API calls when setting shader resources.

src/State.h (2)

214-216: LGTM! Good refactoring to centralize permutation state.

Consolidating the shader descriptors into PermutationCB instances improves code organization and makes state comparisons more efficient.


152-159: All IsDecal references removed

Grep searches across C++ and HLSL/HLSLI files returned no matches for IsDecal. No remaining references were found, so this breaking change is fully applied and no further updates are needed.

src/State.cpp (3)

41-44: LGTM! Efficient permutation buffer update logic.

The use of the equality operator to compare the entire PermutationCB struct is more efficient and cleaner than checking individual fields.


99-149: Well-implemented performance tracking system.

The frame timing implementation using QueryPerformanceCounter and exponential moving average smoothing (95/5 split) is well designed. Good thread safety with mutex locking.


163-163: Good improvement to force permutation buffer update.

Using memset with 0xFF to invalidate the cached data is more reliable than a boolean flag, as it guarantees the comparison will fail and trigger an update.

Comment thread src/State.cpp Outdated
Comment thread src/State.cpp Outdated
Comment thread src/State.h
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jul 21, 2025

✅ A pre-release build is available for this PR:
Download

Copy link
Copy Markdown
Collaborator

@alandtse alandtse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one nit from me.

Comment thread package/Shaders/Lighting.hlsl Outdated
@doodlum doodlum merged commit 194a4c7 into dev Jul 23, 2025
10 of 11 checks passed
This was referenced Aug 11, 2025
@coderabbitai coderabbitai Bot mentioned this pull request Sep 4, 2025
@coderabbitai coderabbitai Bot mentioned this pull request Dec 28, 2025
@alandtse alandtse deleted the messy-opts branch February 6, 2026 05:20
@coderabbitai coderabbitai Bot mentioned this pull request Feb 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants