perf: reduce overhead of hot functions #1290
Conversation
Automated formatting by clang-format, prettier, and other hooks. See https://pre-commit.ci for details.
|
""" WalkthroughThis update centralizes and refactors shader permutation and feature descriptor tracking by consolidating related flags and variables into unified bitfields within a Changes
Sequence Diagram(s)sequenceDiagram
participant GameLoop
participant State
participant Deferred
participant Shader
participant TruePBR
GameLoop->>State: Draw()
State->>State: Compare permutationData with permutationDataPrevious
alt Data changed
State->>Shader: Update permutation constant buffer
end
State->>Deferred: CopyShadowData() if utility shader
State->>State: Reset updateShader
GameLoop->>State: DrawDebug()
State->>State: Setup shader resources
State->>State: Update permutation buffer if needed
State->>State: Debug()
State->>Deferred: CopyShadowData() if utility shader
State->>State: Reset updateShader
State->>State: Debug()
State->>State: Smooth frame/draw call counters
State->>State: Track timings and draw call counts
State->>State: Emit performance events if enabled
Estimated code review effort4 (~90 minutes) Possibly related PRs
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: .coderabbit.yaml 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
Automated formatting by clang-format, prettier, and other hooks. See https://pre-commit.ci for details.
Automated formatting by clang-format, prettier, and other hooks. See https://pre-commit.ci for details.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
src/State.h (1)
84-85: Add documentation for new public methods.These new methods lack documentation explaining their purpose. Based on the implementation,
DrawDebug()performs shader updates with debug instrumentation, whileDebug()handles frame timing and performance metrics.Consider adding documentation:
+ // Performs shader updates with debug instrumentation enabled void DrawDebug(); + // Updates frame timing metrics and draw call statistics void Debug();src/Deferred.cpp (3)
769-777: Consider exception safety for bit clearing.The current implementation clears the bit after the function call, but if
func(a1)throws an exception, the bit won't be cleared. Also, maintaining bothstate->inWorldand the bit inExtraShaderDescriptorseems redundant.Consider using RAII or consolidating the state:
void Deferred::Hooks::Main_RenderWorld::thunk(bool a1) { auto* const state = globals::state; - state->permutationData.ExtraShaderDescriptor |= (uint32_t)State::ExtraShaderDescriptors::InWorld; - state->inWorld = true; - func(a1); - state->inWorld = false; - state->permutationData.ExtraShaderDescriptor &= ~(uint32_t)State::ExtraShaderDescriptors::InWorld; + struct InWorldGuard { + State* state; + InWorldGuard(State* s) : state(s) { + state->permutationData.ExtraShaderDescriptor |= (uint32_t)State::ExtraShaderDescriptors::InWorld; + state->inWorld = true; + } + ~InWorldGuard() { + state->inWorld = false; + state->permutationData.ExtraShaderDescriptor &= ~(uint32_t)State::ExtraShaderDescriptors::InWorld; + } + } guard(state); + func(a1); };
809-818: Apply consistent exception safety pattern.This hook has the same exception safety issue as
Main_RenderWorld. If the pattern suggested there is adopted, apply it here too for consistency.
828-841: Document the purpose and buffer slot assignments.This new hook sets specific constant buffer slots but lacks documentation about why these specific slots (4-6 for PS, 5-6 for CS) are used.
Add a comment explaining the hook's purpose:
void Deferred::Hooks::Renderer_ResetState::thunk(void* This) { func(This); + // Reset shader constant buffers after renderer state reset + // PS slots: 4=permutation, 5=sharedData, 6=featureData + // CS slots: 5=sharedData, 6=featureData auto* const state = globals::state; auto* const context = globals::d3d::context;
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
package/Shaders/Lighting.hlslis excluded by!**/*.hlsl
📒 Files selected for processing (10)
package/Shaders/Common/Permutation.hlsli(1 hunks)src/Deferred.cpp(2 hunks)src/Deferred.h(2 hunks)src/Features/ExtendedTranslucency.cpp(2 hunks)src/Features/SubsurfaceScattering.cpp(1 hunks)src/Features/TerrainHelper.cpp(2 hunks)src/Hooks.cpp(3 hunks)src/State.cpp(2 hunks)src/State.h(4 hunks)src/TruePBR.cpp(1 hunks)
🧰 Additional context used
🧠 Learnings (10)
📓 Common learnings
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-06-24T07:17:36.604Z
Learning: When reviewing PRs, always clarify the scope if there are multiple related features or dependencies. WeatherPicker was a separate PR that was already merged, while this PR focuses specifically on WetnessEffects climate preset system enhancements.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use `uint` type instead of `bool` for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use `uint` type instead of `bool` for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#577
File: features/Wetness Effects/Shaders/WetnessEffects/WetnessEffects.hlsli:57-61
Timestamp: 2025-06-17T05:40:22.785Z
Learning: Default parameter values are supported in the HLSL compiler used by the skyrim-community-shaders project, contrary to standard HLSL (FXC/DXC) limitations.
src/Features/ExtendedTranslucency.cpp (3)
Learnt from: alandtse
PR: #1157
File: src/Feature.cpp:42-49
Timestamp: 2025-06-17T09:27:49.594Z
Learning: In src/Feature.cpp, when an obsolete feature's INI file is deleted, the feature should be silently disabled without surfacing any issues to the user. This is the intended behavior because a deleted INI file for an obsolete feature indicates that the user has properly cleaned up the obsolete feature.
Learnt from: jiayev
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-18T15:21:03.641Z
Learning: In the skyrim-community-shaders rendering pipeline, materials with alpha < 1 or alpha blending enabled are rendered in non-deferred mode rather than deferred mode. This means issues with dynamic cubemaps on transparent materials are actually non-deferred rendering issues.
Learnt from: alandtse
PR: #577
File: features/Wetness Effects/Shaders/WetnessEffects/WetnessEffects.hlsli:57-61
Timestamp: 2025-06-17T05:40:22.785Z
Learning: Default parameter values are supported in the HLSL compiler used by the skyrim-community-shaders project, contrary to standard HLSL (FXC/DXC) limitations.
package/Shaders/Common/Permutation.hlsli (3)
Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.
Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-01T18:01:07.079Z
Learning: In the skyrim-community-shaders project, simple scalar constants in HLSL shaders use #define (e.g., #define NTHREADS 128), while more complex constants use static const within namespaces (e.g., Math namespace in Math.hlsli). For epsilon standardization, #define is the appropriate choice since epsilon values are simple scalar constants.
src/Features/SubsurfaceScattering.cpp (3)
Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.
Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.
src/Features/TerrainHelper.cpp (5)
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.
Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.
Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.
Learnt from: alandtse
PR: #577
File: features/Wetness Effects/Shaders/WetnessEffects/WetnessEffects.hlsli:57-61
Timestamp: 2025-06-17T05:40:22.785Z
Learning: Default parameter values are supported in the HLSL compiler used by the skyrim-community-shaders project, contrary to standard HLSL (FXC/DXC) limitations.
Learnt from: davo0411
PR: #1070
File: src/State.cpp:79-83
Timestamp: 2025-05-30T11:44:15.542Z
Learning: In the Skyrim Community Shaders project, the smoothDrawCalls array in the State class is declared as type double in src/State.h, which is the correct floating-point type for performing exponential moving average smoothing calculations.
src/Hooks.cpp (6)
Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.
Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.
Learnt from: alandtse
PR: #577
File: features/Wetness Effects/Shaders/WetnessEffects/WetnessEffects.hlsli:57-61
Timestamp: 2025-06-17T05:40:22.785Z
Learning: Default parameter values are supported in the HLSL compiler used by the skyrim-community-shaders project, contrary to standard HLSL (FXC/DXC) limitations.
Learnt from: davo0411
PR: #1070
File: src/State.cpp:79-83
Timestamp: 2025-05-30T11:44:15.542Z
Learning: In the Skyrim Community Shaders project, the smoothDrawCalls array in the State class is declared as type double in src/State.h, which is the correct floating-point type for performing exponential moving average smoothing calculations.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-01T18:01:07.079Z
Learning: In the skyrim-community-shaders project, simple scalar constants in HLSL shaders use #define (e.g., #define NTHREADS 128), while more complex constants use static const within namespaces (e.g., Math namespace in Math.hlsli). For epsilon standardization, #define is the appropriate choice since epsilon values are simple scalar constants.
src/Deferred.h (5)
Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.
Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.
Learnt from: alandtse
PR: #577
File: features/Wetness Effects/Shaders/WetnessEffects/WetnessEffects.hlsli:57-61
Timestamp: 2025-06-17T05:40:22.785Z
Learning: Default parameter values are supported in the HLSL compiler used by the skyrim-community-shaders project, contrary to standard HLSL (FXC/DXC) limitations.
Learnt from: jiayev
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-18T15:21:03.641Z
Learning: In the skyrim-community-shaders rendering pipeline, materials with alpha < 1 or alpha blending enabled are rendered in non-deferred mode rather than deferred mode. This means issues with dynamic cubemaps on transparent materials are actually non-deferred rendering issues.
src/State.cpp (2)
Learnt from: davo0411
PR: #1070
File: src/State.cpp:79-83
Timestamp: 2025-05-30T11:44:15.542Z
Learning: In the Skyrim Community Shaders project, the smoothDrawCalls array in the State class is declared as type double in src/State.h, which is the correct floating-point type for performing exponential moving average smoothing calculations.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.
src/State.h (5)
Learnt from: davo0411
PR: #1070
File: src/State.cpp:79-83
Timestamp: 2025-05-30T11:44:15.542Z
Learning: In the Skyrim Community Shaders project, the smoothDrawCalls array in the State class is declared as type double in src/State.h, which is the correct floating-point type for performing exponential moving average smoothing calculations.
Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.
Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.
Learnt from: alandtse
PR: #577
File: features/Wetness Effects/Shaders/WetnessEffects/WetnessEffects.hlsli:57-61
Timestamp: 2025-06-17T05:40:22.785Z
Learning: Default parameter values are supported in the HLSL compiler used by the skyrim-community-shaders project, contrary to standard HLSL (FXC/DXC) limitations.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-01T18:01:07.079Z
Learning: In the skyrim-community-shaders project, simple scalar constants in HLSL shaders use #define (e.g., #define NTHREADS 128), while more complex constants use static const within namespaces (e.g., Math namespace in Math.hlsli). For epsilon standardization, #define is the appropriate choice since epsilon values are simple scalar constants.
src/Deferred.cpp (4)
Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent size, alignment, and cross-platform compatibility when passing data to shader constant buffers.
Learnt from: alandtse
PR: #577
File: src/Features/WetnessEffects.h:36-36
Timestamp: 2025-06-08T11:25:14.536Z
Learning: In the skyrim-community-shaders project, boolean flags in C++ structs that interface with HLSL shaders use uint type instead of bool for compatibility reasons. This ensures consistent 4-byte size, proper 16-byte alignment in constant buffers, and cross-platform compatibility when passing data between C++ and HLSL shaders.
Learnt from: jiayev
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-18T15:21:03.641Z
Learning: In the skyrim-community-shaders rendering pipeline, materials with alpha < 1 or alpha blending enabled are rendered in non-deferred mode rather than deferred mode. This means issues with dynamic cubemaps on transparent materials are actually non-deferred rendering issues.
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.
🧬 Code Graph Analysis (1)
src/State.h (1)
src/State.cpp (4)
DrawDebug(58-97)DrawDebug(58-58)Debug(99-149)Debug(99-99)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Build plugin and addons
- GitHub Check: Validate shader compilation (VR, .github/configs/shader-validation-vr.yaml)
- GitHub Check: Validate shader compilation (Flatrim, .github/configs/shader-validation.yaml)
🔇 Additional comments (16)
package/Shaders/Common/Permutation.hlsli (1)
63-63: AllIsTreereferences updated; noIsDecalusages remainSearch across C++ and shader files confirms:
- src/State.h defines
IsTree = 1 << 4- src/Hooks.cpp clears/sets
State::ExtraShaderDescriptors::IsTreeconsistently- No hard-coded
1 << 5related to tree detection- No occurrences of
IsDecalin any source or shader filessrc/Features/ExtendedTranslucency.cpp (1)
24-24: Consistent refactoring to use centralized permutationData structure.The changes correctly update all references from
currentExtraFeatureDescriptortopermutationData.ExtraFeatureDescriptor, maintaining the same bitfield manipulation logic while centralizing the state management.Also applies to: 41-41, 49-49
src/Features/SubsurfaceScattering.cpp (1)
372-376: Good improvement: Explicit flag clearing prevents stale state.The change to explicitly clear the
IsBeastRaceflag when the condition is false is a defensive programming improvement. This ensures the flag state is always correct and prevents potential issues from stale flag values persisting across render passes.src/Deferred.h (2)
131-141: New hooks support permutation state management.The addition of
Main_RenderFirstPersonViewandRenderer_ResetStatehooks aligns with the PR's optimization objectives by providing better control over shader permutation state during different rendering phases.
153-155: Appropriate VR compatibility check.Good practice to conditionally install the
Main_RenderFirstPersonViewhook only for non-VR builds, preventing potential compatibility issues.src/Features/TerrainHelper.cpp (2)
124-158: Excellent BitScanForward optimization for batching shader resources!This implementation efficiently reduces D3D11 API calls by:
- Using
_BitScanForwardto find contiguous set bits in the modification mask- Batching consecutive texture updates into single
PSSetShaderResourcescalls- Early exit when no textures are modified
This optimization directly addresses the PR's performance goals and should provide measurable improvements in rendering efficiency.
190-193: Consistent migration to centralized permutationData.The changes correctly update the extra feature descriptor manipulation to use the centralized
permutationData.ExtraFeatureDescriptor, maintaining consistency with the refactoring across other files.src/Hooks.cpp (3)
127-129: LGTM: Centralized shader descriptor storage improves performance.The consolidation of vertex and pixel shader descriptors into the
permutationDatastructure aligns well with the performance optimization goals. The masking operation correctly filters out shader-specific bits from the pixel descriptor, which should reduce unnecessary processing overhead.Also applies to: 135-136
177-177: LGTM: Proper centralization of extra shader descriptor flags.The migration from standalone
currentExtraDescriptorto the centralizedpermutationData.ExtraShaderDescriptoris well-implemented. The bitwise OR operation correctly sets theEffectShadowsflag while preserving existing bits, and the logic condition remains unchanged.
193-193: LGTM: Correct bitfield implementation for tree flag management.The conversion from a boolean
isTreeflag to bitwise operations on the centralizedpermutationData.ExtraShaderDescriptoris well-implemented. The clear-first-then-conditionally-set pattern ensures proper state management, and the tree object detection logic remains unchanged. This change effectively contributes to the performance optimization by reducing separate flag tracking.Also applies to: 198-198
src/TruePBR.cpp (1)
1591-1633: Excellent performance optimization using bit scanning and batching.The new implementation represents a significant performance improvement over the previous individual slot approach:
- Efficient bit scanning:
_BitScanForwardefficiently finds the first dirty texture slot- Smart batching: Consecutive dirty slots are batched into single
PSSetShaderResourcescalls, reducing D3D11 API overhead- Early exit optimization: The
[[likely]]attribute and early return for empty masks optimizes the common case- Correct bit manipulation: The consecutive bit counting and mask clearing logic is implemented correctly
This aligns perfectly with the PR's performance optimization goals by eliminating unnecessary API calls when setting shader resources.
src/State.h (2)
214-216: LGTM! Good refactoring to centralize permutation state.Consolidating the shader descriptors into
PermutationCBinstances improves code organization and makes state comparisons more efficient.
152-159: AllIsDecalreferences removedGrep searches across C++ and HLSL/HLSLI files returned no matches for
IsDecal. No remaining references were found, so this breaking change is fully applied and no further updates are needed.src/State.cpp (3)
41-44: LGTM! Efficient permutation buffer update logic.The use of the equality operator to compare the entire
PermutationCBstruct is more efficient and cleaner than checking individual fields.
99-149: Well-implemented performance tracking system.The frame timing implementation using QueryPerformanceCounter and exponential moving average smoothing (95/5 split) is well designed. Good thread safety with mutex locking.
163-163: Good improvement to force permutation buffer update.Using
memsetwith 0xFF to invalidate the cached data is more reliable than a boolean flag, as it guarantees the comparison will fail and trigger an update.
|
✅ A pre-release build is available for this PR: |
Automated formatting by clang-format, prettier, and other hooks. See https://pre-commit.ci for details.
Automated formatting by clang-format, prettier, and other hooks. See https://pre-commit.ci for details.
Automated formatting by clang-format, prettier, and other hooks. See https://pre-commit.ci for details.
Automated formatting by clang-format, prettier, and other hooks. See https://pre-commit.ci for details.
Automated formatting by clang-format, prettier, and other hooks. See https://pre-commit.ci for details.
Extreme optimisations for SetDirtyStates to effectively remove all overhead from having that hook on.
Optimisations for Terrain Helper and True PBR PSSetShaderResources.
Constant buffers are set separately instead of being checked every object render.
Restored permutation buffer masking to reduce number of uploads.
Permutation buffer optimised for best vectorisation.
Bitmasks are not set to 0 each time, and instead are disabled/enabled by associated code. Only triggers an upload if actually needed.
RTTI optimisations.
Optimised GetCurrentAccumulator by accessing the address directly.
Optimised SetupPointLights.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Chores