Skip to content

perf(shadercache): use faster clock#1275

Merged
alandtse merged 4 commits into
devfrom
fast-clock
Jul 17, 2025
Merged

perf(shadercache): use faster clock#1275
alandtse merged 4 commits into
devfrom
fast-clock

Conversation

@doodlum
Copy link
Copy Markdown
Collaborator

@doodlum doodlum commented Jul 16, 2025

Summary by CodeRabbit

  • Refactor

    • Improved timing accuracy for shader compilation tracking and ETA estimation by switching to Windows high-resolution performance counters.
    • Updated internal timing mechanisms for better performance and reliability on Windows systems.
  • Style

    • Minor parameter name adjustment for consistency in time-related methods.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jul 16, 2025

Walkthrough

The changes replace the use of C++'s std::chrono timing mechanisms with Windows-specific high-resolution timing via QueryPerformanceCounter and related APIs in the CompilationSet class. Member variables and time calculations are updated accordingly, and a constructor is added to initialize the new timing variables.

Changes

File(s) Change Summary
src/ShaderCache.cpp Replaced all uses of std::chrono::high_resolution_clock with Windows QueryPerformanceCounter for timing.
src/ShaderCache.h Added timing members (LARGE_INTEGER), a constructor, and removed std::chrono-based members. Minor parameter rename in GetHumanTime.

Poem

In the cache where shaders wait,
A clock once ticked, but now it's fate—
Windows counters count with care,
Measuring time with extra flair.
Goodbye chrono, hello speed,
High-res timing—just what we need!
🕰️✨


📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 97e9798 and 577e687.

📒 Files selected for processing (1)
  • src/ShaderCache.cpp (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/ShaderCache.cpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build plugin and addons
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions
Copy link
Copy Markdown

Using provided base ref: 2c9e3c4
Using base ref: 2c9e3c4
Base commit date: 2025-07-16T17:39:38+01:00 (Wednesday, July 16, 2025 05:39 PM)
No actionable suggestions for changed features.

Automated formatting by clang-format, prettier, and other hooks.
See https://pre-commit.ci for details.
@alandtse alandtse marked this pull request as ready for review July 16, 2025 18:02
Copilot AI review requested due to automatic review settings July 16, 2025 18:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR migrates the ShaderCache timing system from C++ std::chrono to Windows QueryPerformanceCounter API for improved performance and precision in time measurements during shader compilation.

  • Replaces std::chrono::steady_clock with Windows QueryPerformanceCounter API
  • Updates time calculation logic to use high-resolution performance counters
  • Standardizes parameter naming from a_totalms to a_totalMs

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/ShaderCache.h Replaces chrono member variables with LARGE_INTEGER performance counter fields and adds constructor for initialization
src/ShaderCache.cpp Updates all timing calculations to use QueryPerformanceCounter and converts to milliseconds when needed

Comment thread src/ShaderCache.cpp
Comment thread src/ShaderCache.cpp

double CompilationSet::GetEta()
{
double totalMs = static_cast<double>(totalTime.QuadPart) * 1000.0 / frequency.QuadPart;
Copy link

Copilot AI Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conversion from performance counter to milliseconds is duplicated in GetEta() and GetStatsString(). Consider extracting this calculation into a private helper method to reduce code duplication.

Copilot uses AI. Check for mistakes.
@alandtse
Copy link
Copy Markdown
Collaborator

@coderabbitai can you investigate if there are other parts of the code base that would benefit from a similar change in the timer and if so, create a new issue describing the refactor work, leave it unassigned, but tag as a good first issue and help wanted?

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jul 16, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@alandtse I'll investigate the codebase for other timing usage that could benefit from similar performance improvements. Let me search for std::chrono usage patterns.


🏁 Script executed:

#!/bin/bash
# Search for std::chrono timing usage in the codebase
echo "=== Searching for std::chrono high_resolution_clock usage ==="
rg "high_resolution_clock" -A 3 -B 3

echo -e "\n=== Searching for std::chrono steady_clock usage ==="
rg "steady_clock" -A 3 -B 3

echo -e "\n=== Searching for std::chrono system_clock usage ==="
rg "system_clock" -A 3 -B 3

echo -e "\n=== Searching for duration_cast usage ==="
rg "duration_cast" -A 2 -B 2

echo -e "\n=== Searching for std::chrono namespace usage ==="
rg "using namespace std::chrono" -A 5 -B 2

echo -e "\n=== Searching for performance-related timing patterns ==="
rg "(elapsed|duration|timer|timing|benchmark|perf)" -i --type cpp -A 2 -B 2

Length of output: 165280


Based on my analysis of the codebase, I found several areas that could benefit from similar high-resolution timing optimizations. Let me create a GitHub issue to document this refactor opportunity.


🏁 Script executed:

#!/bin/bash
# Create a GitHub issue for timer performance optimization
gh issue create \
  --title "perf: optimize std::chrono timing to QueryPerformanceCounter in performance-critical paths" \
  --body "## Overview
This issue tracks opportunities to replace std::chrono timing mechanisms with Windows QueryPerformanceCounter for better performance in timing-critical code paths, following the pattern established in PR #1275.

## Background
PR #1275 successfully replaced std::chrono::high_resolution_clock with QueryPerformanceCounter in ShaderCache.cpp for shader compilation timing, demonstrating improved performance characteristics.

## Identified Opportunities

### Primary Target: State.cpp Frame Timing (High Priority)
**Location**: \`src/State.cpp\` and \`src/State.h\`
**Current Implementation**: Uses \`std::chrono::high_resolution_clock\` for per-frame shader timing measurement
**Impact**: Performance-critical - measures timing for each shader type during rendering

Key code locations:
- \`frameStartTime\` member variable (line ~87 in State.h)
- Frame timing initialization in \`UpdatePerFrameData()\` (line ~240 in State.cpp)
- Per-shader timing measurement in \`UpdatePerFrameData()\` (line ~315 in State.cpp)

### Secondary Targets:

#### ABTesting Timer (Medium Priority)
**Location**: \`src/Features/PerformanceOverlay/ABTesting/ABTesting.cpp\`
**Current Implementation**: Uses \`std::chrono::high_resolution_clock\` for A/B test interval timing
**Impact**: Used for performance testing intervals

#### PerformanceOverlay Update Timer (Low Priority)
**Location**: \`src/Features/PerformanceOverlay.cpp\`
**Current Implementation**: Uses \`std::chrono::steady_clock\` for overlay update intervals
**Impact**: Less critical but part of performance monitoring system

## Implementation Guidelines

Follow the pattern from PR #1275:
1. Add \`LARGE_INTEGER\` members for frequency, last measurement, and accumulated time
2. Use \`QueryPerformanceFrequency()\` for initialization
3. Use \`QueryPerformanceCounter()\` for measurements
4. Convert to milliseconds using: \`time.QuadPart * 1000.0 / frequency.QuadPart\`

## Files to Modify
- \`src/State.h\` - Update timing member variables
- \`src/State.cpp\` - Update timing calculations
- Potentially \`src/Features/PerformanceOverlay/ABTesting/ABTesting.cpp\` and \`src/Features/PerformanceOverlay.cpp\` for secondary targets

## Testing
- Verify frame timing accuracy is maintained
- Test performance impact on shader performance measurement
- Ensure no regression in A/B testing functionality (if addressed)

## Notes
- Focus on State.cpp frame timing first as it has the highest performance impact
- This change is Windows-specific but the project already uses Windows APIs extensively
- Consider the existing usage of QueryPerformanceCounter in Upscaling.cpp and DX12SwapChain.cpp as reference implementations" \
  --assignee "" \
  --label "good first issue,help wanted,performance,refactor"

Length of output: 2883


Let me try creating the issue again with the available labels:


🏁 Script executed:

#!/bin/bash
# Create the GitHub issue with correct labels
gh issue create \
  --title "perf: optimize std::chrono timing to QueryPerformanceCounter in performance-critical paths" \
  --body "## Overview
This issue tracks opportunities to replace std::chrono timing mechanisms with Windows QueryPerformanceCounter for better performance in timing-critical code paths, following the pattern established in PR #1275.

## Background
PR #1275 successfully replaced std::chrono::high_resolution_clock with QueryPerformanceCounter in ShaderCache.cpp for shader compilation timing, demonstrating improved performance characteristics.

## Identified Opportunities

### Primary Target: State.cpp Frame Timing (High Priority)
**Location**: \`src/State.cpp\` and \`src/State.h\`
**Current Implementation**: Uses \`std::chrono::high_resolution_clock\` for per-frame shader timing measurement
**Impact**: Performance-critical - measures timing for each shader type during rendering

Key code locations:
- \`frameStartTime\` member variable (line ~87 in State.h)
- Frame timing initialization in \`UpdatePerFrameData()\` (line ~240 in State.cpp)
- Per-shader timing measurement in \`UpdatePerFrameData()\` (line ~315 in State.cpp)

### Secondary Targets:

#### ABTesting Timer (Medium Priority)
**Location**: \`src/Features/PerformanceOverlay/ABTesting/ABTesting.cpp\`
**Current Implementation**: Uses \`std::chrono::high_resolution_clock\` for A/B test interval timing
**Impact**: Used for performance testing intervals

#### PerformanceOverlay Update Timer (Low Priority)
**Location**: \`src/Features/PerformanceOverlay.cpp\`
**Current Implementation**: Uses \`std::chrono::steady_clock\` for overlay update intervals
**Impact**: Less critical but part of performance monitoring system

## Implementation Guidelines

Follow the pattern from PR #1275:
1. Add \`LARGE_INTEGER\` members for frequency, last measurement, and accumulated time
2. Use \`QueryPerformanceFrequency()\` for initialization
3. Use \`QueryPerformanceCounter()\` for measurements
4. Convert to milliseconds using: \`time.QuadPart * 1000.0 / frequency.QuadPart\`

## Files to Modify
- \`src/State.h\` - Update timing member variables
- \`src/State.cpp\` - Update timing calculations
- Potentially \`src/Features/PerformanceOverlay/ABTesting/ABTesting.cpp\` and \`src/Features/PerformanceOverlay.cpp\` for secondary targets

## Testing
- Verify frame timing accuracy is maintained
- Test performance impact on shader performance measurement
- Ensure no regression in A/B testing functionality (if addressed)

## Notes
- Focus on State.cpp frame timing first as it has the highest performance impact
- This change is Windows-specific but the project already uses Windows APIs extensively
- Consider the existing usage of QueryPerformanceCounter in Upscaling.cpp and DX12SwapChain.cpp as reference implementations" \
  --label "good first issue,help wanted"

Length of output: 2870


@alandtse Perfect! I've successfully created GitHub issue #1276 documenting additional timer optimization opportunities.

After analyzing the codebase, I identified several areas that could benefit from similar QueryPerformanceCounter optimizations as implemented in this PR:

Primary Target (High Priority):

  • State.cpp frame timing - Currently uses std::chrono::high_resolution_clock for per-frame shader timing measurement. This is performance-critical as it measures timing for each shader type during rendering.

Secondary Targets:

  • ABTesting timer - Uses std::chrono::high_resolution_clock for A/B test interval timing
  • PerformanceOverlay update timer - Uses std::chrono::steady_clock for overlay update intervals

The issue has been tagged as "good first issue" and "help wanted" as requested, and left unassigned. It includes detailed implementation guidelines following the pattern established in PR #1275, specific file locations, and testing recommendations.

The State.cpp frame timing optimization would likely provide the most significant performance benefit since it's used for real-time per-frame shader performance measurement, similar to the shader compilation timing that was optimized in this PR.


🧠 Learnings used
Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-06-29T00:27:42.717Z
Learning: When reviewing PRs, the committer's description should inform analysis of changes and may take precedence to conclusions when reasonable, but the choice of commit type should still be independently evaluated for compliance with conventional commits. Both committer intent and conventional commit standards should be considered to determine if the classification is defensible.

Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-06-24T07:17:36.604Z
Learning: When reviewing PRs, always clarify the scope if there are multiple related features or dependencies. WeatherPicker was a separate PR that was already merged, while this PR focuses specifically on WetnessEffects climate preset system enhancements.

Learnt from: alandtse
PR: doodlum/skyrim-community-shaders#0
File: :0-0
Timestamp: 2025-07-05T05:20:45.823Z
Learning: In the skyrim-community-shaders repository, file deletion error handling improvements that replace existence checks and try-catch blocks with std::filesystem::remove error-code-based approaches are considered bug fixes rather than refactoring, as they address inadequate error handling and misleading log messages.

@alandtse alandtse marked this pull request as draft July 16, 2025 18:08
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jul 16, 2025

✅ A pre-release build is available for this PR:
Download

alandtse and others added 2 commits July 16, 2025 22:52
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Automated formatting by clang-format, prettier, and other hooks.
See https://pre-commit.ci for details.
@alandtse alandtse marked this pull request as ready for review July 17, 2025 06:04
@alandtse
Copy link
Copy Markdown
Collaborator

Tested in game. I'm not sure this buys us much since it's just the cache compilation measuring.

@alandtse alandtse merged commit 72ebc2a into dev Jul 17, 2025
25 checks passed
alandtse added a commit to alandtse/open-shaders that referenced this pull request Jul 20, 2025
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Alan Tse <alandtse@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@alandtse alandtse deleted the fast-clock branch February 6, 2026 05:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants