ci: eliminate VS install from shader validation, add parallel hlslkit#2394
Draft
doodlum wants to merge 3 commits into
Draft
ci: eliminate VS install from shader validation, add parallel hlslkit#2394doodlum wants to merge 3 commits into
doodlum wants to merge 3 commits into
Conversation
…ompile **Shader validation (~8-18 min saved per run)** The shader-validation job previously ran setup-build-environment (VS install, vcpkg, cmake configure) just to copy shader files into the AIO layout. Replace the cmake-based prepare_shaders target with a direct robocopy step that needs no C++ toolchain at all. - prepare-shaders action now copies with robocopy instead of cmake - shader-validation jobs skip submodule checkout (not needed for HLSL) - shader-validation jobs drop the setup-build-environment step entirely - fxc.exe is already in the Windows SDK on windows-2025 runners **Parallel shader compilation (~30-50% faster validation)** - Add --jobs $(nproc) --strip-debug-defines to hlslkit-compile so all available CPU cores are used and per-shader debug overhead is dropped **hlslkit pip caching** - Pin hlslkit via .github/configs/hlslkit-requirements.txt - Cache pip wheels with setup-python@v6 so repeated runs skip the git clone + build of hlslkit **CMake cache key: stop hashing 79 MB of extern/** - Use .git/modules/*/HEAD refs (3 small files) instead of extern/** to detect submodule SHA changes; same correctness, negligible I/O **FastDev build preset (local dev)** - New ALL-FastDev configure preset: ZIP_TO_DIST=OFF, AIO_ZIP_TO_DIST=OFF, BUILD_SHADER_TESTS=OFF for a leaner configure step - New FastDev build preset: RelWithDebInfo config → no LTO/LTCG, 3-4x faster link step while keeping /O2 optimisation - XSEPlugin.cmake: add CMAKE_INTERPROCEDURAL_OPTIMIZATION_RELWITHDEBINFO=OFF so cmake does not apply LTO to RelWithDebInfo even with IPO globally ON **BuildRelease.bat: add --build-only flag** - Skip cmake configure on repeated local builds when project files have not changed: BuildRelease.bat --build-only [preset] **tools/benchmark-build.ps1** - New script that times configure / compile / package / shader validation separately so each phase can be measured and compared across presets https://claude.ai/code/session_01SWzRWgwr9UexPxa9hLrpmq
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
No actionable suggestions for changed features. |
Allows testing workflow and action file changes on a feature branch without merging to dev first. workflow_dispatch resolves all files (including .github/actions/*) from the selected ref, unlike pull_request_target which always uses the base branch. Trigger via: Actions → Manual CI Test → Run workflow → select branch.
The comment said 'pinned by commit hash' but the actual URL had no @<hash>. Without a pin, the pip wheel cache key never changes even when hlslkit is updated upstream — cached runs silently use stale wheels. Pin to current HEAD (d30266d).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Benchmarked and optimized the three CI legs that run on every HLSL/C++ PR. The dominant win is restructuring the shader-validation jobs to skip the VS Build Tools install and cmake configure entirely — they only needed those to copy shader files.
Measured baseline
Timings pulled from the last three real PRs that triggered shader validation (#2344, #2363, #2386):
Where the shader-validation time goes (before)
Each of the two validation jobs (Flatrim, VR) ran this setup sequence before doing any actual validation:
actions/checkoutwithsubmodules: recursive(79 MB extern/)prepare_shaderstarget (just copies files)pip install git+…hlslkit(no cache)hlslkit-compile(single-core, 212 permutations)Changes and expected improvement
1 — Remove VS install + cmake from shader-validation jobs (
_shared-build.yaml,prepare-shaders/action.yml)The
prepare_shaderscmake target only copies HLSL files frompackage/Shaders/andfeatures/*/Shaders/intobuild/ALL/aio/Shaders/. There is no C++ compilation. Replace with a directrobocopystep that runs in ~30 s and needs no toolchain at all.setup-build-environment(VS + vcpkg + cmake) from both validation jobssubmodules: recursive; the extern/ C++ deps are irrelevant herefxc.exeships with the Windows SDK already onwindows-2025runnersExpected saving: ~12–17 min per validation job.
2 — Parallel shader compilation (
_shared-build.yaml)windows-2025runners have 4 logical cores.--jobs $(nproc)parallelises the 212-permutation fxc.exe invocations.--strip-debug-definesremoves the per-shader/Zidebug-info overhead.Expected saving: ~6–9 min (from ~12 min at 1 core to ~4–5 min at 4 cores).
3 — pip caching for hlslkit (
prepare-shaders/action.yml,.github/configs/hlslkit-requirements.txt)Previously:
pip install git+https://github.com/alandtse/hlslkit.giton every run (fresh git clone + build, no cache).Now: pin via
hlslkit-requirements.txt, cache built wheel withsetup-python@v6 cache: pip.Expected saving: ~1–2 min on warm cache runs.
4 — CMake cache key: stop hashing 79 MB of extern/ (
setup-build-environment/action.yaml)Three 41-byte HEAD files instead of scanning the entire extern/ tree. Same correctness (detects any submodule SHA change), negligible I/O. Relevant for cpp-build and shader-unit-tests jobs.
5 — FastDev local build preset (
CMakePresets.json,cmake/XSEPlugin.cmake)New
FastDevbuild preset:RelWithDebInfoconfig (keeps/O2) with LTO/LTCG disabled, no zip packaging. Linker is ~3–4× faster for rapid iteration:XSEPlugin.cmakegainsCMAKE_INTERPROCEDURAL_OPTIMIZATION_RELWITHDEBINFO OFFso cmake doesn't silently re-enable LTO for that config even with the global IPO flag set.6 —
BuildRelease.bat --build-onlySkip cmake reconfigure on repeat local builds when project files haven't changed:
7 —
tools/benchmark-build.ps1Measures configure / compile / package / shader-validation times individually so each phase can be compared before/after any future build-system change:
After timings
Test plan
windows-2025FastDevpreset builds locally on Windows (manual, requires VS)https://claude.ai/code/session_01SWzRWgwr9UexPxa9hLrpmq
Generated by Claude Code