Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Binary Archives as a pipeline cache on the metal backend #3719

Merged
merged 24 commits into from
Apr 19, 2021

Conversation

expenses
Copy link
Contributor

@expenses expenses commented Apr 7, 2021

See #3716. This PR builds on-top of gfx-rs/metal-rs#207 to add an implementation of get_pipeline_cache_data for the metal backend. This PR is not ready for reviews.

Todo:

  • Get rid of the Mutex around the pipeline cache and replace is_empty with an AtomicBool, as unsafe impl Sync is probably fine for the binary archive.
  • Check the availability of binary archives on the target with device.supports_binary_archive().
  • Potentially use a second cache to store SPIR-V -> MSL transformations.
  • Come up with a test suite to test pipeline caches. Something with hundreds/thousands of pipelines (maybe using shaders from shadertoy?) would be nice.

PR checklist:

  • make succeeds (on *nix)
  • make reftests succeeds
  • tested examples with the following backends: metal

@expenses
Copy link
Contributor Author

expenses commented Apr 9, 2021

I've added most of the features that I want for this now; what's left is a bit of a clean up, optimization as far as serialization is concerned and to merge gfx-rs/naga#686 and https://github.com/grovesNL/spirv_cross/compare/master...expenses:serialize-feature?expand=1.

@expenses
Copy link
Contributor Author

Paused for a little bit because I'd really like the SPV->MSL cache to be done for naga, but testing this requires naga compilation of the shadertoy shaders to work a little better. gfx-rs/naga#693 fixes most of the issues.

@kvark
Copy link
Member

kvark commented Apr 12, 2021

This is also going to help gfx-portability. In particular, as we are finishing up the Naga fixes needed to get Dota2 running, we'll be benchmarking again, and having the working shader cache will boost it significantly.

Comment on lines 1477 to 1481
Ok(bincode::serialize(&n::SerializablePipelineCache {
binary_archive: &binary_archive()?,
spv_to_msl: n::serialize_spv_to_msl_cache(&cache.spv_to_msl),
})
.unwrap())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be made faster by not using bincode and making it zero-copy.

Comment on lines 1432 to 1437
let pipeline_cache: n::SerializablePipelineCache = bincode::deserialize(data).unwrap();

Ok(n::PipelineCache {
binary_archive: create_binary_archive(&pipeline_cache.binary_archive)?,
spv_to_msl: n::load_spv_to_msl_cache(pipeline_cache.spv_to_msl),
})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be made faster by not using bincode and making it zero-copy.

@expenses expenses marked this pull request as ready for review April 13, 2021 21:38
@expenses expenses changed the title [Draft] Use Binary Archives as a pipeline cache on the metal backend Use Binary Archives as a pipeline cache on the metal backend Apr 13, 2021
@expenses
Copy link
Contributor Author

I think this is about ready for review now. gfx-rs/naga#716 needs to be merged and a gfx-22 tag of naga needs to be made before merging.

@expenses
Copy link
Contributor Author

We're still parsing and validating the SPIR-V code with naga unnecessarily, as we should be able to skip over that if the SPIR-V is in the cache. Parsing and validating only seems to take about 40ms for 8000 modules though, so I don't think this is a huge problem.

@expenses
Copy link
Contributor Author

Parsing and validating only seems to take about 40ms for 8000 modules though, so I don't think this is a huge problem.

Writing out the MSL shaders with a fresh pipeline cache also only takes 50ms for 8000 modules, so it appears that naga is fast enough that there's not too much benefit to having a SPV->MSL cache in the first place.

@expenses
Copy link
Contributor Author

expenses commented Apr 13, 2021

Writing out the MSL shaders with a fresh pipeline cache also only takes 50ms for 8000 modules, so it appears that naga is fast enough that there's not too much benefit to having a SPV->MSL cache in the first place.

Here are some timings for 8286 pipelines (less than the number in #3716 (comment) because some fail to verify):

cleared spv->msl cache: 8.50 5.87 5.44 5.46 6.42
hot spv->msl cache: 5.48 4.89 4.78 4.78 4.64

Copy link
Member

@kvark kvark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some top quality work, thank you for kicking this forward!

examples/quad/main.rs Outdated Show resolved Hide resolved
@@ -15,7 +15,7 @@ edition = "2018"
[features]
default = []
signpost = []
cross = ["spirv_cross", "auxil", "naga/spv-out"]
cross = ["spirv_cross", "auxil"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is naga/spv-out removed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we're now enabling it by default for create_shader_module_from_naga.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed things around now and we only use naga/spv-out if we're using the pipeline-cache feature or the cross feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so for pipeline caching, we basically have to always go through SPIR-V? It's not the worst thing, I imagine wgpu will just not have this feature enabled for the near future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily, I could probably look into reworking things so we cache the hash of the WSL as well.

src/backend/metal/Cargo.toml Outdated Show resolved Hide resolved
src/backend/metal/src/device.rs Outdated Show resolved Hide resolved
src/backend/metal/src/device.rs Outdated Show resolved Hide resolved
src/backend/metal/src/device.rs Outdated Show resolved Hide resolved
src/backend/metal/src/native.rs Outdated Show resolved Hide resolved
@expenses
Copy link
Contributor Author

I timed it again and having a binary archive pipeline cache definitely makes a difference when the system cache has been cleaned.

wiped system cache, with binary archive: 293.24 288.50
wiped system cache, no binary archive: 892.77 889.07

@expenses expenses requested a review from kvark April 14, 2021 17:25
Copy link
Member

@kvark kvark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!
I noted a few ideas here, but overall happy with this landing as soon as naga's gfx-22 train is up.

src/backend/metal/src/device.rs Outdated Show resolved Hide resolved
src/backend/metal/src/device.rs Outdated Show resolved Hide resolved
src/backend/metal/src/native.rs Outdated Show resolved Hide resolved
@kvark
Copy link
Member

kvark commented Apr 19, 2021

@expenses Naga's "gfx-22" is up, please update the PR to use it (and make sure every commit is testable).

@kvark
Copy link
Member

kvark commented Apr 19, 2021

@expenses can we make it so every commit is testable? If not sure, let's squash everything.

@expenses
Copy link
Contributor Author

Okay, I'll do a squash then I think.

@expenses
Copy link
Contributor Author

I'm trying to find a good way to do with while having the same commit message that doing a squash merge on github does. Maybe just doing it through the site would be better?

@kvark kvark merged commit 6999171 into gfx-rs:master Apr 19, 2021
bors bot added a commit to gfx-rs/wgpu that referenced this pull request Apr 19, 2021
1339: Update naga to gfx-22, add interpolation sampling validation r=kvark a=kvark

**Connections**
Includes gfx-rs/gfx#3729, gfx-rs/gfx#3719, gfx-rs/naga#689, gfx-rs/naga#700, and other fixes

**Description**
Update gfx and naga to latest (gfx-22 tag).
The hope is for this to be the last tag release before 0.8

**Testing**
Tested on wgpu-rs examples

Co-authored-by: Dzmitry Malyshau <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants