-
Notifications
You must be signed in to change notification settings - Fork 557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always decompose nvcc compilations #2300
Always decompose nvcc compilations #2300
Conversation
…usted). Always decompose via `nvcc --dryrun`, then cache and report the host compiler call as a CUDA compilation
Ok(( | ||
command, | ||
None, | ||
// Never assume the outer `nvcc` call is cacheable. We must decompose the nvcc call into | ||
// its constituent subcommands with `--dryrun` and only cache the final build product. | ||
// | ||
// Always decomposing `nvcc --dryrun` is the only way to ensure caching nvcc invocations | ||
// is fully sound, because the `nvcc -E` preprocessor output is not sufficient to detect | ||
// all source code changes. | ||
// | ||
// Specifically, `nvcc -E` always defines __CUDA_ARCH__, which means changes to host-only | ||
// code guarded by an `#ifndef __CUDA_ARCH__` will _not_ be captured in `nvcc -E` output. | ||
Cacheable::No, | ||
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might now be able to get away with doing less in the preprocess()
function, since we effectively don't care about most of what it does.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2300 +/- ##
==========================================
- Coverage 30.91% 0 -30.92%
==========================================
Files 53 0 -53
Lines 20112 0 -20112
Branches 9755 0 -9755
==========================================
- Hits 6217 0 -6217
+ Misses 7922 0 -7922
+ Partials 5973 0 -5973 ☔ View full report in Codecov by Sentry. |
sccache 0.9.1 should be dealing with `nvcc -E` correctly see mozilla/sccache#2300 If this works as expected, we can get rid of this code: https://github.com/pytorch/pytorch/pull/142813/files Pull Request resolved: #145012 Approved by: https://github.com/malfet
Never cache the outer CUDA compilation (because
nvcc -E
can't be trusted).Always decompose via
nvcc --dryrun
, then cache and report the host compiler call as a CUDA compilation.Fixes #2299.