Always decompose nvcc compilations #2300

trxcllnt · 2024-12-21T21:07:47Z

Never cache the outer CUDA compilation (because nvcc -E can't be trusted).

Always decompose via nvcc --dryrun, then cache and report the host compiler call as a CUDA compilation.

…usted). Always decompose via `nvcc --dryrun`, then cache and report the host compiler call as a CUDA compilation

trxcllnt · 2024-12-21T21:13:41Z

src/compiler/nvcc.rs

+    Ok((
+        command,
+        None,
+        // Never assume the outer `nvcc` call is cacheable. We must decompose the nvcc call into
+        // its constituent subcommands with `--dryrun` and only cache the final build product.
+        //
+        // Always decomposing `nvcc --dryrun` is the only way to ensure caching nvcc invocations
+        // is fully sound, because the `nvcc -E` preprocessor output is not sufficient to detect
+        // all source code changes.
+        //
+        // Specifically, `nvcc -E` always defines __CUDA_ARCH__, which means changes to host-only
+        // code guarded by an `#ifndef __CUDA_ARCH__` will _not_ be captured in `nvcc -E` output.
+        Cacheable::No,
+    ))


We might now be able to get away with doing less in the preprocess() function, since we effectively don't care about most of what it does.

codecov-commenter · 2024-12-21T21:26:48Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 0.00%. Comparing base (0cc0c62) to head (a03b43c).
Report is 123 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #2300       +/-   ##
==========================================
- Coverage   30.91%       0   -30.92%     
==========================================
  Files          53       0       -53     
  Lines       20112       0    -20112     
  Branches     9755       0     -9755     
==========================================
- Hits         6217       0     -6217     
+ Misses       7922       0     -7922     
+ Partials     5973       0     -5973

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

see mozilla/sccache#2300

sccache 0.9.1 should be dealing with `nvcc -E` correctly see mozilla/sccache#2300 If this works as expected, we can get rid of this code: https://github.com/pytorch/pytorch/pull/142813/files Pull Request resolved: #145012 Approved by: https://github.com/malfet

Never cache the outer CUDA compilation (because nvcc -E can't be tr…

a03b43c

…usted). Always decompose via `nvcc --dryrun`, then cache and report the host compiler call as a CUDA compilation

trxcllnt mentioned this pull request Dec 21, 2024

[BUG]: sccache is causing mis-compiles NVIDIA/cccl#3103

Open

1 task

trxcllnt commented Dec 21, 2024

View reviewed changes

sylvestre merged commit 709309e into mozilla:main Dec 21, 2024
59 checks passed

trxcllnt mentioned this pull request Dec 23, 2024

Add test for #2299 #2301

Merged

wdvr added a commit to pytorch/pytorch that referenced this pull request Jan 16, 2025

upgrade to sccache 0.9.1 - dealing with nvcc -E correctly

31db637

see mozilla/sccache#2300

wdvr mentioned this pull request Jan 16, 2025

upgrade to sccache 0.9.1 - dealing with nvcc -E correctly pytorch/pytorch#145012

Closed

pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Jan 17, 2025

upgrade to sccache 0.9.1 - dealing with nvcc -E correctly

9863b98

see mozilla/sccache#2300

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Always decompose nvcc compilations #2300

Always decompose nvcc compilations #2300

trxcllnt commented Dec 21, 2024

trxcllnt Dec 21, 2024

codecov-commenter commented Dec 21, 2024

Always decompose nvcc compilations #2300

Always decompose nvcc compilations #2300

Conversation

trxcllnt commented Dec 21, 2024

trxcllnt Dec 21, 2024

Choose a reason for hiding this comment

codecov-commenter commented Dec 21, 2024

Codecov Report