-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[clang][deps] Add module map describing compiled module to file dependencies. #160226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang][deps] Add module map describing compiled module to file dependencies. #160226
Conversation
…dencies. When we add the module map describing the compiled module to the command line, add it to the file dependencies as well. Discovered while working on reproducers where a command line input was missing in the captured files as it wasn't considered a dependency.
|
@llvm/pr-subscribers-clang Author: Volodymyr Sapsai (vsapsai) ChangesWhen we add the module map describing the compiled module to the command line, add it to the file dependencies as well. Discovered while working on reproducers where a command line input was missing in the captured files as it wasn't considered a dependency. Full diff: https://github.com/llvm/llvm-project/pull/160226.diff 5 Files Affected:
diff --git a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
index d67178c153e88..9109ce789ee4c 100644
--- a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -444,9 +444,11 @@ void ModuleDepCollector::applyDiscoveredDependencies(CompilerInvocation &CI) {
if (OptionalFileEntryRef CurrentModuleMap =
PP.getHeaderSearchInfo()
.getModuleMap()
- .getModuleMapFileForUniquing(CurrentModule))
+ .getModuleMapFileForUniquing(CurrentModule)) {
CI.getFrontendOpts().ModuleMapFiles.emplace_back(
CurrentModuleMap->getNameAsRequested());
+ Consumer.handleFileDependency(CurrentModuleMap->getNameAsRequested());
+ }
SmallVector<ModuleID> DirectDeps;
for (const auto &KV : ModularDeps)
diff --git a/clang/test/ClangScanDeps/modules-fmodule-name-no-module-built.m b/clang/test/ClangScanDeps/modules-fmodule-name-no-module-built.m
index cfe29c2bf7cdb..1fdcbc51c8e1f 100644
--- a/clang/test/ClangScanDeps/modules-fmodule-name-no-module-built.m
+++ b/clang/test/ClangScanDeps/modules-fmodule-name-no-module-built.m
@@ -46,7 +46,8 @@
// CHECK: "file-deps": [
// CHECK-NEXT: "[[PREFIX]]/modules-fmodule-name-no-module-built.m",
// CHECK-NEXT: "[[PREFIX]]/Inputs/header3.h",
-// CHECK-NEXT: "[[PREFIX]]/Inputs/header.h"
+// CHECK-NEXT: "[[PREFIX]]/Inputs/header.h",
+// CHECK-NEXT: "Inputs/module.modulemap"
// CHECK-NEXT: ],
// CHECK-NEXT: "input-file": "[[PREFIX]]/modules-fmodule-name-no-module-built.m"
// CHECK-NEXT: }
diff --git a/clang/test/ClangScanDeps/modules-header-sharing.m b/clang/test/ClangScanDeps/modules-header-sharing.m
index 31ef351ec38b7..005b22dc6902a 100644
--- a/clang/test/ClangScanDeps/modules-header-sharing.m
+++ b/clang/test/ClangScanDeps/modules-header-sharing.m
@@ -79,7 +79,8 @@
// CHECK: ],
// CHECK: "file-deps": [
// CHECK-NEXT: "[[PREFIX]]/tu.m",
-// CHECK-NEXT: "[[PREFIX]]/shared/H.h"
+// CHECK-NEXT: "[[PREFIX]]/shared/H.h",
+// CHECK-NEXT: "[[PREFIX]]/frameworks/A.framework/Modules/module.modulemap"
// CHECK-NEXT: ],
// CHECK-NEXT: "input-file": "[[PREFIX]]/tu.m"
// CHECK-NEXT: }
diff --git a/clang/test/ClangScanDeps/modules-implementation-module-map.c b/clang/test/ClangScanDeps/modules-implementation-module-map.c
index b7637d0c9143a..a7170aab2448c 100644
--- a/clang/test/ClangScanDeps/modules-implementation-module-map.c
+++ b/clang/test/ClangScanDeps/modules-implementation-module-map.c
@@ -28,7 +28,8 @@ framework module FWPrivate { header "private.h" }
// CHECK: "-fmodule-name=FWPrivate",
// CHECK: ],
// CHECK: "file-deps": [
-// CHECK-NEXT: "[[PREFIX]]/tu.m"
+// CHECK-NEXT: "[[PREFIX]]/tu.m",
+// CHECK-NEXT: "[[PREFIX]]/frameworks/FW.framework/Modules/module.private.modulemap"
// CHECK-NEXT: ],
// CHECK-NEXT: "input-file": "[[PREFIX]]/tu.m"
// CHECK-NEXT: }
diff --git a/clang/test/ClangScanDeps/modules-implementation-private.m b/clang/test/ClangScanDeps/modules-implementation-private.m
index b376073f4b9ee..210fbfb424aca 100644
--- a/clang/test/ClangScanDeps/modules-implementation-private.m
+++ b/clang/test/ClangScanDeps/modules-implementation-private.m
@@ -65,7 +65,8 @@
// CHECK: "file-deps": [
// CHECK-NEXT: "[[PREFIX]]/tu.m",
// CHECK-NEXT: "[[PREFIX]]/frameworks/FW.framework/PrivateHeaders/Missed.h",
-// CHECK-NEXT: "[[PREFIX]]/frameworks/FW.framework/Headers/FW.h"
+// CHECK-NEXT: "[[PREFIX]]/frameworks/FW.framework/Headers/FW.h",
+// CHECK-NEXT: "[[PREFIX]]/frameworks/FW.framework/Modules/module.modulemap"
// CHECK-NEXT: ],
// CHECK-NEXT: "input-file": "[[PREFIX]]/tu.m"
// CHECK-NEXT: }
|
|
I don't think it is worth creating a separate test case as it seems to be covered by existing ones. |
| // CHECK-NEXT: "[[PREFIX]]/Inputs/header3.h", | ||
| // CHECK-NEXT: "[[PREFIX]]/Inputs/header.h" | ||
| // CHECK-NEXT: "[[PREFIX]]/Inputs/header.h", | ||
| // CHECK-NEXT: "Inputs/module.modulemap" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little bit suspicious that here we use a relative path to the modulemap. But it is the same value we provide on the command line. cc @benlangmuir as the author of 73dba2f.
|
Ping. I'd like to know if it aligns with the architecture and direction of ModuleDepCollector.cpp. This is a pretty simple change but I don't know how it fits into a bigger picture. |
| .getModuleMapFileForUniquing(CurrentModule)) { | ||
| CI.getFrontendOpts().ModuleMapFiles.emplace_back( | ||
| CurrentModuleMap->getNameAsRequested()); | ||
| Consumer.handleFileDependency(CurrentModuleMap->getNameAsRequested()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All other file dependencies go through addFileDep which handles making the file path match the expected style (absolute, with native path separators). That's why you're seeing a relative modulemap path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks. Switched to addFileDep as this looks like a common approach and Consumer.handleFileDependency is invoked in 1 place only.
| .getModuleMapFileForUniquing(CurrentModule)) { | ||
| CI.getFrontendOpts().ModuleMapFiles.emplace_back( | ||
| CurrentModuleMap->getNameAsRequested()); | ||
| Consumer.handleFileDependency(CurrentModuleMap->getNameAsRequested()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be just ->getName() - the build system won't know how to map between the as-requested path and the on-disk path.
I'd also prefer if we did this in ModuleDepCollectorPP::EndOfMainFile(). This function should just modify the CI with the information collected so far, not perform more logic and reporting. It may also get called multiple times for a driver command, and re-invoking the Consumer doesn't make sense IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the module map dependency reporting to ModuleDepCollectorPP::EndOfMainFile. Thanks for pointing out the better structuring.
I don't know, I'm still sticking with ->getNameAsRequested() because of 73dba2f. But don't have a strong opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @jansvoboda11 is right. The difference here is that this is an output path for consumption by external tools, which don't know how to map the requested path to the real on-disk location, while the commit you referenced is changing an input to the compiler, which needs to have the virtual path and knows how to map it as needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested ->getName() and it fails with my use case. I'm going to try to add a test capturing this difference as the current tests pass with both getName and getNameAsRequested.
I suspect the problem might be with symlinks where getName follows a symlink and getNameAsRequested doesn't. So when we have getNameAsRequested on the command line we cannot find a module map at that location as reproducers don't capture symlinks. Maybe reproducers should capture the symlink structure but so far it wasn't required.
| if (OptionalFileEntryRef CurrentModuleMap = | ||
| PP.getHeaderSearchInfo().getModuleMap().getModuleMapFileForUniquing( | ||
| CurrentModule)) | ||
| MDC.addFileDep(CurrentModuleMap->getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With ->getName() I have
Failed Tests (1):
Clang :: ClangScanDeps/modules-current-modulemap-file-dep.c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The failure is expected. The "file-deps" field is intended for consumption by tools that might not understand Clang's virtual filesystem overlays, so the only thing they can reason about is the on-disk path for files ("external-contents" in the VFS overlay file). That's of course different from what we put on the command-line for Clang itself, as Ben explained above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's take a step back. What use cases do you have for "file-deps"? Because I've realized that for reproducers I'm not interested in the output paths to process them somehow. Actually, I want to take the paths and provide them to the compiler as the input.
I'm considering if we should have 2 flavours of file dependencies - one before all the file system remappings and one after clang applied the remappings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The build system uses file-deps to check their modification time during incremental builds and figure out what needs to be recompiled.
What do you mean by input and output paths? All file dependencies are inputs currently.
How do you plan to feed the list of files to the compiler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by input and output paths? All file dependencies are inputs currently.
In my mind the difference between input and output paths is after the remappings and before the remappings.
For example, a VFS overlay claims a file at "include/a.h" maps to "build/b.h". The input path is "include/a.h" as that's what the compiler is using and is able to resolve. The output path would be "build/b.h" which corresponds to a concrete file on disk that can be copied around, checked for modifications, etc. This difference is more relevant when input paths are purely virtual and don't exist on disk. But in my testing both "include/a.h" and "build/b.h" would exist on disk [with the same content] so using input paths makes a lot of sense.
How do you plan to feed the list of files to the compiler?
Basically, I have a reproducer root where I copy files from "file-deps" and use VFS overlay with 'overlay-relative': 'true' to make those files accessible through the old absolute paths.
For example, if clang uses a file "/sources/include/CompilerInvocation.h" it would get copied to "/tmp/vfs/sources/include/CompilerInvocation.h". And I'll have "/tmp/vfs/overlay.yaml" telling to read "/sources/include/CompilerInvocation.h" from "VFS-directory/sources/include/CompilerInvocation.h". Specific paths are different but I've simplified them for the sake of the explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The better situation might be to mention in file dependencies both input and output paths. This way you can use an actual file and have in VFS "input -> output" mapping. Though it seems more correct to mention in file-deps all the [intermediate] remappers as their change is supposed to invalidate the build. But I haven't really considered how it can be represented, what is the effort to do so, and what are the downsides.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, that makes sense. I think we can make the scanner also produce what you call input paths. I wouldn't want to do that by default in the scans during the actual build, since it implies a bunch of extra allocations for the paths that (I'm assuming) aren't needed during a normal (non-crashing) build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have experimented with it a little bit and looks like having output paths (like now) and remappers (specifically VFS overlays) in file-deps is enough. With these clang can find which output path corresponds to an input path.
These are preliminary results, I'm going to implement a more appropriate solution in this direction and see how it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After landing #167824 ->getName() aka "output path" works correctly with the reproducers. So let's go in this direction.
|
There was some CI breakage, details are in #162384 Got to merge "main" after the fix was made. |
|
Ping. I believe I've addressed all the comments, especially |
jansvoboda11
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
🐧 Linux x64 Test Results
|
…dencies. (llvm#160226) When we add the module map describing the compiled module to the command line, add it to the file dependencies as well. Discovered while working on reproducers where a command line input was missing in the captured files as it wasn't considered a dependency.
…dencies. (llvm#160226) When we add the module map describing the compiled module to the command line, add it to the file dependencies as well. Discovered while working on reproducers where a command line input was missing in the captured files as it wasn't considered a dependency.
When we add the module map describing the compiled module to the command line, add it to the file dependencies as well.
Discovered while working on reproducers where a command line input was missing in the captured files as it wasn't considered a dependency.