-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More extensive record-keeping during module loading/compilation #23898
Conversation
This allows one to obtain the file dependencies of a non-precompiled package.
a23e416
to
ec870cd
Compare
6b2306d
to
2163e3d
Compare
@vtjnash, if possible it would be great to have you look this over, since this makes a number of changes to the format of the cache files. |
9a1d74c
to
15569d9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is time that we document the cachefile format.
IIUC this now also stores the source text of a dependent files in the cache file so that it is easier accessible for down-stream consumers?
@test map(x -> x[1], sort(deps)) == [Foo_file, joinpath(dir, "bar.jl"), joinpath(dir, "foo.jl")] | ||
@test map(x -> x[1], sort(discard_module.(deps))) == [Foo_file, joinpath(dir, "bar.jl"), joinpath(dir, "foo.jl")] | ||
srctxt = Base.read_dependency_src(cachefile, Foo_file) | ||
@test !isempty(srctxt) && srctxt == read(Foo_file, String) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a test for a manually included dependency. If I understand one could end up parsing the list of dependent files and end up with an ("__external__", "external.so", mtime)
and `Base.read_dependency_src(cachefile, "external.so") should fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but IMO that's fine. This is an internal method and consumers can make sure they don't try to read "__external__"
dependencies. (Reading the src text is not something that's done unless it is requested.) Revise doesn't intend to track external dependencies, because it's hard to know what you're supposed to do about them.
But if you know of packages that are using this feature, I would be happy to go look at how they're using external dependencies; that could end up changing my mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything that would lead to a stale cachefile should probably prompt Revise to to reload that file :).
In my use cases I have stateful information from that external dependency (if it exists or not, API version, ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would love a link to your use cases, if they are in public repos.
It would be fine to trigger revise(module)
when an external dependency updates, but it seems like it can't be robust: what if the external dependency is a single file in another package? How is Revise supposed to know whether it has to revise that package first? (In some cases it wouldn't but in others it would be necessary.)
In either case, I don't think Revise will need access to the source text of external dependencies. If, as you say, it's an .so
file, Revise can't do anything with it anyway. In fact it occurs to me that we probably shouldn't store external dependencies in the src-text cache; it would be crazy to cache, say, libkde.so
in a *.ji
file 😄.
I will change the format so that each source-text file has its name at the beginning, so that one doesn't have to "count" from the list of dependencies while skipping over #__external__
dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if that might happen ;) yeah that would be crazy.
base/loading.jl
Outdated
@@ -230,7 +234,7 @@ This is only needed if your module depends on a file that is not used via `inclu | |||
no effect outside of compilation. | |||
""" | |||
function include_dependency(path::AbstractString) | |||
_include_dependency(path) | |||
_include_dependency("__external__", path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently __external__
is a perfectly valid module name, albeit odd.
Solutions I see:
- make
__external__
a reserved module name. - add a flag to signify a external file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about I change it to #__external__
? That seems pretty much impossible unless someone goes to heroic effort to generate it, @eval using $(Symbol("#__external__"))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good even though I would love a verification step when we write out the module, but maybe that is easier to add to Revise.
This ensures that it's possible to re-evaluate modified code on a file-by-file basis (as with Revise)
15569d9
to
7476cbc
Compare
Barring objections, I'm planning to merge this by the end of the week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
src/dump.c
Outdated
ios_write(&f, jl_string_data(dep), slen); | ||
posfile = ios_pos(&f); | ||
write_uint64(&f, 0); // placeholder for length of this file in bytes | ||
ios_file(&srctext, jl_string_data(dep), 1, 0, 0, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs error handling. Alternatively, might be nice to store the contents as they are used. Also might be nice to use some standard format. For example, LLVM provides llvm::writeArchive
(http://llvm.org/doxygen/ArchiveWriter_8h_source.html#l00041) for copying a list of files (paths and/or buffers) into the ar format (although LLVM may need a small patch to disable throwing an error when the symbol table cannot be created when it is not needed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just added some simple error handling, hopefully it's what you had in mind.
The LLVM interaction seems a little scarier, since I'm less familiar with it. Dumb question: how do you even call out to LLVM from a *.c
file?
The thing I like about storing this in the *.ji
file is that they are guaranteed to be validated/invalidated together. Otherwise, I agree with you that a standard format has advantages.
7476cbc
to
ef28c11
Compare
I was looking at whether I should call |
Currently Revise has to try to parse source code to figure out what files a package depends on and which module a file should be evaluated into. This is fragile. The two commits here should allow Revise to work sensibly on Julia 1.0:
include
so that other packages can log these dependencies (important for__precompile__(false)
packages)__precompile__(true)
packages)EDIT: now it also saves the raw source text (as a string) to the cache file, addressing #23448.