-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving package load times #4373
Comments
I'm against adding features to deal with this. We just have to make it faster. |
Features like pre-compiling things are reasonable however. |
I'm with Jeff on this on. I also disagree that code generation is a good way to go about wrapping APIs, but that's a different can of worms. We should just get to static pre-compilation already ;). |
At least a couple seconds (I would guess maybe 3-4 out of the 10 seconds) are just in the front end. We could create |
Similar to pythons .pyc files? Please don't make jlcache usable without the source file without renaming the cache file. I once spent 3 hours to figure out why a python file caused trouble after it was deleted. |
Don't worry – we've all been bitten by that and will not make the same mistake. |
The LLVM blog describes using the MCJIT to cache and pre-compile objects (http://blog.llvm.org/2013/08/object-caching-with-kaleidoscope.html). "However, MCJIT provides a mechanism for caching generated object images. Once we’ve compiled a module, we can store the image and never have to compile it again. This is not available with the JIT execution engine and gives MCJIT a significant performance advantage when a library is used in multiple invocations of the program." If the MCJIT and JIT can talk to each other, would this be a promising route? |
Yes, that's the idea. The problem here is not that we don't know what needs to be done, we do. The problem is that it is a significant amount of work and nobody has done it yet. |
It seems like there are several issues discussed here that are already open elsewhere. In terms of package loading times, there is the usable, though fairly undocumented technique using What else would be needed to close this issue? |
I don't think this can be closed yet. The IMHO a nice solution would be if any module (or package) could be compiled into its own .so file that is cached somewhere. If this is feasible one could autogenerate these files on either the installation of a package or when |
Requiring users to modify Julia source (e.g., The system should work out of box, which means that packages should be able to load reasonably fast without any modification to Julia base files. I agree that we should probably cache pre-compiled images, invalidate the cache whenever the source changes in any way. |
Can we put heads together and figure out some form of package caching/precompilation first thing after getting LLVM 3.5 running? |
+1 |
+100. Its so bad that we are actually hesitant to merge a very large PR into JuMP because of the impact on loading times... |
Gadfly doesn't really need Datetime or DataFrames, which cuts out a big part of that graph. I suspect other parts aren't really required by the core of Gafly either. |
Oh if it doesn't need Dataframes then that'd be pretty cool. |
precompilation, precompile |
#7977, which was on the 0.4-projects milestone, was closed as a dup. Should this be put on the 0.4-projects milestone in its stead? |
Is putting this as milestone 0.5 up for discussion? I'm not doing the work myself, so I don't want to dictate priorities, but I'm kinda distressed by the idea that package loading will be this slow until... December? |
sounds like we need a faster 0.4 release then? there's more than one way to skin a cat :) but, in fact, i tried to remove a lot of "nice to haves" from the 0.4 list specifically to help with schedule |
Please do not edit the 0.4 milestone without discussion. |
Should we centralize an overall scope discussion somewhere, either as an issue or on julia-dev? |
My own personal opinion: of things that seem "close" for 0.4 (#8745), in terms of importance I'd say there's nothing even remotely in the same league. A substantial improvement in package loading times would be my vote for the very first item mentioned in the announcement for whatever release this makes it into (unless the debugger gets merged in the same release). If there's stuff we can do to help, please do let us know. I guess I should start checking out #8745 and playing with it. |
(I should add that multithreading might also be a competitor for the top spot...) |
My take: if its 3 month more time we should release 0.4 without precompilation and make it the only goal for 0.5 to be released asap. Maybe those who have the skills to finish it could comment on a realistic time schedule (me not) |
If it's a non-breaking feature, it could even be introduced in a minor release. |
Faster package loading is no longer just a "nice to have" feature. It's frankly quite difficult to claim that Julia is a fast language and then have the second thing a user try (after |
@jiahao: i think we all agree on that. But we still have to do realistic release planing. It does not help to wait for a feature when it is not realistic to get in. So it would be good if jameson, jeff, and keno make a clear decision here. |
That's just as much the case now as it was a year ago. We're rate limited on implementation labor (and code review? in the cases of already-open PR's) for big core features that everyone knows need to be done. Decisions and/or plans should probably start to be made pretty soon whether the 0.4.0 roadmap is going to be feature-defined or schedule-defined. If the former, by which features (and expect it to take a while), or if the latter, by what target (and expect it to not have as many finished features as everyone would like). |
I though that there was agreement on a time-based schedule (e.g. by @StefanKarpinski https://groups.google.com/d/msg/julia-users/aqGvjGLVaLk/CI7p8R8XZGEJ ) |
Any updates on the current status of improving package load times? I'm interested to see if I could help with anything. My LLVM-skills are getting rusty and it'd be good to have an excuse to work on them again. ;) Would that be a questions to ask @vtjnash directly? |
agreed. this issue doesn't have much to do with llvm. but see #9336 for a list of llvm36 issues to burn down. helping improve/fix debugging info would also be a huge win (tracking inline functions for line number tables and emitting the debugging symbol tables for seeing the values of local variables in lldb/gdb). |
Keno also has a pile of LLVM patches up for review that are moving slowly, not sure if us bumping them will make them go any faster. |
@ViralBShah Great, sounds like some work on the LLVM debugging info would be helpful. I will need to look into how the Julia front-end handles the debugging info. @vtjnash, it looks like the serialization code is only serializing the AST, or did I miss how it's serializing the JIT'ed code? |
For debug info, look at "step 5" in (other questions should probably go to julia-dev) |
Is there an open issue for this? This is still laughably slow. Even after precompilation. Even after it is already loaded!
This is on a fairly high spec Macbook Pro. I never had to wait 10 seconds to make a plot in Matlab... |
There are several open issues for it; see the "latency" label. |
Some background first:
To improve the above, I just wanted to sound out if either of the following approaches is feasible/makes sense :
Approach 1
Base
provides a functionsyms_on_demand(syms::Vector{Symbol}, load_sym_cb::Function)
.syms_on_demand
with a list of symbols that it wants to be defined/loaded only when used. This list is recorded by the julia interpreter.load_sym_cb(s::Symbol)
is a callback that executes an appropriate@eval
for the specified symbolApproach 2
@eval_on_demand <symbols> <code block>
does the same as above, i.e., it registers the symbols (and associates it with the particular module), but does not evaluate the code block till required.I am not familiar with the intricate details/issues related to code generation, but just thought I'll put this up for discussion. Also any alternate suggestions for improving the load time of AWS.jl are welcome.
The text was updated successfully, but these errors were encountered: