-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Further speedups in flisp? Possible, or not? #8310
Comments
Oh, and the C line numbers need to be viewed with a grain of salt; they're close, but they sometimes seem to be off by a bit. See #7986. |
For the record:
|
What's required to reproduce the problem? E.g., could the |
That particular file (or one like it) will come soon as part of #8235, but I'm still working on it. Until then, the auto-generated file created by #7977 (comment) might be helpful (I haven't recently generated a profile specifically for that function, can do upon request). But in my experience, flisp dominates loading time for any file or package I've tried (less so now, but still true). |
Oh, and @ArchRobison, if you don't know the history in #7977, grab a cup of coffee and read from top to bottom---it's a real thriller. Jeff wrote the final chapters in commits that, as far as I know, ended with f522f8f. |
I tested the package load times (timing |
That's a pretty awesome test, @jakebolewski! Images has come out with a pretty radical overhaul since I made those measurements; in particular, the new code is riddled with But in your numbers there's little that shows anything like a 2x improvement. Hmm. A little disappointing, I have to say. It almost makes me wonder if there was something wrong with my test. Did you always do each package with a cold start of julia? |
I just tested Images again (after |
In loading the file profiled above, there seem to be a couple of key recursive loops, which show up as:
Those two types of recursive calling account for almost all the time in |
The times were measured from a cold start. Might as well profile the "user experience" rather than isolating out the dependencies. @timholy flisp uses a virtual stack machine interpreter. SP is the stack pointer. What you were measuring is the cost of function application :-) |
I was pretty sure that had to be it. So unless recursion proper has some overhead, then the real time-sinks are scattered throughout lines 1096-1889 of |
Now flisp needs a JIT too! |
I looked at the specific lines in |
@jakebolewski, any idea how to solve #7280 (comment)? Since macro application appears to be done in Scheme, it's out of my territory. If we could develop a list of what other modules the parsing of a given file depends on, then the path towards merging #5061 would be easier. |
I had the same thought about JITing flisp. Or rewrite the .scm parts in C++. (Yes, heresy, but it ducks the bootstrap problem that rewriting the .scm as .jl brings on.) I suspect that caching is the right answer, since skipping/reusing work beats speeding up the worker. |
Macros don't really live in a file they live in some runtime module, so invalidating a cache based on file changes doesn't seem correct to me. To me this all ties into #265. |
As long as we can say "rebuild package X if package Y changes" I think we're covered. Maybe we'll rebuild more often than we strictly need to, but currently we "rebuild" on every startup, so the bar for progress is very, very low 😄. I personally don't think we need to worry about #265 to solve this problem---just a datestamp on every package and dependency chains. I suppose we could rely on the REQUIRE file, but that works only for registered packages. |
Aside from caching, I think the next improvements will come from
|
While I sympathize with the 3rd, for most packages that doesn't yet seem to be the first-order problem. Of course, once you start using the package it becomes the problem, but I think it will make many people's day if just In terms of prioritizing lowering vs parsing, I found #7977 (comment) interesting. Although #7977 (comment) seems to suggest the ratio might not be quite that high. Not quite sure what the difference between those tests was. From my perspective, following Arjan's advice I'd say we should be looking to achieve an approximate 20x speedup from where we are now---Gadfly in 1 second is my personal mantra, although I recognize we may only arrive at that point in stages. But I'm a little skeptical that tweaking the front end will take us most of the way there. My suspicion is we should implement caching first, then go back and see how much of a speedup we can get on the front end. That will still be very relevant, for all those times when the cache has to be invalidated. Time and talent permitting, I'm happy to help as best I can with this. |
Ok, that is reasonable. A good first step is to save fully-expanded top-level expressions if they contain no macro calls, and otherwise just save the parsed (not expanded) form. #5061 is nearly there. It would be nice to come up with a way to cache expressions generated by One case I care a lot about is the productivity of package developers like yourself; you will still have to wait to reload after making changes to a big package. That's why caching is not the be-all-end-all to me, though of course it's still good to have. |
To answer the question posed by this issue more directly, no I do not think flisp can be sped up significantly without major effort, e.g. compiling to native code. As I said above, I think we'll get more from optimizing the code in the .scm files, or replacing flisp. Rewriting in julia has advantages beyond just speed, making that option attractive. I will close since this is more of a question than an issue in itself. |
Upon further reflection, @jakebolewski's comment about #265 started making me nervous about how inlining across packages might work. Seems almost like the macro case. Will this cause trouble? In general, some earlier benchmarking I did indicated that the Julia side of macros seem to take almost no time, although macros do slow things down. Presumably this is again on the flisp side. And I'm very glad to hear about your interest in keeping package developers happy! I am prepared to wait after changes to my packages, and happy about getting faster loading of packages that mine depend on. Obviously it would be great to make it all fast. The news about flisp is not unexpected, and it's fine to close this. |
Also, if it's possible to do the caching on a file-by-file basis, rather than package-by-package basis, then for most developers it will dramatically decrease load times of large packages. Most large packages have many files, and most changes don't touch most files. |
Since @JeffBezanson improved flisp, loading times have gone down by a factor of 2 (on top of the 2-fold improvement he got at the end of the 0.3 cycle). In case there's room for yet further improvement, here's a link to profiling
include
ing a file that takes approximately 12 seconds to load (it's autogenerated by an up-and-coming version of #8235 that outputs all the defined methods to a file).You may want to download the file locally and then view with your browser. It takes a long time to render (minutes, see timholy/ProfileView.jl#18), but if you just wait it out until it finally finishes rendering, interaction is reasonably fast.
Bottom line: flisp still accounts for about 75% of the total time. That's a huge improvement from where we started. Is there room for further improvement, or at this point do we need to turn our attention to file caching?
Load times continue to be a favorite topic on julia-users.
The text was updated successfully, but these errors were encountered: