-
-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase performance and decrease memory footprint #539
Comments
IrDsymbol::resetAll sucks 13% processing time when passing several modules! Optimized that. The real problem is memory consumption. The command allocates 12GB of memory!
|
Is the impact of using |
I also thought about that, but it's a vector of pointers to IrDsymbol spread out in the Dsymbols of the AST. |
Ah yes, of course – makes sense because of locality, I guess. |
Ok as expected clang is able to use vector instructions either way and VS isn't. |
@Trass3r: As these are non-virtual structs anyway, I don't really care either way, let's get this in. Do you have any concrete data points on how much an impact the change has overall? |
For the command above reduces runtime by 10-15% (370s vs 310s). |
@Trass3r: You edited your earlier post, right? If not, I now feel pretty stupid for asking for numbers. ;) Regarding memory usage, the same problem occurs with DMD, right? (i.e., you mean frontend semantic analysis with "first phase") |
Yeah I enhanced that post right after I put it in ;) |
Has anyone ever evaluated the "quality" of the IR emitted by ldc? The better it is the less effort for llvm. |
Not particularly thoroughly, no. I have been noticing a few issues such as unnecessary allocas/loads being emitted, but never really worked on tracking down as I would always discover them while working on some serious codegen bug. Improvements in this area would also directly benefit debug executable performance (i.e. -O0 builds). |
I started a rough comparison between ldc and clang some time ago. Basically, the IR generated by clang at |
@redstar: Clang must only do this in simple cases then. Duplicating SSA formation (mem2reg, … ) in each frontend doesn't make a lot of sense, and I clearly remember Chandler saying that in multiple of his talks. |
That being said, maybe we can easily track whether we assign a parameter during semantic analysis and elide some loads/stores if we don't – I'd need to have a look at the code to be sure. Then again, IIRC the ABI transformation stuff is also tied into the parameter allocas. |
I've posted an alternative to std::set / std::list / std::vector for IrDSymbol: https://github.com/Safety0ff/ldc/compare/intdlist |
Ok, I've fixed the bug, I used the same command as Trass3r minus std/datetime.d (I only have 11 gigs of ram.) |
Trass3r's code gives a time of: 3m59s |
Alright, I created an intrusive SList version, using Trass3r's conclusion that removals aren't random, they're mostly from the back: https://github.com/Safety0ff/ldc/compare/intslist |
An intrusive linked list is indeed the optimal struture for this (?!) |
EDIT: nvm. |
Yes 1. Some template symbols. |
Hmm, yea. |
Actually, this might be an indication that the design needs to be rethought. OTOH, IrDsymbol shouldn't be accessed before codegen, and at that point, the frontend certainly doesn't create or destroy any Dsymbols anymore. |
Trying to estimate what there is to gain from going full lazy I hacked something together here: https://github.com/Safety0ff/ldc/compare/lazyhack (it's not pretty) |
OK, well I found out my build settings were bonk for all my previous timings, here are the correct numbers: |
I started on a lazy |
Ok, I finished it, but it's pretty ugly: https://github.com/Safety0ff/ldc/compare/lazymap |
Anyways, for now I think we should go with @Trass3r's vector for simplicity and later it can be decided if one of the more invasive solutions is desired. |
Add various constants/types for Linux/PPC.
As reported on the newsgroup, LDC is around 4 times slower then DMD.
My personal experience is that LDC is really fast for small files but speed decreases dramatically with larger files, e.g.
std.datetime
andstd.array
. At the same time, LDC consumes a huge amount of memory (compare with issue #438)!A possible explanation for the decreased speed could be that the system has to swap out memory to satisfies LDC requirements.
Memory and performance profiling must be performed to find the root cause(s).
The text was updated successfully, but these errors were encountered: