-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize #10
Comments
The stack trace is profiled using flamegraph for both the What flamegraphs tells us is what functions are on the stack the most. The results are commited to the profile branch in the Regarding Expectedly, we see that The most common descendants on the stack are (this does not and should not equal 100% since the children contribute to their parents percentages):
Conclusion: These are the areas we should to look into optimizing. Thoughts and proposed steps forward:
Would love to hear your thoughts and get feedback, @naumenkogs. |
Did compiler optimize the code for these measurements? Would be great to add these results to some README file with explanations. I think this is good enough. I think we should still focus on RAM for now though. |
Making |
Yes, it is release because flamegraph defaults to release.
Can you elaborate on these two sentences? They seem to be conflicting because the first sentences says I have been thinking about the right path forward when it comes to optimization... When it comes to what we are optimizing for, we said that we are trying to get the RAM usage lower. However, this is still vague on my end. I want to make sure we are optimizing for a specific case. There are many, many possibilities for what systems end users will be running this program on. The systems could be anything: a 2gb machine vs a 4gb machine vs a 4gb plus tons of swap, etc. Right now, it seems like there are other things that may be more valuable. Like writing more tests for correctness and getting people to use it. Once people use it, we can figure out what (if any) problems they are encountering and then optimize for those specific issues. For immediate actions, I think we should write more tests to ensure correctness. This way we ensure we are not introducing any bugs as we optimize. That being said, I want to inflate the gz files in the In terms of loading in the data in chunks in an attempt to reduce the RAM usage, it might not be the right optimization to try first, but I am still curious to see the difference in the resulting flamegraphs so we can still go for it. |
I said algorithm, meaning all the logical loops (as opposed to parse method which just converts strings into numbers). The latter may be worth optimizing, the former is too small in the overall run time to bother.
I would say we are optimizing for reducing runtime. You can obviously ignore my suggestion and optimize other things.
Yes.
I wouldn't rely on this. I would be happy if 1,000 people independently use this during 2020 (although pre-made asmap created by a Bitcoin Core maintainer with this program will be shipped with release so that's cool). And of those 1,000, I'd expect maybe 10 reports or something. Most of the people will simply give up if they face any difficulties.
I'm not sure if that works out, but yeah, sounds good if true. |
I think there still may be some miscommunication here because Everything sounds good to me. |
Alright, so I was testing the script over all 25 files on this google cloud machine: n1-standard-2 (2 vCPUs, 7.5 GB memory, added 10Gb swap). I think it's reasonable to expect a smooth workflow on this kind of hardware. After more than 6 hours your script ate all memory so that I couldn't ssh back into the machine. Maybe it would terminate sooner, but since I can't track the progress I gave up. This is the RAM issue I was talking about. My code, on the other hand, successfully finished after 200 minutes. |
Currently asmap-rs takes 15 Gb of RAM, need to get that down to 4Gb.
Rationale from @naumenkogs: "Ideally, any user should have the right to be paranoid and do everything on their own, including constructing their own asmap locally. Obviously, requiring constructing asmaps on the phone is not reasonable, but maybe comparing to compiling Bitcoin Core is reasonable. This takes 2-4Gb of RAM."
For profiling and narrowing down the inefficient code:
benchmark
feature for testscargo flamegraph
(requires bin to work)The text was updated successfully, but these errors were encountered: