Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General performance improvements #200

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

General performance improvements #200

wants to merge 2 commits into from

Conversation

diegohavenstein
Copy link

Hi Simon,

I have studied the code a bit. I found several spots where a conditional is inside a loop (or several loops), where this could be avoided by using a array to store possible values and accessing the right index by evaluating a boolean expression.

I have marked the spots in the code here. This is not a real "pull request", since I did not change the code logic yet. But if you agree with me that the improvement could be noticeable in the numbers afterwards, I can proceed to change the code accordingly and push the changes afterwards to create a useful pull request.

Regards,
Diego

@sdsllitebot
Copy link
Collaborator

Can one of the admins verify this patch?

@eloj
Copy link

eloj commented Sep 12, 2014

I'm not Simon, but I'd be more compelled by seeing at least some benchmark first. It's very hard to look at code and come up with micro-optimization that actually turn out to be worthwhile optimizations in practice, doubly so when code quality must also be considered. But that's me.

@simongog
Copy link
Owner

Hi Diego,

wow, thanks for reviewing the code so carefully. As @eloj already mentioned, the code should be fast and intelligible. In the past I discovered that many of these micro-optimizations do not really improve execution time (assuming the code is compiled with optimizations) and sometimes the branch is explicitly there since reading and writing through proxy objects is actually more expensive than the branch. So we have to be careful. I think this would be an example for this:
https://github.com/diegohavenstein/sdsl-lite/blob/general_performance_improvements/include/sdsl/suffix_array_algorithm.hpp#L481

How would you replace this code?

Things like exchanging if (x > max_val) max_val=x by std::max are ok, since both versions are easy to understand. I will give you detailed comments on your proposals in a few days.

@diegohavenstein
Copy link
Author

Hi Simon,

yes, reading the proxy object regardless of the outcome of the branch could be expensive. In these cases I will not do anything, since I don't know the exact properties of these objects. But there are also lots of spots, like
https://github.com/diegohavenstein/sdsl-lite/blob/general_performance_improvements/include/sdsl/construct_sa_se.hpp#L242

where the branch could hurt. Maybe it is better if I see some profiling results first, maybe these functions are not heavily used and the optimizations are useless at the end. I will have a look these days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants