Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement for lower/upper bound #1223

Merged
merged 2 commits into from
Dec 19, 2017

Conversation

zivsha
Copy link
Contributor

@zivsha zivsha commented Nov 8, 2017

Changed updateQueries() to use member functions of multiset instead of std::lower_bound and std::upper_bound

Without this change, playing a bag file containing over 200K messages (under a single topic) yields in a major performance hit.
Running performance diagnostic shows that most of the time is spent in std::advance and std::distance which are used inside std::lower_bound and std::upper_bound.

From cppreference - lower_bound:

Complexity
The number of comparisons performed is logarithmic in the distance between first and last (At most log2(last - first) + O(1) comparisons). However, for non-RandomAccessIterators, the number of iterator increments is linear.

Changing to std::multiset<IndexEntry>::lower_bound results in a major performance improvement.
I have not measure the improvement, but I have a bag file containing a video of a stopwatch. In my code, for each sensor_msgs::Image I try to find a matching (single) message.Just by looking at the playback, I can see that the FPS is now fast enough to show a real second, whereas without this change, 1 second in the video (time it took for the stop watch to advanced by 1 second) took about 17 seconds in the real world!

Changed updateQueries() to use member functions of `multiset` instead of `std::lower_bound` and `std::upper_bound`
@racko
Copy link
Contributor

racko commented Nov 8, 2017

Huh ... multiset::lower_bound does a tree search, which is "obviously efficient", while doing a (generic) binary search on the iterators of a tree structure is a bad idea: Even computing the mid of the range requires you to iterate over all elements to count them, which makes this operation linear instead of logarithmic :)

Changed the calls to lower/upper bounds so they dont use the "new" C++11 extended initializer lists.
@clalancette
Copy link
Contributor

This code looks OK to me, and seems to be an obvious performance improvement. @mikepurvis any thoughts?

@dirk-thomas
Copy link
Member

Thank you for the patch.

@dirk-thomas dirk-thomas merged commit bd51f97 into ros:lunar-devel Dec 19, 2017
dirk-thomas pushed a commit that referenced this pull request Feb 9, 2018
* Performance improvement for lower/upper bound

Changed updateQueries() to use member functions of `multiset` instead of `std::lower_bound` and `std::upper_bound`

* Fixed warnings for C++98

Changed the calls to lower/upper bounds so they dont use the "new" C++11 extended initializer lists.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants