-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RNET-1131: optimize many OR'd terms on UUID/ObjectId queries #7582
Conversation
Pull Request Test Coverage Report for Build james.stone_528Details
💛 - Coveralls |
src/realm/query_engine.hpp
Outdated
void combine_conditions(bool ignore_indexes) | ||
{ | ||
// Although ColKey is not unique per table, it is not important to consider | ||
// the table when sorting here because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incomplete sentence
src/realm/query_engine.hpp
Outdated
struct ConditionType { | ||
ConditionType(const ParentNode* node) | ||
: m_col(node->m_condition_column_key.value) | ||
, m_type_hash(typeid(node).hash_code()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no guarantee that type_info::hash_code() is distinct for different types (e.g. return 0;
is a legal implementation), so comparing only the hashes doesn't work. There doesn't appear to be any reason to use the hash code, though, as type_info supports ordering (via before()) and equality comparisons directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how we should deal with the problem that Thomas points out. I can see that including the type_info directly in the ConditionType makes it hard to use in the FlatMap.
Otherwise very nice.
@@ -84,7 +84,7 @@ inline ObjectId TestValueGenerator::convert_for_test<ObjectId>(int64_t v) | |||
uint64_t cur = static_cast<uint64_t>(v); | |||
for (size_t i = 0; i < 24; ++i) { | |||
value += char(hex_digits[cur % 16]); | |||
cur -= (cur % 16); | |||
cur = cur >> 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test code is used in the benchmarks and was generating many of the same values because as soon as a value is a multiple of 16 we'd subtract 0 which left all remaining values filled with 0s. This skewed results. It might affect a few of the other ObjectId benchmarks a bit compared to our historic results, but is much more realistic.
std::sort(m_conditions.begin(), m_conditions.end(), [](auto& a, auto& b) { | ||
return a->m_condition_column_key < b->m_condition_column_key; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous sorting code didn't consider the node type in the sort. This might have led to the query not being optimized/combined even if it should have been. The affected queries would have multiple conditions on the same column such as c == 1 or c == 2 or c == 3 or c > 5
. If the sort put the c > 5
node between any of the ==
nodes, then not all of the equality nodes would have been combined. The query would have evaluated correctly, just not at peak performance, so I don't think it is worth a changelog note.
{ | ||
} | ||
int64_t m_col; | ||
std::type_index m_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed to use a std::type_index
which I think should be more robust than using the hash.
I have added an optimization for RQL
|
@jedelbo could I please get your re-review on the last commit? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few non-blocking details.
Fixes realm/realm-dotnet#3566
Added the ability for the UUID/ObjectId equality query node to consume conditions.
As the number of conditions increases, the outer loop to individually check each condition increases more than checking each value against a hash set of condition values. This is true even if the UUID/ObjectId property is indexed.
Benchmark results as compared to the same benchmarks on master:
☑️ ToDos
bindgen/spec.yml
, if public C++ API changed