-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added operator[] support. #4
Conversation
- It is documented that find() and insert() are faster, but this exists mainly for convenience. - The performance of operator[] is basically just as fast as calling find/insert for queries and insertions. Update may be slower since updates are performed by // a unified update/insert will be nice. upsert is not doing // anything particularly different from this. // and this is currently biased towards inserting a new value // than updating an existing value while (!owner->insert(key, value) && !owner->update(key, value)); Note that this implementation here is not exactly STL compliant. To maintain performance and avoid hitting the hash table too many times, The reference object is *lazy*. In other words, - operator[] does not actually perform an insert. It returns a reference object pointing to the requested key. - On table[i] = val // reference::operator=(mapped_type) an insert / update is called, and the reference becomes eager. - On val = table[i] // operator mapped_type() an find / insert is called and the reference becomes eager. - On table[i] (i.e. no operation performed), the destructor is called immediately (reference::~reference()) in which case insert(key, mapped_type()) is called. Thus, with normal usage, this should behave pretty much exactly like a regular reference. However, where issues might occur is when the lifetime of the reference exceeds its usual lifespan. auto i = table[i] in which case the lifetime of reference object is extended beyond what we would like it to be causing issues. To avoid this issue, the above is banned. By making the default constructor, copy constructor, and assignment operator of the reference object private, the above will cause a compilation error. Though that means that annoyingly, table[i] = table[j] will cause a compilation error. - Having a fused find/insert and a fused insert/update will help make the operator[] implementation be *much* nicer avoiding the silly lazy reference issue. i.e. what will be nice to have is: // returns mapped value if key exists, or if key does not exist, // inserts (key, value) and return value. // equivalent to: // \code // while(1) { // mapped_type qval; // if (find(key, qval)) return qval; // else if (insert(key, value)) return value; // } // \endcode mapped_type find_or_insert(key_type key, mapped_type value) and // Equivalent to: // \code // while (!owner->insert(key, value) && // !owner->update(key, value)); // \endcode // // or, // // \code // upsert(key, [](auto i){return i;}, value); // \endcode (though the upsert is not exactly faster than the while loop) mapped_type insert_or_update(key_type key, mapped_type value) - Changed the update functions to templatize around the Updater. This is generally faster since that permits lambda to be inlined rather type erased through the std::function.
I like this idea of having a wrapper class around the table entry to return Performance-wise, I think it is likely to be the same (hoping the compiler In general, I think this is good, and it will improve the readability and Thoughts? Manu, Dave, Michael?
On Mon, Sep 8, 2014 at 4:53 PM, Yucheng Low [email protected]
Computer Science Department |
Yeah I think it would make a nice addition. Looking at it briefly, there -Manu On Mon, Sep 8, 2014 at 6:25 PM, Bin Fan [email protected] wrote:
|
Hey @ylow, Sorry I took so long to look over the pull request. I decided to rework your implementation quite a bit, so rather than asking you to re-implement the changes I made on this pull request, I'm going to send a pull request of my changes to your forked repo and you can take a look at that. If we can agree on those changes, then we can merge into the efficient repository. Thanks, |
No worries. Please do go ahead. I hacked this together as a proof of concept. As mentioned in the comments, given a couple of additional primitives, it can be built quite a bit cleaner. |
I merged your changes and mine into the master branch |
It is documented that find() and insert() are faster, but this exists
mainly for convenience.
The performance of operator[] is basically just as fast as calling
find/insert for queries and insertions. Update may be slower since updates
are performed by
Note that this implementation here is not exactly STL compliant.
To maintain performance and avoid hitting the hash table too many
times, The reference object is lazy. In other words,
reference object pointing to the requested key.
an insert / update is called, and the reference becomes eager.
an find / insert is called and the reference becomes eager.
immediately (reference::~reference()) in which case
insert(key, mapped_type()) is called.
Thus, with normal usage, this should behave pretty much exactly like
a regular reference. However, where issues might occur is when the
lifetime of the reference exceeds its usual lifespan.
auto i = table[i]
in which case the lifetime of reference object is extended beyond
what we would like it to be causing issues. To avoid this issue,
the above is banned. By making the default constructor,
copy constructor, and assignment operator of the reference object
private, the above will cause a compilation error.
Though that means that annoyingly,
table[i] = table[j]
will cause a compilation error.
Having a fused find/insert and a fused insert/update will help make
the operator[] implementation be much nicer avoiding the silly lazy
reference issue.
i.e. what will be nice to have is:
// returns mapped value if key exists, or if key does not exist,
// inserts (key, value) and return value.
// equivalent to:
// \code
// while(1) {
// mapped_type qval;
// if (find(key, qval)) return qval;
// else if (insert(key, value)) return value;
// }
// \endcode
mapped_type find_or_insert(key_type key, mapped_type value)
and
// Equivalent to:
// \code
// while (!owner->insert(key, value) &&
// !owner->update(key, value));
// \endcode
//
// or,
//
// \code
// upsert(key, [](auto i){return i;}, value);
// \endcode
(though the upsert is not exactly faster than the while loop)
mapped_type insert_or_update(key_type key, mapped_type value)
Changed the update functions to templatize around the Updater. This
is generally faster since that permits lambda to be inlined rather type
erased through the std::function.